Facebook
TwitterThese datasets contain reviews from the Steam video game platform, and information about which games were bundled together.
Metadata includes
reviews
purchases, plays, recommends (likes)
product bundles
pricing information
Basic Statistics:
Reviews: 7,793,069
Users: 2,567,538
Items: 15,474
Bundles: 615
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset provides a comprehensive collection of features for building a content-based recommendation system in an ecommerce environment. Content filtering, which relies on users' interests and past activities, is a prevalent method for suggesting products tailored to individual preferences.
Each entry in the dataset represents a product along with various attributes that can be leveraged for recommendation purposes. Here's an overview of the features included:
1) Number of clicks on similar products: Indicates the popularity or engagement level of similar items. 2) Number of similar products purchased so far: Reflects the conversion rate of similar products. 3) Average rating given to similar products: Offers insight into the perceived quality of comparable items. 4) Gender: Allows for gender-specific recommendations. 5) Median purchasing price (in rupees): Provides pricing information for segmentation or pricing strategy analysis. 6) Rating of the product: The rating of the product itself, indicating its overall quality. 7)**Brand of the product**: Brand loyalty or preference can influence recommendations. 8) Customer review sentiment score (overall): Sentiment analysis of customer reviews, indicating overall satisfaction. 9) Price of the product: The actual price of the product. 10) Holiday: Seasonal or holiday-specific buying patterns. 11) Season: Seasonal preferences may influence product choices. 12) Geographical locations: Regional preferences or availability may impact recommendations. 13) Probability for the product to be recommended to the person: The likelihood of recommending the product to a specific user based on their profile and past behavior.
With this rich set of features, businesses can implement sophisticated recommendation algorithms to personalize the shopping experience for users, ultimately leading to increased customer satisfaction, engagement, and sales.
Facebook
TwitterThese datasets contain 1.48 million question and answer pairs about products from Amazon.
Metadata includes
question and answer text
is the question binary (yes/no), and if so does it have a yes/no answer?
timestamps
product ID (to reference the review dataset)
Basic Statistics:
Questions: 1.48 million
Answers: 4,019,744
Labeled yes/no questions: 309,419
Number of unique products with questions: 191,185
Facebook
TwitterThese datasets include ratings as well as social (or trust) relationships between users. Data are from LibraryThing (a book review website) and epinions (general consumer reviews).
Metadata includes
reviews
price paid (epinions)
helpfulness votes (librarything)
flags (librarything)
Facebook
TwitterThis file contains purchase data from April 2020 to November 2020 from a large home appliances and electronics online store.
Each row in the file represents an event. All events are related to products and users. Each event is like many-to-many relation between products and users.
Data collected by Open CDP project. Feel free to use open source customer data platform.
Checkout another datasets:
A session can have multiple purchase events. It's ok, because it's a single order.
Thanks to REES46 Marketing Platform for this dataset.
You can use this dataset for free. Just mention the source of it: link to this page and link to REES46 Marketing Platform.
Facebook
TwitterThese datasets contain attributes about products sold on ModCloth and Amazon which may be sources of bias in recommendations (in particular, attributes about how the products are marketed). Data also includes user/item interactions for recommendation.
Metadata includes
ratings
product images
user identities
item sizes, user genders
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is a dataset obtained from an online survey conducted in August 2020.
In the survey, participants were introduced to the concept of a smartphone-based shopping assistant application with the help of pictures and videos when shopping with and without the application. Participants were presented with three different shopping scenarios. In each scenario, we showed products on a shelf (groceries, luxury chocolate, shoes, books). The first shopping scenario was a regular shopping scenario (RSS), the second was an augmented reality shopping scenario (ARSS), and the third was an augmented reality shopping scenario with explainable AI features (XARSS). For each scenario participants had to answer questions about how they perceived the scenario and how it influenced their overall purchase intention.
The present work was conducted within the Innovative Training Network project PERFORM funded by the European Union’s Horizon 2020 research and innovation program under the Marie Skłodowska-Curie grant agreement No. 765395. The EU Research Executive Agency is not responsible for any use that may be made of the information it contains.
Facebook
TwitterThis is a mutli-modal dataset for restaurants from Google Local (Google Maps). Data includes images and reviews posted by users, as well as metadata for each restaurant.
Facebook
TwitterThis is a dataset of users consuming streaming content on Twitch. We retrieved all streamers, and all users connected in their respective chats, every 10 minutes during 43 days.
Facebook
Twitterhttps://brightdata.com/licensehttps://brightdata.com/license
Use our G2 dataset to collect product descriptions, ratings, reviews, and pricing information from the world's largest tech marketplace. You may purchase a full or partial dataset depending on your business needs. The G2 Software Products Dataset, with a focus on top-rated products, serves as a valuable resource for software buyers, businesses, and technology enthusiasts. This use case highlights products that have received exceptional ratings and positive reviews on the G2 platform, offering insights into customer satisfaction and popularity. For software buyers, this dataset acts as a trusted guide, presenting a curated selection of G2's top-rated software products, ensuring a higher likelihood of satisfaction with purchases. Businesses and technology professionals can leverage this dataset to identify popular and well-reviewed software solutions, optimizing their decision-making process. This use case emphasizes the dataset's utility for those specifically interested in exploring and acquiring top-rated software products from G2's Product Overview The G2 software products and reviews dataset offer a detailed and thorough overview of leading software companies. The dataset includes all major data points: Product descriptions Average rating (1-5) Sellers number of reviews Key features (highest and lowest rated) Competitors Website & social media links and more.
Facebook
Twitterhttps://brightdata.com/licensehttps://brightdata.com/license
Use our Best Buy products to collect ratings, prices, and descriptions about products from an e-commerce online web. You can purchase either the entire dataset or a customized subset, depending on your requirements. The Best Buy Products Dataset stands as a comprehensive resource for businesses, researchers, and analysts aiming to navigate the vast array of products offered by Best Buy, a leading retailer in consumer electronics and technology. Tailored to provide a deep understanding of Best Buy's e-commerce ecosystem, this dataset facilitates market analysis, pricing optimization, customer behavior comprehension, and competitor assessment. At its core, the dataset encompasses essential attributes such as product ID, title, descriptions, ratings, reviews, pricing details, and seller information. These fundamental data elements empower users to glean insights into product performance, customer sentiment, and seller credibility, thereby facilitating informed decision-making processes. Whether you're a retailer looking to enhance your product portfolio, a researcher investigating trends in consumer electronics, or an analyst seeking to refine e-commerce strategies, the Best Buy Products Dataset offers a valuable resource for uncovering opportunities and driving success in the ever-evolving landscape of retail.
Facebook
Twitterhttps://brightdata.com/licensehttps://brightdata.com/license
Utilize our Amazon reviews dataset for diverse applications to enrich business strategies and market insights. Analyzing this dataset can aid in understanding customer behavior, product performance, and market trends, empowering organizations to refine their product and marketing strategies. Access the entire dataset or tailor a subset to fit your requirements. Popular use cases include: Product Performance Analysis: Analyze Amazon reviews to assess product performance, uncovering customer satisfaction levels, common issues, and highly praised features to inform product improvements and marketing messages. Customer Behavior Insights: Gain insights into customer behavior, purchasing patterns, and preferences, enabling more personalized marketing and product recommendations. Demand Forecasting: Leverage Amazon reviews to predict future product demand by analyzing historical review data and identifying trends, helping to optimize inventory management and sales strategies. Accessing and analyzing the Amazon reviews dataset supports market strategy optimization by leveraging insights to analyze key market trends and customer preferences, enhancing overall business decision-making.
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
A dataset consisting of 751,500 English app reviews of 12 online shopping apps. The dataset was scraped from the internet using a python script. This ShoppingAppReviews dataset contains app reviews of the 12 most popular online shopping android apps: Alibaba, Aliexpress, Amazon, Daraz, eBay, Flipcart, Lazada, Meesho, Myntra, Shein, Snapdeal and Walmart. Each review entry contains many metadata like review score, thumbsupcount, review posting time, reply content etc. The dataset is organized in a zip file, under which there are 12 json files and 12 csv files for 12 online shopping apps. This dataset can be used to obtain valuable information about customers' feedback regarding their user experience of these financially important apps.
Facebook
TwitterThese datasets contain reviews from the Goodreads book review website, and a variety of attributes describing the items. Critically, these datasets have multiple levels of user interaction, raging from adding to a shelf, rating, and reading.
Metadata includes
reviews
add-to-shelf, read, review actions
book attributes: title, isbn
graph of similar books
Basic Statistics:
Items: 1,561,465
Users: 808,749
Interactions: 225,394,930
Facebook
Twitterhttps://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy
Unlock valuable insights with our IKEA Product Reviews Dataset, a comprehensive collection of customer feedback on a wide range of IKEA products.
This dataset includes detailed reviews, ratings, and comments from customers who have purchased and used IKEA products. Analyze customer sentiments, identify product strengths and weaknesses, and uncover trends in consumer preferences.
Ideal for market research, product development, and competitive analysis, this dataset provides a wealth of information to help businesses and analysts make data-driven decisions and improve product offerings.
Customization of dataset is available to change data formats and removing or adding additional fields to dataset.
Facebook
Twitterhttps://brightdata.com/licensehttps://brightdata.com/license
The Google Shopping dataset is perfect for obtaining detailed product information worldwide. Easily filter by product title, seller, price, and other factors to find the exact data you need. The Google Shopping dataset includes key data points such as URL, product ID, title, description, rating, reviews count, images, seller name, delivery price, return policy, item price, total price, specifications, related items, and more.
Facebook
TwitterThis Dataset is an updated version of the Amazon review dataset released in 2014. As in the previous version, this dataset includes reviews (ratings, text, helpfulness votes), product metadata (descriptions, category information, price, brand, and image features), and links (also viewed/also bought graphs). In addition, this version provides the following features:
More reviews:
New reviews:
Metadata: - We have added transaction metadata for each review shown on the review page.
If you publish articles based on this dataset, please cite the following paper:
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The best overall results are highlighted in bold.
Facebook
Twitterhttps://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Introducing the Turkish Shopping List Image Dataset - a diverse and comprehensive collection of handwritten text images carefully curated to propel the advancement of text recognition and optical character recognition (OCR) models designed specifically for the Turkish language.
Dataset Contain & Diversity:Containing more than 2000 images, this Turkish OCR dataset offers a wide distribution of different types of shopping list images. Within this dataset, you'll discover a variety of handwritten text, including sentences, and individual item name words, quantity, comments, etc on shopping lists. The images in this dataset showcase distinct handwriting styles, fonts, font sizes, and writing variations.
To ensure diversity and robustness in training your OCR model, we allow limited (less than three) unique images in a single handwriting. This ensures we have diverse types of handwriting to train your OCR model on. Stringent measures have been taken to exclude any personally identifiable information (PII) and to ensure that in each image a minimum of 80% of space contains visible Turkish text.
The images have been captured under varying lighting conditions, including day and night, as well as different capture angles and backgrounds. This diversity helps build a balanced OCR dataset, featuring images in both portrait and landscape modes.
All these shopping lists were written and images were captured by native Turkish people to ensure text quality, prevent toxic content, and exclude PII text. We utilized the latest iOS and Android mobile devices with cameras above 5MP to maintain image quality. Images in this training dataset are available in both JPEG and HEIC formats.
Metadata:In addition to the image data, you will receive structured metadata in CSV format. For each image, this metadata includes information on image orientation, country, language, and device details. Each image is correctly named to correspond with the metadata.
This metadata serves as a valuable resource for understanding and characterizing the data, aiding informed decision-making in the development of Turkish text recognition models.
Update & Custom Collection:We are committed to continually expanding this dataset by adding more images with the help of our native Turkish crowd community.
If you require a customized OCR dataset containing shopping list images tailored to your specific guidelines or device distribution, please don't hesitate to contact us. We have the capability to curate specialized data to meet your unique requirements.
Additionally, we can annotate or label the images with bounding boxes or transcribe the text in the images to align with your project's specific needs using our crowd community.
License:This image dataset, created by FutureBeeAI, is now available for commercial use.
Conclusion:Leverage this shopping list image OCR dataset to enhance the training and performance of text recognition, text detection, and optical character recognition models for the Turkish language. Your journey to improved language understanding and processing begins here.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
TechCorner Mobile Sales & Customer Insights is a real-world dataset capturing 10 months of mobile phone sales transactions from a retail shop in Bangladesh. This dataset was designed to analyze customer location, buying behavior, and the impact of Facebook marketing efforts.
The primary goal was to identify whether customers are from the local area (Rangamati Sadar, Inside Rangamati) or completely outside Rangamati. Since TechCorner operates a Facebook page, the dataset also includes insights into whether Facebook marketing is effectively reaching potential buyers.
Additionally, the dataset helps in determining: ✔ How many customers are new vs. returning buyers ✔ If customers are followers of the shop’s Facebook page ✔ Whether a customer was recommended by an existing buyer
Retail sales analysis to understand product demand fluctuations.
Marketing impact measurement (Facebook engagement vs. actual purchase behavior).
Customer segmentation (local vs. non-local buyers, social media influence, word-of-mouth impact).
Sales trend analysis based on preferred phone models and price ranges.
With a realistic, non-uniform distribution of daily sales and some intentional missing values, this dataset reflects actual retail business conditions rather than artificially smooth AI-generated data.
Does he/she Come from Facebook Page? → Whether the customer came from a Facebook page (Yes/No). Used to analyze Facebook marketing reach.
Does he/she Followed Our Page? → Whether the customer is already a follower of the shop’s Facebook page (Yes/No). Helps measure brand loyalty and organic engagement.
Did he/she buy any mobile before? → Whether the customer is a repeat buyer (Yes/No). Determines the percentage of returning customers.
Did he/she hear of our shop before? → Whether the customer knew about the shop before purchasing (Yes/No). Identifies the impact of referrals or previous marketing efforts.
Was this customer recommended by an old customer? → Whether an existing customer referred them to the shop (Yes/No). Helps evaluate the effectiveness of word-of-mouth marketing.
This dataset is derived from real-world mobile sales transactions recorded at TechCorner, a retail shop in Bangladesh. It accurately reflects customer purchasing behavior, pricing trends, and the effectiveness of Facebook marketing in driving sales. Special appreciation to TechCorner for providing comprehensive insights into daily sales patterns, customer demographics, and market dynamics.
📊 Predictive modeling of sales trends based on customer demographics and marketing channels. 📈 Marketing effectiveness analysis (impact of Facebook promotions vs. organic sales). 🔍 Clustering customers based on purchasing habits (new vs. returning buyers, Facebook users vs. walk-ins). 📌 Understanding demand for different smartphone brands in a local retail market. 🚀 Analyzing how word-of-mouth recommendations influence new customer acquisition.
💡 Can you build a model to predict if a customer is likely to return? 💬 How effective is Facebook in driving actual sales compared to walk-ins? 🔍 Can we cluster customers based on behavior and brand preferences?
Facebook
TwitterThese datasets contain reviews from the Steam video game platform, and information about which games were bundled together.
Metadata includes
reviews
purchases, plays, recommends (likes)
product bundles
pricing information
Basic Statistics:
Reviews: 7,793,069
Users: 2,567,538
Items: 15,474
Bundles: 615