These datasets contain 1.48 million question and answer pairs about products from Amazon.
Metadata includes
question and answer text
is the question binary (yes/no), and if so does it have a yes/no answer?
timestamps
product ID (to reference the review dataset)
Basic Statistics:
Questions: 1.48 million
Answers: 4,019,744
Labeled yes/no questions: 309,419
Number of unique products with questions: 191,185
https://brightdata.com/licensehttps://brightdata.com/license
Gain extensive insights with our Amazon datasets, encompassing detailed product information including pricing, reviews, ratings, brand names, product categories, sellers, ASINs, images, and much more. Ideal for market researchers, data analysts, and eCommerce professionals looking to excel in the competitive online marketplace. Over 425M records available Price starts at $250/100K records Data formats are available in JSON, NDJSON, CSV, XLSX and Parquet. 100% ethical and compliant data collection Included datapoints:
Title Asin Main Image Brand Name Description Availability Subcategory Categories Parent Asin Type Product Type Name Model Number Manufacturer Color Size Date First Available Released Model Year Item Model Number Part Number Price Total Reviews Total Ratings Average Rating Features Best Sellers Rank Subcategory Buybox Buybox Seller Id Buybox Is Amazon Images Product URL And more
https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
1) Data Introduction • The Amazon Products Sales Dataset 2023 is a large e-commerce dataset that summarizes various product information in a tabular format, including product name, price, rating, discount information, images, and links by 142 major categories collected from Amazon's website.
2) Data Utilization (1) Amazon Products Sales Dataset 2023 has characteristics that: • Each row contains 10 key attributes, including product name, main/subcategory, image, Amazon link, rating, number of ratings, discount price, and actual price. • The data encompasses a wide range of products and is structured to enable multi-faceted analysis such as price policy, customer evaluation, and trend by category. (2) Amazon Products Sales Dataset 2023 can be used to: • Product Recommendation and Marketing Strategy: Use rating, price, and category data to develop a customized recommendation system, analyze popular products, and establish a category-specific marketing strategy. • Price and Discount Policy Analysis—Based on discounted prices and actual prices, ratings, reviews, etc., it can be applied to effective pricing policies, promotion strategies, market competitiveness analyses, and more.
This Dataset is an updated version of the Amazon review dataset released in 2014. As in the previous version, this dataset includes reviews (ratings, text, helpfulness votes), product metadata (descriptions, category information, price, brand, and image features), and links (also viewed/also bought graphs). In addition, this version provides the following features:
More reviews:
New reviews:
Metadata: - We have added transaction metadata for each review shown on the review page.
If you publish articles based on this dataset, please cite the following paper:
https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy
Access a comprehensive dataset of over 240,000 shoe product listings directly from Amazon UK. This dataset is ideal for researchers, e-commerce analysts, and AI developers looking to explore pricing trends, brand performance, product features, or build training data for retail-focused models.
All data is neatly packaged in a downloadable ZIP archive containing files in JSON format, making it easy to integrate with your preferred analytics or database tools.
Price and discount trend analysis
Competitor benchmarking
Product attribute extraction and modeling
AI/ML training datasets (e.g., shoe recommendation systems)
Retail assortment planning
This dataset is available as a static snapshot, but you can request weekly or monthly updates through the Crawl Feeds dashboard. Upon purchase, the data will be bundled and delivered via a direct download link.
https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
Amazon Customer Reviews (a.k.a. Product Reviews) is one of Amazons iconic products. In a period of over two decades since the first review in 1995, millions of Amazon customers have contributed over a hundred million reviews to express opinions and describe their experiences regarding products on the Amazon.com website. This makes Amazon Customer Reviews a rich source of information for academic researchers in the fields of Natural Language Processing (NLP), Information Retrieval (IR), and Machine Learning (ML), amongst others. Accordingly, we are releasing this data to further research in multiple disciplines related to understanding customer product experiences. Specifically, this dataset was constructed to represent a sample of customer evaluations and opinions, variation in the perception of a product across geographical regions, and promotional intent or bias in reviews.
Over 130+ million customer reviews are available to researchers as part of this release. The data is available in TSV files in the amazon-reviews-pds S3 bucket in AWS US East Region. Each line in the data files corresponds to an individual review (tab delimited, with no quote and escape characters).
Each Dataset contains the following columns:
Note: This dataset contains the 'Apparel' data from many of the datasets previously made available by Amazon for academic research purposes. The original source links are provided below: Dataset Readme, provided by Amazon: https://s3.amazonaws.com/amazon-reviews-pds/readme.html All Customer Review Datasets by Amazon: https://s3.amazonaws.com/amazon-reviews-pds/tsv/index.txt
Amazon Customer Reviews Dataset Amazon Customer Reviews (a.k.a. Product Reviews) is one of Amazon’s iconic products. In a period of over two decades since the first review in 1995, millions of Amazon customers have contributed over a hundred million reviews to express opinions and describe their experiences regarding products on the Amazon.com website. This makes Amazon Customer Reviews a rich source of information for academic researchers in the fields of Natural Language Processing (NLP), Information Retrieval (IR), and Machine Learning (ML), amongst others. Accordingly, we are releasing this data to further research in multiple disciplines related to understanding customer product experiences. Specifically, this dataset was constructed to represent a sample of customer evaluations and opinions, variation in the perception of a product across geographical regions, and promotional intent or bias in reviews.
Content marketplace - Country code of the marketplace where the review was written customer_id - ID of the customer reviewed the product review_id - The unique ID of the review product_id - The unique ID of the product product_parent - ID of the parent category product_title - Title of the product product_category - Broad product category, here only 'Apparel' data is available star_rating - The 1-5 star rating of the review helpful_votes - Number of helpful votes total_votes - Number of total votes the review received vine - The review was written as part of the Vine program or not verified_purchase - The review is on a verified purchase review_headline - The title of the review review_body - The review text review_date - The date the review was written
License By accessing the Amazon Customer Reviews Library ("Reviews Library"), you agree that the Reviews Library is an Amazon Service subject to the Amazon.com Conditions of Use (https://www.amazon.com/gp/help/customer/display.html/ref=footer_cou?ie=UTF8&nodeId=508088) and you agree to be bound by them, with the following additional conditions:
In addition to the license rights granted under the Conditions of Use, Amazon or its content providers grant you a limited, non-exclusive, non-transferable, non-sublicensable, revocable license to access and use the Reviews Library for purposes of academic research. You may not resell, republish, or make any commercial use of the Reviews Library or its contents, including use of the Reviews Library for commercial research, such as research related to a funding or consultancy contract, internship, or other relationship in which the results are provided for a fee or delivered to a for-profit organization. You may not (a) link or associate content in the Reviews Library with any personal information (including Amazon customer accounts), or (b) attempt to determine the identity of the author of any content in the Reviews Library. If you violate any of the foregoing conditions, your license to access and use the Reviews Library will automatically terminate without prejudice to any of the other rights or remedies Amazon may have. https://s3.amazonaws.com/amazon-reviews-pds/license.txt
Useful Links Provided by Amazon: https://s3.amazonaws.com/amazon-reviews-pds/readme.html Amazon Customer Review Available Datasets: https://s3.amazonaws.com/amazon-reviews-pds/tsv/index.txt
NOTE: This dataset is made available in Kaggle as the above links are no longer accessible
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Total-Cash-From-Operating-Activities Time Series for Amazon.com Inc. Amazon.com, Inc. engages in the retail sale of consumer products, advertising, and subscriptions service through online and physical stores in North America and internationally. The company operates through three segments: North America, International, and Amazon Web Services (AWS). It also manufactures and sells electronic devices, including Kindle, fire tablets, fire TVs, echo, ring, blink, and eero; and develops and produces media content. In addition, the company offers programs that enable sellers to sell their products in its stores; and programs that allow authors, independent publishers, musicians, filmmakers, Twitch streamers, skill and app developers, and others to publish and sell content. Further, it provides compute, storage, database, analytics, machine learning, and other services, as well as advertising services through programs, such as sponsored ads, display, and video advertising. Additionally, the company offers Amazon Prime, a membership program. The company's products offered through its stores include merchandise and content purchased for resale and products offered by third-party sellers. It also provides AgentCore services, such as AgentCore Runtime, AgentCore Memory, AgentCore Observability, AgentCore Identity, AgentCore Gateway, AgentCore Browser, and AgentCore Code Interpreter. It serves consumers, sellers, developers, enterprises, content creators, advertisers, and employees. Amazon.com, Inc. was incorporated in 1994 and is headquartered in Seattle, Washington.
https://brightdata.com/licensehttps://brightdata.com/license
Utilize our Amazon reviews dataset for diverse applications to enrich business strategies and market insights. Analyzing this dataset can aid in understanding customer behavior, product performance, and market trends, empowering organizations to refine their product and marketing strategies. Access the entire dataset or tailor a subset to fit your requirements. Popular use cases include: Product Performance Analysis: Analyze Amazon reviews to assess product performance, uncovering customer satisfaction levels, common issues, and highly praised features to inform product improvements and marketing messages. Customer Behavior Insights: Gain insights into customer behavior, purchasing patterns, and preferences, enabling more personalized marketing and product recommendations. Demand Forecasting: Leverage Amazon reviews to predict future product demand by analyzing historical review data and identifying trends, helping to optimize inventory management and sales strategies. Accessing and analyzing the Amazon reviews dataset supports market strategy optimization by leveraging insights to analyze key market trends and customer preferences, enhancing overall business decision-making.
This dataset provides comprehensive real-time data from Amazon's global marketplaces. It includes detailed product information, reviews, seller profiles, best sellers, deals, influencers, and more across all Amazon domains worldwide. The data covers product attributes like pricing, availability, specifications, reviews and ratings, as well as seller information including profiles, contact details, and performance metrics. Users can leverage this dataset for price monitoring, competitive analysis, market research, and building e-commerce applications. The API enables real-time access to Amazon's vast product catalog and marketplace data, helping businesses make data-driven decisions about pricing, inventory, and market positioning. Whether you're conducting market analysis, tracking competitors, or building e-commerce tools, this dataset provides current and reliable Amazon marketplace data. The dataset is delivered in a JSON format via REST API.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
To read any dataset you can use the following code
>>> import numpy as np
>>> embed_image = np.load('embed_image.npy')
>>> embed_image.shape
(33962, 768)
>>> embed_text = np.load('embed_text.npy')
>>> embed_text.shape
(33962, 768)
>>> import pandas as pd
>>> items = pd.read_csv('items.txt')
>>> m = len(items)
>>> print(f'{m} items in dataset')
33962
>>> users = pd.read_csv('users.txt')
>>> n = len(users)
>>> print(f'{n} users in dataset')
14790
>>> train = pd.read_csv('train.txt')
>>> train
user item
0 13444 23557
1 13444 33739
... ... ...
317109 13506 29993
317110 13506 13931
>>> from scipy.sparse import csr_matrix
>>> train_matrix = csr_matrix((np.ones(len(train)), (train.user, train.item)), shape=(n,m))
This dataset contains six datasets. Each dataset is duplicated with seven combinations of different Image and Text encoders, so you should see 42 folders.
Each folder is the name of the dataset and the encoder used for the visual and textual parts. For example: bookcrossing-vit_bert
.
The datasets are: - Clothing, Shoes and Jewelry (Amazon) - Home and Kitchen (Amazon) - Musical Instruments (Amazon) - Movies and TV (Amazon) - Book-Crossing - Movielens 25M
And the encoders are:
- CLIP (Image and Text) (*-clip_clip
). This is the main one used in the experiments.
- ViT and BERT (*-vit_bert
)
- CLIP (only visual data) *-clip_none
- ViT only *-vit_none
- BERT only *-none_bert
- CLIP (text only) *-clip_none
- No textual or visual information *-none_none
For each dataset, we have the following files, considering we have M
items and N
users, textual embeddings with D (like 1024) dimensions, and Visual with E dimensions (like 768)
- embed_image.npy
A NumPy array of MxE
elements.
- embed_text.npy
A NumPy array of MXD
elements.
- items.csv
A CSV with the Item ID in the original dataset (like the Amazon ASIN, the Movie ID, etc.) and the item number, an integer from 0 to M-1
- users.csv
A CSV with the User ID in the original dataset (like the Amazon Reviewer Id) and the item number, an integer from 0 to N-1
- train.txt
, validation.txt
and test.txt
are CSV files with the portions of the reviews for train validation and test. It has the item the user liked or reviewed positively. Each row has a positive user item.
We consider a review "positive" if the rating is four or more (or 8 or more for Book-crossing).
The vector is zeroed out if an Item does not have an image or text.
Dataset | Users | Item | Ratings | Density |
---|---|---|---|---|
Clothing & Shoes & Jewelry | 23318 | 38493 | 178944 | 0.020% |
Home & Kitchen | 5968 | 57645 | 135839 | 0.040% |
Movies & TV | 21974 | 23958 | 216110 | 0.041% |
Musical Instruments | 14429 | 29040 | 93923 | 0.022% |
Book-crossing | 14790 | 33962 | 519613 | 0.103% |
Movielens 25M | 162541 | 59047 | 25000095 | 0.260% |
Only a tiny fraction of the dataset was taken for the Amazon Datasets by considering reviews in a specific date range.
For the Bookcrossing dataset, only items with images were considered.
There are various other minor tweaks on how to obtain images and texts. The repo https://github.com/igui/MultimodalRecomAnalysis has the Notebook and scripts to reproduce the dataset extraction from scratch.
• 200K+ Seller Leads • Seller Type: Brand/PL Seller, 1P/Amazon Vendor Central and 3P Sellers • Selling Platforms: Amazon USA, UK, EU, CA, AU • C-Suite/Marketing/Sales Contacts • FBA/FBM Sellers • Filter your leads by revenue, categories, location, SKU's and more • 100% manually researched and verified.
For over a decade, we have been manually collecting Amazon seller data from various data sources such as Amazon, LinkedIn, Google, and others. We specialize in getting valid data so you may conduct ads and begin selling without hesitation.
We designed our data packages for all types of organizations, thus they are reasonably priced. We are always trying to reduce our prices to better suit all of your requirements.
So, if you’re looking to reach out to your targeted Amazon sellers, now is the greatest time to do so and offer your goods, services, and promotions. You can get your targeted Amazon Sellers List with seller contact information.
Alternatively, if you provide Amazon Seller Names or IDs, we will conduct Custom Research and deliver the customized list to you.
Data Points Available:
Full Name Linkedin URL Direct Email Generic Phone Number Business Name and Address Company Website Seller IDs and URLs Revenue Seller Review Count Niche FBA/Non-FBA Country and More
DES is publishing the Amazon spend for state agencies collected through the Washington State Amazon Business account. The data set only includes closed orders. Any orders that are still in process or have been cancelled are not included. This data is for Fiscal Year 22 (July 1, 2021 to June 30, 2022). Data is updated monthly.
SpaceKnow USA Supply Chain Premium Dataset gives you data (by locations and company) of US Supply Chain choke points in near-real-time as seen from satellite images. The uniqueness of this dataset lies in its granularity.
About dataset: We apply proprietary algorithms to SAR satellite imagery of key industrial, transportation, storage, and logistics locations to create daily indices of industry activity. Data was collected from more than 5,000 locations across the USA. Thanks to the use of SAR satellite technology, the quality of the SpaceKnow dataset is not influenced by weather fluctuations.
In total SpaceKnow USA Supply Chain dataset offers +50 specific indices with real-time insights. The premium dataset includes company-focused indices. This type of data can be used by investors to get insight on important KPIs such as revenue.
This dataset is:
Daily frequency History from Jan 2017 - present
Within one package we provide you with real-time insights into:
Port Container country-level indices(A container port or container terminal is a facility where cargo containers are transshipped between different transport vehicles, for onward transportation) Port Container indices for the major ports in US: Port of Los Angeles Port of Long Beach Port of New York & New Jersey Port of Savannah Port of Houston Port of Virginia Port of Oakland in California Port of South Carolina Port of Miami
Trucking Stop indices for the most important locations in the supply chain like: Iowa Nevada South Carolina Oregon North Carolina
Inland Containers index on a country-level
Logistics Center index on a country-level (Logistics centers are distribution hubs for finished goods that need to be transported to another location. We include logistics centers from companies like Amazon, Walmart, Fedex and others)
Logistics Center indices for states like: California New York Illinois Indiana South Carolina And many more…
Logistics Center indices for companies: Amazon Walmart Fedex
Research Reports Don't have the capacity to analyze the data? Let SpaceKnow's in-house economists do the heavy lifting so that you can focus on what's important. SpaceKnow writes research reports based on what the data from the US Supply Chain dataset package is showing. The document includes a detailed explanation of what is happening with supporting charts and tables. The reports are published on a monthly basis.
Delivery Mechanisms All of the delivery mechanisms detailed below are available as part of this package. Data is distributed only in the flat-table CSV format. Methods how to access the data: Dashboard - option that also offers data visualization within the webpage Automatic email delivery API access to our dataset Research reports - provided via email in PDF format
Client Support
Each client is assigned an account representative who will reach out periodically to make sure that the data packages are meeting your needs. Here are some other ways to contact SpaceKnow in case you have a specific question.
For delivery questions and issues: Please reach out to support@spaceknow.com
For data questions: Please reach out to product@spaceknow.com
For pricing/sales support: Please reach out to info@spaceknow.com or sales@spaceknow.com
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset comprises customer reviews for Amazon, an online retail giant, featuring insights into customer experiences, including ratings, review titles, texts, and metadata. It is valuable for analyzing customer satisfaction, sentiment, and trends.
Column Descriptions:
Reviewer Name: Identifies the reviewer. Profile Link: Links to the reviewer's profile for additional insights. Country: Indicates the reviewer's location. Review Count: Number of reviews by the same user, showing engagement level. Review Date: When the review was posted, useful for time analysis. Rating: Numerical satisfaction measure. Review Title: Summarizes the review sentiment. Review Text: Detailed customer feedback. Date of Experience: When the service/product was experienced.
Prospective applications:
Sentiment Analysis: Analyze review texts and titles to assess overall customer sentiment toward products, enabling the identification of strengths and weaknesses. Customer Satisfaction Tracking: Track and visualize rating trends over time to understand fluctuations in customer satisfaction. Product Improvement: Identify common themes in reviews to highlight areas for product enhancement or development. Market Segmentation: Use country and demographic information to customize marketing strategies and gain insights into regional preferences. Competitor Analysis: Evaluate customer feedback on Amazon products in comparison to competitors to determine market positioning. Recommendation Systems: Leverage review data to enhance recommendation algorithms, improving personalized shopping experiences. Trend Analysis: Investigate temporal patterns in reviews to link sentiment changes with marketing efforts or product launches.
This extensive dataset serves as a valuable asset for various analyses focused on enhancing customer engagement and refining business strategies.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Operating-Income Time Series for Amazon.com Inc. Amazon.com, Inc. engages in the retail sale of consumer products, advertising, and subscriptions service through online and physical stores in North America and internationally. The company operates through three segments: North America, International, and Amazon Web Services (AWS). It also manufactures and sells electronic devices, including Kindle, fire tablets, fire TVs, echo, ring, blink, and eero; and develops and produces media content. In addition, the company offers programs that enable sellers to sell their products in its stores; and programs that allow authors, independent publishers, musicians, filmmakers, Twitch streamers, skill and app developers, and others to publish and sell content. Further, it provides compute, storage, database, analytics, machine learning, and other services, as well as advertising services through programs, such as sponsored ads, display, and video advertising. Additionally, the company offers Amazon Prime, a membership program. The company's products offered through its stores include merchandise and content purchased for resale and products offered by third-party sellers. It also provides AgentCore services, such as AgentCore Runtime, AgentCore Memory, AgentCore Observability, AgentCore Identity, AgentCore Gateway, AgentCore Browser, and AgentCore Code Interpreter. It serves consumers, sellers, developers, enterprises, content creators, advertisers, and employees. Amazon.com, Inc. was incorporated in 1994 and is headquartered in Seattle, Washington.
These datasets contain peer-to-peer trades from various recommendation platforms.
Metadata includes
peer-to-peer trades
have and want lists
image data (tradesy)
Get the needed Amazon product data right from the data extractor! Collect Amazon data product information from 19 Amazon countries from the following domains: - amazon.com - amazon.com.au - amazon.com.br - amazon.ca - amazon.cn - amazon.fr - amazon.de - amazon.in - amazon.it - amazon.com.mx - amazon.nl - amazon.sg - amazon.es - amazon.com.tr
Request Ecommerce Product Data dataset by: - keyword - category - seller
Amazon E-commerce Data datasets gathered by keyword and category contain: - product page position in search - product position on the page - product global position
Data attributes contain: - ASIN - URL - Price (current price and discount information) - Reviews (total reviews and total rating). - Reviews information: each product can be enriched with the sub-dataset with reviews - Title - Description - Audio and Video And dozens of additional information.
Amazon extraction results can be delivered by schedule or API request, so the data can be extracted in real-time.
DATAANT uses the in-house web scraping service with no concurrency limitations, so unlimited data extractions can be performed simultaneously.
Output can and attributes can be customized to fit your particular needs.
In 2024, Amazon's net revenue from subscription services segment amounted to 44.37 billion U.S. dollars. Subscription services include Amazon Prime, for which Amazon reported 200 million paying members worldwide at the end of 2020. The AWS category generated 107.56 billion U.S. dollars in annual sales. During the most recently reported fiscal year, the company’s net revenue amounted to 638 billion U.S. dollars. Amazon revenue segments Amazon is one of the biggest online companies worldwide. In 2019, the company’s revenue increased by 21 percent, compared to Google’s revenue growth during the same fiscal period, which was just 18 percent. The majority of Amazon’s net sales are generated through its North American business segment, which accounted for 236.3 billion U.S. dollars in 2020. The United States are the company’s leading market, followed by Germany and the United Kingdom. Business segment: Amazon Web Services Amazon Web Services, commonly referred to as AWS, is one of the strongest-growing business segments of Amazon. AWS is a cloud computing service that provides individuals, companies and governments with a wide range of computing, networking, storage, database, analytics and application services, among many others. As of the third quarter of 2020, AWS accounted for approximately 32 percent of the global cloud infrastructure services vendor market.
These datasets contain attributes about products sold on ModCloth and Amazon which may be sources of bias in recommendations (in particular, attributes about how the products are marketed). Data also includes user/item interactions for recommendation.
Metadata includes
ratings
product images
user identities
item sizes, user genders
These datasets contain 1.48 million question and answer pairs about products from Amazon.
Metadata includes
question and answer text
is the question binary (yes/no), and if so does it have a yes/no answer?
timestamps
product ID (to reference the review dataset)
Basic Statistics:
Questions: 1.48 million
Answers: 4,019,744
Labeled yes/no questions: 309,419
Number of unique products with questions: 191,185