60 datasets found

Amazon Dataset
brightdata.com
.json, .csv, .xlsx
Updated Mar 31, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bright Data (2022). Amazon Dataset [Dataset]. https://brightdata.com/products/datasets/amazon
Explore at:
.json, .csv, .xlsxAvailable download formats
Dataset updated
Mar 31, 2022
Dataset authored and provided by
Bright Datahttps://brightdata.com/
License
https://brightdata.com/licensehttps://brightdata.com/license
Area covered
Worldwide
Description
Buy Amazon datasets and get access to over 300 million records from any Amazon domain. Get insights on Amazon products, sellers, and reviews.
d
Rainforest API | Amazon Product & Search Results Data
datarade.ai
.json, .csv
Updated Nov 23, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Traject Data (2023). Rainforest API | Amazon Product & Search Results Data [Dataset]. https://datarade.ai/data-categories/amazon-sellers-data/datasets
Explore at:
.json, .csvAvailable download formats
Dataset updated
Nov 23, 2023
Dataset authored and provided by
Traject Data
Area covered
Australia, Angola, Malawi, Swaziland, Switzerland, Ireland, Denmark, Montserrat, Ghana, Albania
Description
Capture all Amazon product listing details with confidence that you are getting complete and current data. Rainforest API offers comprehensive coverage of each of the product listings or search results in a cleanly structured output.

Rainforest API's advanced parsing means the results returned are exactly what a human user would see. You can request data from any Amazon domain and originate your request from any country in the world. The high-capacity, global infrastructure of the Rainforest API assures you the highest level of performance and reliability. For easy integration with your apps, data is delivered in JSON or CSV format. A convenient CSV Builder allows customization of data columns.

Data is retrieved in real time, by search term, or for single products, by global identifiers such as GTIN, ISBN, UPC and EAN rather than Amazon ASIN. The API automatically performs the ASIN conversion for each request. You can also submit a product page URL (product results), or a category ID (category search results) instead.

So what's in the data from Rainforest API?

Product: - Brand & manufacturer - Manufacturer & Amazon product descriptions - Specifications - Buy Box Winner: price, etc. - 1st party, 2nd party & 3rd party seller data - Additional product details (i.e. energy efficiency, add-ons) - A-Plus content - Imagery - Product videos - Category details (category, bestseller category) - Deals (types, states) - Bundles - Seller offers (including delivery options) - Frequently bought together / Also bought - Also viewed / Similar item to consider - Rating & reviews (incl. full review, top positive, top negative, manufacturer replies) - Stock estimation - Sales estimation (for select Amazon domains)

Search Results: - Product details per search result - Position - Related searches - Related brands

...and more, depending on your request parameters or the search result.

How can Traject Data: Amazon Product Results Data be used? - Product listing management - Price monitoring - Brand protection - Category & product trends monitoring - Market research & competitor intelligence - Location-specific & cross-border Amazon shipping data - Rank tracking on Amazon

Who uses Traject Data: Amazon Product Results Data? This data is leveraged by software developers, marketers, founders, sales & business development teams, researchers, and data analysts & engineers in ecommerce, other retail/wholesale business, agencies and SaaS platforms.

Anyone in your organization who works with your digital presence can develop business intelligence and strategy using this advanced product data.
Amazon revenue 2004-2024
statista.com
flwrdeptvarieties.store
+1more
Updated Feb 21, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Amazon revenue 2004-2024 [Dataset]. https://www.statista.com/statistics/266282/annual-net-revenue-of-amazoncom/
Explore at:
Dataset updated
Feb 21, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
Worldwide, United States
Description
From 2004 to 2024, the net revenue of Amazon e-commerce and service sales has increased tremendously. In the fiscal year ending December 31, the multinational e-commerce company's net revenue was almost 638 billion U.S. dollars, up from 575 billion U.S. dollars in 2023.Amazon.com, a U.S. e-commerce company originally founded in 1994, is the world’s largest online retailer of books, clothing, electronics, music, and many more goods. As of 2024, the company generates the majority of it's net revenues through online retail product sales, followed by third-party retail seller services, cloud computing services, and retail subscription services including Amazon Prime. From seller to digital environment Through Amazon, consumers are able to purchase goods at a rather discounted price from both small and large companies as well as from other users. Both new and used goods are sold on the website. Due to the wide variety of goods available at prices which often undercut local brick-and-mortar retail offerings, Amazon has dominated the retailer market. As of 2024, Amazon’s brand worth amounts to over 185 billion U.S. dollars, topping the likes of companies such as Walmart, Ikea, as well as digital competitors Alibaba and eBay. One of Amazon's first forays into the world of hardware was its e-reader Kindle, one of the most popular e-book readers worldwide. More recently, Amazon has also released several series of own-branded products and a voice-controlled virtual assistant, Alexa. Headquartered in North America Due to its location, Amazon offers more services in North America than worldwide. As a result, the majority of the company’s net revenue in 2023 was actually earned in the United States, Canada, and Mexico. In 2023, approximately 353 billion U.S. dollars was earned in North America compared to only roughly 131 billion U.S. dollars internationally.
P
Amazon Product Data Dataset
paperswithcode.com
opendatalab.com
Updated Mar 5, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ruining He; Julian McAuley (2024). Amazon Product Data Dataset [Dataset]. https://paperswithcode.com/dataset/amazon-product-data
Explore at:
Dataset updated
Mar 5, 2024
Authors
Ruining He; Julian McAuley
Description
This dataset contains product reviews and metadata from Amazon, including 142.8 million reviews spanning May 1996 - July 2014.

This dataset includes reviews (ratings, text, helpfulness votes), product metadata (descriptions, category information, price, brand, and image features), and links (also viewed/also bought graphs).
Amazon Web Services annual revenue 2013-2023
statista.com
flwrdeptvarieties.store
Updated Sep 29, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2024). Amazon Web Services annual revenue 2013-2023 [Dataset]. https://www.statista.com/statistics/233725/development-of-amazon-web-services-revenue/
Explore at:
Dataset updated
Sep 29, 2024
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
Worldwide
Description
In 2023, Amazon Web Services (AWS) generated 90.8 billion US dollars with its cloud services. From 2013 until today, the annual revenue of AWS cloud computing and hosting solutions continually increased.

Amazon - additional information Amazon.com went online in 1995, initially as a book store, and achieved almost immediate success. In 1998 the store expanded to include a music and video store and different other products, such as apparel and consumer electronics in the following years. The company is the undisputed leader of the e-retail market in the United States, ranking ahead of walmart.com and apple.com in terms of revenue.

Amazon Web Services In 2006, AWS launched as a cloud computing platform to provide online services. Amazon Elastic Compute Cloud and Amazon S3, which provide large virtual computing capacity, are the most well-known of these services. The company has dozens of locations in 25 different regions across the world and is continually expanding its global infrastructure to ensure low latency through proximity to the user.

From these data centers, Amazon is offering more than 200 fully featured services to its global customer base. Video streaming service Netflix is one of AWS’s largest customers, using Amazon’s services to store their content on servers throughout the world. Among its more than one million active users, AWS also lists other well-known organizations from various industries, such as Disney, the UK Ministry of Justice, Kellog’s, Guardian News and Media, and the European Space Agency.
More than 1,070,574 Verified Contacts of companies that use Amazon AWS
datarade.ai
Updated Aug 20, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
DataCaptive (2021). More than 1,070,574 Verified Contacts of companies that use Amazon AWS [Dataset]. https://datarade.ai/data-products/more-than-1-070-574-verified-contacts-of-companies-that-use-a-datacaptive
Explore at:
.json, .xml, .csv, .xls, .sql, .txtAvailable download formats
Dataset updated
Aug 20, 2021
Dataset authored and provided by
DataCaptive
Area covered
Cyprus, Turkmenistan, Ascension and Tristan da Cunha, Italy, Sweden, Timor-Leste, Suriname, Serbia, Pitcairn, Bosnia and Herzegovina
Description
Amazon AWS - Cloud Platforms & Services

Companies using Amazon AWS

We have data on 1,070,574 companies that use Amazon AWS. The companies using Amazon AWS are most often found in United States and in the Computer Software industry. Amazon AWS is most often used by companies with 10-50 employees and 1M-10M dollars in revenue. Our data for Amazon AWS usage goes back as far as 2 years and 1 months.

What is Amazon AWS?

Amazon Web Services (AWS) is a collection of remote computing services, also called web services that make up a cloud computing platform offered by Amazon.com.

Top Industries that use Amazon AWS

Looking at Amazon AWS customers by industry, we find that Computer Software (6%) is the largest segment.

Distribution of companies using Amazon AWS by Industry

 Computer software - 67, 537 companies  Hospitals & Healthcare - 54, 293 companies  Retail - 39, 543 companies  Information Technology and Services - 35, 382 companies  Real Estate - 31, 676 companies  Restaurants - 30, 302 companies  Construction - 29, 207 companies  Automotive - 28, 469 companies  Financial Services - 23, 680 companies  Education Management - 21, 548 companies

Top Countries that use Amazon AWS

49% of Amazon AWS customers are in United States and 7% are in United Kingdom.

Distribution of companies using Amazon AWS by country

 United Sates – 616 2275 companies  United Kingdom – 68 219 companies  Australia – 44 601 companies  Canada – 42 770 companies  Germany – 31 541 companies  India – 30 949 companies  Netherlands – 19 543 companies  Brazil – 17 165 companies  Italy – 14 876 companies  Spain – 14 675 companies

Contact Information of Fields Include:-

• Company Name • Business contact number • Title
• Name • Email Address • Country, State, City, Zip Code • Phone, Mobile and Fax • Website • Industry • SIC & NAICS Code • Employees Size
• Revenue Size
• And more…

Why Buy AWS Users List from DataCaptive?

• More than 1,070,574 companies
• Responsive database • Customizable as per your requirements • Email and Tele-verified list • Team of 100+ market researchers • Authentic data sources

What’s in for you?

Over choosing us, here are a few advantages we authenticate-

• Locate, target, and prospect leads from 170+ countries • Design and execute ABM and multi-channel campaigns • Seamless and smooth pre-and post-sale customer service • Connect with old leads and build a fruitful customer relationship • Analyze the market for product development and sales campaigns • Boost sales and ROI with increased customer acquisition and retention

Our security compliance

We use of globally recognized data laws like –

GDPR, CCPA, ACMA, EDPS, CAN-SPAM and ANTI CAN-SPAM to ensure the privacy and security of our database. We engage certified auditors to validate our security and privacy by providing us with certificates to represent our security compliance.

Our USPs- what makes us your ideal choice?

At DataCaptive™, we strive consistently to improve our services and cater to the needs of businesses around the world while keeping up with industry trends.

• Elaborate data mining from credible sources • 7-tier verification, including manual quality check • Strict adherence to global and local data policies • Guaranteed 95% accuracy or cash-back • Free sample database available on request

Guaranteed benefits of our Amazon AWS users email database!

85% email deliverability and 95% accuracy on other data fields

We understand the importance of data accuracy and employ every avenue to keep our database fresh and updated. We execute a multi-step QC process backed by our Patented AI and Machine learning tools to prevent anomalies in consistency and data precision. This cycle repeats every 45 days. Although maintaining 100% accuracy is quite impractical, since data such as email, physical addresses, and phone numbers are subjected to change, we guarantee 85% email deliverability and 95% accuracy on other data points.

100% replacement in case of hard bounces

Every data point is meticulously verified and then re-verified to ensure you get the best. Data Accuracy is paramount in successfully penetrating a new market or working within a familiar one. We are committed to precision. However, in an unlikely event where hard bounces or inaccuracies exceed the guaranteed percentage, we offer replacement with immediate effect. If need be, we even offer credits and/or refunds for inaccurate contacts.

Other promised benefits

• Contacts are for the perpetual usage • The database comprises consent-based opt-in contacts only • The list is free of duplicate contacts and generic emails • Round-the-clock customer service assistance • 360-degree database solutions
Amazon Review Polarity
figshare.com
bin
Updated Nov 13, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Luís Fred (2020). Amazon Review Polarity [Dataset]. http://doi.org/10.6084/m9.figshare.13232501.v1
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.13232501.v1
Dataset updated
Nov 13, 2020
Dataset provided by
figshare
Authors
Luís Fred
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Amazon Review Polaridy DatasetVersion 3, Updated 09/09/2015ORIGINThe Amazon reviews dataset consists of reviews from amazon. The data span a period of 18 years, including ~35 million reviews up to March 2013. Reviews include product and user information, ratings, and a plaintext review. For more information, please refer to the following paper: J. McAuley and J. Leskovec. Hidden factors and hidden topics: understanding rating dimensions with review text. RecSys, 2013.The Amazon reviews polarity dataset is constructed by Xiang Zhang (xiang.zhang@nyu.edu) from the above dataset. It is used as a text classification benchmark in the following paper: Xiang Zhang, Junbo Zhao, Yann LeCun. Character-level Convolutional Networks for Text Classification. Advances in Neural Information Processing Systems 28 (NIPS 2015).DESCRIPTIONThe Amazon reviews polarity dataset is constructed by taking review score 1 and 2 as negative, and 4 and 5 as positive. Samples of score 3 is ignored. In the dataset, class 1 is the negative and class 2 is the positive. Each class has 1,800,000 training samples and 200,000 testing samples.The files train.csv and test.csv contain all the training samples as comma-sparated values. There are 3 columns in them, corresponding to class index (1 or 2), review title and review text. The review title and text are escaped using double quotes ("), and any internal double quote is escaped by 2 double quotes (""). New lines are escaped by a backslash followed with an "n" character, that is " ".
d
Paypal Email Receipt Data | Consumer Transaction Data | Payment Data | Asia,...
datarade.ai
.json, .xml, .csv
Updated Nov 15, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Measurable AI (2023). Paypal Email Receipt Data | Consumer Transaction Data | Payment Data | Asia, EMEA, LATAM, MENA, India | Granular & Aggregate Data available [Dataset]. https://datarade.ai/data-categories/paypal-transaction-data/datasets
Explore at:
.json, .xml, .csvAvailable download formats
Dataset updated
Nov 15, 2023
Dataset authored and provided by
Measurable AI
Area covered
Colombia, United States of America, Chile, Mexico, Japan, Brazil, Argentina
Description
The Measurable AI Amazon Consumer Transaction Dataset is a leading source of email receipts and consumer transaction data, offering data collected directly from users via Proprietary Consumer Apps, with millions of opt-in users.

We source our email receipt consumer data panel via two consumer apps which garner the express consent of our end-users (GDPR compliant). We then aggregate and anonymize all the transactional data to produce raw and aggregate datasets for our clients.

Use Cases Our clients leverage our datasets to produce actionable consumer insights such as: - Market share analysis - User behavioral traits (e.g. retention rates) - Average order values - Promotional strategies used by the key players. Several of our clients also use our datasets for forecasting and understanding industry trends better.

Granular Data Itemized, high-definition data per transaction level with metrics such as - Order value - Items ordered - No. of orders per user - Delivery fee - Service fee - Promotions used - Geolocation data and more

Aggregate Data - Weekly/ monthly order volume - Revenue delivered in aggregate form, with historical data dating back to 2018. All the transactional e-receipts are sent from app to users’ registered accounts.

Most of our clients are fast-growing Tech Companies, Financial Institutions, Buyside Firms, Market Research Agencies, Consultancies and Academia.

Our dataset is GDPR compliant, contains no PII information and is aggregated & anonymized with user consent. Contact michelle@measurable.ai for a data dictionary and to find out our volume in each country.
P
Amazon Polarity Dataset
paperswithcode.com
Updated May 15, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Amazon Polarity Dataset [Dataset]. https://paperswithcode.com/dataset/amazon-polarity-1
Explore at:
Dataset updated
May 15, 2024
Description
The Amazon Polarity dataset is a set of reviews from Amazon. The dataset is constructed by taking review scores 1 and 2 as negative (class 1), and 4 and 5 as positive (class 2). Reviews with a score of 3 are ignored. The dataset spans a period of 18 years, including approximately 35 million reviews up to March 2013. Each class in the dataset has 1,800,000 training samples and 200,000 testing samples. The dataset includes product and user information, ratings, and a plaintext review.
The statistics of Amazon and Bookcross datasets.
plos.figshare.com
xls
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Wei Zeng; An Zeng; Ming-Sheng Shang; Yi-Cheng Zhang (2023). The statistics of Amazon and Bookcross datasets. [Dataset]. http://doi.org/10.1371/journal.pone.0079354.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0079354.t001
Dataset updated
May 31, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Wei Zeng; An Zeng; Ming-Sheng Shang; Yi-Cheng Zhang
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The sparsity is obtained by , where and are the number of users and items, respcetively.
P
E-commerce Product Image Classification Dataset Dataset
paperswithcode.com
Updated Mar 23, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). E-commerce Product Image Classification Dataset Dataset [Dataset]. https://paperswithcode.com/dataset/e-commerce-product-image-classification
Explore at:
Dataset updated
Mar 23, 2025
Description
Description:

👉 Download the dataset here

This dataset is specifically designed for the classification of e-commerce products based on their images, forming a critical part of an experimental study aimed at improving product categorization using computer vision techniques. Accurate categorization is essential for e-commerce platforms as it directly influences customer satisfaction, enhances user experience, and optimizes sales by ensuring that products are presented in the correct categories.

Data Collection and Sources

The dataset comprises a comprehensive collection of e-commerce product images gathered from a diverse range of sources, including prominent online marketplaces such as Amazon, Walmart, and Google, as well as additional resources obtained through web scraping. Additionally, the Amazon Berkeley Objects (ABO) project has been utilized to enhance the dataset in certain categories, though its contribution is limited to specific classes.

Download Dataset

Dataset Composition and Structure

The dataset is organized into 9 distinct classes, primarily reflecting major product categories prevalent on Amazon. These categories were chosen based on a balance between representation and practicality, ensuring sufficient diversity and relevance for training and testing computer vision models. The dataset's structure includes:

18,175 images: Resized to 224x224 pixels, suitable for use in various pretrained CNN architectures.

9 Classes: Representing major e-commerce product categories, offering a broad spectrum of items typically found on online retail platforms.

Train-Val-Check Sets: The dataset is split into training, validation, and check sets. The training and validation sets are designated for model training and hyperparameter tuning, while a smaller check set is reserved for model deployment, providing a visual evaluation of the model's performance in a real-world scenario.

Application and Relevance

E-commerce platforms face significant challenges in product categorization due to the vast number of categories, the variety of products, and the need for precise classification. This dataset addresses these challenges by offering a well-balanced collection of images across multiple categories, allowing for robust model training and evaluation.

This dataset is sourced from kaggle.
r
Amazon Web Services Public Data Sets
rrid.site
scicrunch.org
+1more
Updated Mar 22, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Amazon Web Services Public Data Sets [Dataset]. http://identifiers.org/RRID:SCR_006318
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_006318
Dataset updated
Mar 22, 2025
Description
A multidisciplinary repository of public data sets such as the Human Genome and US Census data that can be seamlessly integrated into AWS cloud-based applications. AWS is hosting the public data sets at no charge for the community. Anyone can access these data sets from their Amazon Elastic Compute Cloud (Amazon EC2) instances and start computing on the data within minutes. Users can also leverage the entire AWS ecosystem and easily collaborate with other AWS users. If you have a public domain or non-proprietary data set that you think is useful and interesting to the AWS community, please submit a request and the AWS team will review your submission and get back to you. Typically the data sets in the repository are between 1 GB to 1 TB in size (based on the Amazon EBS volume limit), but they can work with you to host larger data sets as well. You must have the right to make the data freely available.
AI Training Data Market will grow at a CAGR of 23.50% from 2024 to 2031.
cognitivemarketresearch.com
pdf,excel,csv,ppt
Updated Jan 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cognitive Market Research (2025). AI Training Data Market will grow at a CAGR of 23.50% from 2024 to 2031. [Dataset]. https://www.cognitivemarketresearch.com/ai-training-data-market-report
Explore at:
pdf,excel,csv,pptAvailable download formats
Dataset updated
Jan 15, 2025
Dataset authored and provided by
Cognitive Market Research
License
https://www.cognitivemarketresearch.com/privacy-policyhttps://www.cognitivemarketresearch.com/privacy-policy
Time period covered
2021 - 2033
Area covered
Global
Description
According to Cognitive Market Research, the global Ai Training Data market size is USD 1865.2 million in 2023 and will expand at a compound annual growth rate (CAGR) of 23.50% from 2023 to 2030.

The demand for Ai Training Data is rising due to the rising demand for labelled data and diversification of AI applications. Demand for Image/Video remains higher in the Ai Training Data market. The Healthcare category held the highest Ai Training Data market revenue share in 2023. North American Ai Training Data will continue to lead, whereas the Asia-Pacific Ai Training Data market will experience the most substantial growth until 2030.

Market Dynamics of AI Training Data Market

Key Drivers of AI Training Data Market

Rising Demand for Industry-Specific Datasets to Provide Viable Market Output

A key driver in the AI Training Data market is the escalating demand for industry-specific datasets. As businesses across sectors increasingly adopt AI applications, the need for highly specialized and domain-specific training data becomes critical. Industries such as healthcare, finance, and automotive require datasets that reflect the nuances and complexities unique to their domains. This demand fuels the growth of providers offering curated datasets tailored to specific industries, ensuring that AI models are trained with relevant and representative data, leading to enhanced performance and accuracy in diverse applications.

In July 2021, Amazon and Hugging Face, a provider of open-source natural language processing (NLP) technologies, have collaborated. The objective of this partnership was to accelerate the deployment of sophisticated NLP capabilities while making it easier for businesses to use cutting-edge machine-learning models. Following this partnership, Hugging Face will suggest Amazon Web Services as a cloud service provider for its clients.

(Source: about:blank)

Advancements in Data Labelling Technologies to Propel Market Growth

The continuous advancements in data labelling technologies serve as another significant driver for the AI Training Data market. Efficient and accurate labelling is essential for training robust AI models. Innovations in automated and semi-automated labelling tools, leveraging techniques like computer vision and natural language processing, streamline the data annotation process. These technologies not only improve the speed and scalability of dataset preparation but also contribute to the overall quality and consistency of labelled data. The adoption of advanced labelling solutions addresses industry challenges related to data annotation, driving the market forward amidst the increasing demand for high-quality training data.

In June 2021, Scale AI and MIT Media Lab, a Massachusetts Institute of Technology research centre, began working together. To help doctors treat patients more effectively, this cooperation attempted to utilize ML in healthcare.

www.ncbi.nlm.nih.gov/pmc/articles/PMC7325854/

Restraint Factors Of AI Training Data Market

Data Privacy and Security Concerns to Restrict Market Growth

A significant restraint in the AI Training Data market is the growing concern over data privacy and security. As the demand for diverse and expansive datasets rises, so does the need for sensitive information. However, the collection and utilization of personal or proprietary data raise ethical and privacy issues. Companies and data providers face challenges in ensuring compliance with regulations and safeguarding against unauthorized access or misuse of sensitive information. Addressing these concerns becomes imperative to gain user trust and navigate the evolving landscape of data protection laws, which, in turn, poses a restraint on the smooth progression of the AI Training Data market.

How did COVID–19 impact the Ai Training Data market?

The COVID-19 pandemic has had a multifaceted impact on the AI Training Data market. While the demand for AI solutions has accelerated across industries, the availability and collection of training data faced challenges. The pandemic disrupted traditional data collection methods, leading to a slowdown in the generation of labeled datasets due to restrictions on physical operations. Simultaneously, the surge in remote work and the increased reliance on AI-driven technologies for various applications fueled the need for diverse and relevant training data. This duali...
u
Steam Video Game and Bundle Data
cseweb.ucsd.edu
json
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
UCSD CSE Research Project, Steam Video Game and Bundle Data [Dataset]. https://cseweb.ucsd.edu/~jmcauley/datasets.html
Explore at:
jsonAvailable download formats
Dataset authored and provided by
UCSD CSE Research Project
Description
These datasets contain reviews from the Steam video game platform, and information about which games were bundled together.

Metadata includes

reviews

purchases, plays, recommends (likes)

product bundles

pricing information

Basic Statistics:

Reviews: 7,793,069

Users: 2,567,538

Items: 15,474

Bundles: 615
h
Amazon_Customer_Review_2023
huggingface.co
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Amazon_Customer_Review_2023 [Dataset]. https://huggingface.co/datasets/kevykibbz/Amazon_Customer_Review_2023
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Authors
kevin kibebe
Description
Amazon Product Review Dataset (2023)

Dataset Overview

The Amazon Product Review Dataset (2023) contains product reviews from Amazon customers. The dataset includes product information, review details, and metadata about the customers who left the reviews. This dataset can be used for various natural language processing (NLP) tasks, including sentiment analysis, review prediction, recommendation systems, and more.

Dataset Name: Amazon Product Review Dataset (2023)… See the full description on the dataset page: https://huggingface.co/datasets/kevykibbz/Amazon_Customer_Review_2023.
Amazon Seller Contact Intent Sequence
registry.opendata.aws
Updated Jun 21, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Amazon (2023). Amazon Seller Contact Intent Sequence [Dataset]. https://registry.opendata.aws/amazon-seller-contact-intent-sequence/
Explore at:
Dataset updated
Jun 21, 2023
Dataset provided by
Amazon.comhttp://amazon.com/
Description
When sellers need help from Amazon, such as how to create a listing, they often reach out to Amazon seller support through email, chat or phone. For each contact, we assign an intent so that we can manage the request more easily. The data we present in this release includes 548k contacts with 118 intents from 70k sellers sampled from recent years. There are 3 columns. 1. De-identified seller id - seller_id_anon; 2. Noisy inter-arrival time in the unit of hour between contacts - interarrival_time_hr_noisy; 3. An integer that represents the contact intent - contact_intent. Note that, to balance the need between data anonymization and usefulness, we randomly perturbed the interarrival time in an intricate way such that the temporal pattern are preserved and seller identity are anonymized to the largest extent. We also note that for each seller_id_anon, the interarrival_time_hr_noisy are already arranged in chonological order, the first contact_intent_id_anon is always the origin when sellers begin to sell with us and the interarrival_time_hr_noisy for each seller_id_anon are all relative with respect to the previous contact. A straightforward use case of the data is to predict the next timestamp and intent of a user given the user's history.
h
Amazon-Reviews-2023
huggingface.co
Updated Apr 7, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
McAuley-Lab (2024). Amazon-Reviews-2023 [Dataset]. https://huggingface.co/datasets/McAuley-Lab/Amazon-Reviews-2023
Explore at:
Dataset updated
Apr 7, 2024
Dataset authored and provided by
McAuley-Lab
Description
Amazon Review 2023 is an updated version of the Amazon Review 2018 dataset. This dataset mainly includes reviews (ratings, text) and item metadata (desc- riptions, category information, price, brand, and images). Compared to the pre- vious versions, the 2023 version features larger size, newer reviews (up to Sep 2023), richer and cleaner meta data, and finer-grained timestamps (from day to milli-second).
Facebook posts for analyzing Content Strategies for Digital Consumer...
zenodo.org
data.niaid.nih.gov
csv
Updated Jul 20, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gustavo N. de Sousa; Gustavo N. de Sousa; Antonio F. L. Jacob Junior; Antonio F. L. Jacob Junior; Fábio M. F. Lobato; Fábio M. F. Lobato (2021). Facebook posts for analyzing Content Strategies for Digital Consumer Engagement: a curated dataset [Dataset]. http://doi.org/10.5281/zenodo.5113266
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.5113266
Dataset updated
Jul 20, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Gustavo N. de Sousa; Gustavo N. de Sousa; Antonio F. L. Jacob Junior; Antonio F. L. Jacob Junior; Fábio M. F. Lobato; Fábio M. F. Lobato
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This database contains public data from publications made on Facebook by small and medium companies in the tourism sector operating in the Amazon region in Brazil. The collection was carried out from January to June 2018 using the public API provided by the platform.

These data were processed and classified by different evaluators according to the content categories proposed by Gavilanes. Thus, there is a column in the dataset called "category", which contains the classification of each publication, with each number is associated with a category, as described below:

“0” - No category

“1” - New product announcement

“2” - Sweepstakes and contest

“3” - Sales

“4” - Consumer Feedback

“5” - Infotainment

“6” - Organization Branding

“N\A” - Non-agreement between evaluators

Therefore, the columns with data present in the CSV file are:

“status_id” - Identification of each publication in the social network. String Textual content of each publication;

“status_message” - The textual content of each publication;

“link_name” - Which part of the profile the post are related;

“status_type” - The type of the publication is a ’photo’, ’video’, ’status’ and/or ’link’;

“status_link” - URL to the publications on the platform;

“status_published” - Publication date in the social network;

“num_comments” - Numbers of comments made by users;

Reactions - Numbers of emoticons reactions from users on post – these numbers are presents in the columns: "num_reactions", "num_shares", "num_likes", "num_loves", "num_wows", "num_hahas", "num_sads", "num_angrys", "num_special";

“category” - The category of content that we mentioned before.
P
Amazon Fine Foods Dataset
paperswithcode.com
Updated Feb 7, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Julian McAuley; Jure Leskovec (2021). Amazon Fine Foods Dataset [Dataset]. https://paperswithcode.com/dataset/amazon-fine-foods
Explore at:
Dataset updated
Feb 7, 2021
Authors
Julian McAuley; Jure Leskovec
Description
Amazon Fine Foods is a dataset that consists of reviews of fine foods from amazon. The data span a period of more than 10 years, including all ~500,000 reviews up to October 2012. Reviews include product and user information, ratings, and a plaintext review.
C
Allegheny County Anxiety Medication
data.wprdc.org
catalog.data.gov
csv, html, xlsx
Updated Jun 8, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Allegheny County (2024). Allegheny County Anxiety Medication [Dataset]. https://data.wprdc.org/dataset/anxiety
Explore at:
xlsx, html, csv, csv(12405)Available download formats
Dataset updated
Jun 8, 2024
Dataset provided by
Allegheny County
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Area covered
Allegheny County
Description
These Census Tract-level datasets described here provide de-identified diagnosis data for customers of three managed care organizations in Allegheny County (Gateway Health Plan, Highmark Health, and UPMC) that have filed a claim for anxiety medications in 2015 and 2016. The data also includes the number of enrolled members in the three participating managed care organizations in 2015 and 2016.

Disclaimer: Users should be cautious of using administrative claims data as a measure of disease prevalence and interpreting trends over time, as data provided were collected for purposes other than surveillance. Limitations of these data include but are not limited to: misclassification, duplicate individuals, exclusion of individuals who did not seek care in past two years and those who are: uninsured, enrolled in plans not represented in the dataset, or were not enrolled in one of the represented plans for at least 90 days.

Support for Health Equity datasets and tools provided by Amazon Web Services (AWS) through their Health Equity Initiative.