https://brightdata.com/licensehttps://brightdata.com/license
Gain extensive insights with our Amazon datasets, encompassing detailed product information including pricing, reviews, ratings, brand names, product categories, sellers, ASINs, images, and much more. Ideal for market researchers, data analysts, and eCommerce professionals looking to excel in the competitive online marketplace. Over 425M records available Price starts at $250/100K records Data formats are available in JSON, NDJSON, CSV, XLSX and Parquet. 100% ethical and compliant data collection Included datapoints:
Title Asin Main Image Brand Name Description Availability Subcategory Categories Parent Asin Type Product Type Name Model Number Manufacturer Color Size Date First Available Released Model Year Item Model Number Part Number Price Total Reviews Total Ratings Average Rating Features Best Sellers Rank Subcategory Buybox Buybox Seller Id Buybox Is Amazon Images Product URL And more
Amazon's monthly revenue in the United States for beauty and personal care sales is estimated to range from 2.4 to 3.8 billion U.S. dollars in 2024. The sales revenue is estimated to experience two significant peaks, one in July 2024 at 3.6 billion U.S. dollars and the other in December 2024 at 3.8 billion U.S. dollars. In the same year, beauty and personal care products were one of Amazon's most profitable product categories.
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Amazon is one of the most recognisable brands in the world, and the third largest by revenue. It was the fourth tech company to reach a $1 trillion market cap, and a market leader in e-commerce,...
Over the course of 2024, in the United States, the retailer market share of home improvement sales on Amazon varied. The retailer market share started at just over 14 percent at the beginning of the year. Throughout the year, the market share fell, before rising to around 18 percent in the last month of the year.
https://brightdata.com/licensehttps://brightdata.com/license
Utilize our Amazon reviews dataset for diverse applications to enrich business strategies and market insights. Analyzing this dataset can aid in understanding customer behavior, product performance, and market trends, empowering organizations to refine their product and marketing strategies. Access the entire dataset or tailor a subset to fit your requirements. Popular use cases include: Product Performance Analysis: Analyze Amazon reviews to assess product performance, uncovering customer satisfaction levels, common issues, and highly praised features to inform product improvements and marketing messages. Customer Behavior Insights: Gain insights into customer behavior, purchasing patterns, and preferences, enabling more personalized marketing and product recommendations. Demand Forecasting: Leverage Amazon reviews to predict future product demand by analyzing historical review data and identifying trends, helping to optimize inventory management and sales strategies. Accessing and analyzing the Amazon reviews dataset supports market strategy optimization by leveraging insights to analyze key market trends and customer preferences, enhancing overall business decision-making.
YouTube has emerged as the dominant social media platform for driving traffic to Amazon.com, accounting for nearly 60 percent of referrals to the e-commerce platform in December 2023. Facebook.com and Twitter.com followed, contributing about ten and nine percent of social media referrals respectively, while Reddit and WhatsApp rounded out the top five sources. Amazon's dominance Amazon's position as the leading online retailer in the United States is evident in its traffic and sales figures. In December 2023, Amazon still recorded an impressive 2.7 billion combined visits. The company's financial performance remains strong, with a net income of approximately 13.5 billion U.S. dollars in the second quarter of 2024, up from the previous quarters. Mobile presence Amazon's mobile presence continues to grow, with its shopping app downloads reaching a nine-year peak in August 2022 at approximately 25 million. As of July 2024, the Amazon Shopping app reached over 18 million downloads across iOS and Android platforms. That month, Amazon’s shopping app was the most popular app published by the e-commerce and tech giant.
ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
This Dataset contains the information of smart watches from amazon which is scraped using selenium.Basically this Dataset can be used for similarity search.Attributes included are: Name , price , Brand , Model Name, Style , Colour , Screen Size Series , Special Feature ,Target Audience , Age Range (Description) , Shape, Item Dimensions LxWxH, Item Weight, Battery Life .
Amazon Customer Reviews (a.k.a. Product Reviews) is one of Amazons iconic products. In a period of over two decades since the first review in 1995, millions of Amazon customers have contributed over a hundred million reviews to express opinions and describe their experiences regarding products on the Amazon.com website. This makes Amazon Customer Reviews a rich source of information for academic researchers in the fields of Natural Language Processing (NLP), Information Retrieval (IR), and Machine Learning (ML), amongst others. Accordingly, we are releasing this data to further research in multiple disciplines related to understanding customer product experiences. Specifically, this dataset was constructed to represent a sample of customer evaluations and opinions, variation in the perception of a product across geographical regions, and promotional intent or bias in reviews.
Over 130+ million customer reviews are available to researchers as part of this release. The data is available in TSV files in the amazon-reviews-pds S3 bucket in AWS US East Region. Each line in the data files corresponds to an individual review (tab delimited, with no quote and escape characters).
Each Dataset contains the following columns : marketplace - 2 letter country code of the marketplace where the review was written. customer_id - Random identifier that can be used to aggregate reviews written by a single author. review_id - The unique ID of the review. product_id - The unique Product ID the review pertains to. In the multilingual dataset the reviews for the same product in different countries can be grouped by the same product_id. product_parent - Random identifier that can be used to aggregate reviews for the same product. product_title - Title of the product. product_category - Broad product category that can be used to group reviews (also used to group the dataset into coherent parts). star_rating - The 1-5 star rating of the review. helpful_votes - Number of helpful votes. total_votes - Number of total votes the review received. vine - Review was written as part of the Vine program. verified_purchase - The review is on a verified purchase. review_headline - The title of the review. review_body - The review text. review_date - The date the review was written.
To use this dataset:
import tensorflow_datasets as tfds
ds = tfds.load('amazon_us_reviews', split='train')
for ex in ds.take(4):
print(ex)
See the guide for more informations on tensorflow_datasets.
Amazon is known as an e-commerce company, but in recent years, the retailer has invested in opening physical stores across the United States with more international expansion in mind. Amazon’s physical retail stores come in different formats, including Amazon Fresh grocery stores, Amazon Go, Amazon Books, Amazon 4 Star, and Amazon Pop-up. Typically, Amazon’s branded devices, books and other merchandise are available in these stores. In the fourth quarter of 2024, net sales from Amazon’s physical retailing amounted to nearly 5.8 billion U.S. dollars. Whole Foods acquisition and Amazon Fresh Amazon’s venture into brick-and-mortar grocery store retailing started with the acquisition of the Whole Foods Market in 2018. By 2017, just before it was bought out by Amazon, the supermarket Whole Foods had registered a net sales revenue of over 16 billion U.S. dollars. In addition to some 500 Whole Foods locations, Amazon’s grocery retail business is supported by Amazon Fresh with stores predominantly in the United States. Outside of the United States, Amazon opened its first Amazon Fresh stores in the United Kingdom in March 2021. Amazon’s retail portfolio Amazon has a diverse retail portfolio, both in terms of merchandise and the business models it offers across its platforms. While it started its e-commerce business as an online retailer acting as the first-party owner of the products on offer, third-party selling on the Amazon marketplace increasingly became the norm among online sellers, who often employ both models when working with Amazon. Since 2017, more than half of paid units of Amazon is attributed to third-party sellers using the Amazon marketplace to sell their products.
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
CC-BY-NC-SA Credit to NovelRank.com for compiling the data and Amazon.com as the data source.
I've been collecting salesrank for authors publishing through Amazon worldwide for almost a decade via the site NovelRank.com. The data is collected as frequently as hourly and as infrequently as once every 24 hours. Over a single year this represents GBs of data. I think this would be a great time to let the Kaggle community play with it.
The earliest data is from Jan 1, 2017. The latest data is from June 29, 2018. Within the 61,000+ unique books, there is roughly a 50/50 split between Kindle Editions and Print Editions. This is critically important because Amazon sales rankings are grouped under the Books umbrella into those two categories. Thus you can have two books in the data set have the same sales rank at the same time if one is in the kindle group and the other is in the book group.
Within the data set there is a small subset of books that have more consistent sales rank collection, specifically they have hourly salesrank collection. *(Future Goal: offer a .zip file of only these ASINs). These titles are tracked by NovelRank Pro users which has the benefit of no throttling to their tracking. Books that don't sell for a while will have tracking throttled to as low as once every 24 hours until a drop in sales rank is detected, thus the variability in most of the data collection timestamps.
Finally, when salesrank has not changed, NovelRank does not record it. In other words, taking the books that have hourly checks mentioned above, if salesrank has not changed, then there would be a gap, possibly 2 hours between the data points or more due to this housekeeping detail. This is true for books that maintain a very good ranking (where it is harder for the book to manage but more likely to occur) as well as for books with a very low ranking.
Sales rank is updated on an hourly basis (at best) by Amazon.
For years I've used salesrank changes to estimate # of sales for authors, which due to inherent flaws in ranking data as a primary source has been better for low volume sellers than high volume sellers. This is aggravated by unreliable data collection to match actual sales to sales rank changes to improve things.
Some of the flaws:
https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
The AI training data market is experiencing robust growth, driven by the increasing adoption of artificial intelligence across diverse sectors. The market's expansion is fueled by the escalating demand for high-quality data to train sophisticated AI models, enabling improved accuracy and performance in applications like computer vision, natural language processing, and machine learning. The market size in 2025 is estimated at $15 billion, projecting a Compound Annual Growth Rate (CAGR) of 25% from 2025 to 2033. This significant growth trajectory is underpinned by several key factors: the proliferation of AI-powered applications across industries, advancements in AI algorithms requiring larger and more diverse datasets, and the rising availability of data annotation tools and platforms. However, challenges remain, including data privacy concerns, the high cost of data acquisition and annotation, and the need for skilled professionals to manage and curate these vast datasets. The market is segmented by data type (text, image, video, audio), application (autonomous vehicles, healthcare, finance), and region, with North America currently holding the largest market share due to early adoption of AI technologies and the presence of major technology companies. Key players in the market, such as Google (Kaggle), Amazon Web Services, Microsoft, and Appen Limited, are strategically investing in developing advanced data annotation tools and expanding their data acquisition capabilities to cater to this burgeoning demand. The competitive landscape is characterized by both established players and emerging startups, leading to innovation in data acquisition techniques, data quality control, and the development of specialized data annotation services. The future of the market is poised for further expansion, driven by the growing adoption of AI in emerging technologies like the metaverse and the Internet of Things (IoT), along with increasing government investments in AI research and development. Addressing data privacy concerns and fostering ethical data collection practices will be crucial to sustainable growth in the coming years. This will involve greater transparency and robust regulatory frameworks.
In January 2025, around 3.8 million users visited pharmacy.amazon.com. The site is Amazon's homepage for its online pharmacy services. Visitors stayed on the site for an average duration of three minutes and 15 seconds.
https://www.archivemarketresearch.com/privacy-policyhttps://www.archivemarketresearch.com/privacy-policy
The B2C E-commerce Market size was valued at USD 6.23 trillion in 2023 and is projected to reach USD 21.18 trillion by 2032, exhibiting a CAGR of 19.1 % during the forecasts period. The B2C e-commerce can be defined as the sale of commercial products or services through the internet between buyers and sellers. This market pertains to several industries that fall under its fold that includes the area of retail, travelling, electronics and digital products. Some of the most common implementations are in the ecommerce sites, mobile applications, and membership services. Some aspects of the B2C e-commerce market include increased popularity of omnichannel retailing that combines online and offline environments and the shift to the concept of individualization due to the digitalization and data processing using artificial intelligence and machine learning. Also, growth is noted in mobile commerce (m-commerce) as a result of the increase in the number of mobile devices and more effective mobile payments. To this list one should also include the concepts of social commerce and sustainability which also became significant in today’s society due to increasing importance of ethical and convenient shopping. Recent developments include: In March 2024, Blink, an Amazon company, launched the Blink Mini 2 camera. The new compact plug-in camera offers enhanced features such as person detection, a broader field of view, a built-in LED spotlight for night view in color, and improved image quality. The Blink Mini 2 is designed to work indoors and outdoors, with the option to purchase the Blink Weather Resistant Power Adapter for outdoor use. , In October 2023, Flipkart.com introduced the 'Flipkart Commerce Cloud,' a customized suite of AI-driven retail technology solutions for global retailers and e-commerce businesses. This extensive offering includes marketplace technology, retail media solutions, pricing, and inventory management features rigorously assessed by Flipkart.com. The company aims to equip international sellers with reliable and secure tools to enhance business expansion and efficiency within the competitive global market. , In August 2023, Shopify and Amazon.com, Inc. announced a strategic partnership that will allow Shopify merchants to seamlessly implement Amazon's "Buy with Prime" option on their sites. As a result of the agreement, Amazon.com, Inc. Prime customers will enjoy a more efficient checkout process on various platforms. This collaboration allows Amazon Prime members to utilize their existing Amazon payment options, while Shopify will handle the transaction processing through its system, showcasing a partnership between the two leading companies. , In February 2023, eBay acquired 3PM Shield, a developer of AI-powered online retail solutions. 3PM Shield uses machine learning and artificial intelligence to analyze extensive data sets, enhancing marketplace compliance and user experience. This acquisition aligns with eBay's goal to offer a "safe and reliable" platform by boosting its ability to block the sale of counterfeit and prohibited items. By incorporating 3PM Shield's sophisticated monitoring technologies, eBay seeks to enhance its capability to address problematic seller behavior and spot problematic listings, fostering a safer e-commerce space for its worldwide community of sellers and buyers. .
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset contains cloud network performance data related to the Amazon S3 storage service. The dataset refers to experimental campaigns conducted in May 2016. The dataset was collected leveraging 77 Bismark VPs, instructed as detailed in the following. Each VP performed repeated download cycles over 7 days. Each cycle is composed of 40 sequential download requests spaced out by 10 seconds and uniquely identified by a combination of factors, i.e. cloud region, file size, and storage class. Downloads within cycles are randomly scheduled and repeated from each VP every 2 hours. After every download, VPs run TCP-traceroute towards the IP address that served the request in order to trace the information related to the path and estimate the RTT to the S3 cloud datacenter (note that this information is not always available, due to the version of the firmware of the Bismark nodes and to the measurement tools available on them).
When refering to our data set, please cite the following reference: Valerio Persico, Antonio Montieri, Antonio Pescapè: On the Network Performance of Amazon S3 Cloud-Storage Service. CloudNet 2016: 113-118
Comprehensive dataset covering Amazon's 1.9 million active third-party sellers worldwide, including regional distribution, growth trends, and marketplace dynamics from 2000-2025.
The net cash of Amazon.Com with headquarters in the United States amounted to 107.95 billion U.S. dollars in 2024. The reported fiscal year ends on December 31.Compared to the earliest depicted value from 2020 this is a total increase by approximately 41.89 billion U.S. dollars. The trend from 2020 to 2024 shows, however, that this increase did not happen continuously.
Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
License information was derived automatically
Programming Languages Infrastructure as Code (PL-IaC) enables IaC programs written in general-purpose programming languages like Python and TypeScript. The currently available PL-IaC solutions are Pulumi and the Cloud Development Kits (CDKs) of Amazon Web Services (AWS) and Terraform. This dataset provides metadata and initial analyses of all public GitHub repositories in August 2022 with an IaC program, including their programming languages, applied testing techniques, and licenses. Further, we provide a shallow copy of the head state of those 7104 repositories whose licenses permit redistribution. The dataset is available under the Open Data Commons Attribution License (ODC-By) v1.0.
Contents:
This artifact is part of the ProTI Infrastructure as Code testing project: https://proti-iac.github.io.
The dataset's metadata comprises three tabular CSV files containing metadata about all analyzed repositories, IaC programs, and testing source code files.
repositories.csv:
programs.csv:
testing-files.csv:
scripts-and-logs.zip contains all scripts and logs of the creation of this dataset. In it, executions/executions.log documents the commands that generated this dataset in detail. On a high level, the dataset was created as follows:
The repositories are searched through search-repositories.py and saved in a CSV file. The script takes these arguments in the following order:
Pulumi projects have a Pulumi.yaml or Pulumi.yml (case-sensitive file name) file in their root folder, i.e., (3) is Pulumi and (4) is yml,yaml. https://www.pulumi.com/docs/intro/concepts/project/
AWS CDK projects have a cdk.json (case-sensitive file name) file in their root folder, i.e., (3) is cdk and (4) is json. https://docs.aws.amazon.com/cdk/v2/guide/cli.html
CDK for Terraform (CDKTF) projects have a cdktf.json (case-sensitive file name) file in their root folder, i.e., (3) is cdktf and (4) is json. https://www.terraform.io/cdktf/create-and-deploy/project-setup
The script uses the GitHub code search API and inherits its limitations:
More details: https://docs.github.com/en/search-github/searching-on-github/searching-code
The results of the GitHub code search API are not stable. However, the generally more robust GraphQL API does not support searching for files in repositories: https://stackoverflow.com/questions/45382069/search-for-code-in-github-using-graphql-v4-api
download-repositories.py downloads all repositories in CSV files generated through search-respositories.py and generates an overview CSV file of the downloads. The script takes these arguments in the following order:
The script only downloads a shallow recursive copy of the HEAD of the repo, i.e., only the main branch's most recent state, including submodules, without the rest of the git history. Each repository is downloaded to a subfolder named by the repository's ID.
Discover the unparalleled potential of our comprehensive eCommerce leads database, featuring essential data fields such as Store Name, Website, Contact First Name, Contact Last Name, Email Address, Physical Address, City, State, Country, Zip Code, Phone Number, Revenue Size, Employee Size, and more on demand.
With a focus on Shopify, Amazon, eBay, and other global retail stores, this database equips you with accurate information for successful marketing campaigns. Supercharge your marketing efforts with our enriched contact and company database, providing real-time, verified data insights for strategic market assessments and effective buyer engagement across digital and traditional channels.
• 4M+ eCommerce Companies • 40M+ Worldwide eCommerce Leads • Direct Contact Info for Shop Owners • 47+ eCommerce Platforms • 40+ Data Points • Lifetime Access • 10+ Data Segmentations • Sample Data"
https://www.marketresearchintellect.com/privacy-policyhttps://www.marketresearchintellect.com/privacy-policy
Market Research Intellect presents the Amazon RDS Consulting Service Market Report-estimated at USD 2.1 billion in 2024 and predicted to grow to USD 4.3 billion by 2033, with a CAGR of 8.8% over the forecast period. Gain clarity on regional performance, future innovations, and major players worldwide.
Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
Dataset is from Amazon ML Challenge 2021 Amazon catalog consists of billions of products that belong to thousands of browse nodes (each browse node represents a collection of items for sale). Browse nodes are used to help customers navigate through our website and classify products to product type groups. Hence, it is important to predict the node assignment at the time of listing of the product or when the browse node information is absent.
This dataset is a part of Hackathon** "Amazon ML Challenge"** which was held on July 30, 2021. It contains:
Key column – PRODUCT_ID
Input features – TITLE, DESCRIPTION, BULLET_POINTS, BRAND
Target column – BROWSE_NODE_ID
Train dataset size – 2,903,024
Number of classes in Train – 9,919
Overall Test dataset size – 110,775
Please do UPVOTE it if you find it useful 😊 Currently it has 5k+ views and 300+ downloads. Help it reach out to more users!!
All the credit for the dataset goes to Amazon and HackerEarth (the platform on which the Hackathon was hosted)
The contest used Accuracy as the evaluation metric to measure submissions quality. Since this is a multiclass classification problem, interested in subset accuracy: the set of labels predicted for a sample must exactly match the corresponding set of ground truth labels.
https://brightdata.com/licensehttps://brightdata.com/license
Gain extensive insights with our Amazon datasets, encompassing detailed product information including pricing, reviews, ratings, brand names, product categories, sellers, ASINs, images, and much more. Ideal for market researchers, data analysts, and eCommerce professionals looking to excel in the competitive online marketplace. Over 425M records available Price starts at $250/100K records Data formats are available in JSON, NDJSON, CSV, XLSX and Parquet. 100% ethical and compliant data collection Included datapoints:
Title Asin Main Image Brand Name Description Availability Subcategory Categories Parent Asin Type Product Type Name Model Number Manufacturer Color Size Date First Available Released Model Year Item Model Number Part Number Price Total Reviews Total Ratings Average Rating Features Best Sellers Rank Subcategory Buybox Buybox Seller Id Buybox Is Amazon Images Product URL And more