68 datasets found
  1. u

    Amazon review data 2018

    • mcauleylab.ucsd.edu
    • nijianmo.github.io
    • +1more
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    UCSD CSE Research Project (2023). Amazon review data 2018 [Dataset]. https://mcauleylab.ucsd.edu:8443/public_datasets/data/amazon_v2/
    Explore at:
    Dataset updated
    May 31, 2023
    Dataset authored and provided by
    UCSD CSE Research Project
    Description

    Context

    This Dataset is an updated version of the Amazon review dataset released in 2014. As in the previous version, this dataset includes reviews (ratings, text, helpfulness votes), product metadata (descriptions, category information, price, brand, and image features), and links (also viewed/also bought graphs). In addition, this version provides the following features:

    • More reviews:

      • The total number of reviews is 233.1 million (142.8 million in 2014).
    • New reviews:

      • Current data includes reviews in the range May 1996 - Oct 2018.
    • Metadata: - We have added transaction metadata for each review shown on the review page.

      • Added more detailed metadata of the product landing page.

    Acknowledgements

    If you publish articles based on this dataset, please cite the following paper:

    • Jianmo Ni, Jiacheng Li, Julian McAuley. Justifying recommendations using distantly-labeled reviews and fined-grained aspects. EMNLP, 2019.
  2. Amazon Product Reviews

    • kaggle.com
    Updated Nov 26, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). Amazon Product Reviews [Dataset]. https://www.kaggle.com/datasets/thedevastator/amazon-product-reviews/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 26, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    The Devastator
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Amazon Product Reviews

    18 Years of Customer Ratings and Experiences

    By Huggingface Hub [source]

    About this dataset

    The Amazon Reviews Polarity Dataset discloses eighteen years of customers' ratings and reviews from Amazon.com, offering an unparalleled trove of insight and knowledge. Drawing from the immense pool of over 35 million customer reviews, this dataset presents a broad spectrum of customer opinions on products they have bought or used. This invaluable data is a gold mine for improving products and services as it contains comprehensive information regarding customers' experiences with a product including ratings, titles, and plaintext content. At the same time, this dataset contains both customer-specific data along with product information which encourages deep analytics that could lead to great advances in providing tailored solutions for customers. Has your product been favored by the majority? Are there any aspects that need extra care? Use Amazon Reviews Polarity to gain deeper insights into what your customers want - explore now!

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    • Analyze customer ratings to identify trends: Take a look at how many customers have rated the same product or service with the same score (e.g., 4 stars). You can use this information to identify what customers like or don’t like about it by examining common sentiment throughout the reviews. Identifying these patterns can help you make decisions on which features of your products or services to emphasize in order to boost sales and satisfaction rates.

    2 Review content analysis: Analyzing review content is one of the best ways to gauge customer sentiment toward specific features or aspects of a product/service. Using natural language processing tools such as Word2Vec, Latent Dirichlet Allocation (LDA), or even simple keyword search algorithms can quickly reveal general topics that are discussed in relation to your product/service across multiple reviews - allowing you quickly pinpoint areas that may need improvement for particular items within your lines of business.

    3 Track associated scores over time: By tracking customer ratings overtime, you may be able to better understand when there has been an issue with something specific related to your product/service - such as negative response toward a feature that was introduced but didn’t seem popular among customers and was removed shortly after introduction.. This can save time and money by identifying issues before they become widespread concerns with larger sets of consumers who invest their money in using your company's item(s).

    4 Visualize sentiment data over time graphs : Utilizing visualizations such as bar graphs can help identify trends across different categories quicker than raw numbers alone; combining both numeric values along with color differences associated between different scores allows you spot anomalies easier - allowing faster resolution times when trying figure out why certain spikes occurred where other stayed stable (or vice-versa) when comparing similar data points through time-series based visualization models

    Research Ideas

    • Developing a customer sentiment analysis system that can be used to quickly analyze the sentiment of reviews and identify any potential areas of improvement.
    • Building a product recommendation service that takes into account the ratings and reviews of customers when recommending similar products they may be interested in purchasing.
    • Training a machine learning model to accurately predict customers’ ratings on new products they have not yet tried and leverage this for further product development optimization initiatives

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

    Columns

    File: train.csv | Column name | Description | |:--------------|:-------------------------------------------------------------------| | label | The sentiment of the review, either positive or negative. (String) | | title | The title of the review. (String) ...

  3. d

    Amazon Seller Directory 2025 | Amazon Seller Database USA, FR, Germany, ESP,...

    • datarade.ai
    .csv, .xls
    Updated Feb 21, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lead for Business (2022). Amazon Seller Directory 2025 | Amazon Seller Database USA, FR, Germany, ESP, UK, Italy, CA | List of Amazon Sellers | 200K+ Amazon Seller Leads| [Dataset]. https://datarade.ai/data-products/amazon-seller-directory-amazon-fba-seller-database-with-sto-lead-for-business
    Explore at:
    .csv, .xlsAvailable download formats
    Dataset updated
    Feb 21, 2022
    Dataset authored and provided by
    Lead for Business
    Area covered
    United Kingdom, United States, Italy
    Description

    • 500K+ Active Amazon Stores • 200K+ Seller Leads • Platforms USA, Germany, UK, Italy, France, Spain, CA • C-Suite/Marketing/Sales Contacts • FBA/Non-FBA Sellers • 15+ data points available for each prospect • Filter your leads by store size, niche, location, and many more • 100% manually researched and verified.

    For over a decade, we have been manually collecting Amazon seller data from various data sources such as Amazon, Linkedin, Google, and others. We are specialized to get valid, and potential data so you may conduct ads and begin selling without hesitation.

    We designed our data packages for all types of organizations, thus they are reasonably priced. We are always trying to reduce our prices to better suit all of your requirements.

    So, if you’re looking to reach out to your targeted Amazon sellers, now is the greatest time to do so and offer your goods, services, and promotions. You can get your targeted Amazon Sellers List with seller contact information.

    Alternatively, if you provide Amazon Seller Names or IDs, we will conduct Custom Research and deliver the customized list to you.

    Data Points Available:

    Full Name Linkedin URL Direct Email Generic Phone Number Business Name and Address Company Website Seller IDs and URLs Revenue Seller Review Count Niche FBA/Non-FBA Country and More

  4. h

    amazon_us_reviews

    • huggingface.co
    • tensorflow.org
    Updated Jun 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Polina Kazakova (2023). amazon_us_reviews [Dataset]. https://huggingface.co/datasets/polinaeterna/amazon_us_reviews
    Explore at:
    Dataset updated
    Jun 30, 2023
    Authors
    Polina Kazakova
    License

    https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/

    Description

    Amazon Customer Reviews (a.k.a. Product Reviews) is one of Amazons iconic products. In a period of over two decades since the first review in 1995, millions of Amazon customers have contributed over a hundred million reviews to express opinions and describe their experiences regarding products on the Amazon.com website. This makes Amazon Customer Reviews a rich source of information for academic researchers in the fields of Natural Language Processing (NLP), Information Retrieval (IR), and Machine Learning (ML), amongst others. Accordingly, we are releasing this data to further research in multiple disciplines related to understanding customer product experiences. Specifically, this dataset was constructed to represent a sample of customer evaluations and opinions, variation in the perception of a product across geographical regions, and promotional intent or bias in reviews.

    Over 130+ million customer reviews are available to researchers as part of this release. The data is available in TSV files in the amazon-reviews-pds S3 bucket in AWS US East Region. Each line in the data files corresponds to an individual review (tab delimited, with no quote and escape characters).

    Each Dataset contains the following columns:

    • marketplace: 2 letter country code of the marketplace where the review was written.
    • customer_id: Random identifier that can be used to aggregate reviews written by a single author.
    • review_id: The unique ID of the review.
    • product_id: The unique Product ID the review pertains to. In the multilingual dataset the reviews for the same product in different countries can be grouped by the same product_id.
    • product_parent: Random identifier that can be used to aggregate reviews for the same product.
    • product_title: Title of the product.
    • product_category: Broad product category that can be used to group reviews (also used to group the dataset into coherent parts).
    • star_rating: The 1-5 star rating of the review.
    • helpful_votes: Number of helpful votes.
    • total_votes: Number of total votes the review received.
    • vine: Review was written as part of the Vine program.
    • verified_purchase: The review is on a verified purchase.
    • review_headline: The title of the review.
    • review_body: The review text.
    • review_date: The date the review was written.
  5. P

    Amazon-Fraud Dataset

    • paperswithcode.com
    Updated Dec 23, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yingtong Dou; Zhiwei Liu; Li Sun; Yutong Deng; Hao Peng; Philip S. Yu (2024). Amazon-Fraud Dataset [Dataset]. https://paperswithcode.com/dataset/amazon-fraud
    Explore at:
    Dataset updated
    Dec 23, 2024
    Authors
    Yingtong Dou; Zhiwei Liu; Li Sun; Yutong Deng; Hao Peng; Philip S. Yu
    Description

    Amazon-Fraud is a multi-relational graph dataset built upon the Amazon review dataset, which can be used in evaluating graph-based node classification, fraud detection, and anomaly detection models.

    Dataset Statistics

    # Nodes%Fraud Nodes (Class=1)
    11,9449.5
    Relation# Edges
    U-P-U
    U-S-U
    U-V-U1,036,737
    All

    Graph Construction

    The Amazon dataset includes product reviews under the Musical Instruments category. Similar to this paper, we label users with more than 80% helpful votes as benign entities and users with less than 20% helpful votes as fraudulent entities. we conduct a fraudulent user detection task on the Amazon-Fraud dataset, which is a binary classification task. We take 25 handcrafted features from this paper as the raw node features for Amazon-Fraud. We take users as nodes in the graph and design three relations: 1) U-P-U: it connects users reviewing at least one same product; 2) U-S-V: it connects users having at least one same star rating within one week; 3) U-V-U: it connects users with top 5% mutual review text similarities (measured by TF-IDF) among all users.

    To download the dataset, please visit this Github repo. For any other questions, please email ytongdou(AT)gmail.com for inquiry.

  6. Global net revenue of Amazon 2014-2024, by product group

    • statista.com
    • ai-chatbox.pro
    Updated Feb 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Global net revenue of Amazon 2014-2024, by product group [Dataset]. https://www.statista.com/statistics/672747/amazons-consolidated-net-revenue-by-segment/
    Explore at:
    Dataset updated
    Feb 24, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Worldwide
    Description

    In 2024, Amazon's net revenue from subscription services segment amounted to 44.37 billion U.S. dollars. Subscription services include Amazon Prime, for which Amazon reported 200 million paying members worldwide at the end of 2020. The AWS category generated 107.56 billion U.S. dollars in annual sales. During the most recently reported fiscal year, the company’s net revenue amounted to 638 billion U.S. dollars. Amazon revenue segments Amazon is one of the biggest online companies worldwide. In 2019, the company’s revenue increased by 21 percent, compared to Google’s revenue growth during the same fiscal period, which was just 18 percent. The majority of Amazon’s net sales are generated through its North American business segment, which accounted for 236.3 billion U.S. dollars in 2020. The United States are the company’s leading market, followed by Germany and the United Kingdom. Business segment: Amazon Web Services Amazon Web Services, commonly referred to as AWS, is one of the strongest-growing business segments of Amazon. AWS is a cloud computing service that provides individuals, companies and governments with a wide range of computing, networking, storage, database, analytics and application services, among many others. As of the third quarter of 2020, AWS accounted for approximately 32 percent of the global cloud infrastructure services vendor market.

  7. D

    State Agency Amazon Spend Fiscal Year 22

    • data.wa.gov
    • catalog.data.gov
    application/rdfxml +5
    Updated Aug 1, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). State Agency Amazon Spend Fiscal Year 22 [Dataset]. https://data.wa.gov/Procurements-and-Contracts/State-Agency-Amazon-Spend-Fiscal-Year-22/agvw-ch2s
    Explore at:
    csv, xml, json, tsv, application/rdfxml, application/rssxmlAvailable download formats
    Dataset updated
    Aug 1, 2022
    Description

    DES is publishing the Amazon spend for state agencies collected through the Washington State Amazon Business account. The data set only includes closed orders. Any orders that are still in process or have been cancelled are not included. This data is for Fiscal Year 22 (July 1, 2021 to June 30, 2022). Data is updated monthly.

  8. Johns Hopkins Multi-Domain Sentiment Dataset ∑∞

    • kaggle.com
    Updated Jan 14, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jérøme E. Blanch∑xt (2020). Johns Hopkins Multi-Domain Sentiment Dataset ∑∞ [Dataset]. https://www.kaggle.com/jeromeblanchet/multidomain-sentiment-analysis-dataset/activity
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 14, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Jérøme E. Blanch∑xt
    Description

    Multidomain sentiment analysis dataset

    Amazon review from Johns Hopkins University’s Department of Computer Science

    Source: https://www.cs.jhu.edu/~mdredze/datasets/sentiment/

    Kaggle kernel take care of the tar.gz files for you :-)

    This dataset features slightly older product reviews from Amazon and derives from the Johns Hopkins University’s Department of Computer Science.

    Dataset included

    unprocessed.tar.gz processed_acl.tar.gz processed_stars.tar.gz

    This sentiment dataset has been used in several papers:

    John Blitzer, Mark Dredze, Fernando Pereira. Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification. Association of Computational Linguistics (ACL), 2007. [PDF]

    John Blitzer, Koby Crammer, Alex Kulesza, Fernando Pereira, and Jenn Wortman. Learning Bounds for Domain Adaptation. Neural Information Processing Systems (NIPS), 2008. [PDF]

    Mark Dredze, Koby Crammer, and Fernando Pereira. Confidence-Weighted Linear Classification. International Conference on Machine Learning (ICML), 2008. [PDF]

    Yishay Mansour, Mehryar Mohri, and Afshin Rostamizadeh. Domain Adaptation with Multiple Sources. Neural Information Processing Systems (NIPS), 2009.

    If you use this data for your research or a publication, please cite the first (ACL 2007) paper as the reference for the data. Also, please drop me a line so I know that you found the data useful.

    The Multi-Domain Sentiment Dataset contains product reviews taken from Amazon.com from many product types (domains). Some domains (books and dvds) have hundreds of thousands of reviews. Others (musical instruments) have only a few hundred. Reviews contain star ratings (1 to 5 stars) that can be converted into binary labels if needed. This page contains some descriptions about the data. If you have questions, please email Mark Dredze or John Blitzer.

    A few notes regarding the data sets.

    1) unprocessed.tar.gz contains the original data. 2) processed.acl.tar.gz contains the data pre-processed and balanced. That is, the format of Blitzer et al. (ACL 2007) 3) processed.realvalued.tar.gz contains the data pre-processed and balanced, but with the number of stars, rather than just positive or negative. That is, the format of Mansour et al. (NIPS 2009)

  9. u

    Product Exchange/Bartering Data

    • cseweb.ucsd.edu
    json
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    UCSD CSE Research Project, Product Exchange/Bartering Data [Dataset]. https://cseweb.ucsd.edu/~jmcauley/datasets.html
    Explore at:
    jsonAvailable download formats
    Dataset authored and provided by
    UCSD CSE Research Project
    Description

    These datasets contain peer-to-peer trades from various recommendation platforms.

    Metadata includes

    • peer-to-peer trades

    • have and want lists

    • image data (tradesy)

  10. u

    Pinterest Fashion Compatibility

    • cseweb.ucsd.edu
    • beta.data.urbandatacentre.ca
    json
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    UCSD CSE Research Project, Pinterest Fashion Compatibility [Dataset]. https://cseweb.ucsd.edu/~jmcauley/datasets.html
    Explore at:
    jsonAvailable download formats
    Dataset authored and provided by
    UCSD CSE Research Project
    Description

    This dataset contains images (scenes) containing fashion products, which are labeled with bounding boxes and links to the corresponding products.

    Metadata includes

    • product IDs

    • bounding boxes

    Basic Statistics:

    • Scenes: 47,739

    • Products: 38,111

    • Scene-Product Pairs: 93,274

  11. Amazon Bin Image Dataset File List

    • kaggle.com
    Updated Apr 23, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    William Hyun (2022). Amazon Bin Image Dataset File List [Dataset]. https://www.kaggle.com/datasets/williamhyun/amazon-bin-image-dataset-file-list
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 23, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    William Hyun
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Amazon Bin Image Dataset

    The Amazon Bin Image Dataset contains 536,434 images and metadata from bins of a pod in an operating Amazon Fulfillment Center. The bin images in this dataset are captured as robot units carry pods as part of normal Amazon Fulfillment Center operations. This dataset has many images and the corresponding medadata.

    The image files have three groups according to its naming scheme.

    • A file name with 1~4 digits (1,200): 1.jpg ~ 1200.jpg
    • A file name with 5 digits (99,999): 00001.jpg ~ 99999.jpg
    • A file name with 6 digits (435,235): 100000.jpg ~ 535234.jpg

    Amazon Bin Image Dataset File List dataset aims to provide a CSV file to contain all file locations and the quantity to help the analysis and distributed learning.

    Documentation

    Download

  12. Amazon Bin Image Dataset

    • kaggle.com
    Updated Jan 30, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dhruvil Dave (2021). Amazon Bin Image Dataset [Dataset]. http://doi.org/10.34740/kaggle/dsv/1887853
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 30, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Dhruvil Dave
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    The Amazon Bin Image Dataset contains 50,000 images and metadata from bins of a pod in an operating Amazon Fulfillment Center. The bin images in this dataset are captured as robot units carry pods as part of normal Amazon Fulfillment Center operations. This dataset can be used for research in variety of areas like computer vision, counting genetic items and learning from weakly-tagged data.

    For each image, there is a corresponding entry of its metadata in JSON format stored in metadata.sqlite i.e. for image 01290.jpg, there is a corresponding json object in the data field of the metadata file which can be retrieved with query SELECT data FROM metadata WHERE img_id = 01290;

    Refer the Starter Notebook to see how to work with the dataset.

    Amazon uses a random storage scheme where items are placed into accessible bins with available space, so the contents of each bin are random, rather than organized by specific product types. Thus, each bin image may show only one type of product or a diverse range of products. Occasionally, items are misplaced while being handled, so the contents of some bin images may not match the recorded inventory of that bin.

    These are some typical images in the dataset. A bin contains multiple object categories and various number of instances. The corresponding metadata exist for each bin image and it includes the object category identification (ASIN - Amazon Standard Identification Number), quantity and dimensions of objects. The size of bins are various depending on the size of objects in it. The tapes in front of the bins are for preventing the items from falling out of the bins and sometimes it might make the objects unclear. Objects are sometimes heavily occluded by other objects or limited viewpoint of the images.

    Image Credits: Unsplash - helloimnik

  13. 70,000 Active buyer Email list ( from Amazon & ebay ) for Market

    • dataandsons.com
    csv, zip
    Updated Dec 12, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    boobxff.blogspot.com (2020). 70,000 Active buyer Email list ( from Amazon & ebay ) for Market [Dataset]. https://www.dataandsons.com/categories/product-lists/70-000-active-buyer-email-list-from-amazon-and-ebay-for-market
    Explore at:
    zip, csvAvailable download formats
    Dataset updated
    Dec 12, 2020
    Dataset provided by
    Authors
    boobxff.blogspot.com
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    About this Dataset

    You will get an active email list for real and active buyers who make regular purchases through Amazon and other e-commerce sites. This email list contains 100% original email address. You can also use these emails to increase visits to your website, blog, or YouTube channel. I offer you now, a great treasure to use whenever you want.

    So don't waste your time and start boosting your ecommerce business online.

    The buyers will be from:

    United States of America Canada Europe Union

    $ There are no duplicate emails $ No fake IDs $ Audiences ready to buy

    Category

    Product Lists

    Keywords

    email marketing,emails,Email List,buyer

    Row Count

    70150

    Price

    $90.00

  14. Amazon revenue 2004-2024

    • statista.com
    Updated Jun 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Amazon revenue 2004-2024 [Dataset]. https://www.statista.com/statistics/266282/annual-net-revenue-of-amazoncom/
    Explore at:
    Dataset updated
    Jun 25, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Worldwide, United States
    Description

    From 2004 to 2024, the net revenue of Amazon e-commerce and service sales has increased tremendously. In the fiscal year ending December 31, the multinational e-commerce company's net revenue was almost *** billion U.S. dollars, up from *** billion U.S. dollars in 2023.Amazon.com, a U.S. e-commerce company originally founded in 1994, is the world’s largest online retailer of books, clothing, electronics, music, and many more goods. As of 2024, the company generates the majority of it's net revenues through online retail product sales, followed by third-party retail seller services, cloud computing services, and retail subscription services including Amazon Prime. From seller to digital environment Through Amazon, consumers are able to purchase goods at a rather discounted price from both small and large companies as well as from other users. Both new and used goods are sold on the website. Due to the wide variety of goods available at prices which often undercut local brick-and-mortar retail offerings, Amazon has dominated the retailer market. As of 2024, Amazon’s brand worth amounts to over *** billion U.S. dollars, topping the likes of companies such as Walmart, Ikea, as well as digital competitors Alibaba and eBay. One of Amazon's first forays into the world of hardware was its e-reader Kindle, one of the most popular e-book readers worldwide. More recently, Amazon has also released several series of own-branded products and a voice-controlled virtual assistant, Alexa. Headquartered in North America Due to its location, Amazon offers more services in North America than worldwide. As a result, the majority of the company’s net revenue in 2023 was actually earned in the United States, Canada, and Mexico. In 2023, approximately *** billion U.S. dollars was earned in North America compared to only roughly *** billion U.S. dollars internationally.

  15. 2021 Amazon Last Mile Routing Research Challenge Dataset

    • registry.opendata.aws
    Updated Sep 16, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amazon (2022). 2021 Amazon Last Mile Routing Research Challenge Dataset [Dataset]. https://registry.opendata.aws/amazon-last-mile-challenges/
    Explore at:
    Dataset updated
    Sep 16, 2022
    Dataset provided by
    Amazon.comhttp://amazon.com/
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    The 2021 Amazon Last Mile Routing Research Challenge was an innovative research initiative led by Amazon.com and supported by the Massachusetts Institute of Technology’s Center for Transportation and Logistics. Over a period of 4 months, participants were challenged to develop innovative machine learning-based methods to enhance classic optimization-based approaches to solve the travelling salesperson problem, by learning from historical routes executed by Amazon delivery drivers. The primary goal of the Amazon Last Mile Routing Research Challenge was to foster innovative applied research in route planning, building on recent advances in predictive modeling, and using a real-world problem and data. The dataset released for the research challenge includes route-, stop-, and package-level features for 9,184 historical routes performed by Amazon drivers in 2018 in five metropolitan areas in the United States. This real-world dataset excludes any personally identifiable information (all route and package identifiers have been randomly regenerated and related location data have been obfuscated to ensure anonymity). Although multiple synthetic benchmark datasets are available in the literature, the dataset of the 2021 Amazon Last Mile Routing Research Challenge is the first large and publicly available dataset to include instances based on real-world operational routing data. The dataset is fully described and formally introduced in the following Transportation Science article: https://pubsonline.informs.org/doi/10.1287/trsc.2022.1173

  16. h

    Amazon-Reviews-2023

    • huggingface.co
    Updated Sep 15, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    McAuley-Lab (2023). Amazon-Reviews-2023 [Dataset]. https://huggingface.co/datasets/McAuley-Lab/Amazon-Reviews-2023
    Explore at:
    Dataset updated
    Sep 15, 2023
    Dataset authored and provided by
    McAuley-Lab
    Description

    Amazon Review 2023 is an updated version of the Amazon Review 2018 dataset. This dataset mainly includes reviews (ratings, text) and item metadata (desc- riptions, category information, price, brand, and images). Compared to the pre- vious versions, the 2023 version features larger size, newer reviews (up to Sep 2023), richer and cleaner meta data, and finer-grained timestamps (from day to milli-second).

  17. Amazon Web Services: MOD13Q1

    • catalog.data.gov
    • gimi9.com
    • +4more
    Updated Apr 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AWS NEX (2025). Amazon Web Services: MOD13Q1 [Dataset]. https://catalog.data.gov/dataset/amazon-web-services-mod13q1
    Explore at:
    Dataset updated
    Apr 10, 2025
    Dataset provided by
    Amazon Web Serviceshttp://aws.amazon.com/
    Description

    Global MODIS vegetation indices are designed to provide consistent spatial and temporal comparisons of vegetation conditions. Blue, red, and near-infrared reflectances, centered at 469-nanometers, 645-nanometers, and 858-nanometers, respectively, are used to determine the MODIS daily vegetation indices. The MODIS Normalized Difference Vegetation Index (NDVI) complements NOAA's Advanced Very High Resolution Radiometer (AVHRR) NDVI products and provides continuity for time series historical applications. MODIS also includes a new Enhanced Vegetation Index (EVI) that minimizes canopy background variations and maintains sensitivity over dense vegetation conditions. The EVI also uses the blue band to remove residual atmosphere contamination caused by smoke and sub-pixel thin cloud clouds. The MODIS NDVI and EVI products are computed from atmospherically corrected bi-directional surface reflectances that have been masked for water, clouds, heavy aerosols, and cloud shadows. Global MOD13Q1 data are provided every 16 days at 250-meter spatial resolution as a gridded level-3 product in the Sinusoidal projection. Lacking a 250m blue band, the EVI algorithm uses the 500m blue band to correct for residual atmospheric effects, with negligible spatial artifacts. Vegetation indices are used for global monitoring of vegetation conditions and are used in products displaying land cover and land cover changes. These data may be used as input for modeling global biogeochemical and hydrologic processes and global and regional climate. These data also may be used for characterizing land surface biophysical properties and processes, including primary production and land cover conversion.

  18. O

    Multi-Domain Sentiment Dataset v2.0

    • opendatalab.com
    zip
    Updated Aug 28, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    University of Pennsylvania (2022). Multi-Domain Sentiment Dataset v2.0 [Dataset]. https://opendatalab.com/OpenDataLab/Multi-Domain_Sentiment_Dataset_etc
    Explore at:
    zip(5222228453 bytes)Available download formats
    Dataset updated
    Aug 28, 2022
    Dataset provided by
    University of Pennsylvania
    Description

    The Multi-Domain Sentiment Dataset contains product reviews taken from Amazon.com from many product types (domains). Some domains (books and dvds) have hundreds of thousands of reviews. Others (musical instruments) have only a few hundred. Reviews contain star ratings (1 to 5 stars) that can be converted into binary labels if needed. This page contains some descriptions about the data. If you have questions, please email Mark Dredze or John Blitzer.

  19. P

    OA-Mine - annotations Dataset

    • paperswithcode.com
    Updated Jun 30, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Xinyang Zhang; Chenwei Zhang; Xian Li; Xin Luna Dong; Jingbo Shang; Christos Faloutsos; Jiawei Han (2024). OA-Mine - annotations Dataset [Dataset]. https://paperswithcode.com/dataset/oa-mine-annotations
    Explore at:
    Dataset updated
    Jun 30, 2024
    Authors
    Xinyang Zhang; Chenwei Zhang; Xian Li; Xin Luna Dong; Jingbo Shang; Christos Faloutsos; Jiawei Han
    Description

    The dataset contains Amazon products from 10 product categories with full human annotations. The dataset was collected in 2021. The products may have been taken down from Amazon since the collection of the dataset.

  20. Amazon-Google, Augmented Version, Fixed Splits

    • linkagelibrary.icpsr.umich.edu
    Updated Nov 23, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anna Primpeli; Christian Bizer (2020). Amazon-Google, Augmented Version, Fixed Splits [Dataset]. http://doi.org/10.3886/E127241V1
    Explore at:
    Dataset updated
    Nov 23, 2020
    Dataset provided by
    University of Mannheim (Germany)
    Authors
    Anna Primpeli; Christian Bizer
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Motivation:Entity Matching is the task of determining which records from different data sources describe the same real-world entity. It is an important task for data integration and has been the focus of many research works. A large number of entity matching/record linkage tasks has been made available for evaluating entity matching methods. However, the lack of fixed development and test splits as well as correspondence sets including both matching and non-matching record pairs hinders the reproducibility and comparability of benchmark experiments. In an effort to enhance the reproducibility and comparability of the experiments, we complement existing entity matching benchmark tasks with fixed sets of non-matching pairs as well as fixed development and test splits. Dataset Description:An augmented version of the amazon-google products dataset for benchmarking entity matching/record linkage methods found at: https://dbs.uni-leipzig.de/research/projects/object_matching/benchmark_datasets_for_entity_resolutio...The augmented version adds a fixed set of non-matching pairs to the original dataset. In addition, fixed splits for training, validation and testing as well as their corresponding feature vectors are provided. The feature vectors are built using data type specific similarity metrics.The dataset contains 1,363 records describing products deriving from amazon which are matched against 3,226 product records from google. The gold standards have manual annotations for 1,298 matching and 6,306 non-matching pairs. The total number of attributes used to decribe the product records are 4 while the attribute density is 0.75.The augmented dataset enhances the reproducibility of matching methods and the comparability of matching results.The dataset is part of the CompERBench repository which provides 21 complete benchmark tasks for entity matching for public download:http://data.dws.informatik.uni-mannheim.de/benchmarkmatchingtasks/index.html

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
UCSD CSE Research Project (2023). Amazon review data 2018 [Dataset]. https://mcauleylab.ucsd.edu:8443/public_datasets/data/amazon_v2/

Amazon review data 2018

Explore at:
Dataset updated
May 31, 2023
Dataset authored and provided by
UCSD CSE Research Project
Description

Context

This Dataset is an updated version of the Amazon review dataset released in 2014. As in the previous version, this dataset includes reviews (ratings, text, helpfulness votes), product metadata (descriptions, category information, price, brand, and image features), and links (also viewed/also bought graphs). In addition, this version provides the following features:

  • More reviews:

    • The total number of reviews is 233.1 million (142.8 million in 2014).
  • New reviews:

    • Current data includes reviews in the range May 1996 - Oct 2018.
  • Metadata: - We have added transaction metadata for each review shown on the review page.

    • Added more detailed metadata of the product landing page.

Acknowledgements

If you publish articles based on this dataset, please cite the following paper:

  • Jianmo Ni, Jiacheng Li, Julian McAuley. Justifying recommendations using distantly-labeled reviews and fined-grained aspects. EMNLP, 2019.
Search
Clear search
Close search
Google apps
Main menu