Amazon Review 2023 is an updated version of the Amazon Review 2018 dataset. This dataset mainly includes reviews (ratings, text) and item metadata (desc- riptions, category information, price, brand, and images). Compared to the pre- vious versions, the 2023 version features larger size, newer reviews (up to Sep 2023), richer and cleaner meta data, and finer-grained timestamps (from day to milli-second).
Get the needed Amazon product review data right from the data extractor! Collect Amazon review information from 19 Amazon countries from the following domains: - amazon.com - amazon.com.au - amazon.com.br - amazon.ca - amazon.cn - amazon.fr - amazon.de - amazon.in - amazon.it - amazon.com.mx - amazon.nl - amazon.sg - amazon.es - amazon.com.tr
Request Ecommerce Product Review dataset by: - keyword - category - seller - product ID (ASIN)
Amazon E-commerce Reviews Data datasets gathered by keyword, seller, category, or ASIN contain: - Product ID (can be extended to the full product information) - Review content and rating - Review metadata
Amazon extraction results can be delivered by schedule or API request, so the data can be extracted in real-time.
DATAANT uses the in-house web scraping service with no concurrency limitations, so unlimited data extractions can be performed simultaneously.
Output can and attributes can be customized to fit your particular needs.
This dataset contains product reviews and metadata from Amazon, including 142.8 million reviews spanning May 1996 - July 2014.
This dataset includes reviews (ratings, text, helpfulness votes), product metadata (descriptions, category information, price, brand, and image features), and links (also viewed/also bought graphs).
Attribution 1.0 (CC BY 1.0)https://creativecommons.org/licenses/by/1.0/
License information was derived automatically
About
This is a mock dataset with Amazon product reviews. Classes are structured: 6 "level 1" classes, 64 "level 2" classes, and 510 "level 3" classes.
3 files are shared:
Level 1 classes are: health personal care, toys games, beauty, pet supplies, baby products, and grocery gourmet food.
Dataset originally from https://www.kaggle.com/datasets/kashnitsky/hierarchical-text-classification
These datasets contain reviews from the Steam video game platform, and information about which games were bundled together.
Metadata includes
reviews
purchases, plays, recommends (likes)
product bundles
pricing information
Basic Statistics:
Reviews: 7,793,069
Users: 2,567,538
Items: 15,474
Bundles: 615
Amazon Product Review Dataset (2023)
Dataset Overview
The Amazon Product Review Dataset (2023) contains product reviews from Amazon customers. The dataset includes product information, review details, and metadata about the customers who left the reviews. This dataset can be used for various natural language processing (NLP) tasks, including sentiment analysis, review prediction, recommendation systems, and more.
Dataset Name: Amazon Product Review Dataset (2023)… See the full description on the dataset page: https://huggingface.co/datasets/kevykibbz/Amazon_Customer_Review_2023.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
MuMu is a Multimodal Music dataset with multi-label genre annotations that combines information from the Amazon Reviews dataset and the Million Song Dataset (MSD). The former contains millions of album customer reviews and album metadata gathered from Amazon.com. The latter is a collection of metadata and precomputed audio features for a million songs.
To map the information from both datasets we use MusicBrainz. This process yields the final set of 147,295 songs, which belong to 31,471 albums. For the mapped set of albums, there are 447,583 customer reviews from the Amazon Dataset. The dataset have been used for multi-label music genre classification experiments in the related publication. In addition to genre annotations, this dataset provides further information about each album, such as genre annotations, average rating, selling rank, similar products, and cover image url. For every text review it also provides helpfulness score of the reviews, average rating, and summary of the review.
The mapping between the three datasets (Amazon, MusicBrainz and MSD), genre annotations, metadata, data splits, text reviews and links to images are available here. Images and audio files can not be released due to copyright issues.
MuMu dataset (mapping, metadata, annotations and text reviews)
Data splits and multimodal feature embeddings for ISMIR multi-label classification experiments
These data can be used together with the Tartarus deep learning library https://github.com/sergiooramas/tartarus.
NOTE: This version provides simplified files with metadata and splits.
Scientific References
Please cite the following papers if using MuMu dataset or Tartarus library.
Oramas, S., Barbieri, F., Nieto, O., and Serra, X (2018). Multimodal Deep Learning for Music Genre Classification, Transactions of the International Society for Music Information Retrieval, V(1).
Oramas S., Nieto O., Barbieri F., & Serra X. (2017). Multi-label Music Genre Classification from audio, text and images using Deep Features. In Proceedings of the 18th International Society for Music Information Retrieval Conference (ISMIR 2017). https://arxiv.org/abs/1707.04916
This dataset was created by lievgarcia
This dataset was created by Prateek Pal
This dataset was created by Atul Krishnan
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The River Sediment Database-Amazon (RivSed-Amazon) database contains surface suspended sediment concentrations (SSC) derived from Landsat 5, 7, and 8 Level 1 Collection 1 surface reflectance from all rivers in the Amazon River Basin that are ~60 meters wide or greater. SSC represent spatially integrated "reach" median concentrations over the footprint of SWOT River Database (SWORD, Altenau et al., 2021) centerlines (median reach length = 10 km) where high quality river water pixels were detected within each Landsat image from 1984-2018.
The methods used to produce this database were initially developed in the following publications:
Gardner, J., Pavelsky, T. M., Topp, S., Yang, X., Ross, M. R., & Cohen, S. (2023). Human activities change suspended sediment concentration along rivers. Environmental Research Letters. https://iopscience.iop.org/article/10.1088/1748-9326/acd8d8 and
Gardner et al. (2020). The color of rivers. Geophysical Research Letters. https://agupubs.onlinelibrary.wiley.com/doi/full/10.1029/2020GL088946
The publication associated with RivSed-Amazon is in review.
Files:
1) Metadata (rivSed_Amazon_metadata_v1.01.pdf): Description key data files associated with this repository.
2) RiverSed (RiverSed_Amazon_v1.1.txt). Table of SSC and associated data that is joinable to SWORD based on the ""reach_id".
3) Shapefile of river centerlines over South America to which the reflectance data can be attached (SWORD_SA.shp).
4) Shapefile of the reach polygons associated with SWORD_SA over the Amazon Basin. (reach_polygons_amazon.shp).
5) SSC-Landsat matchup database with extended metadata on locations and in-situ data (train_full_v1.1.csv).
6) The final training data used to build the xgboost machine learning model (train_v1.1.csv).
7) The xgboost model that can make SSC predictions over inland waters in USA using Landsat bands/band combinations (tssAmazon_model_v1.1.rds and .rda). The model can only be loaded and used in R at this time.
8) The correction coefficients applied to Landsat 5 and 8 to harmonized surface reflectance across Landsat 5,7,8 and over all bands to enable time series analysis.
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
History of annual reviews of the quality of water bodies in Montreal. The aquatic environment monitoring network (RSMA) takes surface water samples in order to draw up a state of the situation in the Montreal agglomeration.**This third party metadata element was translated using an automated translation tool (Amazon Translate).**
Amazon All Beauty Dataset
Directory Structure
metadata: Contains product information.
reviews: Contains user reviews about the products.
filtered:
e5-base-v2_embeddings.jsonl: Contains "asin" and "embeddings" created with e5-base-v2. metadata.jsonl: Contains "asin" and "text", where text is created from the title, description, brand, main category, and category. reviews.jsonl: Contains "reviewerID", "reviewTime", and "asin". Reviews are filtered to include… See the full description on the dataset page: https://huggingface.co/datasets/smartcat/Amazon_All_Beauty_2018.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Amazon Review 2023 is an updated version of the Amazon Review 2018 dataset. This dataset mainly includes reviews (ratings, text) and item metadata (desc- riptions, category information, price, brand, and images). Compared to the pre- vious versions, the 2023 version features larger size, newer reviews (up to Sep 2023), richer and cleaner meta data, and finer-grained timestamps (from day to milli-second).