13 datasets found

h
Amazon-Reviews-2023
huggingface.co
Updated Apr 7, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
McAuley-Lab (2024). Amazon-Reviews-2023 [Dataset]. https://huggingface.co/datasets/McAuley-Lab/Amazon-Reviews-2023
Explore at:
Dataset updated
Apr 7, 2024
Dataset authored and provided by
McAuley-Lab
Description
Amazon Review 2023 is an updated version of the Amazon Review 2018 dataset. This dataset mainly includes reviews (ratings, text) and item metadata (desc- riptions, category information, price, brand, and images). Compared to the pre- vious versions, the 2023 version features larger size, newer reviews (up to Sep 2023), richer and cleaner meta data, and finer-grained timestamps (from day to milli-second).
d
DATAANT | Amazon Data | E-commerce Product Review | Dataset, API | Reviews...
datarade.ai
Updated Nov 22, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dataant (2022). DATAANT | Amazon Data | E-commerce Product Review | Dataset, API | Reviews by keyword, by category, by seller, by product ASIN | 19 countries [Dataset]. https://datarade.ai/data-products/amazon-data-reviews-by-keyword-by-category-by-seller-by-p-dataant
Explore at:
.bin, .json, .xml, .csv, .xls, .sqlAvailable download formats
Dataset updated
Nov 22, 2022
Dataset authored and provided by
Dataant
Area covered
Spain, Canada, Poland, Turkey, China, Brazil, United Arab Emirates, Germany, Netherlands, France
Description
Get the needed Amazon product review data right from the data extractor! Collect Amazon review information from 19 Amazon countries from the following domains: - amazon.com - amazon.com.au - amazon.com.br - amazon.ca - amazon.cn - amazon.fr - amazon.de - amazon.in - amazon.it - amazon.com.mx - amazon.nl - amazon.sg - amazon.es - amazon.com.tr

Request Ecommerce Product Review dataset by: - keyword - category - seller - product ID (ASIN)

Amazon E-commerce Reviews Data datasets gathered by keyword, seller, category, or ASIN contain: - Product ID (can be extended to the full product information) - Review content and rating - Review metadata

Amazon extraction results can be delivered by schedule or API request, so the data can be extracted in real-time.

DATAANT uses the in-house web scraping service with no concurrency limitations, so unlimited data extractions can be performed simultaneously.

Output can and attributes can be customized to fit your particular needs.
P
Amazon Product Data Dataset
paperswithcode.com
opendatalab.com
Updated Mar 5, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ruining He; Julian McAuley (2024). Amazon Product Data Dataset [Dataset]. https://paperswithcode.com/dataset/amazon-product-data
Explore at:
Dataset updated
Mar 5, 2024
Authors
Ruining He; Julian McAuley
Description
This dataset contains product reviews and metadata from Amazon, including 142.8 million reviews spanning May 1996 - July 2014.

This dataset includes reviews (ratings, text, helpfulness votes), product metadata (descriptions, category information, price, brand, and image features), and links (also viewed/also bought graphs).
Amazon product reviews (mock dataset)
zenodo.org
csv
Updated Jun 18, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yury Kashnitsky; Yury Kashnitsky (2022). Amazon product reviews (mock dataset) [Dataset]. http://doi.org/10.5281/zenodo.6657410
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.6657410
Dataset updated
Jun 18, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Yury Kashnitsky; Yury Kashnitsky
License
Attribution 1.0 (CC BY 1.0)https://creativecommons.org/licenses/by/1.0/
License information was derived automatically
Description
About

This is a mock dataset with Amazon product reviews. Classes are structured: 6 "level 1" classes, 64 "level 2" classes, and 510 "level 3" classes.

3 files are shared:

train_40k.csv - training 40k Amazon product reviews

valid_10k.csv - 10k reviews left for validation

unlabeled_150k.csv - raw 150k Amazon product reviews, these can be used for language model finetuning.

Level 1 classes are: health personal care, toys games, beauty, pet supplies, baby products, and grocery gourmet food.

Dataset originally from https://www.kaggle.com/datasets/kashnitsky/hierarchical-text-classification
u
Steam Video Game and Bundle Data
cseweb.ucsd.edu
json
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
UCSD CSE Research Project, Steam Video Game and Bundle Data [Dataset]. https://cseweb.ucsd.edu/~jmcauley/datasets.html
Explore at:
jsonAvailable download formats
Dataset authored and provided by
UCSD CSE Research Project
Description
These datasets contain reviews from the Steam video game platform, and information about which games were bundled together.

Metadata includes

reviews

purchases, plays, recommends (likes)

product bundles

pricing information

Basic Statistics:

Reviews: 7,793,069

Users: 2,567,538

Items: 15,474

Bundles: 615
h
Amazon_Customer_Review_2023
huggingface.co
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Amazon_Customer_Review_2023 [Dataset]. https://huggingface.co/datasets/kevykibbz/Amazon_Customer_Review_2023
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Authors
kevin kibebe
Description
Amazon Product Review Dataset (2023)

Dataset Overview

The Amazon Product Review Dataset (2023) contains product reviews from Amazon customers. The dataset includes product information, review details, and metadata about the customers who left the reviews. This dataset can be used for various natural language processing (NLP) tasks, including sentiment analysis, review prediction, recommendation systems, and more.

Dataset Name: Amazon Product Review Dataset (2023)… See the full description on the dataset page: https://huggingface.co/datasets/kevykibbz/Amazon_Customer_Review_2023.
Z
MuMu: Multimodal Music Dataset
data.niaid.nih.gov
zenodo.org
Updated Dec 6, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Oramas, Sergio (2022). MuMu: Multimodal Music Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_831188
Explore at:
Dataset updated
Dec 6, 2022
Dataset authored and provided by
Oramas, Sergio
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
MuMu is a Multimodal Music dataset with multi-label genre annotations that combines information from the Amazon Reviews dataset and the Million Song Dataset (MSD). The former contains millions of album customer reviews and album metadata gathered from Amazon.com. The latter is a collection of metadata and precomputed audio features for a million songs.

To map the information from both datasets we use MusicBrainz. This process yields the final set of 147,295 songs, which belong to 31,471 albums. For the mapped set of albums, there are 447,583 customer reviews from the Amazon Dataset. The dataset have been used for multi-label music genre classification experiments in the related publication. In addition to genre annotations, this dataset provides further information about each album, such as genre annotations, average rating, selling rank, similar products, and cover image url. For every text review it also provides helpfulness score of the reviews, average rating, and summary of the review.

The mapping between the three datasets (Amazon, MusicBrainz and MSD), genre annotations, metadata, data splits, text reviews and links to images are available here. Images and audio files can not be released due to copyright issues.

MuMu dataset (mapping, metadata, annotations and text reviews)

Data splits and multimodal feature embeddings for ISMIR multi-label classification experiments

These data can be used together with the Tartarus deep learning library https://github.com/sergiooramas/tartarus.

NOTE: This version provides simplified files with metadata and splits.

Scientific References

Please cite the following papers if using MuMu dataset or Tartarus library.

Oramas, S., Barbieri, F., Nieto, O., and Serra, X (2018). Multimodal Deep Learning for Music Genre Classification, Transactions of the International Society for Music Information Retrieval, V(1).

Oramas S., Nieto O., Barbieri F., & Serra X. (2017). Multi-label Music Genre Classification from audio, text and images using Deep Features. In Proceedings of the 18th International Society for Music Information Retrieval Conference (ISMIR 2017). https://arxiv.org/abs/1707.04916
amazon_reviews
kaggle.com
zip
Updated Jan 29, 2019
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
lievgarcia (2019). amazon_reviews [Dataset]. https://www.kaggle.com/lievgarcia/amazon-reviews
Explore at:
zip(4593980 bytes)Available download formats
Dataset updated
Jan 29, 2019
Authors
lievgarcia
Description
Dataset

This dataset was created by lievgarcia

Contents
Amazon Reviews
kaggle.com
zip
Updated Feb 16, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Prateek Pal (2022). Amazon Reviews [Dataset]. https://www.kaggle.com/prateekiet/amazon-reviews
Explore at:
zip(4593980 bytes)Available download formats
Dataset updated
Feb 16, 2022
Authors
Prateek Pal
Description
Dataset

This dataset was created by Prateek Pal

Contents
600_Amazon_Reviews
kaggle.com
zip
Updated Feb 15, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Atul Krishnan (2022). 600_Amazon_Reviews [Dataset]. https://www.kaggle.com/atulkrishnan25/600-amazon-reviews
Explore at:
zip(98245 bytes)Available download formats
Dataset updated
Feb 15, 2022
Authors
Atul Krishnan
Description
Dataset

This dataset was created by Atul Krishnan

Contents
Z
River Sediment Database-Amazon (RivSed-Amazon)
data.niaid.nih.gov
zenodo.org
Updated Sep 25, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
John Gardner (2024). River Sediment Database-Amazon (RivSed-Amazon) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8377852
Explore at:
Dataset updated
Sep 25, 2024
Dataset authored and provided by
John Gardner
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The River Sediment Database-Amazon (RivSed-Amazon) database contains surface suspended sediment concentrations (SSC) derived from Landsat 5, 7, and 8 Level 1 Collection 1 surface reflectance from all rivers in the Amazon River Basin that are ~60 meters wide or greater. SSC represent spatially integrated "reach" median concentrations over the footprint of SWOT River Database (SWORD, Altenau et al., 2021) centerlines (median reach length = 10 km) where high quality river water pixels were detected within each Landsat image from 1984-2018.

The methods used to produce this database were initially developed in the following publications:

Gardner, J., Pavelsky, T. M., Topp, S., Yang, X., Ross, M. R., & Cohen, S. (2023). Human activities change suspended sediment concentration along rivers. Environmental Research Letters. https://iopscience.iop.org/article/10.1088/1748-9326/acd8d8 and

Gardner et al. (2020). The color of rivers. Geophysical Research Letters. https://agupubs.onlinelibrary.wiley.com/doi/full/10.1029/2020GL088946

The publication associated with RivSed-Amazon is in review.

Files:

1) Metadata (rivSed_Amazon_metadata_v1.01.pdf): Description key data files associated with this repository.

2) RiverSed (RiverSed_Amazon_v1.1.txt). Table of SSC and associated data that is joinable to SWORD based on the ""reach_id".

3) Shapefile of river centerlines over South America to which the reflectance data can be attached (SWORD_SA.shp).

4) Shapefile of the reach polygons associated with SWORD_SA over the Amazon Basin. (reach_polygons_amazon.shp).

5) SSC-Landsat matchup database with extended metadata on locations and in-situ data (train_full_v1.1.csv).

6) The final training data used to build the xgboost machine learning model (train_v1.1.csv).

7) The xgboost model that can make SSC predictions over inland waters in USA using Landsat bands/band combinations (tssAmazon_model_v1.1.rds and .rda). The model can only be loaded and used in R at this time.

8) The correction coefficients applied to Landsat 5 and 8 to harmonized surface reflectance across Landsat 5,7,8 and over all bands to enable time series analysis.
G
RSMA - Annual review of the quality of water bodies
open.canada.ca
catalogue.arctic-sdi.org
+1more
html, pdf
Updated Mar 5, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Government and Municipalities of Québec (2025). RSMA - Annual review of the quality of water bodies [Dataset]. https://open.canada.ca/data/dataset/7ec7db2d-72e5-405c-a9ae-7f4269758178
Explore at:
pdf, htmlAvailable download formats
Dataset updated
Mar 5, 2025
Dataset provided by
Government and Municipalities of Québec
License
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Time period covered
Jan 1, 2004 - Dec 31, 2023
Description
History of annual reviews of the quality of water bodies in Montreal. The aquatic environment monitoring network (RSMA) takes surface water samples in order to draw up a state of the situation in the Montreal agglomeration.**This third party metadata element was translated using an automated translation tool (Amazon Translate).**
h
Amazon_All_Beauty_2018
huggingface.co
Updated Oct 21, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
SmartCat (2024). Amazon_All_Beauty_2018 [Dataset]. https://huggingface.co/datasets/smartcat/Amazon_All_Beauty_2018
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 21, 2024
Dataset authored and provided by
SmartCat
Description
Amazon All Beauty Dataset

Directory Structure

metadata: Contains product information.

reviews: Contains user reviews about the products.

filtered:

e5-base-v2_embeddings.jsonl: Contains "asin" and "embeddings" created with e5-base-v2. metadata.jsonl: Contains "asin" and "text", where text is created from the title, description, brand, main category, and category. reviews.jsonl: Contains "reviewerID", "reviewTime", and "asin". Reviews are filtered to include… See the full description on the dataset page: https://huggingface.co/datasets/smartcat/Amazon_All_Beauty_2018.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

McAuley-Lab (2024). Amazon-Reviews-2023 [Dataset]. https://huggingface.co/datasets/McAuley-Lab/Amazon-Reviews-2023

Amazon-Reviews-2023

McAuley-Lab/Amazon-Reviews-2023

Explore at:

23 scholarly articles cite this dataset (View in Google Scholar)

Dataset updated

Apr 7, 2024

Dataset authored and provided by

McAuley-Lab

Description

Amazon Review 2023 is an updated version of the Amazon Review 2018 dataset. This dataset mainly includes reviews (ratings, text) and item metadata (desc- riptions, category information, price, brand, and images). Compared to the pre- vious versions, the 2023 version features larger size, newer reviews (up to Sep 2023), richer and cleaner meta data, and finer-grained timestamps (from day to milli-second).

Clear search

Close search

Google apps

Main menu

Amazon-Reviews-2023

DATAANT | Amazon Data | E-commerce Product Review | Dataset, API | Reviews...

Amazon Product Data Dataset

Amazon product reviews (mock dataset)

Steam Video Game and Bundle Data

Amazon_Customer_Review_2023

MuMu: Multimodal Music Dataset

amazon_reviews

Dataset

Contents

Amazon Reviews

Dataset

Contents

600_Amazon_Reviews

Dataset

Contents

River Sediment Database-Amazon (RivSed-Amazon)

RSMA - Annual review of the quality of water bodies

Amazon_All_Beauty_2018

Amazon-Reviews-2023

McAuley-Lab/Amazon-Reviews-2023