98 datasets found
  1. D

    SYNERGY - Open machine learning dataset on study selection in systematic...

    • dataverse.nl
    csv, json, txt, zip
    Updated Apr 24, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jonathan De Bruin; Jonathan De Bruin; Yongchao Ma; Yongchao Ma; Gerbrich Ferdinands; Gerbrich Ferdinands; Jelle Teijema; Jelle Teijema; Rens Van de Schoot; Rens Van de Schoot (2023). SYNERGY - Open machine learning dataset on study selection in systematic reviews [Dataset]. http://doi.org/10.34894/HE6NAQ
    Explore at:
    txt(212), json(702), zip(16028323), json(19426), txt(263), zip(3560967), txt(305), json(470), txt(279), zip(2355371), json(23201), csv(460956), txt(200), json(685), json(546), csv(63996), zip(2989015), zip(5749455), txt(331), txt(315), json(691), json(23775), csv(672721), json(468), txt(415), json(22778), csv(31919), csv(746832), json(18392), zip(62992826), csv(234822), txt(283), zip(34788857), json(475), txt(242), json(533), csv(42227), json(24548), zip(738232), json(22477), json(25491), zip(11463283), json(17741), csv(490660), json(19662), json(578), csv(19786), zip(14708207), zip(24619707), zip(2404439), json(713), json(27224), json(679), json(26426), txt(185), json(906), zip(18534723), json(23550), txt(266), txt(317), zip(6019723), json(33943), txt(436), csv(388378), json(469), zip(2106498), txt(320), csv(451336), txt(338), zip(19428163), json(14326), json(31652), txt(299), csv(96153), txt(220), csv(114789), json(15452), csv(5372708), json(908), csv(317928), csv(150923), json(465), csv(535584), json(26090), zip(8164831), json(19633), txt(316), json(23494), csv(133950), json(18638), csv(3944082), json(15345), json(473), zip(4411063), zip(10396095), zip(835096), txt(255), json(699), csv(654705), txt(294), csv(989865), zip(1028035), txt(322), zip(15085090), txt(237), txt(310), json(756), json(30628), json(19490), json(25908), txt(401), json(701), zip(5543909), json(29397), zip(14007470), json(30058), zip(58869042), csv(852937), json(35711), csv(298011), csv(187163), txt(258), zip(3526740), json(568), json(21552), zip(66466788), csv(215250), json(577), csv(103010), txt(306), zip(11840006)Available download formats
    Dataset updated
    Apr 24, 2023
    Dataset provided by
    DataverseNL
    Authors
    Jonathan De Bruin; Jonathan De Bruin; Yongchao Ma; Yongchao Ma; Gerbrich Ferdinands; Gerbrich Ferdinands; Jelle Teijema; Jelle Teijema; Rens Van de Schoot; Rens Van de Schoot
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    SYNERGY is a free and open dataset on study selection in systematic reviews, comprising 169,288 academic works from 26 systematic reviews. Only 2,834 (1.67%) of the academic works in the binary classified dataset are included in the systematic reviews. This makes the SYNERGY dataset a unique dataset for the development of information retrieval algorithms, especially for sparse labels. Due to the many available variables available per record (i.e. titles, abstracts, authors, references, topics), this dataset is useful for researchers in NLP, machine learning, network analysis, and more. In total, the dataset contains 82,668,134 trainable data points. The easiest way to get the SYNERGY dataset is via the synergy-dataset Python package. See https://github.com/asreview/synergy-dataset for all information.

  2. u

    Goodreads Book Reviews

    • cseweb.ucsd.edu
    json
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    UCSD CSE Research Project, Goodreads Book Reviews [Dataset]. https://cseweb.ucsd.edu/~jmcauley/datasets.html
    Explore at:
    jsonAvailable download formats
    Dataset authored and provided by
    UCSD CSE Research Project
    Description

    These datasets contain reviews from the Goodreads book review website, and a variety of attributes describing the items. Critically, these datasets have multiple levels of user interaction, raging from adding to a shelf, rating, and reading.

    Metadata includes

    • reviews

    • add-to-shelf, read, review actions

    • book attributes: title, isbn

    • graph of similar books

    Basic Statistics:

    • Items: 1,561,465

    • Users: 808,749

    • Interactions: 225,394,930

  3. c

    Booking dot com reviews datasets

    • crawlfeeds.com
    csv, zip
    Updated Jun 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Crawl Feeds (2025). Booking dot com reviews datasets [Dataset]. https://crawlfeeds.com/datasets/booking-dot-com-reviews-datasets
    Explore at:
    csv, zipAvailable download formats
    Dataset updated
    Jun 15, 2025
    Dataset authored and provided by
    Crawl Feeds
    License

    https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy

    Description

    The Booking.com Reviews Dataset is a comprehensive collection of user-generated reviews for hotels, hostels, bed & breakfasts, and other accommodations listed on Booking.com. This dataset provides detailed information on customer reviews, including ratings, review text, review dates, customer demographics, and more. It is a valuable resource for analyzing customer sentiment, service quality, and overall guest experiences across different types of accommodations worldwide.

    Key Features:

    • Review Data: Includes detailed customer reviews with both positive and negative feedback, providing insights into customer experiences and satisfaction levels.
    • Ratings: Features individual ratings for various aspects of the accommodations, such as cleanliness, location, service, value for money, and overall satisfaction.
    • Review Dates: Provides the dates of each review, enabling trend analysis over time.
    • Accommodation Details: Includes information about the accommodations being reviewed, such as name and location.
    • Language Support: Reviews are available in multiple languages, reflecting the diverse user base of Booking.com.

    Use Cases:

    • Sentiment Analysis: Ideal for businesses and researchers conducting sentiment analysis to understand customer opinions and trends in the hospitality industry.
    • Market Research: Useful for market research and competitive analysis, identifying strengths and weaknesses of different accommodation types and regions.
    • Machine Learning: Beneficial for developing machine learning models for natural language processing, sentiment classification, and recommendation systems.
    • Customer Experience Improvement: Helps hotel managers and owners understand customer feedback to improve services and guest experiences.
    • Academic Research: Suitable for academic research in hospitality management, consumer behavior, data science, and artificial intelligence.

    Dataset Format:

    The dataset is available in CSV format making it easy to use for data analysis, machine learning, and application development.

    Access 3 million+ US hotel reviews — submit your request today.

  4. Amazon Customer Review Data

    • zenodo.org
    pdf
    Updated Jul 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Akash Shashikant Vaykar; Abhishek Kaushik; Abhishek Kaushik; Akash Shashikant Vaykar (2024). Amazon Customer Review Data [Dataset]. http://doi.org/10.5281/zenodo.3549704
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Jul 22, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Akash Shashikant Vaykar; Abhishek Kaushik; Abhishek Kaushik; Akash Shashikant Vaykar
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset: Amazon Customer Review Data for sentiment analysis

    Size: 60889 appox.

    Format: .CSV

    Period: 2013 to 2019

    Categories: 5…… (Mobiles, Smart TV, Books, Mobile Accessories, Refrigerator)

    Unique_ID: Customized (Primary Key)

    Review_Header: user’s comment in few words

    Review_Text: User’s comment in details (3-4 lines)

    Rating: (1- Very Low, 2 🡪 Low, 3🡪 Avg, 4 🡪 Good, 5 - Excellent)

    Posting Period: 2013 to 2019

    Own_Rating: for 1-2 🡪 Negative, 3🡪 Neutral, 4-5 🡪 Positive

  5. Amazon Product Reviews

    • kaggle.com
    Updated Nov 26, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). Amazon Product Reviews [Dataset]. https://www.kaggle.com/datasets/thedevastator/amazon-product-reviews/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 26, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    The Devastator
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Amazon Product Reviews

    18 Years of Customer Ratings and Experiences

    By Huggingface Hub [source]

    About this dataset

    The Amazon Reviews Polarity Dataset discloses eighteen years of customers' ratings and reviews from Amazon.com, offering an unparalleled trove of insight and knowledge. Drawing from the immense pool of over 35 million customer reviews, this dataset presents a broad spectrum of customer opinions on products they have bought or used. This invaluable data is a gold mine for improving products and services as it contains comprehensive information regarding customers' experiences with a product including ratings, titles, and plaintext content. At the same time, this dataset contains both customer-specific data along with product information which encourages deep analytics that could lead to great advances in providing tailored solutions for customers. Has your product been favored by the majority? Are there any aspects that need extra care? Use Amazon Reviews Polarity to gain deeper insights into what your customers want - explore now!

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    • Analyze customer ratings to identify trends: Take a look at how many customers have rated the same product or service with the same score (e.g., 4 stars). You can use this information to identify what customers like or don’t like about it by examining common sentiment throughout the reviews. Identifying these patterns can help you make decisions on which features of your products or services to emphasize in order to boost sales and satisfaction rates.

    2 Review content analysis: Analyzing review content is one of the best ways to gauge customer sentiment toward specific features or aspects of a product/service. Using natural language processing tools such as Word2Vec, Latent Dirichlet Allocation (LDA), or even simple keyword search algorithms can quickly reveal general topics that are discussed in relation to your product/service across multiple reviews - allowing you quickly pinpoint areas that may need improvement for particular items within your lines of business.

    3 Track associated scores over time: By tracking customer ratings overtime, you may be able to better understand when there has been an issue with something specific related to your product/service - such as negative response toward a feature that was introduced but didn’t seem popular among customers and was removed shortly after introduction.. This can save time and money by identifying issues before they become widespread concerns with larger sets of consumers who invest their money in using your company's item(s).

    4 Visualize sentiment data over time graphs : Utilizing visualizations such as bar graphs can help identify trends across different categories quicker than raw numbers alone; combining both numeric values along with color differences associated between different scores allows you spot anomalies easier - allowing faster resolution times when trying figure out why certain spikes occurred where other stayed stable (or vice-versa) when comparing similar data points through time-series based visualization models

    Research Ideas

    • Developing a customer sentiment analysis system that can be used to quickly analyze the sentiment of reviews and identify any potential areas of improvement.
    • Building a product recommendation service that takes into account the ratings and reviews of customers when recommending similar products they may be interested in purchasing.
    • Training a machine learning model to accurately predict customers’ ratings on new products they have not yet tried and leverage this for further product development optimization initiatives

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

    Columns

    File: train.csv | Column name | Description | |:--------------|:-------------------------------------------------------------------| | label | The sentiment of the review, either positive or negative. (String) | | title | The title of the review. (String) ...

  6. c

    Apple mobile phones reviews

    • crawlfeeds.com
    json, zip
    Updated Apr 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Crawl Feeds (2025). Apple mobile phones reviews [Dataset]. https://crawlfeeds.com/datasets/apple-mobile-phones-reviews
    Explore at:
    zip, jsonAvailable download formats
    Dataset updated
    Apr 29, 2025
    Dataset authored and provided by
    Crawl Feeds
    License

    https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy

    Description

    App mobile phones reviews structured dataset. This small dataset is ideal for NLP and to test machine learning algorithms.

    Get large dataset from our resources.

    Extracted from amazon.

    Data included only for apple mobile phones.

    Reach out to us for large datasets

  7. m

    Consumer Review of Clothing Product

    • data.mendeley.com
    Updated Feb 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nadhif Girawan (2024). Consumer Review of Clothing Product [Dataset]. http://doi.org/10.17632/pg3s4hw68k.3
    Explore at:
    Dataset updated
    Feb 19, 2024
    Authors
    Nadhif Girawan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The dataset is collected on our own from various sources. This dataset comprises a comprehensive collection of reviews pertaining to clothing products and serves as a valuable resource for multilabel classification research. Each data entry is meticulously annotated with relevant labels, allowing researchers to explore various dimensions of the clothing products being reviewed. The dataset offers a rich diversity of perspectives and opinions, enabling the development and evaluation of robust classification models that can accurately predict multiple aspects of a given clothing item. With its focus on multilabel classification, this data contributes significantly to advancing the understanding and application of machine learning algorithms in the fashion industry.

  8. h

    drug-reviews

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mouwiya S. A. Al-Qaisieh, drug-reviews [Dataset]. https://huggingface.co/datasets/Mouwiya/drug-reviews
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    Mouwiya S. A. Al-Qaisieh
    License

    https://choosealicense.com/licenses/odbl/https://choosealicense.com/licenses/odbl/

    Description

    Dataset Details

      1.Dataset Loading:
    

    Initially, we load the Drug Review Dataset from the UC Irvine Machine Learning Repository. This dataset contains patient reviews of different drugs, along with the medical condition being treated and the patients' satisfaction ratings.

      2.Data Preprocessing:
    

    The dataset is preprocessed to ensure data integrity and consistency. We handle missing values and ensure that each patient ID is unique across the dataset.

      3.Text… See the full description on the dataset page: https://huggingface.co/datasets/Mouwiya/drug-reviews.
    
  9. i

    IMDb Movie Reviews Dataset

    • ieee-dataport.org
    Updated Aug 2, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aditya Pal (2022). IMDb Movie Reviews Dataset [Dataset]. https://ieee-dataport.org/open-access/imdb-movie-reviews-dataset
    Explore at:
    Dataset updated
    Aug 2, 2022
    Authors
    Aditya Pal
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    R

  10. i

    SaudiShopInsights Dataset: Saudi Customer Reviews in Clothes and Electronics...

    • ieee-dataport.org
    Updated Dec 19, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    razan alrefaey (2023). SaudiShopInsights Dataset: Saudi Customer Reviews in Clothes and Electronics [Dataset]. https://ieee-dataport.org/documents/saudishopinsights-dataset-saudi-customer-reviews-clothes-and-electronics
    Explore at:
    Dataset updated
    Dec 19, 2023
    Authors
    razan alrefaey
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Saudi Arabia
    Description

    natural language processing

  11. h

    amazon_us_reviews

    • huggingface.co
    • tensorflow.org
    Updated Jun 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Polina Kazakova (2023). amazon_us_reviews [Dataset]. https://huggingface.co/datasets/polinaeterna/amazon_us_reviews
    Explore at:
    Dataset updated
    Jun 30, 2023
    Authors
    Polina Kazakova
    License

    https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/

    Description

    Amazon Customer Reviews (a.k.a. Product Reviews) is one of Amazons iconic products. In a period of over two decades since the first review in 1995, millions of Amazon customers have contributed over a hundred million reviews to express opinions and describe their experiences regarding products on the Amazon.com website. This makes Amazon Customer Reviews a rich source of information for academic researchers in the fields of Natural Language Processing (NLP), Information Retrieval (IR), and Machine Learning (ML), amongst others. Accordingly, we are releasing this data to further research in multiple disciplines related to understanding customer product experiences. Specifically, this dataset was constructed to represent a sample of customer evaluations and opinions, variation in the perception of a product across geographical regions, and promotional intent or bias in reviews.

    Over 130+ million customer reviews are available to researchers as part of this release. The data is available in TSV files in the amazon-reviews-pds S3 bucket in AWS US East Region. Each line in the data files corresponds to an individual review (tab delimited, with no quote and escape characters).

    Each Dataset contains the following columns:

    • marketplace: 2 letter country code of the marketplace where the review was written.
    • customer_id: Random identifier that can be used to aggregate reviews written by a single author.
    • review_id: The unique ID of the review.
    • product_id: The unique Product ID the review pertains to. In the multilingual dataset the reviews for the same product in different countries can be grouped by the same product_id.
    • product_parent: Random identifier that can be used to aggregate reviews for the same product.
    • product_title: Title of the product.
    • product_category: Broad product category that can be used to group reviews (also used to group the dataset into coherent parts).
    • star_rating: The 1-5 star rating of the review.
    • helpful_votes: Number of helpful votes.
    • total_votes: Number of total votes the review received.
    • vine: Review was written as part of the Vine program.
    • verified_purchase: The review is on a verified purchase.
    • review_headline: The title of the review.
    • review_body: The review text.
    • review_date: The date the review was written.
  12. Language Generation Dataset: 200M Samples

    • kaggle.com
    zip
    Updated Sep 7, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abhishek Chatterjee (2019). Language Generation Dataset: 200M Samples [Dataset]. https://www.kaggle.com/datasets/imdeepmind/language-generation-dataset-200m-samples
    Explore at:
    zip(3416608411 bytes)Available download formats
    Dataset updated
    Sep 7, 2019
    Authors
    Abhishek Chatterjee
    Description

    Context

    Amazon Customer Reviews Dataset is a dataset of user-generated product reviews on the shopping website Amazon. It contains over 130 million product reviews.

    This dataset contains a tiny fraction of that dataset processed and prepared specifically for language generation.

    To know how the dataset is prepared, then please check the GitHub repository for this dataset. https://github.com/imdeepmind/AmazonReview-LanguageGenerationDataset

    Content

    The dataset is stored in an SQLite database. The database contains one table called reviews. This table contains two columns sequence and next.

    The sequence column contains sequences of characters. In this dataset, each sequence of 40 characters long.

    The next column contains the next character after the sequence.

    There are about 200 million samples are in the dataset.

    Acknowledgements

    Thanks to Amazon for making this awesome dataset. Here is the link for the dataset: https://s3.amazonaws.com/amazon-reviews-pds/readme.html

    Inspiration

    This dataset can be used for Language Generation. As it contains 200 million samples, complex Deep Learning models can be trained on this data.

  13. m

    Mobile App Logo and User Reviews Recommendation

    • data.mendeley.com
    Updated Aug 15, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Iconix Sas (2024). Mobile App Logo and User Reviews Recommendation [Dataset]. http://doi.org/10.17632/v4ndw78f9b.1
    Explore at:
    Dataset updated
    Aug 15, 2024
    Authors
    Iconix Sas
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset offers thorough app metadata from the Google Play Store and a sentiment analysis of user reviews for the app. The first dataset (App_Sentiment_Analysis.csv) provides insights into user views and experiences via translated review texts, sentiment classifications, and numerical ratings for sentiment polarity and subjectivity. The second dataset (Review.csv) covers various program parameters, including ratings, review counts, sizes, installation counts, content ratings, genres, and more. When combined, these datasets allow for an in-depth examination of user reviews and app performance, which supports tactics for app suggestion and enhancement. And also used app logo images using recommendations in this dataset.

  14. Z

    Dataset for Machine Learning Assisted Citation Screening for Systematic...

    • data.niaid.nih.gov
    Updated Dec 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dhrangadhariya, Anjani (2023). Dataset for Machine Learning Assisted Citation Screening for Systematic Reviews [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10423426
    Explore at:
    Dataset updated
    Dec 22, 2023
    Dataset provided by
    Dhrangadhariya, Anjani
    Müller, Henning
    Hilfiker, Roger
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The work "Machine Learning Assisted Citation Screening for Systematic Reviews" explored the problem of citation screening automation using machine-learning (ML) with an aim to accelerate the process of generating systematic reviews. Manual process of citation screening involve two reviewers manually screening the searched studies using a predefined inclusion criteria. If the study passes the "inclusion" criteria, it is included for further analysis or is excluded. As apparant through manual screening process, the work considered citation screening as a binary classification problem whereby any ML classifier could be trained to separate the searched studies into these two classes (include and exclude).

    A physiotherapy citation screening dataset was used to test automation approaches and the dataset includes the studies identified for citation screening in an update to the systematic review by Hilfiker et al. The dataset included titles and abstracts (citations) from 31,279 (deduplicated: 25,540) studies identified during the search phase of this SR. These studies were already manually assessed for relevance and labelled by two reviewers into two mutually exclusive labels. The uploaded file consists of 25,540 data samples, with each data sample separated by a new line. It is a tab separated file and the data in it is structured as shown below. This dataset was manually labelled into include and exclude by Hilfiker et al.

    Title PMID Abstract Class MeSH terms (separated by a pipe)

    Structured exercise improves physical functioning in women with stages I and II breast cancer: results of a randomized controlled trial.
    11157015 Abstract PURPOSE: Self-directed and supervised exercise were compared with usual care in a clinical trial designed to evaluate the effect of structured exercise on physical functioning and other dimensions of health-related quality of life in women with stages I and II breast cancer. PATIENTS AND METHODS: One hundred twenty-three women with stages I and II breast cancer completed baseline evaluations of generic and disease- and site-specific health-related quality of life, aerobic capacity, and body weight. Participants were randomly allocated to one of three intervention groups: usual care (control group), self-directed exercise, or supervised exercise. Quality of life, aerobic capacity, and body weight measures were repeated at 26 weeks... include or exclude Clinical Trial | Comparative Study | Randomized Controlled Trial | Research Support, Non-U.S. Gov't | Antineoplastic Combined Chemotherapy Protocols | Breast Neoplasms | Breast Neoplasms | Breast Neoplasms | Chemotherapy, Adjuvant | Exercise | Female | Humans | Middle Aged | Neoplasm Staging | Quality of Life | Radiotherapy, Adjuvant

    If you use this dataset in your research, please cite our papers.

  15. f

    Amazon Reviews Full

    • figshare.com
    application/x-gzip
    Updated Nov 13, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Luís Fred (2020). Amazon Reviews Full [Dataset]. http://doi.org/10.6084/m9.figshare.13232537.v1
    Explore at:
    application/x-gzipAvailable download formats
    Dataset updated
    Nov 13, 2020
    Dataset provided by
    figshare
    Authors
    Luís Fred
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Amazon Review Full Score DatasetVersion 3, Updated 09/09/2015ORIGINThe Amazon reviews dataset consists of reviews from amazon. The data span a period of 18 years, including ~35 million reviews up to March 2013. Reviews include product and user information, ratings, and a plaintext review. For more information, please refer to the following paper: J. McAuley and J. Leskovec. Hidden factors and hidden topics: understanding rating dimensions with review text. RecSys, 2013.The Amazon reviews full score dataset is constructed by Xiang Zhang (xiang.zhang@nyu.edu) from the above dataset. It is used as a text classification benchmark in the following paper: Xiang Zhang, Junbo Zhao, Yann LeCun. Character-level Convolutional Networks for Text Classification. Advances in Neural Information Processing Systems 28 (NIPS 2015).DESCRIPTIONThe Amazon reviews full score dataset is constructed by randomly taking 600,000 training samples and 130,000 testing samples for each review score from 1 to 5. In total there are 3,000,000 trainig samples and 650,000 testing samples.The files train.csv and test.csv contain all the training samples as comma-sparated values. There are 3 columns in them, corresponding to class index (1 to 5), review title and review text. The review title and text are escaped using double quotes ("), and any internal double quote is escaped by 2 double quotes (""). New lines are escaped by a backslash followed with an "n" character, that is " ".

  16. h

    walmart-reviews-dataset

    • huggingface.co
    Updated Feb 17, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Crawl Feeds (2021). walmart-reviews-dataset [Dataset]. https://huggingface.co/datasets/crawlfeeds/walmart-reviews-dataset
    Explore at:
    Dataset updated
    Feb 17, 2021
    Authors
    Crawl Feeds
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    🛒 Walmart Product Reviews Dataset (6.7K Records)

    This dataset contains 6,700+ structured customer reviews from Walmart.com. Each entry includes product-level metadata along with review details, making it ideal for small-scale machine learning models, sentiment analysis, and ecommerce insights.

      📑 Dataset Fields
    

    Column Description

    url Direct product page URL

    name Product name/title

    sku Product SKU (Stock Keeping Unit)

    price Product price (numeric, USD)… See the full description on the dataset page: https://huggingface.co/datasets/crawlfeeds/walmart-reviews-dataset.

  17. w

    Chemical product and function dataset

    • data.wu.ac.at
    • catalog.data.gov
    xls
    Updated Jun 4, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Environmental Protection Agency (2017). Chemical product and function dataset [Dataset]. https://data.wu.ac.at/schema/data_gov/NWM4NDc4MWItYTE3NS00NjdhLWJkZDYtYjkyNDRlYTMzZjgw
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 4, 2017
    Dataset provided by
    U.S. Environmental Protection Agency
    Description

    Merged product weight fraction and chemical function data.

    This dataset is associated with the following publication: Isaacs , K., M. Goldsmith, P. Egeghy , K. Phillips, R. Brooks, T. Hong, and J. Wambaugh. Characterization and prediction of chemical functions and weight fractions in consumer products. Toxicology Reports. Elsevier B.V., Amsterdam, NETHERLANDS, 3: 723-732, (2016).

  18. c

    Apple iPhone SE reviews & ratings Dataset

    • cubig.ai
    Updated Feb 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CUBIG (2025). Apple iPhone SE reviews & ratings Dataset [Dataset]. https://cubig.ai/store/products/143/apple-iphone-se-reviews-ratings-dataset
    Explore at:
    Dataset updated
    Feb 25, 2025
    Dataset authored and provided by
    CUBIG
    License

    https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service

    Measurement technique
    Synthetic data generation using AI techniques for model training, Privacy-preserving data transformation via differential privacy
    Description

    1) Data introduction • Apple-iphone-se-reviews dataset is a dataset that scrapes data from the Flipkart website using Selenium and BeautifulSoup links.

    2) Data utilization (1)Apple-iphone-se-reviews data has characteristics that: • User ratings for Apple iPhone SE on Indian e-commerce website Flipkart are . We aim at NLP text classification through user ratings, review titles, and review text. (2)Apple-iphone-se-reviews data can be used to: • Rating prediction: You can support automated review analysis and summarization by developing machine learning models to predict ratings based on review text. • Product Improvement: Insights gained from reviews can help us identify common issues and areas for improvement in iPhone SE and guide product development and quality improvements.

  19. The Artificial Intelligence in Retail Market size was USD 4951.2 Million in...

    • cognitivemarketresearch.com
    pdf,excel,csv,ppt
    Updated Mar 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cognitive Market Research (2024). The Artificial Intelligence in Retail Market size was USD 4951.2 Million in 2023 [Dataset]. https://www.cognitivemarketresearch.com/artificial-intelligence-in-retail-market-report
    Explore at:
    pdf,excel,csv,pptAvailable download formats
    Dataset updated
    Mar 1, 2024
    Dataset authored and provided by
    Cognitive Market Research
    License

    https://www.cognitivemarketresearch.com/privacy-policyhttps://www.cognitivemarketresearch.com/privacy-policy

    Time period covered
    2021 - 2033
    Area covered
    Global
    Description

    According to Cognitive Market Research, the global Artificial Intelligence in Retail market size is USD 4951.2 million in 2023and will expand at a compound annual growth rate (CAGR) of 39.50% from 2023 to 2030.

    Enhanced customer personalization to provide viable market output
    Demand for online remains higher in Artificial Intelligence in the Retail market.
    The machine learning and deep learning category held the highest Artificial Intelligence in Retail market revenue share in 2023.
    North American Artificial Intelligence In Retail will continue to lead, whereas the Asia-Pacific Artificial Intelligence In Retail market will experience the most substantial growth until 2030.
    

    Market Dynamics of the Artificial Intelligence in the Retail Market

    Key Drivers for Artificial Intelligence in Retail Market

    Enhanced Customer Personalization to Provide Viable Market Output
    

    A primary driver of Artificial Intelligence in the Retail market is the pursuit of enhanced customer personalization. A.I. algorithms analyze vast datasets of customer behaviors, preferences, and purchase history to deliver highly personalized shopping experiences. Retailers leverage this insight to offer tailored product recommendations, targeted marketing campaigns, and personalized promotions. The drive for superior customer personalization not only enhances customer satisfaction but also increases engagement and boosts sales. This focus on individualized interactions through A.I. applications is a key driver shaping the dynamic landscape of A.I. in the retail market.

    January 2023 - Microsoft and digital start-up AiFi worked together to offer Smart Store Analytics. It is a cloud-based tracking solution that helps merchants with operational and shopper insights for intelligent, cashierless stores.

    Source-techcrunch.com/2023/01/10/aifi-microsoft-smart-store-analytics/

    Improved Operational Efficiency to Propel Market Growth
    

    Another pivotal driver is the quest for improved operational efficiency within the retail sector. A.I. technologies streamline various aspects of retail operations, from inventory management and demand forecasting to supply chain optimization and cashier-less checkout systems. By automating routine tasks and leveraging predictive analytics, retailers can enhance efficiency, reduce costs, and minimize errors. The pursuit of improved operational efficiency is a key motivator for retailers to invest in AI solutions, enabling them to stay competitive, adapt to dynamic market conditions, and meet the evolving demands of modern consumers in the highly competitive artificial intelligence (AI) retail market.

    January 2023 - The EY Retail Intelligence solution, which is based on Microsoft Cloud, was introduced by the Fintech business EY to give customers a safe and efficient shopping experience. In order to deliver insightful information, this solution makes use of Microsoft Cloud for Retail and its technologies, which include image recognition, analytics, and artificial intelligence (A.I.).

    Source-www.ey.com/en_gl/news/2023/01/ey-announces-launch-of-retail-solution-that-builds-on-the-microsoft-cloud-to-help-achieve-seamless-consumer-shopping-experiences

    Key Restraints for Artificial Intelligence in Retail Market

    Data Security Concerns to Restrict Market Growth
    

    A prominent restraint in Artificial Intelligence in the Retail market is the pervasive concern over data security. As retailers increasingly rely on A.I. to process vast amounts of customer data for personalized experiences, there is a growing apprehension regarding the protection of sensitive information. The potential for data breaches and cyberattacks poses a significant challenge, as retailers must navigate the delicate balance between utilizing customer data for AI-driven initiatives and safeguarding it against potential security threats. Addressing these concerns is crucial to building and maintaining consumer trust in A.I. applications within the retail sector.

    Key Trends for Artificial Intelligence in Retail Market

    Surge in Voice-Enabled Shopping Interfaces Reshaping Retail Experiences
    

    Voice-enabled A.I. assistants such as Amazon Alexa and Google Assistant are revolutionizing the way consumers engage with retail platforms. Shoppers can now utilize voice commands to search, compare, and purchase products, thereby streamlining and accelerating the buying process. Retailers...

  20. h

    drugscom_reviews

    • huggingface.co
    Updated Feb 24, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zakia Salod (2024). drugscom_reviews [Dataset]. https://huggingface.co/datasets/Zakia/drugscom_reviews
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 24, 2024
    Authors
    Zakia Salod
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset Card for "DrugsCom Reviews"

      Dataset Summary
    

    The DrugsCom Reviews dataset is originally sourced from the UCI Machine Learning Repository. It provides patient reviews on specific drugs along with related conditions and a 10-star patient rating reflecting overall patient satisfaction. The dataset has been uploaded to Hugging Face to facilitate easier access and use by the machine learning community. It contains 161,297 instances in the training set and 53,766… See the full description on the dataset page: https://huggingface.co/datasets/Zakia/drugscom_reviews.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Jonathan De Bruin; Jonathan De Bruin; Yongchao Ma; Yongchao Ma; Gerbrich Ferdinands; Gerbrich Ferdinands; Jelle Teijema; Jelle Teijema; Rens Van de Schoot; Rens Van de Schoot (2023). SYNERGY - Open machine learning dataset on study selection in systematic reviews [Dataset]. http://doi.org/10.34894/HE6NAQ

SYNERGY - Open machine learning dataset on study selection in systematic reviews

Explore at:
13 scholarly articles cite this dataset (View in Google Scholar)
txt(212), json(702), zip(16028323), json(19426), txt(263), zip(3560967), txt(305), json(470), txt(279), zip(2355371), json(23201), csv(460956), txt(200), json(685), json(546), csv(63996), zip(2989015), zip(5749455), txt(331), txt(315), json(691), json(23775), csv(672721), json(468), txt(415), json(22778), csv(31919), csv(746832), json(18392), zip(62992826), csv(234822), txt(283), zip(34788857), json(475), txt(242), json(533), csv(42227), json(24548), zip(738232), json(22477), json(25491), zip(11463283), json(17741), csv(490660), json(19662), json(578), csv(19786), zip(14708207), zip(24619707), zip(2404439), json(713), json(27224), json(679), json(26426), txt(185), json(906), zip(18534723), json(23550), txt(266), txt(317), zip(6019723), json(33943), txt(436), csv(388378), json(469), zip(2106498), txt(320), csv(451336), txt(338), zip(19428163), json(14326), json(31652), txt(299), csv(96153), txt(220), csv(114789), json(15452), csv(5372708), json(908), csv(317928), csv(150923), json(465), csv(535584), json(26090), zip(8164831), json(19633), txt(316), json(23494), csv(133950), json(18638), csv(3944082), json(15345), json(473), zip(4411063), zip(10396095), zip(835096), txt(255), json(699), csv(654705), txt(294), csv(989865), zip(1028035), txt(322), zip(15085090), txt(237), txt(310), json(756), json(30628), json(19490), json(25908), txt(401), json(701), zip(5543909), json(29397), zip(14007470), json(30058), zip(58869042), csv(852937), json(35711), csv(298011), csv(187163), txt(258), zip(3526740), json(568), json(21552), zip(66466788), csv(215250), json(577), csv(103010), txt(306), zip(11840006)Available download formats
Dataset updated
Apr 24, 2023
Dataset provided by
DataverseNL
Authors
Jonathan De Bruin; Jonathan De Bruin; Yongchao Ma; Yongchao Ma; Gerbrich Ferdinands; Gerbrich Ferdinands; Jelle Teijema; Jelle Teijema; Rens Van de Schoot; Rens Van de Schoot
License

CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically

Description

SYNERGY is a free and open dataset on study selection in systematic reviews, comprising 169,288 academic works from 26 systematic reviews. Only 2,834 (1.67%) of the academic works in the binary classified dataset are included in the systematic reviews. This makes the SYNERGY dataset a unique dataset for the development of information retrieval algorithms, especially for sparse labels. Due to the many available variables available per record (i.e. titles, abstracts, authors, references, topics), this dataset is useful for researchers in NLP, machine learning, network analysis, and more. In total, the dataset contains 82,668,134 trainable data points. The easiest way to get the SYNERGY dataset is via the synergy-dataset Python package. See https://github.com/asreview/synergy-dataset for all information.

Search
Clear search
Close search
Google apps
Main menu