5 datasets found
  1. h

    Data from: imdb

    • huggingface.co
    Updated Aug 3, 2003
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stanford NLP (2003). imdb [Dataset]. https://huggingface.co/datasets/stanfordnlp/imdb
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 3, 2003
    Dataset authored and provided by
    Stanford NLP
    License

    https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/

    Description

    Dataset Card for "imdb"

      Dataset Summary
    

    Large Movie Review Dataset. This is a dataset for binary sentiment classification containing substantially more data than previous benchmark datasets. We provide a set of 25,000 highly polar movie reviews for training, and 25,000 for testing. There is additional unlabeled data for use as well.

      Supported Tasks and Leaderboards
    

    More Information Needed

      Languages
    

    More Information Needed

      Dataset Structure… See the full description on the dataset page: https://huggingface.co/datasets/stanfordnlp/imdb.
    
  2. i

    IMDb Movie Reviews Dataset

    • ieee-dataport.org
    Updated Aug 2, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aditya Pal (2022). IMDb Movie Reviews Dataset [Dataset]. https://ieee-dataport.org/open-access/imdb-movie-reviews-dataset
    Explore at:
    Dataset updated
    Aug 2, 2022
    Authors
    Aditya Pal
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    R

  3. o

    IMDB Movie Reviews (Binary Sentiment)

    • opendatabay.com
    .csv
    Updated Jun 18, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Datasimple (2025). IMDB Movie Reviews (Binary Sentiment) [Dataset]. https://www.opendatabay.com/data/ai-ml/c48f7110-3d06-45be-9cae-aa8799720eec
    Explore at:
    .csvAvailable download formats
    Dataset updated
    Jun 18, 2025
    Dataset authored and provided by
    Datasimple
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    Entertainment & Media Consumption
    Description

    Source Huggingface Hub: link

    About this dataset This is a large dataset for binary sentiment classification containing a substantial amount of data compared to previous benchmark datasets. Provided are 25,000 highly polar movie reviews for training and 25,000 for testing. There is also additional unlabeled data available for use. The data fields are consistent among all splits of the dataset

    How to use the dataset In order to use this dataset, you will need to first download the IMDB Large Movie Review Dataset. Once you have downloaded the dataset, you can either use it in its original form or split it into training and testing sets. To split the dataset, you will need to create a new file called unsupervised.csv and copy the text column from train.csv into it. You can then split unsupervised.csv into two files: train_unsupervised.csv and test_unsupervised.csv.

    Once you have either the original dataset or the training and testing sets, you can begin using them for binary sentiment classification. In order to do this, you will need to use a machine learning algorithm that is capable of performing binary classification, such as logistic regression or support vector machines. Once you have trained your model on the training set, you can then evaluate its performance on the test set by predicting the labels of the reviews in test_unsupervised.csv

    Research Ideas This dataset can be used to train a binary sentiment classification model. This dataset can be used to train a model to classify movie reviews into positive and negative sentiment categories. This dataset can be used to build a large movie review database for research purposes

    License

    CC0

    Original Data Source: IMDB Movie Reviews (Binary Sentiment)

  4. P

    IMDB-MULTI Dataset

    • paperswithcode.com
    • opendatalab.com
    Updated Sep 1, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pinar Yanardag; S. V. N. Vishwanathan (2021). IMDB-MULTI Dataset [Dataset]. https://paperswithcode.com/dataset/imdb-multi
    Explore at:
    Dataset updated
    Sep 1, 2021
    Authors
    Pinar Yanardag; S. V. N. Vishwanathan
    Description

    IMDB-MULTI is a relational dataset that consists of a network of 1000 actors or actresses who played roles in movies in IMDB. A node represents an actor or actress, and an edge connects two nodes when they appear in the same movie. In IMDB-MULTI, the edges are collected from three different genres: Comedy, Romance and Sci-Fi.

  5. How I Met Your Mother Episodes Data

    • kaggle.com
    Updated Jan 2, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bill Cruise (2022). How I Met Your Mother Episodes Data [Dataset]. https://www.kaggle.com/datasets/bcruise/how-i-met-your-mother-episodes-data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 2, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Bill Cruise
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    Series created by: Carter Bays and Craig Thomas Number of seasons: 9 Number of episodes: 208 Original air dates: September 19, 2005 – March 31, 2014

    Content

    Data was acquired through downloading IMDb TV episodes datasets and scraping information from Wikipedia.

    Acknowledgements

    Thanks to IMDb, Wikipedia, and community curators.

    Use

    It should be easy to join these data files together on Title and Air Date fields to compare (for example) US viewers and IMDb ratings.

    Motivation

    I wanted to share a dataset about How I Met Your Mother, one of my favorite TV shows to binge watch.

  6. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Stanford NLP (2003). imdb [Dataset]. https://huggingface.co/datasets/stanfordnlp/imdb

Data from: imdb

IMDB

stanfordnlp/imdb

Related Article
Explore at:
21 scholarly articles cite this dataset (View in Google Scholar)
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 3, 2003
Dataset authored and provided by
Stanford NLP
License

https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/

Description

Dataset Card for "imdb"

  Dataset Summary

Large Movie Review Dataset. This is a dataset for binary sentiment classification containing substantially more data than previous benchmark datasets. We provide a set of 25,000 highly polar movie reviews for training, and 25,000 for testing. There is additional unlabeled data for use as well.

  Supported Tasks and Leaderboards

More Information Needed

  Languages

More Information Needed

  Dataset Structure… See the full description on the dataset page: https://huggingface.co/datasets/stanfordnlp/imdb.
Search
Clear search
Close search
Google apps
Main menu