100+ datasets found
  1. h

    Data from: imdb

    • huggingface.co
    Updated Aug 3, 2003
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stanford NLP (2003). imdb [Dataset]. https://huggingface.co/datasets/stanfordnlp/imdb
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 3, 2003
    Dataset authored and provided by
    Stanford NLP
    License

    https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/

    Description

    Dataset Card for "imdb"

      Dataset Summary
    

    Large Movie Review Dataset. This is a dataset for binary sentiment classification containing substantially more data than previous benchmark datasets. We provide a set of 25,000 highly polar movie reviews for training, and 25,000 for testing. There is additional unlabeled data for use as well.

      Supported Tasks and Leaderboards
    

    More Information Needed

      Languages
    

    More Information Needed

      Dataset Structure… See the full description on the dataset page: https://huggingface.co/datasets/stanfordnlp/imdb.
    
  2. h

    Data from: imdb

    • huggingface.co
    Updated May 10, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    scikit-learn (2025). imdb [Dataset]. https://huggingface.co/datasets/scikit-learn/imdb
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 10, 2025
    Dataset authored and provided by
    scikit-learn
    License

    https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/

    Description

    This is the sentiment analysis dataset based on IMDB reviews initially released by Stanford University. This is a dataset for binary sentiment classification containing substantially more data than previous benchmark datasets. We provide a set of 25,000 highly polar movie reviews for training, and 25,000 for testing. There is additional unlabeled data for use as well. Raw text and already processed bag of words formats are provided. See the README file contained in the release for more… See the full description on the dataset page: https://huggingface.co/datasets/scikit-learn/imdb.

  3. i

    IMDb Movie Reviews Dataset

    • ieee-dataport.org
    Updated Aug 2, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aditya Pal (2022). IMDb Movie Reviews Dataset [Dataset]. https://ieee-dataport.org/open-access/imdb-movie-reviews-dataset
    Explore at:
    Dataset updated
    Aug 2, 2022
    Authors
    Aditya Pal
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    R

  4. IMDb Actors and Movies

    • kaggle.com
    Updated Apr 27, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rishab Jadhav (2024). IMDb Actors and Movies [Dataset]. https://www.kaggle.com/datasets/rishabjadhav/imdb-actors-and-movies
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 27, 2024
    Dataset provided by
    Kaggle
    Authors
    Rishab Jadhav
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    From IMDB's database, I downloaded two datasets of actors and movies. I then cleaned and merged the datasets for a combined dataset containing known actors and relevant information, including a movie they appeared in.

  5. h

    imdb-genres

    • huggingface.co
    Updated Sep 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jack Quigley (2024). imdb-genres [Dataset]. https://huggingface.co/datasets/jquigl/imdb-genres
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 18, 2024
    Authors
    Jack Quigley
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Dataset Card for IMDb Movie Dataset: All Movies by Genre

      Dataset Summary
    

    This dataset is an adapted version of "IMDb Movie Dataset: All Movies by Genre" found at: https://www.kaggle.com/datasets/rajugc/imdb-movies-dataset-based-on-genre?select=history.csv. Within the dataset, the movie title and year columns were combined, the genre was extracted from the seperate csv files, the pre-existing genre column was renamed to expanded-genres, any movies missing a description… See the full description on the dataset page: https://huggingface.co/datasets/jquigl/imdb-genres.

  6. P

    IMDB-MULTI Dataset

    • paperswithcode.com
    • opendatalab.com
    Updated Sep 1, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pinar Yanardag; S. V. N. Vishwanathan (2021). IMDB-MULTI Dataset [Dataset]. https://paperswithcode.com/dataset/imdb-multi
    Explore at:
    Dataset updated
    Sep 1, 2021
    Authors
    Pinar Yanardag; S. V. N. Vishwanathan
    Description

    IMDB-MULTI is a relational dataset that consists of a network of 1000 actors or actresses who played roles in movies in IMDB. A node represents an actor or actress, and an edge connects two nodes when they appear in the same movie. In IMDB-MULTI, the edges are collected from three different genres: Comedy, Romance and Sci-Fi.

  7. IMDb Top Rated Titles (Movies & TV Series)

    • kaggle.com
    Updated Jun 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    OctopusTeam (2025). IMDb Top Rated Titles (Movies & TV Series) [Dataset]. https://www.kaggle.com/datasets/octopusteam/imdb-top-rated-titles-movies-and-tv-series
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 9, 2025
    Dataset provided by
    Kaggle
    Authors
    OctopusTeam
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset contains a list of over 6,000 top-rated titles on IMDb, including both movies and TV series, with a minimum average user rating of 7 and over 10,000 votes.

    A dataset is updated daily at 10:00 AM CET. If you find this dataset helpful, feel free to give it an upvote! 😊

    You can find the IMDb (Unofficial) API at this link: IMDb API on RapidAPI. This API offers access to the entire IMDb database, including detailed ratings, episode information, cast details, and much more.

    All Datasets

  8. h

    IMDB-BINARY

    • huggingface.co
    Updated Mar 13, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Graph Datasets (2023). IMDB-BINARY [Dataset]. https://huggingface.co/datasets/graphs-datasets/IMDB-BINARY
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 13, 2023
    Dataset authored and provided by
    Graph Datasets
    License

    https://choosealicense.com/licenses/unknown/https://choosealicense.com/licenses/unknown/

    Description

    Dataset Card for IMDB-BINARY (IMDb-B)

      Dataset Summary
    

    The IMDb-B dataset is "a movie collaboration dataset that consists of the ego-networks of 1,000 actors/actresses who played roles in movies in IMDB. In each graph, nodes represent actors/actress, and there is an edge between them if they appear in the same movie. These graphs are derived from the Action and Romance genres".

      Supported Tasks and Leaderboards
    

    IMDb-B should be used for graph classification… See the full description on the dataset page: https://huggingface.co/datasets/graphs-datasets/IMDB-BINARY.

  9. h

    Data from: imdb

    • huggingface.co
    Updated May 4, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    testtest (2025). imdb [Dataset]. https://huggingface.co/datasets/test3534/imdb
    Explore at:
    Dataset updated
    May 4, 2025
    Dataset authored and provided by
    testtest
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    test3534/imdb dataset hosted on Hugging Face and contributed by the HF Datasets community

  10. P

    IMDb-Face Dataset

    • library.toponeai.link
    • paperswithcode.com
    Updated Feb 13, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fei Wang; Liren Chen; Cheng Li; Shiyao Huang; Yanjie Chen; Chen Qian; Chen Change Loy (2024). IMDb-Face Dataset [Dataset]. https://library.toponeai.link/dataset/imdb-face
    Explore at:
    Dataset updated
    Feb 13, 2024
    Authors
    Fei Wang; Liren Chen; Cheng Li; Shiyao Huang; Yanjie Chen; Chen Qian; Chen Change Loy
    Description

    IMDb-Face is large-scale noise-controlled dataset for face recognition research. The dataset contains about 1.7 million faces, 59k identities, which is manually cleaned from 2.0 million raw images. All images are obtained from the IMDb website.

  11. imdb_pos-neg

    • kaggle.com
    Updated Jul 14, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nodirbek Kamalov (2023). imdb_pos-neg [Dataset]. https://www.kaggle.com/datasets/nodirbekkamolov/imdb-pos-neg
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 14, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Nodirbek Kamalov
    Description

    Dataset

    This dataset was created by Nodirbek Kamalov

    Contents

  12. Data from: IMDB Dataset

    • kaggle.com
    Updated Feb 29, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Farshad Tofighi (2024). IMDB Dataset [Dataset]. https://www.kaggle.com/datasets/farshadtofighi/imdb-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 29, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Farshad Tofighi
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset

    This dataset was created by Farshad Tofighi

    Released under CC0: Public Domain

    Contents

  13. P

    IMDB-Clean Dataset

    • paperswithcode.com
    • opendatalab.com
    • +1more
    Updated Mar 4, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yiming Lin; Jie Shen; Yujiang Wang; Maja Pantic (2022). IMDB-Clean Dataset [Dataset]. https://paperswithcode.com/dataset/imdb-clean
    Explore at:
    Dataset updated
    Mar 4, 2022
    Authors
    Yiming Lin; Jie Shen; Yujiang Wang; Maja Pantic
    Description

    We have cleaned the noisy IMDB-WIKI dataset using a constrained clustering method, resulting this new benchmark for in-the-wild age estimation. The annotations also allow this dataset to use for some other tasks, like gender classification and face recognition/verification. For more details, please refer to our FPAge paper.

  14. g

    IMDB Movie Dataset

    • gts.ai
    json
    Updated Jan 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    GTS (2025). IMDB Movie Dataset [Dataset]. https://gts.ai/dataset-download/imdb-movie-dataset/
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Jan 24, 2025
    Dataset provided by
    GLOBOSE TECHNOLOGY SOLUTIONS PRIVATE LIMITED
    Authors
    GTS
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Explore the IMDB Movie Dataset to uncover trends, audience preferences, and success factors like ratings, revenue, and genres. Perfect for analysis!

  15. Data from: IMDB Dataset

    • kaggle.com
    Updated Mar 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ShivamYadav11321 (2024). IMDB Dataset [Dataset]. https://www.kaggle.com/datasets/shivamyadav11321/imdb-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 9, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    ShivamYadav11321
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset

    This dataset was created by ShivamYadav11321

    Released under CC0: Public Domain

    Contents

  16. Z

    Sentiment analysis in Galaxy with IMDB movie review dataset

    • data.niaid.nih.gov
    Updated Aug 4, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kaivan Kamali (2022). Sentiment analysis in Galaxy with IMDB movie review dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4477880
    Explore at:
    Dataset updated
    Aug 4, 2022
    Dataset authored and provided by
    Kaivan Kamali
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    IMDB movie review sentiment classification dataset (Andrew L. Maas, Raymond E. Daly, Peter T. Pham, Dan Huang, Andrew Y. Ng, and Christopher Potts. (2011). Learning Word Vectors for Sentiment Analysis. The 49th Annual Meeting of the Association for Computational Linguistics (ACL 2011)). For more information please refer to: https://ai.stanford.edu/~amaas/data/sentiment/

    The IMDB dataset was modified as follows to prepare it for use in a Galaxy Training Tutorial (https://training.galaxyproject.org/):

    The top 50 words are excluded (mostly stop words). Included the next 10,000 top words. Reviews are limited to 500 words max (Longer reviews trimmed and shorter reviews are padded). 25,000 reviews are used for training and testing each. Files are in tsv (tab separated value) format to be consumed by Galaxy (www.usegalaxy.org).

  17. h

    Data from: imdb

    • huggingface.co
    Updated Dec 17, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andrew Zhang (2021). imdb [Dataset]. https://huggingface.co/datasets/zapsdcn/imdb
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 17, 2021
    Authors
    Andrew Zhang
    Description

    zapsdcn/imdb dataset hosted on Hugging Face and contributed by the HF Datasets community

  18. d

    PostgreSQL Dump of IMDB Data for JOB Workload

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Nov 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marcus, Ryan (2023). PostgreSQL Dump of IMDB Data for JOB Workload [Dataset]. http://doi.org/10.7910/DVN/2QYZBT
    Explore at:
    Dataset updated
    Nov 22, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Marcus, Ryan
    Description

    This is a dump generated by pg_dump -Fc of the IMDb data used in the "How Good are Query Optimizers, Really?" paper. PostgreSQL compatible SQL queries and scripts to automatically create a VM with this dataset can be found here: https://git.io/imdb

  19. h

    imdb-movie-reviews

    • huggingface.co
    Updated Aug 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ajay Karthick Senthil Kumar (2024). imdb-movie-reviews [Dataset]. https://huggingface.co/datasets/ajaykarthick/imdb-movie-reviews
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 23, 2024
    Authors
    Ajay Karthick Senthil Kumar
    Description

    IMDB Movie Reviews

    This is a dataset for binary sentiment classification containing substantially huge data. This dataset contains a set of 50,000 highly polar movie reviews for training models for text classification tasks. The dataset is downloaded from https://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz This data is processed and splitted into training and test datasets (0.2% test split). Training dataset contains 40000 reviews and test dataset contains 10000… See the full description on the dataset page: https://huggingface.co/datasets/ajaykarthick/imdb-movie-reviews.

  20. IMDB Shows data with scenes and locations ontology

    • zenodo.org
    • data.niaid.nih.gov
    bin
    Updated Oct 26, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Seth van der Bijl; Alex Hoorn; Ramon Cremers; Seth van der Bijl; Alex Hoorn; Ramon Cremers (2020). IMDB Shows data with scenes and locations ontology [Dataset]. http://doi.org/10.5281/zenodo.4126948
    Explore at:
    binAvailable download formats
    Dataset updated
    Oct 26, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Seth van der Bijl; Alex Hoorn; Ramon Cremers; Seth van der Bijl; Alex Hoorn; Ramon Cremers
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    We proudly present you the IMDB show ontology. This is an ontology based on IMDB data and geocoded locations data for many scenes for shows which previously was not available in a single dataset. The present ontology is extensively documented in our GitHub repository: https://github.com/AlexHoorn/group51-kdd Relations are aligned with foaf and schema ontologies and every show is explicitly aligned with wikidata via a Owl:sameAs predicate.

    For the contents and structure of this ontology we would kindly refer you here: https://github.com/AlexHoorn/MovieLocationsOntology

    For the creation and data in this ontology we would kindly refer you here: https://github.com/AlexHoorn/MovieLocationsOntology/tree/main/data

    We highly recommended you to visit our movie location app to explore this data.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Stanford NLP (2003). imdb [Dataset]. https://huggingface.co/datasets/stanfordnlp/imdb

Data from: imdb

IMDB

stanfordnlp/imdb

Related Article
Explore at:
19 scholarly articles cite this dataset (View in Google Scholar)
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 3, 2003
Dataset authored and provided by
Stanford NLP
License

https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/

Description

Dataset Card for "imdb"

  Dataset Summary

Large Movie Review Dataset. This is a dataset for binary sentiment classification containing substantially more data than previous benchmark datasets. We provide a set of 25,000 highly polar movie reviews for training, and 25,000 for testing. There is additional unlabeled data for use as well.

  Supported Tasks and Leaderboards

More Information Needed

  Languages

More Information Needed

  Dataset Structure… See the full description on the dataset page: https://huggingface.co/datasets/stanfordnlp/imdb.
Search
Clear search
Close search
Google apps
Main menu