100+ datasets found
  1. h

    Data from: imdb

    • huggingface.co
    Updated Aug 3, 2003
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stanford NLP (2003). imdb [Dataset]. https://huggingface.co/datasets/stanfordnlp/imdb
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 3, 2003
    Dataset authored and provided by
    Stanford NLP
    License

    https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/

    Description

    Dataset Card for "imdb"

      Dataset Summary
    

    Large Movie Review Dataset. This is a dataset for binary sentiment classification containing substantially more data than previous benchmark datasets. We provide a set of 25,000 highly polar movie reviews for training, and 25,000 for testing. There is additional unlabeled data for use as well.

      Supported Tasks and Leaderboards
    

    More Information Needed

      Languages
    

    More Information Needed

      Dataset Structure… See the full description on the dataset page: https://huggingface.co/datasets/stanfordnlp/imdb.
    
  2. h

    Data from: imdb

    • huggingface.co
    Updated May 10, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    scikit-learn (2025). imdb [Dataset]. https://huggingface.co/datasets/scikit-learn/imdb
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 10, 2025
    Dataset authored and provided by
    scikit-learn
    License

    https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/

    Description

    This is the sentiment analysis dataset based on IMDB reviews initially released by Stanford University. This is a dataset for binary sentiment classification containing substantially more data than previous benchmark datasets. We provide a set of 25,000 highly polar movie reviews for training, and 25,000 for testing. There is additional unlabeled data for use as well. Raw text and already processed bag of words formats are provided. See the README file contained in the release for more… See the full description on the dataset page: https://huggingface.co/datasets/scikit-learn/imdb.

  3. IMDb Dataset (2024) updated

    • kaggle.com
    zip
    Updated Jul 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Parth (2024). IMDb Dataset (2024) updated [Dataset]. https://www.kaggle.com/datasets/parthdande/imdb-dataset-2024-updated
    Explore at:
    zip(335942 bytes)Available download formats
    Dataset updated
    Jul 6, 2024
    Authors
    Parth
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset contains detailed information about movies listed on IMDb, including titles, genres, release dates, and ratings. It also includes user reviews and ratings, making it an excellent resource for sentiment analysis and trend analysis in the movie industry. This dataset can be used to gain insights into movie trends, audience preferences, and the correlation between movie attributes and ratings. The second file has additional feature called poster_src which is a link Movies poster image. The second is bigger than the first file and has a wider range of moives.

  4. IMDB Dataset - Sentiment Analysis

    • kaggle.com
    zip
    Updated Dec 19, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bhavik Jikadara (2023). IMDB Dataset - Sentiment Analysis [Dataset]. https://www.kaggle.com/datasets/bhavikjikadara/imdb-dataset-sentiment-analysis
    Explore at:
    zip(26962657 bytes)Available download formats
    Dataset updated
    Dec 19, 2023
    Authors
    Bhavik Jikadara
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    The IMDb dataset is a collection of 50,000 reviews from the Internet Movie Database (IMDb). The reviews are labeled as either positive or negative and are split into two sets of 25,000 reviews for training and testing. Each set contains an equal number of positive and negative reviews.

    The IMDb dataset is a binary sentiment analysis dataset for natural language processing or text analytics. It contains more data than previous benchmark datasets.

    IMDb is a rich source of film data that includes cast and crew lists, movie release dates, box office information, plot summaries, trailers, actor and director biographies, and other trivia. Information on IMDb comes from a variety of sources, such as filmmakers, film studios, on-screen credits, and other official sources.

  5. h

    IMDB-Dataset-of-50K-Movie-Reviews-Backup

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Q-b1t, IMDB-Dataset-of-50K-Movie-Reviews-Backup [Dataset]. https://huggingface.co/datasets/Q-b1t/IMDB-Dataset-of-50K-Movie-Reviews-Backup
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    Q-b1t
    License

    https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/

    Description

    Q-b1t/IMDB-Dataset-of-50K-Movie-Reviews-Backup dataset hosted on Hugging Face and contributed by the HF Datasets community

  6. IMDB Movies From 1920 to 2025

    • kaggle.com
    zip
    Updated Mar 27, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Raed Addala (2025). IMDB Movies From 1920 to 2025 [Dataset]. https://www.kaggle.com/datasets/raedaddala/imdb-movies-from-1960-to-2023
    Explore at:
    zip(46688739 bytes)Available download formats
    Dataset updated
    Mar 27, 2025
    Authors
    Raed Addala
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Over 60,000 Movies, 100+ Years of Data, and Rich Metadata!

    Links:

    For details about the scraping process, explore the complete code repository on GitHub.

    About the Dataset

    This dataset provides annual data for the most popular 500–600 movies per year from 1920 to 2025, extracted from IMDb. It includes over 60,000 movies, spanning more than 100 years of cinematic history. Each year’s data is divided into three CSV files for flexibility and ease of use:
    - imdb_movies_[year].csv: Basic movie details.
    - advanced_movies_details_[year].csv: Comprehensive metadata and financial details.
    - merged_movies_data_[year].csv: A unified dataset combining both files.

    File Descriptions

    1. imdb_movies_[year].csv

    Essential movie information, including:
    - Title: Movie title. - Description: Movie Description. - méta_score: IMDB's meta score. - Movie Link: IMDb URL for the movie.
    - Year: Year of release.
    - Duration: Runtime (in minutes).
    - MPA: Motion Picture Association rating (e.g., PG, R).
    - Rating: IMDb rating (scale of 1–10).
    - Votes: Total user votes on IMDb.

    2. advanced_movies_details_[year].csv

    Detailed movie metadata:
    - Link: IMDb URL (for linking with other data).
    - budget: Production budget (in USD).
    - grossWorldWide: Global box office revenue.
    - gross_US_Canada: North American box office earnings.
    - opening_weekend_Gross: Opening weekend revenue.
    - directors: List of directors.
    - writers: List of writers.
    - stars: Main cast members.
    - genres: Movie genres.
    - countries_origin: Countries of production.
    - filming_locations: Primary filming locations.
    - production_companies: Associated production companies.
    - Languages: Languages spoken in the movie.
    - Award_information: Information about awards, nominations and wins.
    - release_date: Official release date.

    3. merged_movies_data_[year].csv

    A unified dataset combining all columns from the previous two files:
    - Basic Details: Title, Year, Rating, Votes.
    - Advanced Features: budget, grossWorldWide, directors, genres, and awards.

    Data Structure

    Template Columns:
    - imdb_movies_[year].csv:
    Title, Year, Duration, MPA, Rating, Votes, meta_score, description, Movie Link

    • advanced_movies_details_[year].csv:
      link, writers, directors, stars, budget, opening_weekend_Gross, grossWorldWide, gross_US_Canada, release_date, countries_origin, filming_locations, production_company, awards_content, genres, Languages

    • merged_movies_data_[year].csv:
      Title, Year, Duration, MPA, Rating, Votes, meta_score, description, Movie Link, writers, directors, stars, budget, opening_weekend_Gross, grossWorldWide, gross_US_Canada, release_date, countries_origin, filming_locations, production_company, awards_content, genres, Languages

    Updates

    The dataset is updated annually in December to include the latest data.

    Applications

    This dataset is ideal for:
    - Trend Analysis: Explore changes in the movie industry over six decades.
    - Predictive Modeling: Build models to forecast box office revenue, ratings, or awards.
    - Recommendation Systems: Use attributes like genres, cast, and ratings for personalized recommendations.
    - Comparative Analysis: Study differences across eras, genres, or regions.

    Dataset Features

    • Over 60,000 Movies: Detailed data from 1920 to 2025.
    • Rich Metadata: Financial, creative, and recognition-related attributes.
    • User-friendly: Modular files for tailored use or comprehensive merged files.
    • Consistency: Uniform structure enables seamless analysis.

    Notes

    • For issues, suggestions, or feature requests, please feel free to contact me: send me an email or open an issue on GitHub. Your input is highly appreciated.
  7. IMDB movie details dataset

    • crawlfeeds.com
    csv, zip
    Updated Nov 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Crawl Feeds (2025). IMDB movie details dataset [Dataset]. https://crawlfeeds.com/datasets/imdb-movie-details-dataset
    Explore at:
    zip, csvAvailable download formats
    Dataset updated
    Nov 9, 2025
    Dataset authored and provided by
    Crawl Feeds
    License

    https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy

    Description
    The IMDB Movie Details Dataset is a comprehensive collection of movie datasets that offers a treasure trove of information about movies, TV shows, and streaming content listed on IMDB. This dataset includes detailed data such as titles, release years, genres, cast, crew, ratings, and more, making it a go-to resource for film and entertainment enthusiasts. Ideal for data analysis, IMDB movie dataset applications span machine learning projects, predictive modeling, and insights into industry trends.
    Researchers can explore patterns in movie ratings and genre popularity, while developers can use the dataset to build recommendation systems or applications. Movie buffs can dive deep into historical and contemporary trends in the world of cinema. This dataset not only supports academic and professional pursuits but also opens doors for creative projects in storytelling, content creation, and audience engagement. Whether you’re a developer, researcher, or film enthusiast, the IMDB movie dataset is a powerful tool for uncovering trends and gaining deeper insights into the evolving entertainment landscape.
  8. h

    imdb-movie-reviews

    • huggingface.co
    Updated Mar 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ajay Karthick Senthil Kumar (2023). imdb-movie-reviews [Dataset]. https://huggingface.co/datasets/ajaykarthick/imdb-movie-reviews
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 31, 2023
    Authors
    Ajay Karthick Senthil Kumar
    Description

    IMDB Movie Reviews

    This is a dataset for binary sentiment classification containing substantially huge data. This dataset contains a set of 50,000 highly polar movie reviews for training models for text classification tasks. The dataset is downloaded from https://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz This data is processed and splitted into training and test datasets (0.2% test split). Training dataset contains 40000 reviews and test dataset contains 10000… See the full description on the dataset page: https://huggingface.co/datasets/ajaykarthick/imdb-movie-reviews.

  9. i

    IMDb Users' Ratings Dataset

    • ieee-dataport.org
    Updated Nov 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    vahid baghi (2025). IMDb Users' Ratings Dataset [Dataset]. https://ieee-dataport.org/open-access/imdb-users-ratings-dataset
    Explore at:
    Dataset updated
    Nov 21, 2025
    Authors
    vahid baghi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    9

  10. h

    mini-imdb

    • huggingface.co
    Updated Sep 30, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Daniel Vila (2022). mini-imdb [Dataset]. https://huggingface.co/datasets/dvilasuero/mini-imdb
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 30, 2022
    Authors
    Daniel Vila
    Description

    dvilasuero/mini-imdb dataset hosted on Hugging Face and contributed by the HF Datasets community

  11. IMDB 5000 Movie Dataset

    • kaggle.com
    zip
    Updated Dec 16, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yueming (2017). IMDB 5000 Movie Dataset [Dataset]. https://www.kaggle.com/datasets/carolzhangdc/imdb-5000-movie-dataset
    Explore at:
    zip(567524 bytes)Available download formats
    Dataset updated
    Dec 16, 2017
    Authors
    Yueming
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    Dataset

    This dataset was created by Yueming

    Released under Database: Open Database, Contents: Database Contents

    Contents

  12. h

    Data from: imdb

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Massive Text Embedding Benchmark, imdb [Dataset]. https://huggingface.co/datasets/mteb/imdb
    Explore at:
    Dataset authored and provided by
    Massive Text Embedding Benchmark
    License

    https://choosealicense.com/licenses/unknown/https://choosealicense.com/licenses/unknown/

    Description

    ImdbClassification An MTEB dataset Massive Text Embedding Benchmark

    Large Movie Review Dataset

    Task category t2c

    Domains Reviews, Written

    Reference http://www.aclweb.org/anthology/P11-1015

      How to evaluate on this task
    

    You can evaluate an embedding model on this dataset using the following code: import mteb

    task = mteb.get_tasks(["ImdbClassification"]) evaluator = mteb.MTEB(task)

    model = mteb.get_model(YOUR_MODEL) evaluator.run(model)

    To learn more… See the full description on the dataset page: https://huggingface.co/datasets/mteb/imdb.

  13. H

    PostgreSQL Dump of IMDB Data for JOB Workload

    • dataverse.harvard.edu
    • search.dataone.org
    Updated Sep 24, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ryan Marcus (2019). PostgreSQL Dump of IMDB Data for JOB Workload [Dataset]. http://doi.org/10.7910/DVN/2QYZBT
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 24, 2019
    Dataset provided by
    Harvard Dataverse
    Authors
    Ryan Marcus
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This is a dump generated by pg_dump -Fc of the IMDb data used in the "How Good are Query Optimizers, Really?" paper. PostgreSQL compatible SQL queries and scripts to automatically create a VM with this dataset can be found here: https://git.io/imdb

  14. IMDb Movies Metadata Dataset – 4.5M Records (Global Coverage)

    • crawlfeeds.com
    csv, zip
    Updated Nov 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Crawl Feeds (2025). IMDb Movies Metadata Dataset – 4.5M Records (Global Coverage) [Dataset]. https://crawlfeeds.com/datasets/imdb-movies-metadata-dataset-4-5m-records-global-coverage
    Explore at:
    csv, zipAvailable download formats
    Dataset updated
    Nov 9, 2025
    Dataset authored and provided by
    Crawl Feeds
    License

    https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy

    Description

    Unlock one of the most comprehensive movie datasets available—4.5 million structured IMDb movie records, extracted and enriched for data science, machine learning, and entertainment research.

    This dataset includes a vast collection of global movie metadata, including details on title, release year, genre, country, language, runtime, cast, directors, IMDb ratings, reviews, and synopsis. Whether you're building a recommendation engine, benchmarking trends, or training AI models, this dataset is designed to give you deep and wide access to cinematic data across decades and continents.

    Perfect for use in film analytics, OTT platforms, review sentiment analysis, knowledge graphs, and LLM fine-tuning, the dataset is cleaned, normalized, and exportable in multiple formats.

    What’s Included:

    • Genres: Drama, Comedy, Horror, Action, Sci-Fi, Documentary, and more

    • Delivery: Direct download

    Use Cases:

    • Train LLMs or chatbots on cinematic language and metadata

    • Build or enrich movie recommendation engines

    • Run cross-lingual or multi-region film analytics

    • Benchmark genre popularity across time periods

    • Power academic studies or entertainment dashboards

    • Feed into knowledge graphs, search engines, or NLP pipelines

  15. IMDb Movie Reviews Genres Description and Emotions

    • kaggle.com
    zip
    Updated Mar 27, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fahad Rehman (2024). IMDb Movie Reviews Genres Description and Emotions [Dataset]. https://www.kaggle.com/datasets/fahadrehman07/movie-reviews-and-emotion-dataset
    Explore at:
    zip(32966193 bytes)Available download formats
    Dataset updated
    Mar 27, 2024
    Authors
    Fahad Rehman
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    🟡Please upvote the dataset if you like it.🍒

    The "IMDB Dataset of Movies Reviews and Translation" dataset has been expanded significantly and is now available on Kaggle in a modified version. Three new columns have been added to the dataset: genres, descriptions, and emotions. The original dataset only had four columns: ratings, reviews, movies, and resenhas. This extension adds to the dataset's richness and offers insightful information about movie genres, in-depth synopses, and the sentimentality of the reviews.

    The addition of the Genres column provides an extensive movie classification that enables scholars and film aficionados to explore particular genres and their traits in greater detail. By examining patterns, trends, and preferences across various genres, analysts can use this data to create more specialized research and moviegoer suggestions.

    The newly added Descriptions column is a valuable addition as it provides textual summaries or synopses of each movie. These descriptions offer a concise overview of the plot, characters, and themes, making it easier for users to understand and evaluate movies of interest. Researchers can leverage this information to conduct sentiment analysis, topic modeling, or recommendation systems based on movie summaries.

    Finally, the Emotions column adds an intriguing dimension to the dataset. By capturing the emotional tone expressed within each description, this column allows for a deeper understanding of sentiments toward the movies. Sentiment analysis techniques can be applied to this data, enabling researchers to gain insights into emotions: like joy, anger, sadness, and more emotions associated with different movies. This information can be particularly valuable for filmmakers, production companies, marketers looking to gauge audience reactions and tailor their strategies accordingly and especially for moviegoers who like to watch movies based on emotions.

    Overall, the expanded version of the "50k Movie Reviews" dataset offers a wealth of new information that fosters detailed analysis and exploration of movie genres, descriptions, and emotional responses. This dataset presents a valuable resource for researchers, data scientists, and movie enthusiasts alike, enabling a deeper understanding of the movie landscape and facilitating the development of innovative tools and applications in the field of movie analysis and recommendation systems.

  16. h

    IMDb-Dataset

    • huggingface.co
    Updated Nov 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sahil (2024). IMDb-Dataset [Dataset]. https://huggingface.co/datasets/labofsahil/IMDb-Dataset
    Explore at:
    Dataset updated
    Nov 9, 2024
    Dataset authored and provided by
    Sahil
    License

    https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/

    Description

    title.akas.csv

    titleId (string) - a tconst, an alphanumeric unique identifier of the title ordering (integer) – a number to uniquely identify rows for a given titleId title (string) – the localized title region (string) - the region for this version of the title language (string) - the language of the title types (array) - Enumerated set of attributes for this alternative title. One or more of the following: "alternative", "dvd", "festival", "tv", "video", "working", "original"… See the full description on the dataset page: https://huggingface.co/datasets/labofsahil/IMDb-Dataset.

  17. IMDb Actors and Movies

    • kaggle.com
    zip
    Updated Apr 27, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rishab Jadhav (2024). IMDb Actors and Movies [Dataset]. https://www.kaggle.com/datasets/rishabjadhav/imdb-actors-and-movies
    Explore at:
    zip(474725643 bytes)Available download formats
    Dataset updated
    Apr 27, 2024
    Authors
    Rishab Jadhav
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    From IMDB's database, I downloaded two datasets of actors and movies. I then cleaned and merged the datasets for a combined dataset containing known actors and relevant information, including a movie they appeared in.

  18. A

    ‘IMDB Movies Dataset’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Nov 13, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2021). ‘IMDB Movies Dataset’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-imdb-movies-dataset-f301/9b433bd2/?iid=018-445&v=presentation
    Explore at:
    Dataset updated
    Nov 13, 2021
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘IMDB Movies Dataset’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/harshitshankhdhar/imdb-dataset-of-top-1000-movies-and-tv-shows on 13 November 2021.

    --- Dataset description provided by original source is as follows ---

    Context

    IMDB Dataset of top 1000 movies and tv shows. You can find the EDA Process on - https://www.kaggle.com/harshitshankhdhar/eda-on-imdb-movies-dataset

    Please consider UPVOTE if you found it useful.

    Content

    Data:- - Poster_Link - Link of the poster that imdb using - Series_Title = Name of the movie - Released_Year - Year at which that movie released - Certificate - Certificate earned by that movie - Runtime - Total runtime of the movie - Genre - Genre of the movie - IMDB_Rating - Rating of the movie at IMDB site - Overview - mini story/ summary - Meta_score - Score earned by the movie - Director - Name of the Director - Star1,Star2,Star3,Star4 - Name of the Stars - No_of_votes - Total number of votes - Gross - Money earned by that movie

    Inspiration

    • Analysis of the gross of a movie vs directors.
    • Analysis of the gross of a movie vs different - different stars.
    • Analysis of the No_of_votes of a movie vs directors.
    • Analysis of the No_of_votes of a movie vs different - different stars.
    • Which actor prefer which Genre more?
    • Which combination of actors are getting good IMDB_Rating maximum time?
    • Which combination of actors are getting good gross?

    --- Original source retains full ownership of the source dataset ---

  19. Z

    Sentiment analysis in Galaxy with IMDB movie review dataset

    • data.niaid.nih.gov
    Updated Aug 4, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kaivan Kamali (2022). Sentiment analysis in Galaxy with IMDB movie review dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4477880
    Explore at:
    Dataset updated
    Aug 4, 2022
    Dataset provided by
    Penn State University
    Authors
    Kaivan Kamali
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    IMDB movie review sentiment classification dataset (Andrew L. Maas, Raymond E. Daly, Peter T. Pham, Dan Huang, Andrew Y. Ng, and Christopher Potts. (2011). Learning Word Vectors for Sentiment Analysis. The 49th Annual Meeting of the Association for Computational Linguistics (ACL 2011)). For more information please refer to: https://ai.stanford.edu/~amaas/data/sentiment/

    The IMDB dataset was modified as follows to prepare it for use in a Galaxy Training Tutorial (https://training.galaxyproject.org/):

    The top 50 words are excluded (mostly stop words). Included the next 10,000 top words. Reviews are limited to 500 words max (Longer reviews trimmed and shorter reviews are padded). 25,000 reviews are used for training and testing each. Files are in tsv (tab separated value) format to be consumed by Galaxy (www.usegalaxy.org).

  20. h

    IMDB-BINARY

    • huggingface.co
    Updated Mar 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Graph Datasets (2023). IMDB-BINARY [Dataset]. https://huggingface.co/datasets/graphs-datasets/IMDB-BINARY
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 13, 2023
    Dataset authored and provided by
    Graph Datasets
    License

    https://choosealicense.com/licenses/unknown/https://choosealicense.com/licenses/unknown/

    Description

    Dataset Card for IMDB-BINARY (IMDb-B)

      Dataset Summary
    

    The IMDb-B dataset is "a movie collaboration dataset that consists of the ego-networks of 1,000 actors/actresses who played roles in movies in IMDB. In each graph, nodes represent actors/actress, and there is an edge between them if they appear in the same movie. These graphs are derived from the Action and Romance genres".

      Supported Tasks and Leaderboards
    

    IMDb-B should be used for graph classification… See the full description on the dataset page: https://huggingface.co/datasets/graphs-datasets/IMDB-BINARY.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Stanford NLP (2003). imdb [Dataset]. https://huggingface.co/datasets/stanfordnlp/imdb

Data from: imdb

IMDB

stanfordnlp/imdb

Related Article
Explore at:
22 scholarly articles cite this dataset (View in Google Scholar)
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 3, 2003
Dataset authored and provided by
Stanford NLP
License

https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/

Description

Dataset Card for "imdb"

  Dataset Summary

Large Movie Review Dataset. This is a dataset for binary sentiment classification containing substantially more data than previous benchmark datasets. We provide a set of 25,000 highly polar movie reviews for training, and 25,000 for testing. There is additional unlabeled data for use as well.

  Supported Tasks and Leaderboards

More Information Needed

  Languages

More Information Needed

  Dataset Structure… See the full description on the dataset page: https://huggingface.co/datasets/stanfordnlp/imdb.
Search
Clear search
Close search
Google apps
Main menu