https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
Dataset Card for "imdb"
Dataset Summary
Large Movie Review Dataset. This is a dataset for binary sentiment classification containing substantially more data than previous benchmark datasets. We provide a set of 25,000 highly polar movie reviews for training, and 25,000 for testing. There is additional unlabeled data for use as well.
Supported Tasks and Leaderboards
More Information Needed
Languages
More Information Needed
Dataset Structure… See the full description on the dataset page: https://huggingface.co/datasets/stanfordnlp/imdb.
https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
This is the sentiment analysis dataset based on IMDB reviews initially released by Stanford University. This is a dataset for binary sentiment classification containing substantially more data than previous benchmark datasets. We provide a set of 25,000 highly polar movie reviews for training, and 25,000 for testing. There is additional unlabeled data for use as well. Raw text and already processed bag of words formats are provided. See the README file contained in the release for more… See the full description on the dataset page: https://huggingface.co/datasets/scikit-learn/imdb.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
R
https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy
IMDB Movie Reviews
This is a dataset for binary sentiment classification containing substantially huge data. This dataset contains a set of 50,000 highly polar movie reviews for training models for text classification tasks. The dataset is downloaded from https://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz This data is processed and splitted into training and test datasets (0.2% test split). Training dataset contains 40000 reviews and test dataset contains 10000… See the full description on the dataset page: https://huggingface.co/datasets/ajaykarthick/imdb-movie-reviews.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
"Movie Recommendation on the IMDB Dataset: A Journey into Machine Learning" is an exciting project focused on leveraging the IMDB Dataset for developing an advanced movie recommendation system. This project aims to explore the vast potential of machine learning techniques in providing personalized movie recommendations to users.
The IMDB Dataset, comprising a wealth of movie information including genres, ratings, and user reviews, serves as the foundation for this project. By harnessing the power of machine learning algorithms and data analysis, the project seeks to build a recommendation system that can accurately suggest movies tailored to each individual's preferences.
This is the IMDB dataset exactly same as ImDb Movie Reviews Dataset, contains the movie reviews.
The real dataset contains text files for training and testing purpose, but I created two csv files from those text files to ease the task ✌️ . Now you only need to download and apply your model. Each file contains 25000 reviews with label 0 for negative and 1 for positive. Each file has two columns 0 and 1, 0 represents reviews and 1 represents labels.
https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
Q-b1t/IMDB-Dataset-of-50K-Movie-Reviews-Backup dataset hosted on Hugging Face and contributed by the HF Datasets community
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
9
This dataset was created by Sidra Kousar
https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
title.akas.csv
titleId (string) - a tconst, an alphanumeric unique identifier of the title ordering (integer) – a number to uniquely identify rows for a given titleId title (string) – the localized title region (string) - the region for this version of the title language (string) - the language of the title types (array) - Enumerated set of attributes for this alternative title. One or more of the following: "alternative", "dvd", "festival", "tv", "video", "working", "original"… See the full description on the dataset page: https://huggingface.co/datasets/labofsahil/IMDb-Dataset.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
IMDB movie review sentiment classification dataset (Andrew L. Maas, Raymond E. Daly, Peter T. Pham, Dan Huang, Andrew Y. Ng, and Christopher Potts. (2011). Learning Word Vectors for Sentiment Analysis. The 49th Annual Meeting of the Association for Computational Linguistics (ACL 2011)). For more information please refer to: https://ai.stanford.edu/~amaas/data/sentiment/
The IMDB dataset was modified as follows to prepare it for use in a Galaxy Training Tutorial (https://training.galaxyproject.org/):
The top 50 words are excluded (mostly stop words). Included the next 10,000 top words. Reviews are limited to 500 words max (Longer reviews trimmed and shorter reviews are padded). 25,000 reviews are used for training and testing each. Files are in tsv (tab separated value) format to be consumed by Galaxy (www.usegalaxy.org).
Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
IMDB Dataset of top 10000 movies. It contains the basic details that are updated in IMDB
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘IMDB Movies Dataset’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/harshitshankhdhar/imdb-dataset-of-top-1000-movies-and-tv-shows on 13 November 2021.
--- Dataset description provided by original source is as follows ---
IMDB Dataset of top 1000 movies and tv shows. You can find the EDA Process on - https://www.kaggle.com/harshitshankhdhar/eda-on-imdb-movies-dataset
Please consider UPVOTE if you found it useful.
Data:- - Poster_Link - Link of the poster that imdb using - Series_Title = Name of the movie - Released_Year - Year at which that movie released - Certificate - Certificate earned by that movie - Runtime - Total runtime of the movie - Genre - Genre of the movie - IMDB_Rating - Rating of the movie at IMDB site - Overview - mini story/ summary - Meta_score - Score earned by the movie - Director - Name of the Director - Star1,Star2,Star3,Star4 - Name of the Stars - No_of_votes - Total number of votes - Gross - Money earned by that movie
--- Original source retains full ownership of the source dataset ---
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Dataset Card for IMDb Movie Dataset: All Movies by Genre
Dataset Summary
This dataset is an adapted version of "IMDb Movie Dataset: All Movies by Genre" found at: https://www.kaggle.com/datasets/rajugc/imdb-movies-dataset-based-on-genre?select=history.csv. Within the dataset, the movie title and year columns were combined, the genre was extracted from the seperate csv files, the pre-existing genre column was renamed to expanded-genres, any movies missing a description… See the full description on the dataset page: https://huggingface.co/datasets/jquigl/imdb-genres.
https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy
Unlock one of the most comprehensive movie datasets available—4.5 million structured IMDb movie records, extracted and enriched for data science, machine learning, and entertainment research.
This dataset includes a vast collection of global movie metadata, including details on title, release year, genre, country, language, runtime, cast, directors, IMDb ratings, reviews, and synopsis. Whether you're building a recommendation engine, benchmarking trends, or training AI models, this dataset is designed to give you deep and wide access to cinematic data across decades and continents.
Perfect for use in film analytics, OTT platforms, review sentiment analysis, knowledge graphs, and LLM fine-tuning, the dataset is cleaned, normalized, and exportable in multiple formats.
Genres: Drama, Comedy, Horror, Action, Sci-Fi, Documentary, and more
Train LLMs or chatbots on cinematic language and metadata
Build or enrich movie recommendation engines
Run cross-lingual or multi-region film analytics
Benchmark genre popularity across time periods
Power academic studies or entertainment dashboards
Feed into knowledge graphs, search engines, or NLP pipelines
https://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified
A dataset for binary sentiment classification containing 25,000 highly polarized movie reviews for training, and 25,000 for testing. There is additional unlabeled data for use as well.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Column Name | Description |
---|---|
Rank | The ranking of the movie based on popularity or ratings. |
Title | The title of the movie. |
Genre | The genre(s) of the movie (e.g., Action, Adventure, Sci-Fi). |
Description | A brief description or synopsis of the movie. |
Director | The director of the movie. |
Actors | The main cast or leading actors in the movie. |
Year | The release year of the movie. |
Runtime (Minutes) | The runtime of the movie in minutes. |
Rating | The IMDb user rating of the movie on a scale from 1 to 10. |
Votes | The number of user votes for the movie on IMDb. |
Revenue (Millions) | The box office revenue of the movie in millions of dollars. |
Metascore | The Metascore of the movie, representing the aggregated critic reviews score on a scale of 1 to 100. |
The MovieLens-IMDB dataset is a collection of user ratings for movies, with each rating indicating the user's preference for the movie.
Accuracies of methods after ensemble for IMDB dataset.
https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
Dataset Card for "imdb"
Dataset Summary
Large Movie Review Dataset. This is a dataset for binary sentiment classification containing substantially more data than previous benchmark datasets. We provide a set of 25,000 highly polar movie reviews for training, and 25,000 for testing. There is additional unlabeled data for use as well.
Supported Tasks and Leaderboards
More Information Needed
Languages
More Information Needed
Dataset Structure… See the full description on the dataset page: https://huggingface.co/datasets/stanfordnlp/imdb.