Facebook
TwitterDescription: This dataset contains information about 616 movies spanning various genres, years of release, and creative talents involved in their production. The dataset is intended for use in data analysis, visualization, and machine learning projects related to the film industry. Each row represents a single movie entry, and the dataset includes the following columns:
Movie: The title of the movie. Year: The year of release for the movie. Genres: The genres or categories associated with the movie. Certification/Rating: The film's certification or rating according to the relevant rating board or organization. IMDb ID: The unique IMDb identifier for the movie. Writer: The name(s) of the writer(s) or screenwriter(s) responsible for the movie's screenplay. Director: The name of the movie's director. Potential Use Cases:
Film industry analysis: Analyze trends in movie genres and ratings over time. Predicting movie success: Build predictive models to forecast a movie's success based on its features. Recommender systems: Develop movie recommendation systems for users based on their preferences. Creative insights: Explore relationships between directors, writers, and movie genres.
Facebook
TwitterOpen Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
License information was derived automatically
The TMDb (The Movie Database) is a comprehensive movie database that provides information about movies, including details like titles, ratings, release dates, revenue, genres, and much more.
This dataset contains a collection of 1,000,000 movies from the TMDB database.
Dataset is updated daily. If you find this dataset valuable, don't forget to hit the upvote button! ๐๐
Clash of Clans Clans Dataset 2023 (3.5M Clans)
Black-White Wage Gap in the USA Dataset
USA Unemployment Rates by Demographics & Race
Photo by Onur Binay on Unsplash
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Dataset Card for IMDb Movie Dataset: All Movies by Genre
Dataset Summary
This dataset is an adapted version of "IMDb Movie Dataset: All Movies by Genre" found at: https://www.kaggle.com/datasets/rajugc/imdb-movies-dataset-based-on-genre?select=history.csv. Within the dataset, the movie title and year columns were combined, the genre was extracted from the seperate csv files, the pre-existing genre column was renamed to expanded-genres, any movies missing a descriptionโฆ See the full description on the dataset page: https://huggingface.co/datasets/jquigl/imdb-genres.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
For details about the scraping process, explore the complete code repository on GitHub.
This dataset provides annual data for the most popular 500โ600 movies per year from 1920 to 2025, extracted from IMDb. It includes over 60,000 movies, spanning more than 100 years of cinematic history. Each yearโs data is divided into three CSV files for flexibility and ease of use:
- imdb_movies_[year].csv: Basic movie details.
- advanced_movies_details_[year].csv: Comprehensive metadata and financial details.
- merged_movies_data_[year].csv: A unified dataset combining both files.
imdb_movies_[year].csvEssential movie information, including:
- Title: Movie title.
- Description: Movie Description.
- mรฉta_score: IMDB's meta score.
- Movie Link: IMDb URL for the movie.
- Year: Year of release.
- Duration: Runtime (in minutes).
- MPA: Motion Picture Association rating (e.g., PG, R).
- Rating: IMDb rating (scale of 1โ10).
- Votes: Total user votes on IMDb.
advanced_movies_details_[year].csvDetailed movie metadata:
- Link: IMDb URL (for linking with other data).
- budget: Production budget (in USD).
- grossWorldWide: Global box office revenue.
- gross_US_Canada: North American box office earnings.
- opening_weekend_Gross: Opening weekend revenue.
- directors: List of directors.
- writers: List of writers.
- stars: Main cast members.
- genres: Movie genres.
- countries_origin: Countries of production.
- filming_locations: Primary filming locations.
- production_companies: Associated production companies.
- Languages: Languages spoken in the movie.
- Award_information: Information about awards, nominations and wins.
- release_date: Official release date.
merged_movies_data_[year].csvA unified dataset combining all columns from the previous two files:
- Basic Details: Title, Year, Rating, Votes.
- Advanced Features: budget, grossWorldWide, directors, genres, and awards.
Template Columns:
- imdb_movies_[year].csv:
Title, Year, Duration, MPA, Rating, Votes, meta_score, description, Movie Link
advanced_movies_details_[year].csv:
link, writers, directors, stars, budget, opening_weekend_Gross, grossWorldWide, gross_US_Canada, release_date, countries_origin, filming_locations, production_company, awards_content, genres, Languages
merged_movies_data_[year].csv:
Title, Year, Duration, MPA, Rating, Votes, meta_score, description, Movie Link, writers, directors, stars, budget, opening_weekend_Gross, grossWorldWide, gross_US_Canada, release_date, countries_origin, filming_locations, production_company, awards_content, genres, Languages
The dataset is updated annually in December to include the latest data.
This dataset is ideal for:
- Trend Analysis: Explore changes in the movie industry over six decades.
- Predictive Modeling: Build models to forecast box office revenue, ratings, or awards.
- Recommendation Systems: Use attributes like genres, cast, and ratings for personalized recommendations.
- Comparative Analysis: Study differences across eras, genres, or regions.
Facebook
Twitterhttps://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy
Facebook
Twitterhttps://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy
We provide a high-quality Rotten Tomatoes movie dataset that includes key metadata for thousands of movies. This dataset is ideal for anyone working with movie-related platforms, entertainment analytics, content curation, or movie discovery tools.
Our collection is structured, clean, and designed to support real-time apps, dashboards, and research use cases.
Each record in the dataset contains core information pulled directly from Rotten Tomatoes, including:
Movie Name โ The official title of the movie.
Poster URL โ High-resolution image link to the movie poster.
Trailer URL โ Direct link to the official trailer (when available).
Genre โ One or more genres associated with the movie, such as Action, Drama, Comedy, or Horror.
Release Date โ The date the movie was released to the public.
Actors โ Main cast members listed on Rotten Tomatoes.
Directors โ Director(s) responsible for the movie.
Rating โ Audience or critic scores, where available.
This dataset spans a wide range of movies across all major genres and decades. From modern releases to timeless classics, from Hollywood blockbusters to independent films โ weโve included movies of all types with relevant data points.
You can expect data on:
U.S. theatrical releases
Netflix, Amazon, and other streaming exclusives
Festival films and limited releases
Animated and documentary films
Here are just a few ways this dataset can be useful:
Movie Recommendation Engines โ Use metadata and genre info to power personalized movie suggestions.
Entertainment Search Tools โ Build searchable movie listings with visual poster previews and trailer links.
Data Visualization Projects โ Create dashboards showing trends by genre, release periods, or actor participation.
AI/ML Training โ Use metadata to train classification models or sentiment prediction tools.
Research & Academic Use โ Analyze patterns in movie releases, cast dynamics, and genre evolution.
Clean & ready-to-use: No raw HTML, just clean structured data.
Minimal but meaningful fields: Focused on useful movie attributes without clutter.
Updated info: Covers both classic and current titles.
Simple integration: Easy to use for developers, analysts, and product teams.
If you're working on a movie-based product or looking for reliable film metadata for your project, this dataset offers an ideal foundation.
Let us know if youโd like to explore it further.
Facebook
Twitterhttps://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
TMDB 5000 Movies (Teeny-Tiny Castle)
This dataset is part of a tutorial tied to the Teeny-Tiny Castle, an open-source repository containing educational tools for AI Ethics and Safety research.
How to Use
from datasets import load_dataset
dataset = load_dataset("AiresPucrs/tmdb-5000-movies", split = 'train')
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
veswaran/movie-dataset dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
Twitteryashvoladoddi37/movie-posters-dataset dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
Twitterhttps://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
Q-b1t/IMDB-Dataset-of-50K-Movie-Reviews-Backup dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterBy Himanshu Sekhar Paul [source]
This inspiring IMDB Movie Dataset is a comprehensive database of movie ratings, featuring director_name, duration, actor_2_name, genres, actor_1_name, movie title and more. Whether you're a fan of dramatic thrillers or nostalgic '90s classics from our childhoods; here you'll find information about the most voted movies from users across the world. Delve into num_voted_users trends and discover the language each movie was released in to craft your very own personal film library of country-specific titles released in any given year. With this dataset at your disposal comparing imdb scores will never be easier! Who will come out top when the votes have been tallied? Dive into data for a journey unparalleled!
For more datasets, click here.
- ๐จ Your notebook can be here! ๐จ!
This dataset offers a comprehensive overview of the movie ratings from IMDB. It includes data about director name, duration, actors, genres, movie title, number of votes, language, country of origin, year released and IMDB score.
To use this dataset to get a deeper understanding of how movies are rated on IMDB you can take the following steps:
- Look through each column of the data to get an overall understanding. This will help you identify any specific trends or correlations in the data that you can then analyze further in later steps.
- Take some time to explore relationships between different columns such as 'Number Voted Users' and 'IMDB Score' โ it could be interesting to look at how these numbers relate with each other in order better understan rating trends on IMDB?
- Analyze how particular sub-groups perform within various categories such as genre or country; this could provide insight into preferences towards certain types of movies or countries with higher associated scores than others?
- Through your analysis try and gain answers to questions related to specific demographic groups on IMDB โ are there distinct preferences among age groups when it comes to what they watch? Are there any clear correlations between rating and genre within certain countries? etcโฆ
By utilizing the questions above and taking an initial 'big picture' view before diving into more detailed analysis users should be able find value from this dataset by uncovering useful insights about movie ratings on IMDB!
- Movie Recommendation System: The dataset can be used to build a movie recommendation system using machine learning algorithms like k-nearest neighbors or collaborative filtering. Based on the user's past ratings, the system can suggest relevant movies with similar genres, actors and directors.
- Movie Popularity Index: Using the data, a metric could be designed that provides an overall popularity index for movies released over the years. This index could be constructed by considering factors such as IMDb score, number of votes and reviews collected, etc..
- Genre-based Over/Under Performance Analysis: Based on genre selections in each movie year, this dataset can provide insight into which genres are performing well and which are not. This kind of analysis could help form important decisioning when deciding to allocate resources towards production budgeting or marketing campaigns for upcoming films in different genres across different regions or markets
If you use this dataset in your research, please credit the original authors. Data Source
See the dataset description for more information.
File: movie_data.csv | Column name | Description | |:-------------------------|:---------------------------------------------------| | director_name | Name of the director of the movie. (String) | | duration | Length of the movie in minutes. (Integer) | | actor_2_name | Name of the second actor in the movie. (String) | | genres | Genre of the movie. (String) | | actor_1_name | Name of the first actor in the movie. (String) | | movie_title | Title of the movie. (String) | | num_voted_users | Number of users who voted for the movie. (Integer) | | actor_3_name | Name of the third actor in the movie. (String) | | movie_imdb_link | Link to the movie's IMDB page. (String) | | num_user_for_reviews |...
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Dataset Card for Wikipedia Movie Plots with AI Plot Summaries
Dataset Summary
Context
Wikipedia Movies Plots dataset by JustinR ( https://www.kaggle.com/jrobischon/wikipedia-movie-plots )
Content
Everything is the same as in https://www.kaggle.com/jrobischon/wikipedia-movie-plots
Acknowledgements
Please, go upvote https://www.kaggle.com/jrobischon/wikipedia-movie-plots dataset, since this is 100% based on that.
Supported Tasks andโฆ See the full description on the dataset page: https://huggingface.co/datasets/vishnupriyavr/wiki-movie-plots-with-summaries.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Letterboxd Film Dataset
This dataset contains a comprehensive collection of 847,209 films from the Letterboxd platform, including movie information, user reviews, and ratings.
Dataset Summary
Total Films: 847,209 File Size: ~1.12 GB (1,120,572,122 bytes) Format: JSONL (JSON Lines) Language: Primarily English, with some multilingual content
Data Structure
Each line contains a JSON object with the following fields: { "url":โฆ See the full description on the dataset page: https://huggingface.co/datasets/pkchwy/letterboxd-all-movie-data.
Facebook
Twitterhttp://researchdatafinder.qut.edu.au/display/n15252http://researchdatafinder.qut.edu.au/display/n15252
This file contains the features for the test portion of the movie dataset. The data has been changed into an average word vector. This is 50% of the total movie results. QUT Research Data Respository Dataset Resource available for download
Facebook
Twitterhttps://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy
Movies Dataset from AllMovie is a comprehensive collection featuring over 430,000 records, encompassing a wide range of films across various genres and languages. This extensive dataset includes essential data points such as movie titles, genres, release dates, posters, languages, directors, durations, synopses, trailers, average ratings, cast information, and URLs. Such detailed metadata is invaluable for developers, researchers, and enthusiasts aiming to analyze trends, build recommendation systems, or conduct in-depth studies of the film industry.
For those interested in alternative datasets, the IMDb Non-Commercial Datasets provide subsets of IMDb data accessible for personal and non-commercial use. These datasets allow users to hold local copies of movie information, facilitating various analytical projects.
Additionally, the MovieLens datasets offer a range of movie rating data suitable for research purposes. For instance, the MovieLens 20M dataset comprises 20 million ratings and 465,000 tag applications applied to 27,000 movies by 138,000 users, making it a valuable resource for studies in user preferences and recommendation algorithms.
Incorporating these datasets into your projects can significantly enhance the quality and depth of your analyses, providing a solid foundation for exploring various aspects of the cinematic world.
Why Choose Crawl Feeds for Your Data Needs?
Crawl Feeds is your trusted partner in acquiring high-quality, curated datasets tailored to your specific requirements. With a vast repository that includes the Movies Dataset, we empower developers and businesses to drive innovation. Explore our easy-to-use platform and transform your ideas into actionable insights.
Get Started with Crawl Feeds Today
Facebook
Twittermc-ai/movie dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterDSWF/movie-dataset dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
MansaT/Movie-Dataset dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
The "IMDB Dataset of Movies Reviews and Translation" dataset has been expanded significantly and is now available on Kaggle in a modified version. Three new columns have been added to the dataset: genres, descriptions, and emotions. The original dataset only had four columns: ratings, reviews, movies, and resenhas. This extension adds to the dataset's richness and offers insightful information about movie genres, in-depth synopses, and the sentimentality of the reviews.
The addition of the Genres column provides an extensive movie classification that enables scholars and film aficionados to explore particular genres and their traits in greater detail. By examining patterns, trends, and preferences across various genres, analysts can use this data to create more specialized research and moviegoer suggestions.
The newly added Descriptions column is a valuable addition as it provides textual summaries or synopses of each movie. These descriptions offer a concise overview of the plot, characters, and themes, making it easier for users to understand and evaluate movies of interest. Researchers can leverage this information to conduct sentiment analysis, topic modeling, or recommendation systems based on movie summaries.
Finally, the Emotions column adds an intriguing dimension to the dataset. By capturing the emotional tone expressed within each description, this column allows for a deeper understanding of sentiments toward the movies. Sentiment analysis techniques can be applied to this data, enabling researchers to gain insights into emotions: like joy, anger, sadness, and more emotions associated with different movies. This information can be particularly valuable for filmmakers, production companies, marketers looking to gauge audience reactions and tailor their strategies accordingly and especially for moviegoers who like to watch movies based on emotions.
Overall, the expanded version of the "50k Movie Reviews" dataset offers a wealth of new information that fosters detailed analysis and exploration of movie genres, descriptions, and emotional responses. This dataset presents a valuable resource for researchers, data scientists, and movie enthusiasts alike, enabling a deeper understanding of the movie landscape and facilitating the development of innovative tools and applications in the field of movie analysis and recommendation systems.
Facebook
TwitterDescription: This dataset contains information about 616 movies spanning various genres, years of release, and creative talents involved in their production. The dataset is intended for use in data analysis, visualization, and machine learning projects related to the film industry. Each row represents a single movie entry, and the dataset includes the following columns:
Movie: The title of the movie. Year: The year of release for the movie. Genres: The genres or categories associated with the movie. Certification/Rating: The film's certification or rating according to the relevant rating board or organization. IMDb ID: The unique IMDb identifier for the movie. Writer: The name(s) of the writer(s) or screenwriter(s) responsible for the movie's screenplay. Director: The name of the movie's director. Potential Use Cases:
Film industry analysis: Analyze trends in movie genres and ratings over time. Predicting movie success: Build predictive models to forecast a movie's success based on its features. Recommender systems: Develop movie recommendation systems for users based on their preferences. Creative insights: Explore relationships between directors, writers, and movie genres.