100+ datasets found

The Ultimate Film Statistics Dataset - for ML🏆🎬
kaggle.com
Updated Jul 9, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alessandro Lo Bello (2023). The Ultimate Film Statistics Dataset - for ML🏆🎬 [Dataset]. https://www.kaggle.com/datasets/alessandrolobello/the-ultimate-film-statistics-dataset-for-ml/data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 9, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Alessandro Lo Bello
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Description: This dataset provides comprehensive movie statistics compiled from multiple sources, including Wikipedia, The Numbers, and IMDb. It offers a rich collection of information and insights into various aspects of movies, such as movie titles, production dates, genres, runtime minutes, director information, average ratings, number of votes, approval index, production budgets, domestic gross earnings, and worldwide gross earnings.

The dataset combines data scraped from Wikipedia, which includes details about movie titles, production dates, genres, runtime minutes, and director information, with data from The Numbers, a reliable source for box office statistics. Additionally, IMDb data is integrated to provide information on average ratings, number of votes, and other movie-related attributes.

With this dataset, users can analyze and explore trends in the film industry, assess the financial success of movies, identify popular genres, and investigate the relationship between average ratings and box office performance. Researchers, movie enthusiasts, and data analysts can leverage this dataset for various purposes, including data visualization, predictive modeling, and deeper understanding of the movie landscape.

Features: - Movie_title - Production_date - Genres - Runtime_minutes - Director_name (primaryName) - Director_professions (primaryProfession) - Director_birthYear - Director_deathYear - Movie_averageRating : refers to the average rating given by online users for a particular movie - Movie_numberOfVotes : refers to the number of votes given by online users for a particular movie - Approval_Index :is a normalized indicator (on scale 0-10) calculated by multiplying the logarithm of the number of votes by the average users rating. It provides a concise measure of a movie's overall popularity and approval among online viewers, penalizing both films that got too few reviews and blockbusters that got too many. - Production_budget ( $) - Domestic_gross ($) - Worldwide_gross ($)

Potential Applications:

Box office analysis: Analyze the relationship between production budgets, domestic and worldwide gross earnings, and profitability. Genre analysis: Identify the most popular genres based on movie counts and analyze their performance. Rating analysis: Explore the relationship between average ratings, number of votes, and financial success. Director analysis: Investigate the impact of directors on movie ratings and financial performance. Time-based analysis: Study movie trends over different production years and observe changes in production budgets, box office earnings, and genre preferences. By utilizing this dataset, users can gain valuable insights into the movie industry and uncover patterns that can inform decision-making, market research, and creative strategies.
d
National box office statistics
data.gov.tw
csv, json
Updated Aug 27, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ministry of Culture (2025). National box office statistics [Dataset]. https://data.gov.tw/en/datasets/94224
Explore at:
json, csvAvailable download formats
Dataset updated
Aug 27, 2025
Dataset authored and provided by
Ministry of Culture
License
https://data.gov.tw/licensehttps://data.gov.tw/license
Description
This dataset provides national theater box office statistics for films distributed by the Administrative Institution National Film and Audiovisual Culture Center. The data is up to the last Sunday before the announcement date and does not include films that have not been screened for less than 7 calendar days. The earliest CSV format data in this dataset begins on July 30, 2018, and the earliest JSON format data begins on March 1, 2020. JSON format queries require entering the start and end dates (in the format of year, month, and day), and can provide data for a maximum of 90 days at a time.
c
IMDB movie details dataset
crawlfeeds.com
csv, zip
Updated Jul 5, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Crawl Feeds (2025). IMDB movie details dataset [Dataset]. https://crawlfeeds.com/datasets/imdb-movie-details-dataset
Explore at:
zip, csvAvailable download formats
Dataset updated
Jul 5, 2025
Dataset authored and provided by
Crawl Feeds
License
https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy
Description

The IMDB Movie Details Dataset is a comprehensive collection of movie datasets that offers a treasure trove of information about movies, TV shows, and streaming content listed on IMDB. This dataset includes detailed data such as titles, release years, genres, cast, crew, ratings, and more, making it a go-to resource for film and entertainment enthusiasts. Ideal for data analysis, IMDB movie dataset applications span machine learning projects, predictive modeling, and insights into industry trends.

Researchers can explore patterns in movie ratings and genre popularity, while developers can use the dataset to build recommendation systems or applications. Movie buffs can dive deep into historical and contemporary trends in the world of cinema. This dataset not only supports academic and professional pursuits but also opens doors for creative projects in storytelling, content creation, and audience engagement. Whether you’re a developer, researcher, or film enthusiast, the IMDB movie dataset is a powerful tool for uncovering trends and gaining deeper insights into the evolving entertainment landscape.
Average revenue of films in the U.S. & Canada 1995-2025, by selected source...
statista.com
Updated Jan 31, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Average revenue of films in the U.S. & Canada 1995-2025, by selected source material [Dataset]. https://www.statista.com/statistics/188689/movie-sources-in-north-america-by-average-box-office-revenue/
Explore at:
Dataset updated
Jan 31, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
Canada, United States
Description
Between 1995 and 2025, a movie based on comics or graphic novels grossed, on average, about 88.36 million U.S. dollars across the United States and Canada – collectively known as the North American box office. Spin-offs followed as the second-most commercially successful film source material, with average box office revenue of around 86.32 million dollars.
h
rotten_tomatoes
huggingface.co
Updated Jun 4, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
cornell-movie-review-data (2024). rotten_tomatoes [Dataset]. https://huggingface.co/datasets/cornell-movie-review-data/rotten_tomatoes
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 4, 2024
Dataset authored and provided by
cornell-movie-review-data
License
https://choosealicense.com/licenses/unknown/https://choosealicense.com/licenses/unknown/
Description
Dataset Card for "rotten_tomatoes"

Dataset Summary

Movie Review Dataset. This is a dataset of containing 5,331 positive and 5,331 negative processed sentences from Rotten Tomatoes movie reviews. This data was first used in Bo Pang and Lillian Lee, ``Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales.'', Proceedings of the ACL, 2005.

Supported Tasks and Leaderboards

More Information Needed

Languages… See the full description on the dataset page: https://huggingface.co/datasets/cornell-movie-review-data/rotten_tomatoes.
The Complete Movie Dataset
kaggle.com
Updated Aug 6, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Maya Soffer (2022). The Complete Movie Dataset [Dataset]. https://www.kaggle.com/datasets/mayasoffer/the-complete-movie-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 6, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Maya Soffer
Description
Introduction

This data set was scraped from the site https://www.the-numbers.com/ using Python 3. it has data of more than 13k movies - and contains monetary data (Domestic Box Office, Infl. Adj. Dom. BO, Opening Weekend, and more) as well as "creative" cinema data (Comparisons, Creative Type, Genre, and more). The complete scraping code I wrote to create the data set is available in my profile: https://www.kaggle.com/code/mayasoffer/movies-data-scraper

Important Info

Please note, that the data was scraped fully from the "The-numbers" website, therefore: - There is some missing data in accordance with the missing data on the site. - The scraping was committed on 01.03.22 (March 2022) so all the data is true to that time. - For more data on how the columns were created and where the site got that data initially, please look into the site itself. - Lastly, note that I scraped the data and saved it as CSV. however, all the columns were scraped in their original form - how they were written on the website. so some "cleaning" of the columns is necessary before any analysis can take place.

Inspiration

The data is very diverse and contains a lot of different columns and goes back to 1995. so the analysis options are many. here are a few analysis leads I thought about: - How have genres changed throughout the years? what genres are the most popular throughout the years? (revenue-wise, legs, opening week...). new genres that gained popularity (animation for example) - Does MPAA rating impact revenue? and much more...

Thank you for using my dataset!
p
Film Industry Statistics 2024
pzaz.io
Updated Mar 28, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pzaz (2024). Film Industry Statistics 2024 [Dataset]. https://pzaz.io/producer-blog/film-industry-statistics/
Explore at:
Dataset updated
Mar 28, 2024
Dataset authored and provided by
Pzaz
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
2024
Area covered
World
Description
Using data from Polly sourced from an independent sample of 2,950,625 people from Twitter, Reddit and TikTok worldwide from March 13, 2023, to March 13, 2024, we delved deeper into what people really think about the state of the film industry. USA aka Hollywood (68%) overwhelmingly leads over India aka Bollywood (5.8%) followed by Italy (5.6%), Japan (5%), South Korea (4.1%), France (35%), Nigeria aka Nollywood (29%) then China (1.1%) engagement. This report has a breakdown by gender, age and worldwide region.
Box office revenue in the U.S. & Canada 1995-2024, by movie rating
statista.com
Updated Jan 6, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Box office revenue in the U.S. & Canada 1995-2024, by movie rating [Dataset]. https://www.statista.com/statistics/433709/highest-grossing-movies-domestic-box-office-rating/
Explore at:
Dataset updated
Jan 6, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
Canada, United States
Description
Between 1995 and 2024, PG-13-rated movies grossed approximately 126.64 billion U.S. dollars at the North American box office – a term that excludes Mexico and includes Canada and the United States. R-rated and PG-rated films grossed around 69.28 billion and 56.04 billion dollars, respectively.
"9,565 Top-Rated Movies Dataset"
kaggle.com
Updated Aug 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Harshit@85 (2024). "9,565 Top-Rated Movies Dataset" [Dataset]. https://www.kaggle.com/datasets/harshit85/9565-top-rated-movies-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 19, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Harshit@85
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
About the Dataset

Title: 9,565 Top-Rated Movies Dataset

Description:
This dataset offers a comprehensive collection of 9,565 of the highest-rated movies according to audience ratings on the Movie Database (TMDb). The dataset includes detailed information about each movie, such as its title, overview, release date, popularity score, average vote, and vote count. It is designed to be a valuable resource for anyone interested in exploring trends in popular cinema, analyzing factors that contribute to a movie’s success, or building recommendation engines.

Key Features: - Title: The official title of each movie. - Overview: A brief synopsis or description of the movie's plot. - Release Date: The release date of the movie, formatted as YYYY-MM-DD. - Popularity: A score indicating the current popularity of the movie on TMDb, which can be used to gauge current interest. - Vote Average: The average rating of the movie, based on user votes. - Vote Count: The total number of votes the movie has received.

Data Source: The data was sourced from the TMDb API, a well-regarded platform for movie information, using the /movie/top_rated endpoint. The dataset represents a snapshot of the highest-rated movies as of the time of data collection.

Data Collection Process: - API Access: Data was retrieved programmatically using TMDb’s API. - Pagination Handling: Multiple API requests were made to cover all pages of top-rated movies, ensuring the dataset’s comprehensiveness. - Data Aggregation: Collected data was aggregated into a single, unified dataset using the pandas library. - Cleaning: Basic data cleaning was performed to remove duplicates and handle missing or malformed data entries.

Potential Uses: - Trend Analysis: Analyze trends in movie ratings over time or compare ratings across different genres. - Recommendation Systems: Build and train models to recommend movies based on user preferences. - Sentiment Analysis: Perform text analysis on movie overviews to understand common themes and sentiments. - Statistical Analysis: Explore the relationship between popularity, vote count, and average ratings.

Data Format: The dataset is provided in a structured tabular format (e.g., CSV), making it easy to load into data analysis tools like Python, R, or Excel.

Usage License: The dataset is shared under [appropriate license], ensuring that it can be used for educational, research, or commercial purposes, with proper attribution to the data source (TMDb).

This description provides a clear and detailed overview, helping potential users understand the dataset's content, origin, and potential applications.
CGI and animated movie box office revenue in the U.S. 2008-2018
statista.com
Updated Jul 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). CGI and animated movie box office revenue in the U.S. 2008-2018 [Dataset]. https://www.statista.com/statistics/1020938/cgi-animated-movie-revenue-us/
Explore at:
Dataset updated
Jul 11, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
United States
Description
This statistic shows the box office revenue of CGI, 3D and animated movies in the United States from 2008 to 2018. According to RenderThat, the total revenue in the U.S. for all movies containing CGI (computer-generated imagery), animation and 3D effects amounted to **** billion U.S. dollars in 2018.
Film, television and video production, summary statistics
www150.statcan.gc.ca
open.canada.ca
+1more
Updated Mar 17, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Government of Canada, Statistics Canada (2025). Film, television and video production, summary statistics [Dataset]. http://doi.org/10.25318/2110005901-eng
Explore at:
Unique identifier
https://doi.org/10.25318/2110005901-eng
Dataset updated
Mar 17, 2025
Dataset provided by
Statistics Canadahttps://statcan.gc.ca/en
Area covered
Canada
Description
The summary statistics by North American Industry Classification System (NAICS) which include: operating revenue (dollars x 1,000,000), operating expenses (dollars x 1,000,000), salaries wages and benefits (dollars x 1,000,000), and operating profit margin (by percent), of motion picture and video production (NAICS 512110), annual, for five years of data.
Movies and Ratings
zenodo.org
zip
Updated Aug 23, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Michael Kaufmann; Michael Kaufmann (2023). Movies and Ratings [Dataset]. http://doi.org/10.5281/zenodo.8276077
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.8276077
Dataset updated
Aug 23, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Michael Kaufmann; Michael Kaufmann
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Transformed, cleaned dataset with reduced number of columns for all 45,000 movies listed in the full MovieLens dataset of movies released in July 2017 or earlier. Data points include movie ID, title, budget, languages, and genres. This dataset also includes 26 million ratings from 270,000 users for all 45,000 movies. Ratings are given on a scale of 1 to 5 and include user ID, movie ID, rating, and timestamp.

This dataset consists of the following files:

* movies.csv: The main movie metadata file. Contains information on 45,000 movies included in the full MovieLens dataset.

* ratings.csv: The full MovieLens dataset with 26 million ratings and 750,000 tag applications from 270,000 users on all 45,000 movies in this dataset.

This dataset is a further development of the following public domain dataset published on Kaggle:

https://www.kaggle.com/datasets/rounakbanik/the-movies-dataset

This data was obtained from the official GroupLens website. The data was originally obtained from The Movies DataBase (TMDB) via the TMDB AP
q
Movie Data - X - Test - w2v
data.researchdatafinder.qut.edu.au
Updated Apr 8, 2018
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2018). Movie Data - X - Test - w2v [Dataset]. https://data.researchdatafinder.qut.edu.au/dataset/survey-word-vector/resource/e638fc06-7ef3-4a41-85e2-21f7fad2dfb3
Explore at:
Dataset updated
Apr 8, 2018
License
http://researchdatafinder.qut.edu.au/display/n15252http://researchdatafinder.qut.edu.au/display/n15252
Description
This file contains the features for the test portion of the movie dataset. The data has been changed into an average word vector. This is 50% of the total movie results. QUT Research Data Respository Dataset Resource available for download
Rotten Tomatoes Movie Dataset – Clean Movie Metadata
crawlfeeds.com
csv, zip
Updated Jul 21, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Crawl Feeds (2025). Rotten Tomatoes Movie Dataset – Clean Movie Metadata [Dataset]. https://crawlfeeds.com/datasets/rotten-tomatoes-movie-dataset-clean-movie-metadata
Explore at:
csv, zipAvailable download formats
Dataset updated
Jul 21, 2025
Dataset authored and provided by
Crawl Feeds
License
https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy
Description
We provide a high-quality Rotten Tomatoes movie dataset that includes key metadata for thousands of movies. This dataset is ideal for anyone working with movie-related platforms, entertainment analytics, content curation, or movie discovery tools.

Our collection is structured, clean, and designed to support real-time apps, dashboards, and research use cases.

What the Dataset Includes

Each record in the dataset contains core information pulled directly from Rotten Tomatoes, including:

Movie Name – The official title of the movie.

Poster URL – High-resolution image link to the movie poster.

Trailer URL – Direct link to the official trailer (when available).

Genre – One or more genres associated with the movie, such as Action, Drama, Comedy, or Horror.

Release Date – The date the movie was released to the public.

Actors – Main cast members listed on Rotten Tomatoes.

Directors – Director(s) responsible for the movie.

Rating – Audience or critic scores, where available.

Broad Coverage

This dataset spans a wide range of movies across all major genres and decades. From modern releases to timeless classics, from Hollywood blockbusters to independent films — we’ve included movies of all types with relevant data points.

You can expect data on:

U.S. theatrical releases

Netflix, Amazon, and other streaming exclusives

Festival films and limited releases

Animated and documentary films

Use Cases

Here are just a few ways this dataset can be useful:

Movie Recommendation Engines – Use metadata and genre info to power personalized movie suggestions.

Entertainment Search Tools – Build searchable movie listings with visual poster previews and trailer links.

Data Visualization Projects – Create dashboards showing trends by genre, release periods, or actor participation.

AI/ML Training – Use metadata to train classification models or sentiment prediction tools.

Research & Academic Use – Analyze patterns in movie releases, cast dynamics, and genre evolution.

Why Use Our Dataset?

Clean & ready-to-use: No raw HTML, just clean structured data.

Minimal but meaningful fields: Focused on useful movie attributes without clutter.

Updated info: Covers both classic and current titles.

Simple integration: Easy to use for developers, analysts, and product teams.

If you're working on a movie-based product or looking for reliable film metadata for your project, this dataset offers an ideal foundation.

Let us know if you’d like to explore it further.
All time worldwide box office collection
kaggle.com
Updated Jan 4, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Somnath Malik (2023). All time worldwide box office collection [Dataset]. https://www.kaggle.com/datasets/somnath2/box-office
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 4, 2023
Dataset provided by
Kaggle
Authors
Somnath Malik
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
The data was scraped from Box Office Mojo. This data set includes all-time worldwide box office collections from 2010 to 2022.

Web scrape source: https://github.com/Somnath4/Python-Web-Sraping

Photo by Kilyan Sockalingum on Unsplash Photo link: https://unsplash.com/photos/nW1n9eNHOsc

About Box Office Mojo: Box office mojo is a website that provides box office collection data for movies. It is a valuable resource for movie studios, producers, and film fans alike. The website allows users to view box office collection data for movies released in the US and around the world. It also provides data on movie budgets and box office performance compared to the budget. Users can access this data by browsing through the website or using the search function to find specific movies. The website is updated regularly, so users can always stay up-to-date on the latest box office collection data.

About the data set: The data set, which contains a worldwide box office collection, includes information on the top 200 grossing films of each year from 2010 to 2022. The data consists the title of the film, its worldwide box office collection, domestic box office collection, the percentage of domestic box office collection, foreign box office collection, and the percentage of foreign box office collection. This data can be beneficial for analyzing trends in the film industry, understanding the performance of different films, and predicting future box office success.
Movie genres viewers want to see more in theaters worldwide 2025, by age
statista.com
Updated Mar 24, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Movie genres viewers want to see more in theaters worldwide 2025, by age [Dataset]. https://www.statista.com/statistics/1607642/movie-genres-viewers-want-to-see-cinemas-theaters-world/
Explore at:
Dataset updated
Mar 24, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Jan 2025
Area covered
Worldwide
Description
According to a survey led in several markets all around the world in January 2025, more than half of respondents across all age brackets wanted to see more action and adventure movies. While younger consumers would like to see more horror movies in theaters, older viewers were hoping to see more dramas.
h
movies-dataset
huggingface.co
Updated Mar 3, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pablo Merchán-Rivera (2022). movies-dataset [Dataset]. https://huggingface.co/datasets/Pablinho/movies-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 3, 2022
Authors
Pablo Merchán-Rivera
License
https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/
Description
+9000 Movie Dataset

Overview

This dataset is sourced from Kaggle and has been granted CC0 1.0 Universal (CC0 1.0) Public Domain Dedication by the original author. This means you can copy, modify, distribute, and perform the work, even for commercial purposes, all without asking permission. I would like to express our gratitude to the original author for their contribution to the data community.

License

This dataset is released under the CC0 1.0 Universal… See the full description on the dataset page: https://huggingface.co/datasets/Pablinho/movies-dataset.
m
Bollywood Movies data
data.mendeley.com
Updated May 12, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Prashant Premkumar (2020). Bollywood Movies data [Dataset]. http://doi.org/10.17632/3c57btcxy9.1
Explore at:
Unique identifier
https://doi.org/10.17632/3c57btcxy9.1
Dataset updated
May 12, 2020
Authors
Prashant Premkumar
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Using a Python script to scrape data from the web, we collected data pertaining to all 1698 Hindi language movies that released in India across a 13 year period (2005-2017) from the website of Box Office India.
Movies and Tv Shows Dataset
crawlfeeds.com
csv, zip
Updated Jul 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Crawl Feeds (2025). Movies and Tv Shows Dataset [Dataset]. https://crawlfeeds.com/datasets/movies-and-tv-shows-dataset
Explore at:
zip, csvAvailable download formats
Dataset updated
Jul 4, 2025
Dataset authored and provided by
Crawl Feeds
License
https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy
Description
Explore our meticulously curated Movies dataset and TV shows dataset, designed to cater to diverse analytical and research needs. Whether you're a data scientist, a student, or a business professional, these datasets provide valuable insights into the entertainment industry.

Key Features of the Movies Dataset:

Extensive collection of global movies across various genres and languages.

Detailed metadata, including titles, release dates, genres, directors, cast, and ratings.

Regularly updated to ensure relevance and accuracy.

Why Choose Our TV Shows Dataset?

Our TV shows dataset is your gateway to understanding trends in episodic content. It includes:

Comprehensive details about popular and niche TV shows.

Information on episode counts, seasons, ratings, and networks.

Insights into audience preferences and regional programming.

Applications of These Datasets

These datasets are perfect for:

Machine learning models for recommendation systems.

Academic research on media trends and audience behavior.

Business strategies for entertainment platforms.

Unlock the power of TV show data with our Crawl Feeds TV Shows Dataset. Start analyzing today and gain valuable insights into your favorite shows!
🎥 Movie Plot Database
kaggle.com
Updated Aug 7, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
mexwell (2024). 🎥 Movie Plot Database [Dataset]. https://www.kaggle.com/datasets/mexwell/movie-plot-database/data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 7, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
mexwell
License
Attribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
License information was derived automatically
Description
Dataset of movie plot summaries and associated metadata. This data was collected by David Bamman, Brendan O'Connor, and Noah Smith at the Language Technologies Institute and Machine Learning Department at Carnegie Mellon University.

Data

plot_summaries.csv

Plot summaries of 42,306 movies extracted from the November 2, 2012 dump of English-language Wikipedia. Each line contains the Wikipedia movie ID (which indexes into movie.metadata.tsv) followed by the summary.

movie_metadata.csv

Metadata for 81,741 movies, extracted from the Noverber 4, 2012 dump of Freebase. Tab-separated; columns: - Wikipedia movie ID - Freebase movie ID - Movie name - Movie release date - Movie box office revenue - Movie runtime - Movie languages (Freebase ID:name tuples) - Movie countries (Freebase ID:name tuples) - Movie genres (Freebase ID:name tuples)

character_metadata.csv

Metadata for 450,669 characters aligned to the movies above, extracted from the Noverber 4, 2012 dump of Freebase. Tab-separated; columns:

Wikipedia movie ID

Freebase movie ID

Movie release date

Character name

Actor date of birth

Actor gender

Actor height (in meters)

Actor ethnicity (Freebase ID)

Actor name

Actor age at movie release

Freebase character/actor map ID

Freebase character ID

Freebase actor ID

tvtropes.clusters.txt

72 character types drawn from tvtropes.com, along with 501 instances of those types. The ID field indexes into the Freebase character/actor map ID in character.metadata.tsv.

name.clusters.txt

970 unique character names used in at least two different movies, along with 2,666 instances of those types. The ID field indexes into the Freebase character/actor map ID in character.metadata.tsv.

Acknowledgments

This research was supported in part by U.S. National Science Foundation grant IIS-0915187.

All data is released under a Creative Commons Attribution-ShareAlike License. For questions or comments, please contact David Bamman (dbamman@cs.cmu.edu).

Foto von Jakob Owens auf Unsplash

Facebook

Twitter

Click to copy link

Link copied

Cite

Alessandro Lo Bello (2023). The Ultimate Film Statistics Dataset - for ML🏆🎬 [Dataset]. https://www.kaggle.com/datasets/alessandrolobello/the-ultimate-film-statistics-dataset-for-ml/data

The Ultimate Film Statistics Dataset - for ML🏆🎬

Unlocking the Secrets of Film Success: A Comprehensive Analysis of Movie Data

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Jul 9, 2023

Dataset provided by

Kagglehttp://kaggle.com/

Authors

Alessandro Lo Bello

License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

Description: This dataset provides comprehensive movie statistics compiled from multiple sources, including Wikipedia, The Numbers, and IMDb. It offers a rich collection of information and insights into various aspects of movies, such as movie titles, production dates, genres, runtime minutes, director information, average ratings, number of votes, approval index, production budgets, domestic gross earnings, and worldwide gross earnings.

The dataset combines data scraped from Wikipedia, which includes details about movie titles, production dates, genres, runtime minutes, and director information, with data from The Numbers, a reliable source for box office statistics. Additionally, IMDb data is integrated to provide information on average ratings, number of votes, and other movie-related attributes.

With this dataset, users can analyze and explore trends in the film industry, assess the financial success of movies, identify popular genres, and investigate the relationship between average ratings and box office performance. Researchers, movie enthusiasts, and data analysts can leverage this dataset for various purposes, including data visualization, predictive modeling, and deeper understanding of the movie landscape.

Features: - Movie_title - Production_date - Genres - Runtime_minutes - Director_name (primaryName) - Director_professions (primaryProfession) - Director_birthYear - Director_deathYear - Movie_averageRating : refers to the average rating given by online users for a particular movie - Movie_numberOfVotes : refers to the number of votes given by online users for a particular movie - Approval_Index :is a normalized indicator (on scale 0-10) calculated by multiplying the logarithm of the number of votes by the average users rating. It provides a concise measure of a movie's overall popularity and approval among online viewers, penalizing both films that got too few reviews and blockbusters that got too many. - Production_budget ( $) - Domestic_gross ($) - Worldwide_gross ($)

Potential Applications:

Box office analysis: Analyze the relationship between production budgets, domestic and worldwide gross earnings, and profitability. Genre analysis: Identify the most popular genres based on movie counts and analyze their performance. Rating analysis: Explore the relationship between average ratings, number of votes, and financial success. Director analysis: Investigate the impact of directors on movie ratings and financial performance. Time-based analysis: Study movie trends over different production years and observe changes in production budgets, box office earnings, and genre preferences. By utilizing this dataset, users can gain valuable insights into the movie industry and uncover patterns that can inform decision-making, market research, and creative strategies.

Clear search

Close search

Google apps

Main menu

The Ultimate Film Statistics Dataset - for ML🏆🎬

National box office statistics

IMDB movie details dataset

Average revenue of films in the U.S. & Canada 1995-2025, by selected source...

rotten_tomatoes

The Complete Movie Dataset

Introduction

Important Info

Inspiration

Film Industry Statistics 2024

Box office revenue in the U.S. & Canada 1995-2024, by movie rating

"9,565 Top-Rated Movies Dataset"

About the Dataset

CGI and animated movie box office revenue in the U.S. 2008-2018

Film, television and video production, summary statistics

Movies and Ratings

Movie Data - X - Test - w2v

Rotten Tomatoes Movie Dataset – Clean Movie Metadata

What the Dataset Includes

Broad Coverage

Use Cases

Why Use Our Dataset?

All time worldwide box office collection

Movie genres viewers want to see more in theaters worldwide 2025, by age

movies-dataset

Bollywood Movies data

Movies and Tv Shows Dataset

Key Features of the Movies Dataset:

Why Choose Our TV Shows Dataset?

Applications of These Datasets

🎥 Movie Plot Database

Data

plot_summaries.csv

movie_metadata.csv

character_metadata.csv

tvtropes.clusters.txt

name.clusters.txt

Acknowledgments

The Ultimate Film Statistics Dataset - for ML🏆🎬

Unlocking the Secrets of Film Success: A Comprehensive Analysis of Movie Data