100+ datasets found
  1. Movies Performance and Feature Statistics

    • kaggle.com
    Updated Jan 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). Movies Performance and Feature Statistics [Dataset]. https://www.kaggle.com/datasets/thedevastator/movies-performance-and-feature-statistics
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 16, 2023
    Dataset provided by
    Kaggle
    Authors
    The Devastator
    Description

    Movies Performance and Feature Statistics

    Analyzing Box Office Performance, Rating and Audience Reactions

    By Yashwanth Sharaff [source]

    About this dataset

    This dataset contains essential characteristics of a variety of movies, including basic pieces of information such as the movie's title and budget, as well as performance indicators like the movie's MPAA rating, gross revenue, release date, genre, runtime, rating count and summary. With this data set we can better understand the film industry and uncover insights on how different features and performance metrics impact one another to guarantee a movie's success. The movies dataset also helps you make informed decisions about which features are key indicators in setting up a high-grossing feature film

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    To get the most out of this data set you need to understand what each column in it represents. The ‘Title’ column gives you the title of the movie which can be used for further search or exploration on popular streaming services and websites that are dedicated to providing detailed information about movies. The ‘MPAA Rating’ lists any Motion Picture Association (MPAA) rating for a movie which consists of G (General Audiences), PG (Parental Guidance Suggested), PG-13 (Parents Strongly Cautioned), R (Under 17 Requires Accompanying Parent or Guardian) etc. The 'Budget' column give you an approximate idea about how much a particular production cost while the 'Gross' columns depicts its earnings if it was released in theaters while its successor 'Release Date' reveals when each film has been released or is going to release in future. The columns 'Genre', 'Runtime', and ‘Rating Count’ cover subje​cts such as what type of movie is it? Every genre will have an associated runtime limit along with rating count which refers to number people who have rated/reviewed a particular flick whether on IMDB or other streaming services as well as paper mediums like newspapers . Last but not least summary field states an overview of what we can expect from film so take this in account before watching anything especially if include children members in your family.

    So go ahead - start exploring this interesting dataset today!

    Research Ideas

    • Creating a box office prediction model using budget, genre, release date and MPAA rating
    • Using the summary data to create a sentiment analysis tool for movie reviews
    • Building a recommendation engine for users based on their prior ratings and what other users with similar tastes have rated as highly

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    See the dataset description for more information.

    Columns

    File: movies.csv | Column name | Description | |:-----------------|:-------------------------------------------------------------------------------| | Title | The title of the movie. (String) | | MPAA Rating | The Motion Picture Association of America (MPAA) rating of the movie. (String) | | Budget | The budget of the movie in US dollars. (Integer) | | Gross | The gross revenue of the movie in US dollars. (Integer) | | Release Date | The date the movie was released. (Date) | | Genre | The genre of the movie. (String) | | Runtime | The length of the movie in minutes. (Integer) | | Rating Count | The number of ratings the movie has received. (Integer) | | Summary | A brief summary of the movie. (String) |

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. If you use this dataset in your research, please credit Yashwanth Sharaff.

  2. Film Genre Statistics

    • kaggle.com
    zip
    Updated Dec 19, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). Film Genre Statistics [Dataset]. https://www.kaggle.com/datasets/thedevastator/film-genre-statistics
    Explore at:
    zip(36435 bytes)Available download formats
    Dataset updated
    Dec 19, 2023
    Authors
    The Devastator
    Description

    Film Genre Statistics

    Movie genre statistics and revenue data from 1995-2018

    By Throwback Thursday [source]

    About this dataset

    This dataset contains genre statistics for movies released between 1995 and 2018. It provides information on various aspects of the movies, such as gross revenue, tickets sold, and inflation-adjusted figures. The dataset includes columns for genre, year of release, number of movies released in each genre and year, total gross revenue generated by movies in each genre and year, total number of tickets sold for movies in each genre and year, inflation-adjusted gross revenue that takes into account changes in the value of money over time, title of the highest-grossing movie in each genre and year, gross revenue generated by the highest-grossing movie in each genre and year, and inflation-adjusted gross revenue of the highest-grossing movie in each genre and year. This dataset offers insights into film industry trends over a span of more than two decades

    How to use the dataset

    Understanding the Columns

    Before diving into the analysis, let's familiarize ourselves with the different columns in this dataset:

    • Genre: This column represents the genre of each movie.
    • Year: The year in which the movies were released.
    • Movies Released: The number of movies released in a particular genre and year.
    • Gross: The total gross revenue generated by movies in a specific genre and year.
    • Tickets Sold: The total number of tickets sold for movies in a specific genre and year.
    • Inflation-Adjusted Gross: The gross revenue adjusted for inflation, taking into account changes in the value of money over time.
    • Top Movie: The title of the highest-grossing movie in a specific genre and year.
    • Top Movie Gross (That Year): The gross revenue generated by the highest-grossing movie in a specific genre and year.
    • Top Movie Inflation-Adjusted Gross (That Year): The inflation-adjusted gross revenue of the highest-grossing movie in a specific genre and year.

    Analyzing Data

    To make use of this dataset effectively, here are some potential analyses you can perform:

    • Find popular genres: You can determine which genres are popular by looking at columns like Movies Released or Tickets Sold. Analyzing these numbers will give you insights into what types of movies attract more audiences.

    • Measure financial success: Explore columns like Gross, Inflation Adjusted Gross, or Top Movie Gross (That Year) to compare the financial success of different genres. This will allow you to identify genres that generate higher revenue.

    • Understand movie trends: By analyzing the dataset over different years, you can observe trends in movie releases and gross revenue for specific genres. This information is crucial for understanding how movie preferences change over time.

    • Identify highest-grossing movies: The column Top Movie gives you the title of the highest-grossing movie in each genre and year. You can use this information to analyze the success of specific movies within their respective genres.

    Data Visualization

    To enhance your analysis, consider using data visualization techniques

    Research Ideas

    • Predicting the popularity and success of movies in different genres: By analyzing the data on tickets sold and gross revenue, we can identify trends and patterns in movie genres that attract more audiences and generate higher revenue. This information can be useful for filmmakers, production studios, and investors to make informed decisions about which genres to focus on for future movie releases.
    • Comparing the performance of movies over time: With the inclusion of inflation-adjusted figures, this dataset allows us to compare the box office success of movies across different years. We can analyze how movies in specific genres have performed over time in terms of gross revenue and adjust these figures for inflation to get a better understanding of their true financial success.
    • Analyzing the impact of genre popularity on ticket sales: By examining the relationship between genre popularity (measured by tickets sold) and total gross revenue, we can gain insights into audience preferences and behavior. This information is valuable for marketing strategies, as it helps determine which movie genres are most likely to attract a larger audience base and generate higher ticket sales

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    See the dataset description for more information.

    Columns...

  3. c

    Movies dataset from allmovie

    • crawlfeeds.com
    json, zip
    Updated Dec 26, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Crawl Feeds (2024). Movies dataset from allmovie [Dataset]. https://crawlfeeds.com/datasets/movies-dataset-form-allmovie
    Explore at:
    json, zipAvailable download formats
    Dataset updated
    Dec 26, 2024
    Dataset authored and provided by
    Crawl Feeds
    License

    https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy

    Description

    Movies Dataset from AllMovie is a comprehensive collection featuring over 430,000 records, encompassing a wide range of films across various genres and languages. This extensive dataset includes essential data points such as movie titles, genres, release dates, posters, languages, directors, durations, synopses, trailers, average ratings, cast information, and URLs. Such detailed metadata is invaluable for developers, researchers, and enthusiasts aiming to analyze trends, build recommendation systems, or conduct in-depth studies of the film industry.

    For those interested in alternative datasets, the IMDb Non-Commercial Datasets provide subsets of IMDb data accessible for personal and non-commercial use. These datasets allow users to hold local copies of movie information, facilitating various analytical projects.

    Additionally, the MovieLens datasets offer a range of movie rating data suitable for research purposes. For instance, the MovieLens 20M dataset comprises 20 million ratings and 465,000 tag applications applied to 27,000 movies by 138,000 users, making it a valuable resource for studies in user preferences and recommendation algorithms.

    Incorporating these datasets into your projects can significantly enhance the quality and depth of your analyses, providing a solid foundation for exploring various aspects of the cinematic world.

    Why Choose Crawl Feeds for Your Data Needs?

    Crawl Feeds is your trusted partner in acquiring high-quality, curated datasets tailored to your specific requirements. With a vast repository that includes the Movies Dataset, we empower developers and businesses to drive innovation. Explore our easy-to-use platform and transform your ideas into actionable insights.

    Get Started with Crawl Feeds Today

  4. c

    IMDB movie details dataset

    • crawlfeeds.com
    csv, zip
    Updated Nov 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Crawl Feeds (2025). IMDB movie details dataset [Dataset]. https://crawlfeeds.com/datasets/imdb-movie-details-dataset
    Explore at:
    zip, csvAvailable download formats
    Dataset updated
    Nov 9, 2025
    Dataset authored and provided by
    Crawl Feeds
    License

    https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy

    Description
    The IMDB Movie Details Dataset is a comprehensive collection of movie datasets that offers a treasure trove of information about movies, TV shows, and streaming content listed on IMDB. This dataset includes detailed data such as titles, release years, genres, cast, crew, ratings, and more, making it a go-to resource for film and entertainment enthusiasts. Ideal for data analysis, IMDB movie dataset applications span machine learning projects, predictive modeling, and insights into industry trends.
    Researchers can explore patterns in movie ratings and genre popularity, while developers can use the dataset to build recommendation systems or applications. Movie buffs can dive deep into historical and contemporary trends in the world of cinema. This dataset not only supports academic and professional pursuits but also opens doors for creative projects in storytelling, content creation, and audience engagement. Whether you’re a developer, researcher, or film enthusiast, the IMDB movie dataset is a powerful tool for uncovering trends and gaining deeper insights into the evolving entertainment landscape.
  5. Most popular movies dataset

    • kaggle.com
    zip
    Updated Aug 14, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    itisnarayan63 (2022). Most popular movies dataset [Dataset]. https://www.kaggle.com/datasets/narayan63/most-popular-movies-dataset
    Explore at:
    zip(657558 bytes)Available download formats
    Dataset updated
    Aug 14, 2022
    Authors
    itisnarayan63
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Content

    This dataset collected form TMDB this dataset has 1 file

    In the dataset

    Data has more then 43000 rows and 8 columns

    Columns

    Columns nameAbout data (store in columns)
    popularityhow much popular this movie
    release_daterelease date of the movie
    titletitle of the movie
    overviewbrief description about movie
    vote_averageaverage of the vote (rating)
    vote_counthow many people vote this movie
    original_languagewhat is making language of the movie
    original_titlewhat is the original tittle of the movie

    Acknowledgements

    Data collected form TMDBl

  6. Frequency of streaming movies in the U.S. 2021, by age group

    • statista.com
    Updated Nov 27, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Frequency of streaming movies in the U.S. 2021, by age group [Dataset]. https://www.statista.com/statistics/935493/movies-watching-streaming-frequency-us-by-age/
    Explore at:
    Dataset updated
    Nov 27, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Sep 27, 2021 - Sep 29, 2021
    Area covered
    United States
    Description

    The findings of a survey held in the United States in September 2021 revealed that ** percent of adults aged between 35 and 44 years old said that they watched or streamed movies every day, making respondents in this age group the most likely to do so. By comparison, ** percent of total respondents reported watching movies on a daily basis.

  7. Film, television and video production, summary statistics

    • www150.statcan.gc.ca
    • open.canada.ca
    • +1more
    Updated Mar 17, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Government of Canada, Statistics Canada (2025). Film, television and video production, summary statistics [Dataset]. http://doi.org/10.25318/2110005901-eng
    Explore at:
    Dataset updated
    Mar 17, 2025
    Dataset provided by
    Statistics Canadahttps://statcan.gc.ca/en
    Area covered
    Canada
    Description

    The summary statistics by North American Industry Classification System (NAICS) which include: operating revenue (dollars x 1,000,000), operating expenses (dollars x 1,000,000), salaries wages and benefits (dollars x 1,000,000), and operating profit margin (by percent), of motion picture and video production (NAICS 512110), annual, for five years of data.

  8. Frequency of going to the movies in the U.S. 2022

    • statista.com
    Updated Nov 27, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Frequency of going to the movies in the U.S. 2022 [Dataset]. https://www.statista.com/statistics/264396/frequency-of-going-to-the-movies-in-the-us/
    Explore at:
    Dataset updated
    Nov 27, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Apr 30, 2022 - May 3, 2022
    Area covered
    United States
    Description

    During a survey carried out in the United States in April and May 2022, approximately 41 percent of responding internet users said they rarely went to the movies. Roughly one-third stated that they went to see a film in theaters sometimes, while eight percent reported doing it often. Almost one out of five interviewees – 18 percent – said they never went to the movies.

    Do wage and age affect moviegoing frequency? According to the same source, little more than one-third of Americans whose household income stood below 50 thousand U.S. dollars reported going to the movies often or sometimes in mid-2022. Meanwhile, more than half of those with an income above 100 thousand dollars said the same. The gap added up to 17 percentage points. There was also a generational gap among cinephiles. About half of respondents aged 18 to 34 stated that they usually went to the movies, whereas little more than one-fourth of consumers aged 65 and over reported doing it.

    Regional and gender differences in film viewing The moviegoing frequency also varies across the U.S.'s regions. In the Northeast, for example, the share of interviewees saying they went to see a film in theaters either sometimes or often amounted to 45 percent. Within the Midwest, more than 60 percent of respondents in the South said they rarely or never went to the movies as of May 2022. Furthermore, nearly half of American male adults surveyed stated that they visited a movie theater often or sometimes, while little more than one-third of women said the same.

  9. Support of AI use cases in TV and film industries in the U.S. 2023

    • statista.com
    Updated Jul 18, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2023). Support of AI use cases in TV and film industries in the U.S. 2023 [Dataset]. https://www.statista.com/statistics/1401588/support-ai-use-cases-film-movies-us/
    Explore at:
    Dataset updated
    Jul 18, 2023
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Jul 18, 2023 - Jul 20, 2023
    Area covered
    United States
    Description

    According to a study held in July 2023 in the United States about AI use cases in TV and film industries, ** percent of respondents supported the use of such technology to create special effects or to alter actors' appearances. It was the most supported AI use case, while generating voices for animated characters came in second position with ** percent. The majority of entertainment-industry professionals agreed that the role AI was a point of contention for the ongoing writers strike.

  10. d

    National box office statistics

    • data.gov.tw
    csv, json
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ministry of Culture, National box office statistics [Dataset]. https://data.gov.tw/en/datasets/94224
    Explore at:
    json, csvAvailable download formats
    Dataset authored and provided by
    Ministry of Culture
    License

    https://data.gov.tw/licensehttps://data.gov.tw/license

    Description

    This dataset provides national theater box office statistics for films distributed by the Administrative Institution National Film and Audiovisual Culture Center. The data is up to the last Sunday before the announcement date and does not include films that have not been screened for less than 7 calendar days. The earliest CSV format data in this dataset begins on July 30, 2018, and the earliest JSON format data begins on March 1, 2020. JSON format queries require entering the start and end dates (in the format of year, month, and day), and can provide data for a maximum of 90 days at a time.

  11. q

    Movie Data - X - Test - w2v

    • data.researchdatafinder.qut.edu.au
    Updated Apr 8, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2018). Movie Data - X - Test - w2v [Dataset]. https://data.researchdatafinder.qut.edu.au/dataset/survey-word-vector/resource/e638fc06-7ef3-4a41-85e2-21f7fad2dfb3
    Explore at:
    Dataset updated
    Apr 8, 2018
    License

    http://researchdatafinder.qut.edu.au/display/n15252http://researchdatafinder.qut.edu.au/display/n15252

    Description

    This file contains the features for the test portion of the movie dataset. The data has been changed into an average word vector. This is 50% of the total movie results. QUT Research Data Respository Dataset Resource available for download

  12. IMDb Movies Metadata Dataset – 4.5M Records (Global Coverage)

    • crawlfeeds.com
    csv, zip
    Updated Nov 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Crawl Feeds (2025). IMDb Movies Metadata Dataset – 4.5M Records (Global Coverage) [Dataset]. https://crawlfeeds.com/datasets/imdb-movies-metadata-dataset-4-5m-records-global-coverage
    Explore at:
    csv, zipAvailable download formats
    Dataset updated
    Nov 9, 2025
    Dataset authored and provided by
    Crawl Feeds
    License

    https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy

    Description

    Unlock one of the most comprehensive movie datasets available—4.5 million structured IMDb movie records, extracted and enriched for data science, machine learning, and entertainment research.

    This dataset includes a vast collection of global movie metadata, including details on title, release year, genre, country, language, runtime, cast, directors, IMDb ratings, reviews, and synopsis. Whether you're building a recommendation engine, benchmarking trends, or training AI models, this dataset is designed to give you deep and wide access to cinematic data across decades and continents.

    Perfect for use in film analytics, OTT platforms, review sentiment analysis, knowledge graphs, and LLM fine-tuning, the dataset is cleaned, normalized, and exportable in multiple formats.

    What’s Included:

    • Genres: Drama, Comedy, Horror, Action, Sci-Fi, Documentary, and more

    • Delivery: Direct download

    Use Cases:

    • Train LLMs or chatbots on cinematic language and metadata

    • Build or enrich movie recommendation engines

    • Run cross-lingual or multi-region film analytics

    • Benchmark genre popularity across time periods

    • Power academic studies or entertainment dashboards

    • Feed into knowledge graphs, search engines, or NLP pipelines

  13. Movie releases in the U.S. & Canada 2000-2024

    • statista.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista, Movie releases in the U.S. & Canada 2000-2024 [Dataset]. https://www.statista.com/statistics/187122/movie-releases-in-north-america-since-2001/
    Explore at:
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Canada, United States
    Description

    In 2024, a total of 569 movies were released in the United States and Canada, up from 506 in the previous year. Still, these figures are under the 792 titles released in 2019, before the COVID-19 outbreak. Will moviegoers return? The box office revenue in the U.S. and Canada more than tripled between 2020 and 2022, when it reached almost 7.4 billion U.S. dollars. The 2022 result still fell way behind the 11.3-billion-dollar annual revenue recorded just before the pandemic. But there are ways to attract newcomers to the moviegoing experience. During a mid-2022 survey conducted among members of the Generation Z – aged between 13 and 24 years – more than half of respondents mentioned movie offering as a leading motivation to go to the movies. About 40 percent of interviewees included the quality of the service and the physical comfort of the seats at the movie theater among their main incentives. Cinema circuits As the industry tries to reinvent itself for a post-pandemic scenario, the top movie theater chains in North America slowly bounce back. Their financial results improved since the coronavirus outbreak, but when or if they will see figures similar to those recorded before 2020 remains an open question. The leading circuit, AMC Theatres, reported a revenue of more than 2.5 billion dollars in 2021, over twice as much as in the previous year.

  14. Movie genres viewers want to see more in theaters worldwide 2025, by age

    • statista.com
    Updated Mar 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Movie genres viewers want to see more in theaters worldwide 2025, by age [Dataset]. https://www.statista.com/statistics/1607642/movie-genres-viewers-want-to-see-cinemas-theaters-world/
    Explore at:
    Dataset updated
    Mar 24, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Jan 2025
    Area covered
    Worldwide
    Description

    According to a survey led in several markets all around the world in January 2025, more than half of respondents across all age brackets wanted to see more action and adventure movies. While younger consumers would like to see more horror movies in theaters, older viewers were hoping to see more dramas.

  15. Ways of finding new TV shows and movies in the U.S. and Canada 2022-2024

    • statista.com
    Updated Nov 27, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Ways of finding new TV shows and movies in the U.S. and Canada 2022-2024 [Dataset]. https://www.statista.com/statistics/935428/methods-video-content-discovery/
    Explore at:
    Dataset updated
    Nov 27, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Canada, United States
    Description

    The most common way in which Americans and Canadians discover new movies and TV shows is through word of mouth or from friends, with *****percent of respondents to a survey in the fourth quarter of 2024. Getting to know new content via news articles or stories outside social media was less common, according to around *****percent of people interviewed.

  16. 60,000+ Movies, 100+ Years of Data, Rich Metadata

    • kaggle.com
    zip
    Updated Sep 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Raed Addala (2025). 60,000+ Movies, 100+ Years of Data, Rich Metadata [Dataset]. https://www.kaggle.com/datasets/raedaddala/top-500-600-movies-of-each-year-from-1960-to-2024
    Explore at:
    zip(53341704 bytes)Available download formats
    Dataset updated
    Sep 28, 2025
    Authors
    Raed Addala
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Links:

    For details about the scraping process, visit the code repository on GitHub.

    About the Dataset

    The final_data.csv file is a consolidated dataset combining data for the most popular 500–600 movies per year from 1920 to 2025, extracted from IMDb. This dataset aggregates all the yearly merged_movies_data_[year].csv files into a comprehensive CSV file for streamlined analysis.

    File Description

    The final_data.csv file includes:
    - Basic movie details: id, title, year, duration, MPA, rating, votes, meta_score, description, Movie_Link.
    - Financial data: budget, opening_weekend_gross, gross_worldwide, gross_us_canada.
    - Credits: directors, writers, stars.
    - Additional details: genres, countries_origin, filming_locations, production_companies, languages.
    - Awards: awards_content (wins, nominations, Oscars).
    - Release info: release_date.

    Columns:
    id,title,year,duration,MPA,rating,votes,meta_score,description,Movie_Link,writers,directors,stars,budget,opening_weekend_gross,gross_worldwide,gross_us_canada,release_date,countries_origin,filming_locations,production_companies,awards_content,genres,languages

    Data Cleaning Notes

    • Uniform Structure: The merged dataset ensures consistent formatting across all years, with cleaned titles, standardized links, and duplicate IDs removed.

    Updates

    The final_data.csv file is updated annually in December to reflect the most recent data additions and corrections.

    Applications

    This dataset is ideal for:
    - Longitudinal Analysis: Studying trends in movie production, popularity, and financial performance over a century.
    - Predictive Analytics: Building models to forecast box office performance or award outcomes.
    - Recommender Systems: Leveraging attributes like genres, cast, and ratings for personalized recommendations.
    - Comparative Studies: Comparing cinematic trends across different eras, regions, or genres.

    Dataset Features

    • Extensive Coverage: Over 60,000 movies spanning 100+ years.
    • Rich Metadata: Comprehensive information on movie attributes, financials, and recognition.
    • Ready for Analysis: Cleaned and consolidated for direct integration into machine learning or analytics workflows.

    Notes

    Please feel free to contact me for more features, errors in the data, suggestions, and enhancements.

    Feel free to contact me by mail or open an issue on GitHub.

  17. m

    Bollywood Movies data

    • data.mendeley.com
    • kaggle.com
    Updated May 12, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Prashant Premkumar (2020). Bollywood Movies data [Dataset]. http://doi.org/10.17632/3c57btcxy9.1
    Explore at:
    Dataset updated
    May 12, 2020
    Authors
    Prashant Premkumar
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Using a Python script to scrape data from the web, we collected data pertaining to all 1698 Hindi language movies that released in India across a 13 year period (2005-2017) from the website of Box Office India.

  18. Description of data from IMDb.

    • plos.figshare.com
    zip
    Updated Jun 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marlon Ramos; Angelo M. Calvão; Celia Anteneodo (2023). Description of data from IMDb. [Dataset]. http://doi.org/10.1371/journal.pone.0136083.s001
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Marlon Ramos; Angelo M. Calvão; Celia Anteneodo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    We collected votes (from 1 to 10 stars) for all movies, excluding TV episodes (total number of 336,090,882 votes for 300,723 movies), from March 19 to 28, 2013 (set # 1). Using the same list of movies, we collected the number of votes again from December 8 to 18, 2014 (set #2, 465,292,451 votes) and from January 5 to 10, 2015 (set # 3, 471,222,420), as shown in (Fig 10). For budgets, we use a new list and collected data from February 5 to 8, 2015. Results with fewer than 5 votes (in 2013) are not exhibited. Number of items by type: 33,941 (Documentary) 133,775 (Feature Film) 3,172 (Mini-Series) 50,408 (Short Film) 1,071 (TV Episode) 25,168 (TV Movie) 33,165 (TV Series) 2,450 (TV Special) 12,120 (Video) 5,453 (Video Game) By genre: 24,911 (Action); 93 (Adult); 15,651 (Adventure); 18,918 (Animation); 5,385 (Biography); 74,393 (Comedy); 18,693 (Crime); 37,250 (Documentary); 97,087 (Drama); 16,022 (Family); 8,677 (Fantasy); 567 (Film Noir); 1,575 (Game Show); 5,525 (History); 15,072 (Horror); 10,212 (Music); 5,840 (Musical); 8,170 (Mystery); 1,036 (News); 3,605 (Reality TV); 21,165 (Romance); 8,239 (Sci-Fi); 61,538 (Short); 4,360 (Sport); 1,467 (Talk Show); 16,246 (Thriller); 5,080 (War); 4,549 (Western). An item could be defined by more the one genre. As a final observation, it is possible for a user to remove his or her vote; as a consequence, a small fraction of movies have a decreasing number of votes. However, this represents a negligible fraction of the movies. We used the following list: http://www.imdb.com/search/title?title_type=feature,tv_movie,tv_series,tv_special,mini_series,documentary,game,short,video,unknown&user_rating=1.0,10. (ZIP)

  19. h

    letterboxd-all-movie-data

    • huggingface.co
    Updated Jul 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Salih Mert Canseven (2025). letterboxd-all-movie-data [Dataset]. https://huggingface.co/datasets/pkchwy/letterboxd-all-movie-data
    Explore at:
    Dataset updated
    Jul 21, 2025
    Authors
    Salih Mert Canseven
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Letterboxd Film Dataset

    This dataset contains a comprehensive collection of 847,209 films from the Letterboxd platform, including movie information, user reviews, and ratings.

      Dataset Summary
    

    Total Films: 847,209 File Size: ~1.12 GB (1,120,572,122 bytes) Format: JSONL (JSON Lines) Language: Primarily English, with some multilingual content

      Data Structure
    

    Each line contains a JSON object with the following fields: { "url":… See the full description on the dataset page: https://huggingface.co/datasets/pkchwy/letterboxd-all-movie-data.

  20. Full TMDB Movies Dataset 2024 (1M Movies)

    • kaggle.com
    zip
    Updated Nov 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    asaniczka (2025). Full TMDB Movies Dataset 2024 (1M Movies) [Dataset]. https://www.kaggle.com/datasets/asaniczka/tmdb-movies-dataset-2023-930k-movies
    Explore at:
    zip(239404730 bytes)Available download formats
    Dataset updated
    Nov 11, 2025
    Authors
    asaniczka
    License

    Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
    License information was derived automatically

    Description

    The TMDb (The Movie Database) is a comprehensive movie database that provides information about movies, including details like titles, ratings, release dates, revenue, genres, and much more.

    This dataset contains a collection of 1,000,000 movies from the TMDB database.

    Dataset is updated daily. If you find this dataset valuable, don't forget to hit the upvote button! 😊💝

    Interesting Task Ideas:

    1. Predict movie ratings based on features such as revenue, popularity, genre, and runtime.
    2. Identify trends in movie release dates and analyze their impact on revenue.
    3. Analyze the relationship between budget, revenue, and popularity to determine factors that contribute to a movie's success.
    4. Build a recommendation system that suggests similar movies based on genres, production companies, and language.
    5. Perform sentiment analysis on movie reviews to understand audience reactions.
    6. Explore the impact of movie genres on popularity and revenue.
    7. Investigate the correlation between runtime and audience engagement.
    8. Identify successful production companies and analyze their strategies.
    9. Utilize natural language processing techniques to extract meaningful insights from movie overviews.
    10. Visualize movie popularity over time and identify popular genres in different periods.

    Checkout my other datasets

    Clash of Clans Clans Dataset 2023 (3.5M Clans)

    Black-White Wage Gap in the USA Dataset

    130K Kindle Books

    USA Unemployment Rates by Demographics & Race

    150K TMDb TV Shows

    Photo by Onur Binay on Unsplash

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
The Devastator (2023). Movies Performance and Feature Statistics [Dataset]. https://www.kaggle.com/datasets/thedevastator/movies-performance-and-feature-statistics
Organization logo

Movies Performance and Feature Statistics

Analyzing Box Office Performance, Rating and Audience Reactions

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 16, 2023
Dataset provided by
Kaggle
Authors
The Devastator
Description

Movies Performance and Feature Statistics

Analyzing Box Office Performance, Rating and Audience Reactions

By Yashwanth Sharaff [source]

About this dataset

This dataset contains essential characteristics of a variety of movies, including basic pieces of information such as the movie's title and budget, as well as performance indicators like the movie's MPAA rating, gross revenue, release date, genre, runtime, rating count and summary. With this data set we can better understand the film industry and uncover insights on how different features and performance metrics impact one another to guarantee a movie's success. The movies dataset also helps you make informed decisions about which features are key indicators in setting up a high-grossing feature film

More Datasets

For more datasets, click here.

Featured Notebooks

  • 🚨 Your notebook can be here! 🚨!

How to use the dataset

To get the most out of this data set you need to understand what each column in it represents. The ‘Title’ column gives you the title of the movie which can be used for further search or exploration on popular streaming services and websites that are dedicated to providing detailed information about movies. The ‘MPAA Rating’ lists any Motion Picture Association (MPAA) rating for a movie which consists of G (General Audiences), PG (Parental Guidance Suggested), PG-13 (Parents Strongly Cautioned), R (Under 17 Requires Accompanying Parent or Guardian) etc. The 'Budget' column give you an approximate idea about how much a particular production cost while the 'Gross' columns depicts its earnings if it was released in theaters while its successor 'Release Date' reveals when each film has been released or is going to release in future. The columns 'Genre', 'Runtime', and ‘Rating Count’ cover subje​cts such as what type of movie is it? Every genre will have an associated runtime limit along with rating count which refers to number people who have rated/reviewed a particular flick whether on IMDB or other streaming services as well as paper mediums like newspapers . Last but not least summary field states an overview of what we can expect from film so take this in account before watching anything especially if include children members in your family.

So go ahead - start exploring this interesting dataset today!

Research Ideas

  • Creating a box office prediction model using budget, genre, release date and MPAA rating
  • Using the summary data to create a sentiment analysis tool for movie reviews
  • Building a recommendation engine for users based on their prior ratings and what other users with similar tastes have rated as highly

Acknowledgements

If you use this dataset in your research, please credit the original authors. Data Source

License

See the dataset description for more information.

Columns

File: movies.csv | Column name | Description | |:-----------------|:-------------------------------------------------------------------------------| | Title | The title of the movie. (String) | | MPAA Rating | The Motion Picture Association of America (MPAA) rating of the movie. (String) | | Budget | The budget of the movie in US dollars. (Integer) | | Gross | The gross revenue of the movie in US dollars. (Integer) | | Release Date | The date the movie was released. (Date) | | Genre | The genre of the movie. (String) | | Runtime | The length of the movie in minutes. (Integer) | | Rating Count | The number of ratings the movie has received. (Integer) | | Summary | A brief summary of the movie. (String) |

Acknowledgements

If you use this dataset in your research, please credit the original authors. If you use this dataset in your research, please credit Yashwanth Sharaff.

Search
Clear search
Close search
Google apps
Main menu