100+ datasets found
  1. The Ultimate Film Statistics Dataset - for ML🏆🎬

    • kaggle.com
    Updated Jul 9, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alessandro Lo Bello (2023). The Ultimate Film Statistics Dataset - for ML🏆🎬 [Dataset]. https://www.kaggle.com/datasets/alessandrolobello/the-ultimate-film-statistics-dataset-for-ml/data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 9, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Alessandro Lo Bello
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Description: This dataset provides comprehensive movie statistics compiled from multiple sources, including Wikipedia, The Numbers, and IMDb. It offers a rich collection of information and insights into various aspects of movies, such as movie titles, production dates, genres, runtime minutes, director information, average ratings, number of votes, approval index, production budgets, domestic gross earnings, and worldwide gross earnings.

    The dataset combines data scraped from Wikipedia, which includes details about movie titles, production dates, genres, runtime minutes, and director information, with data from The Numbers, a reliable source for box office statistics. Additionally, IMDb data is integrated to provide information on average ratings, number of votes, and other movie-related attributes.

    With this dataset, users can analyze and explore trends in the film industry, assess the financial success of movies, identify popular genres, and investigate the relationship between average ratings and box office performance. Researchers, movie enthusiasts, and data analysts can leverage this dataset for various purposes, including data visualization, predictive modeling, and deeper understanding of the movie landscape.

    Features: - Movie_title - Production_date - Genres - Runtime_minutes - Director_name (primaryName) - Director_professions (primaryProfession) - Director_birthYear - Director_deathYear - Movie_averageRating : refers to the average rating given by online users for a particular movie - Movie_numberOfVotes : refers to the number of votes given by online users for a particular movie - Approval_Index :is a normalized indicator (on scale 0-10) calculated by multiplying the logarithm of the number of votes by the average users rating. It provides a concise measure of a movie's overall popularity and approval among online viewers, penalizing both films that got too few reviews and blockbusters that got too many. - Production_budget ( $) - Domestic_gross ($) - Worldwide_gross ($)

    Potential Applications:

    Box office analysis: Analyze the relationship between production budgets, domestic and worldwide gross earnings, and profitability. Genre analysis: Identify the most popular genres based on movie counts and analyze their performance. Rating analysis: Explore the relationship between average ratings, number of votes, and financial success. Director analysis: Investigate the impact of directors on movie ratings and financial performance. Time-based analysis: Study movie trends over different production years and observe changes in production budgets, box office earnings, and genre preferences. By utilizing this dataset, users can gain valuable insights into the movie industry and uncover patterns that can inform decision-making, market research, and creative strategies.

  2. d

    National box office statistics

    • data.gov.tw
    csv, json
    Updated Aug 27, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ministry of Culture (2025). National box office statistics [Dataset]. https://data.gov.tw/en/datasets/94224
    Explore at:
    json, csvAvailable download formats
    Dataset updated
    Aug 27, 2025
    Dataset authored and provided by
    Ministry of Culture
    License

    https://data.gov.tw/licensehttps://data.gov.tw/license

    Description

    This dataset provides national theater box office statistics for films distributed by the Administrative Institution National Film and Audiovisual Culture Center. The data is up to the last Sunday before the announcement date and does not include films that have not been screened for less than 7 calendar days. The earliest CSV format data in this dataset begins on July 30, 2018, and the earliest JSON format data begins on March 1, 2020. JSON format queries require entering the start and end dates (in the format of year, month, and day), and can provide data for a maximum of 90 days at a time.

  3. c

    IMDB movie details dataset

    • crawlfeeds.com
    csv, zip
    Updated Jul 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Crawl Feeds (2025). IMDB movie details dataset [Dataset]. https://crawlfeeds.com/datasets/imdb-movie-details-dataset
    Explore at:
    zip, csvAvailable download formats
    Dataset updated
    Jul 5, 2025
    Dataset authored and provided by
    Crawl Feeds
    License

    https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy

    Description
    The IMDB Movie Details Dataset is a comprehensive collection of movie datasets that offers a treasure trove of information about movies, TV shows, and streaming content listed on IMDB. This dataset includes detailed data such as titles, release years, genres, cast, crew, ratings, and more, making it a go-to resource for film and entertainment enthusiasts. Ideal for data analysis, IMDB movie dataset applications span machine learning projects, predictive modeling, and insights into industry trends.
    Researchers can explore patterns in movie ratings and genre popularity, while developers can use the dataset to build recommendation systems or applications. Movie buffs can dive deep into historical and contemporary trends in the world of cinema. This dataset not only supports academic and professional pursuits but also opens doors for creative projects in storytelling, content creation, and audience engagement. Whether you’re a developer, researcher, or film enthusiast, the IMDB movie dataset is a powerful tool for uncovering trends and gaining deeper insights into the evolving entertainment landscape.
  4. Average revenue of films in the U.S. & Canada 1995-2025, by selected source...

    • statista.com
    Updated Jan 31, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Average revenue of films in the U.S. & Canada 1995-2025, by selected source material [Dataset]. https://www.statista.com/statistics/188689/movie-sources-in-north-america-by-average-box-office-revenue/
    Explore at:
    Dataset updated
    Jan 31, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Canada, United States
    Description

    Between 1995 and 2025, a movie based on comics or graphic novels grossed, on average, about 88.36 million U.S. dollars across the United States and Canada – collectively known as the North American box office. Spin-offs followed as the second-most commercially successful film source material, with average box office revenue of around 86.32 million dollars.

  5. h

    rotten_tomatoes

    • huggingface.co
    Updated Jun 4, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    cornell-movie-review-data (2024). rotten_tomatoes [Dataset]. https://huggingface.co/datasets/cornell-movie-review-data/rotten_tomatoes
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 4, 2024
    Dataset authored and provided by
    cornell-movie-review-data
    License

    https://choosealicense.com/licenses/unknown/https://choosealicense.com/licenses/unknown/

    Description

    Dataset Card for "rotten_tomatoes"

      Dataset Summary
    

    Movie Review Dataset. This is a dataset of containing 5,331 positive and 5,331 negative processed sentences from Rotten Tomatoes movie reviews. This data was first used in Bo Pang and Lillian Lee, ``Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales.'', Proceedings of the ACL, 2005.

      Supported Tasks and Leaderboards
    

    More Information Needed

      Languages… See the full description on the dataset page: https://huggingface.co/datasets/cornell-movie-review-data/rotten_tomatoes.
    
  6. The Complete Movie Dataset

    • kaggle.com
    Updated Aug 6, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Maya Soffer (2022). The Complete Movie Dataset [Dataset]. https://www.kaggle.com/datasets/mayasoffer/the-complete-movie-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 6, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Maya Soffer
    Description

    Introduction

    This data set was scraped from the site https://www.the-numbers.com/ using Python 3. it has data of more than 13k movies - and contains monetary data (Domestic Box Office, Infl. Adj. Dom. BO, Opening Weekend, and more) as well as "creative" cinema data (Comparisons, Creative Type, Genre, and more). The complete scraping code I wrote to create the data set is available in my profile: https://www.kaggle.com/code/mayasoffer/movies-data-scraper

    Important Info

    Please note, that the data was scraped fully from the "The-numbers" website, therefore: - There is some missing data in accordance with the missing data on the site. - The scraping was committed on 01.03.22 (March 2022) so all the data is true to that time. - For more data on how the columns were created and where the site got that data initially, please look into the site itself. - Lastly, note that I scraped the data and saved it as CSV. however, all the columns were scraped in their original form - how they were written on the website. so some "cleaning" of the columns is necessary before any analysis can take place.

    Inspiration

    The data is very diverse and contains a lot of different columns and goes back to 1995. so the analysis options are many. here are a few analysis leads I thought about: - How have genres changed throughout the years? what genres are the most popular throughout the years? (revenue-wise, legs, opening week...). new genres that gained popularity (animation for example) - Does MPAA rating impact revenue? and much more...

    Thank you for using my dataset!

  7. p

    Film Industry Statistics 2024

    • pzaz.io
    Updated Mar 28, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pzaz (2024). Film Industry Statistics 2024 [Dataset]. https://pzaz.io/producer-blog/film-industry-statistics/
    Explore at:
    Dataset updated
    Mar 28, 2024
    Dataset authored and provided by
    Pzaz
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    2024
    Area covered
    World
    Description

    Using data from Polly sourced from an independent sample of 2,950,625 people from Twitter, Reddit and TikTok worldwide from March 13, 2023, to March 13, 2024, we delved deeper into what people really think about the state of the film industry. USA aka Hollywood (68%) overwhelmingly leads over India aka Bollywood (5.8%) followed by Italy (5.6%), Japan (5%), South Korea (4.1%), France (35%), Nigeria aka Nollywood (29%) then China (1.1%) engagement. This report has a breakdown by gender, age and worldwide region.

  8. Box office revenue in the U.S. & Canada 1995-2024, by movie rating

    • statista.com
    Updated Jan 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Box office revenue in the U.S. & Canada 1995-2024, by movie rating [Dataset]. https://www.statista.com/statistics/433709/highest-grossing-movies-domestic-box-office-rating/
    Explore at:
    Dataset updated
    Jan 6, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Canada, United States
    Description

    Between 1995 and 2024, PG-13-rated movies grossed approximately 126.64 billion U.S. dollars at the North American box office – a term that excludes Mexico and includes Canada and the United States. R-rated and PG-rated films grossed around 69.28 billion and 56.04 billion dollars, respectively.

  9. "9,565 Top-Rated Movies Dataset"

    • kaggle.com
    Updated Aug 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Harshit@85 (2024). "9,565 Top-Rated Movies Dataset" [Dataset]. https://www.kaggle.com/datasets/harshit85/9565-top-rated-movies-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 19, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Harshit@85
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    About the Dataset

    Title: 9,565 Top-Rated Movies Dataset

    Description:
    This dataset offers a comprehensive collection of 9,565 of the highest-rated movies according to audience ratings on the Movie Database (TMDb). The dataset includes detailed information about each movie, such as its title, overview, release date, popularity score, average vote, and vote count. It is designed to be a valuable resource for anyone interested in exploring trends in popular cinema, analyzing factors that contribute to a movie’s success, or building recommendation engines.

    Key Features: - Title: The official title of each movie. - Overview: A brief synopsis or description of the movie's plot. - Release Date: The release date of the movie, formatted as YYYY-MM-DD. - Popularity: A score indicating the current popularity of the movie on TMDb, which can be used to gauge current interest. - Vote Average: The average rating of the movie, based on user votes. - Vote Count: The total number of votes the movie has received.

    Data Source: The data was sourced from the TMDb API, a well-regarded platform for movie information, using the /movie/top_rated endpoint. The dataset represents a snapshot of the highest-rated movies as of the time of data collection.

    Data Collection Process: - API Access: Data was retrieved programmatically using TMDb’s API. - Pagination Handling: Multiple API requests were made to cover all pages of top-rated movies, ensuring the dataset’s comprehensiveness. - Data Aggregation: Collected data was aggregated into a single, unified dataset using the pandas library. - Cleaning: Basic data cleaning was performed to remove duplicates and handle missing or malformed data entries.

    Potential Uses: - Trend Analysis: Analyze trends in movie ratings over time or compare ratings across different genres. - Recommendation Systems: Build and train models to recommend movies based on user preferences. - Sentiment Analysis: Perform text analysis on movie overviews to understand common themes and sentiments. - Statistical Analysis: Explore the relationship between popularity, vote count, and average ratings.

    Data Format: The dataset is provided in a structured tabular format (e.g., CSV), making it easy to load into data analysis tools like Python, R, or Excel.

    Usage License: The dataset is shared under [appropriate license], ensuring that it can be used for educational, research, or commercial purposes, with proper attribution to the data source (TMDb).

    This description provides a clear and detailed overview, helping potential users understand the dataset's content, origin, and potential applications.

  10. CGI and animated movie box office revenue in the U.S. 2008-2018

    • statista.com
    Updated Jul 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). CGI and animated movie box office revenue in the U.S. 2008-2018 [Dataset]. https://www.statista.com/statistics/1020938/cgi-animated-movie-revenue-us/
    Explore at:
    Dataset updated
    Jul 11, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    United States
    Description

    This statistic shows the box office revenue of CGI, 3D and animated movies in the United States from 2008 to 2018. According to RenderThat, the total revenue in the U.S. for all movies containing CGI (computer-generated imagery), animation and 3D effects amounted to **** billion U.S. dollars in 2018.

  11. Film, television and video production, summary statistics

    • www150.statcan.gc.ca
    • open.canada.ca
    • +1more
    Updated Mar 17, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Government of Canada, Statistics Canada (2025). Film, television and video production, summary statistics [Dataset]. http://doi.org/10.25318/2110005901-eng
    Explore at:
    Dataset updated
    Mar 17, 2025
    Dataset provided by
    Statistics Canadahttps://statcan.gc.ca/en
    Area covered
    Canada
    Description

    The summary statistics by North American Industry Classification System (NAICS) which include: operating revenue (dollars x 1,000,000), operating expenses (dollars x 1,000,000), salaries wages and benefits (dollars x 1,000,000), and operating profit margin (by percent), of motion picture and video production (NAICS 512110), annual, for five years of data.

  12. Movies and Ratings

    • zenodo.org
    zip
    Updated Aug 23, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Michael Kaufmann; Michael Kaufmann (2023). Movies and Ratings [Dataset]. http://doi.org/10.5281/zenodo.8276077
    Explore at:
    zipAvailable download formats
    Dataset updated
    Aug 23, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Michael Kaufmann; Michael Kaufmann
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Transformed, cleaned dataset with reduced number of columns for all 45,000 movies listed in the full MovieLens dataset of movies released in July 2017 or earlier. Data points include movie ID, title, budget, languages, and genres. This dataset also includes 26 million ratings from 270,000 users for all 45,000 movies. Ratings are given on a scale of 1 to 5 and include user ID, movie ID, rating, and timestamp.

    This dataset consists of the following files:

    * movies.csv: The main movie metadata file. Contains information on 45,000 movies included in the full MovieLens dataset.

    * ratings.csv: The full MovieLens dataset with 26 million ratings and 750,000 tag applications from 270,000 users on all 45,000 movies in this dataset.

    This dataset is a further development of the following public domain dataset published on Kaggle:

    https://www.kaggle.com/datasets/rounakbanik/the-movies-dataset

    This data was obtained from the official GroupLens website. The data was originally obtained from The Movies DataBase (TMDB) via the TMDB AP

  13. q

    Movie Data - X - Test - w2v

    • data.researchdatafinder.qut.edu.au
    Updated Apr 8, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2018). Movie Data - X - Test - w2v [Dataset]. https://data.researchdatafinder.qut.edu.au/dataset/survey-word-vector/resource/e638fc06-7ef3-4a41-85e2-21f7fad2dfb3
    Explore at:
    Dataset updated
    Apr 8, 2018
    License

    http://researchdatafinder.qut.edu.au/display/n15252http://researchdatafinder.qut.edu.au/display/n15252

    Description

    This file contains the features for the test portion of the movie dataset. The data has been changed into an average word vector. This is 50% of the total movie results. QUT Research Data Respository Dataset Resource available for download

  14. Rotten Tomatoes Movie Dataset – Clean Movie Metadata

    • crawlfeeds.com
    csv, zip
    Updated Jul 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Crawl Feeds (2025). Rotten Tomatoes Movie Dataset – Clean Movie Metadata [Dataset]. https://crawlfeeds.com/datasets/rotten-tomatoes-movie-dataset-clean-movie-metadata
    Explore at:
    csv, zipAvailable download formats
    Dataset updated
    Jul 21, 2025
    Dataset authored and provided by
    Crawl Feeds
    License

    https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy

    Description

    We provide a high-quality Rotten Tomatoes movie dataset that includes key metadata for thousands of movies. This dataset is ideal for anyone working with movie-related platforms, entertainment analytics, content curation, or movie discovery tools.

    Our collection is structured, clean, and designed to support real-time apps, dashboards, and research use cases.

    What the Dataset Includes

    Each record in the dataset contains core information pulled directly from Rotten Tomatoes, including:

    • Movie Name – The official title of the movie.

    • Poster URL – High-resolution image link to the movie poster.

    • Trailer URL – Direct link to the official trailer (when available).

    • Genre – One or more genres associated with the movie, such as Action, Drama, Comedy, or Horror.

    • Release Date – The date the movie was released to the public.

    • Actors – Main cast members listed on Rotten Tomatoes.

    • Directors – Director(s) responsible for the movie.

    • Rating – Audience or critic scores, where available.

    Broad Coverage

    This dataset spans a wide range of movies across all major genres and decades. From modern releases to timeless classics, from Hollywood blockbusters to independent films — we’ve included movies of all types with relevant data points.

    You can expect data on:

    • U.S. theatrical releases

    • Netflix, Amazon, and other streaming exclusives

    • Festival films and limited releases

    • Animated and documentary films

    Use Cases

    Here are just a few ways this dataset can be useful:

    • Movie Recommendation Engines – Use metadata and genre info to power personalized movie suggestions.

    • Entertainment Search Tools – Build searchable movie listings with visual poster previews and trailer links.

    • Data Visualization Projects – Create dashboards showing trends by genre, release periods, or actor participation.

    • AI/ML Training – Use metadata to train classification models or sentiment prediction tools.

    • Research & Academic Use – Analyze patterns in movie releases, cast dynamics, and genre evolution.

    Why Use Our Dataset?

    • Clean & ready-to-use: No raw HTML, just clean structured data.

    • Minimal but meaningful fields: Focused on useful movie attributes without clutter.

    • Updated info: Covers both classic and current titles.

    • Simple integration: Easy to use for developers, analysts, and product teams.

    If you're working on a movie-based product or looking for reliable film metadata for your project, this dataset offers an ideal foundation.

    Let us know if you’d like to explore it further.

  15. All time worldwide box office collection

    • kaggle.com
    Updated Jan 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Somnath Malik (2023). All time worldwide box office collection [Dataset]. https://www.kaggle.com/datasets/somnath2/box-office
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 4, 2023
    Dataset provided by
    Kaggle
    Authors
    Somnath Malik
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    The data was scraped from Box Office Mojo. This data set includes all-time worldwide box office collections from 2010 to 2022.

    Web scrape source: https://github.com/Somnath4/Python-Web-Sraping

    Photo by Kilyan Sockalingum on Unsplash Photo link: https://unsplash.com/photos/nW1n9eNHOsc

    About Box Office Mojo: Box office mojo is a website that provides box office collection data for movies. It is a valuable resource for movie studios, producers, and film fans alike. The website allows users to view box office collection data for movies released in the US and around the world. It also provides data on movie budgets and box office performance compared to the budget. Users can access this data by browsing through the website or using the search function to find specific movies. The website is updated regularly, so users can always stay up-to-date on the latest box office collection data.

    About the data set: The data set, which contains a worldwide box office collection, includes information on the top 200 grossing films of each year from 2010 to 2022. The data consists the title of the film, its worldwide box office collection, domestic box office collection, the percentage of domestic box office collection, foreign box office collection, and the percentage of foreign box office collection. This data can be beneficial for analyzing trends in the film industry, understanding the performance of different films, and predicting future box office success.

  16. Movie genres viewers want to see more in theaters worldwide 2025, by age

    • statista.com
    Updated Mar 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Movie genres viewers want to see more in theaters worldwide 2025, by age [Dataset]. https://www.statista.com/statistics/1607642/movie-genres-viewers-want-to-see-cinemas-theaters-world/
    Explore at:
    Dataset updated
    Mar 24, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Jan 2025
    Area covered
    Worldwide
    Description

    According to a survey led in several markets all around the world in January 2025, more than half of respondents across all age brackets wanted to see more action and adventure movies. While younger consumers would like to see more horror movies in theaters, older viewers were hoping to see more dramas.

  17. h

    movies-dataset

    • huggingface.co
    Updated Mar 3, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pablo Merchán-Rivera (2022). movies-dataset [Dataset]. https://huggingface.co/datasets/Pablinho/movies-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 3, 2022
    Authors
    Pablo Merchán-Rivera
    License

    https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/

    Description

    +9000 Movie Dataset

      Overview
    

    This dataset is sourced from Kaggle and has been granted CC0 1.0 Universal (CC0 1.0) Public Domain Dedication by the original author. This means you can copy, modify, distribute, and perform the work, even for commercial purposes, all without asking permission. I would like to express our gratitude to the original author for their contribution to the data community.

      License
    

    This dataset is released under the CC0 1.0 Universal… See the full description on the dataset page: https://huggingface.co/datasets/Pablinho/movies-dataset.

  18. m

    Bollywood Movies data

    • data.mendeley.com
    Updated May 12, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Prashant Premkumar (2020). Bollywood Movies data [Dataset]. http://doi.org/10.17632/3c57btcxy9.1
    Explore at:
    Dataset updated
    May 12, 2020
    Authors
    Prashant Premkumar
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Using a Python script to scrape data from the web, we collected data pertaining to all 1698 Hindi language movies that released in India across a 13 year period (2005-2017) from the website of Box Office India.

  19. Movies and Tv Shows Dataset

    • crawlfeeds.com
    csv, zip
    Updated Jul 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Crawl Feeds (2025). Movies and Tv Shows Dataset [Dataset]. https://crawlfeeds.com/datasets/movies-and-tv-shows-dataset
    Explore at:
    zip, csvAvailable download formats
    Dataset updated
    Jul 4, 2025
    Dataset authored and provided by
    Crawl Feeds
    License

    https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy

    Description

    Explore our meticulously curated Movies dataset and TV shows dataset, designed to cater to diverse analytical and research needs. Whether you're a data scientist, a student, or a business professional, these datasets provide valuable insights into the entertainment industry.

    Key Features of the Movies Dataset:

    1. Extensive collection of global movies across various genres and languages.

    2. Detailed metadata, including titles, release dates, genres, directors, cast, and ratings.

    3. Regularly updated to ensure relevance and accuracy.

    Why Choose Our TV Shows Dataset?

    Our TV shows dataset is your gateway to understanding trends in episodic content. It includes:

    • Comprehensive details about popular and niche TV shows.

    • Information on episode counts, seasons, ratings, and networks.

    • Insights into audience preferences and regional programming.

    Applications of These Datasets

    These datasets are perfect for:

    • Machine learning models for recommendation systems.

    • Academic research on media trends and audience behavior.

    • Business strategies for entertainment platforms.

    Unlock the power of TV show data with our Crawl Feeds TV Shows Dataset. Start analyzing today and gain valuable insights into your favorite shows!

  20. 🎥 Movie Plot Database

    • kaggle.com
    Updated Aug 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    mexwell (2024). 🎥 Movie Plot Database [Dataset]. https://www.kaggle.com/datasets/mexwell/movie-plot-database/data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 7, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    mexwell
    License

    Attribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
    License information was derived automatically

    Description

    Dataset of movie plot summaries and associated metadata. This data was collected by David Bamman, Brendan O'Connor, and Noah Smith at the Language Technologies Institute and Machine Learning Department at Carnegie Mellon University.

    Data

    plot_summaries.csv

    Plot summaries of 42,306 movies extracted from the November 2, 2012 dump of English-language Wikipedia. Each line contains the Wikipedia movie ID (which indexes into movie.metadata.tsv) followed by the summary.

    movie_metadata.csv

    Metadata for 81,741 movies, extracted from the Noverber 4, 2012 dump of Freebase. Tab-separated; columns: - Wikipedia movie ID - Freebase movie ID - Movie name - Movie release date - Movie box office revenue - Movie runtime - Movie languages (Freebase ID:name tuples) - Movie countries (Freebase ID:name tuples) - Movie genres (Freebase ID:name tuples)

    character_metadata.csv

    Metadata for 450,669 characters aligned to the movies above, extracted from the Noverber 4, 2012 dump of Freebase. Tab-separated; columns:

    • Wikipedia movie ID
    • Freebase movie ID
    • Movie release date
    • Character name
    • Actor date of birth
    • Actor gender
    • Actor height (in meters)
    • Actor ethnicity (Freebase ID)
    • Actor name
    • Actor age at movie release
    • Freebase character/actor map ID
    • Freebase character ID
    • Freebase actor ID

    tvtropes.clusters.txt

    72 character types drawn from tvtropes.com, along with 501 instances of those types. The ID field indexes into the Freebase character/actor map ID in character.metadata.tsv.

    name.clusters.txt

    970 unique character names used in at least two different movies, along with 2,666 instances of those types. The ID field indexes into the Freebase character/actor map ID in character.metadata.tsv.

    Acknowledgments

    This research was supported in part by U.S. National Science Foundation grant IIS-0915187.

    All data is released under a Creative Commons Attribution-ShareAlike License. For questions or comments, please contact David Bamman (dbamman@cs.cmu.edu).

    Foto von Jakob Owens auf Unsplash

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Alessandro Lo Bello (2023). The Ultimate Film Statistics Dataset - for ML🏆🎬 [Dataset]. https://www.kaggle.com/datasets/alessandrolobello/the-ultimate-film-statistics-dataset-for-ml/data
Organization logo

The Ultimate Film Statistics Dataset - for ML🏆🎬

Unlocking the Secrets of Film Success: A Comprehensive Analysis of Movie Data

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 9, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Alessandro Lo Bello
License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

Description: This dataset provides comprehensive movie statistics compiled from multiple sources, including Wikipedia, The Numbers, and IMDb. It offers a rich collection of information and insights into various aspects of movies, such as movie titles, production dates, genres, runtime minutes, director information, average ratings, number of votes, approval index, production budgets, domestic gross earnings, and worldwide gross earnings.

The dataset combines data scraped from Wikipedia, which includes details about movie titles, production dates, genres, runtime minutes, and director information, with data from The Numbers, a reliable source for box office statistics. Additionally, IMDb data is integrated to provide information on average ratings, number of votes, and other movie-related attributes.

With this dataset, users can analyze and explore trends in the film industry, assess the financial success of movies, identify popular genres, and investigate the relationship between average ratings and box office performance. Researchers, movie enthusiasts, and data analysts can leverage this dataset for various purposes, including data visualization, predictive modeling, and deeper understanding of the movie landscape.

Features: - Movie_title - Production_date - Genres - Runtime_minutes - Director_name (primaryName) - Director_professions (primaryProfession) - Director_birthYear - Director_deathYear - Movie_averageRating : refers to the average rating given by online users for a particular movie - Movie_numberOfVotes : refers to the number of votes given by online users for a particular movie - Approval_Index :is a normalized indicator (on scale 0-10) calculated by multiplying the logarithm of the number of votes by the average users rating. It provides a concise measure of a movie's overall popularity and approval among online viewers, penalizing both films that got too few reviews and blockbusters that got too many. - Production_budget ( $) - Domestic_gross ($) - Worldwide_gross ($)

Potential Applications:

Box office analysis: Analyze the relationship between production budgets, domestic and worldwide gross earnings, and profitability. Genre analysis: Identify the most popular genres based on movie counts and analyze their performance. Rating analysis: Explore the relationship between average ratings, number of votes, and financial success. Director analysis: Investigate the impact of directors on movie ratings and financial performance. Time-based analysis: Study movie trends over different production years and observe changes in production budgets, box office earnings, and genre preferences. By utilizing this dataset, users can gain valuable insights into the movie industry and uncover patterns that can inform decision-making, market research, and creative strategies.

Search
Clear search
Close search
Google apps
Main menu