23 datasets found
  1. IMDB Top 100 Movies

    • kaggle.com
    zip
    Updated May 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Prakash Mahara (2025). IMDB Top 100 Movies [Dataset]. https://www.kaggle.com/datasets/prakash27x/imdb-top-100-movies
    Explore at:
    zip(18499 bytes)Available download formats
    Dataset updated
    May 5, 2025
    Authors
    Prakash Mahara
    Description

    # 🏆 IMDB Top 100 Movies Dataset

    This dataset contains detailed information about the Top 100 movies from IMDb, collected to assist film enthusiasts, data analysts, and machine learning practitioners in exploring trends and insights in the film industry.

    📁 Dataset Features

    Each movie entry includes: 🎬 Title – Name of the movie 📅 Year – Year of release ⭐ Rating – IMDb user rating (out of 10) 📣 Genres – List of genres the movie belongs to 🎥 Director – Director(s) of the movie 👥 Stars – Leading cast ⏱️ Runtime – Duration in minutes 📝 Summary – A brief synopsis of the movie 🧾 Votes – Number of user votes 💰 Gross – Box office gross (if available)

    🎯 Use Cases

    Data Visualization: Create graphs showing rating trends, genre distributions, etc. Recommendation Systems: Build a content-based movie recommender. NLP Projects: Use summaries for natural language processing tasks. Exploratory Data Analysis: Great dataset for practicing EDA techniques.

    🛠️ Source

    The data is derived from IMDb's public listings and compiled into JSON format for easy use in Python-based projects.

  2. imdb-movies

    • kaggle.com
    zip
    Updated Nov 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Emmanuel Pogbe (2025). imdb-movies [Dataset]. https://www.kaggle.com/datasets/emmanuelpogbe/imdb-movies
    Explore at:
    zip(4041 bytes)Available download formats
    Dataset updated
    Nov 9, 2025
    Authors
    Emmanuel Pogbe
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset is a json file that contains movie information from imdb Fields in json: title, year, rating, genre, director, votes

    The first 100 entries were directly from IMDB Top 100 Movies - https://www.kaggle.com/datasets/prakash27x/imdb-top-100-movies

    The next 10 entries are movies produced in 2007 (for a database management project) and were scraped from IMDB by me

  3. h

    IMDB-Reviews

    • huggingface.co
    Updated Jul 27, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Daksh Bhardwaj (2025). IMDB-Reviews [Dataset]. http://doi.org/10.57967/hf/6003
    Explore at:
    Dataset updated
    Jul 27, 2025
    Authors
    Daksh Bhardwaj
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Dataset Card for IMDb Multi-Movie Review Dataset

      Dataset Summary
    

    The IMDb Multi-Movie Review Dataset contains approximately 114,000 user reviews collected from over 150 movies on IMDb.Each movie is stored as a separate JSON file, identified by its movie_id (IMDb ID).Each JSON file includes a list of structured reviews, where every review consists of:

    title: A short summary or headline of the review. review: The full detailed user review. rating: A numeric rating (1–10)… See the full description on the dataset page: https://huggingface.co/datasets/Daksh0505/IMDB-Reviews.

  4. c

    Movies and Tv Shows Dataset

    • crawlfeeds.com
    • kaggle.com
    csv, zip
    Updated Jul 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Crawl Feeds (2025). Movies and Tv Shows Dataset [Dataset]. https://crawlfeeds.com/datasets/movies-and-tv-shows-dataset
    Explore at:
    zip, csvAvailable download formats
    Dataset updated
    Jul 4, 2025
    Dataset authored and provided by
    Crawl Feeds
    License

    https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy

    Description

    Explore our meticulously curated Movies dataset and TV shows dataset, designed to cater to diverse analytical and research needs. Whether you're a data scientist, a student, or a business professional, these datasets provide valuable insights into the entertainment industry.

    Key Features of the Movies Dataset:

    1. Extensive collection of global movies across various genres and languages.

    2. Detailed metadata, including titles, release dates, genres, directors, cast, and ratings.

    3. Regularly updated to ensure relevance and accuracy.

    Why Choose Our TV Shows Dataset?

    Our TV shows dataset is your gateway to understanding trends in episodic content. It includes:

    • Comprehensive details about popular and niche TV shows.

    • Information on episode counts, seasons, ratings, and networks.

    • Insights into audience preferences and regional programming.

    Applications of These Datasets

    These datasets are perfect for:

    • Machine learning models for recommendation systems.

    • Academic research on media trends and audience behavior.

    • Business strategies for entertainment platforms.

    Unlock the power of TV show data with our Crawl Feeds TV Shows Dataset. Start analyzing today and gain valuable insights into your favorite shows!

  5. All 8000+ Top Rated IMDB Movies Dataset

    • kaggle.com
    zip
    Updated Nov 27, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rajkumar Dubey 10 (2023). All 8000+ Top Rated IMDB Movies Dataset [Dataset]. https://www.kaggle.com/datasets/rajkumardubey10/all-top-rated-imdb-movies-dataset/discussion
    Explore at:
    zip(1270149 bytes)Available download formats
    Dataset updated
    Nov 27, 2023
    Authors
    Rajkumar Dubey 10
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context:

    I made this dataset for "Unlock cinematic gems with a dataset featuring IMDB's top-rated movies, ensuring precise and exceptional movie recommendations for an unparalleled viewing experience."

    source: The dataset was collected from The Movie Database (TMDB) using a valid API key. The CSV data was scrape https://api.themoviedb.org/3/movie/top_rated/ by ensuring proper authorization to access their database .

    The raw data obtained from API responses was processed to extract relevant information. This may include parsing JSON responses, handling pagination, and cleaning the data to ensure consistency.

    Inspiration: The inspiration behind making this dataset is that you can build a recommendation system for your project and you can also do EDA on this dataset and make your mini project.

  6. IMDB Movies User Reviews

    • kaggle.com
    zip
    Updated Jul 14, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Saad Azam (2022). IMDB Movies User Reviews [Dataset]. https://www.kaggle.com/datasets/sadmadlad/imdb-user-reviews
    Explore at:
    zip(15867263 bytes)Available download formats
    Dataset updated
    Jul 14, 2022
    Authors
    Saad Azam
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Motivation

    Bringing you another scraping exercise with BeautifulSoup and Selenium. If you are interested in the scrapper, you can check out this link. .

    Dataset Structure

    MovieFolder/
        -metadata.json
        -movieReviews.csv
    

    Movie: Number of User Reviews - SpiderMan No Way Home': 6034 - Joker': 11357, - Avengers Endgame: 9513 - The Dark Knight: 7642 - Forrest Gump: 2960 - Pulp Fiction: 3475 - The Avengers: 2081 - Morbius: 1910 - Thor: 1864 - John Wick 3: 2417

    Challenges you can try

    • Read Morbius User Reviews
    • Make Word cloud of one of the movies Best of luck
  7. c

    Amazon prime tv shows and movies dataset

    • crawlfeeds.com
    csv, zip
    Updated Jul 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Crawl Feeds (2025). Amazon prime tv shows and movies dataset [Dataset]. https://crawlfeeds.com/datasets/amazon-prime-tv-shows-and-movies-dataset
    Explore at:
    zip, csvAvailable download formats
    Dataset updated
    Jul 4, 2025
    Dataset authored and provided by
    Crawl Feeds
    License

    https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy

    Description

    Amazon Prime TV Shows and Movies Dataset offered by Crawl Feeds is an extensive resource containing over 92,000 records in JSON format. This dataset encompasses a wide array of data points, including links, titles, descriptions, release dates, genres, posters, streaming platforms, countries, number of seasons, content ratings, IMDb ratings, cast and crew details, unique identifiers, and scraping timestamps. Such comprehensive information is invaluable for researchers, data analysts, and developers aiming to conduct in-depth analyses, develop recommendation systems, or explore trends within Amazon Prime's content library.

    For those interested in broader media datasets, Crawl Feeds also offers the Movies and TV Shows Dataset, which includes 118,000 records, and the IMDb Movie Details Dataset, comprising 250,000 records. These datasets provide extensive information across various platforms, facilitating comparative studies and cross-platform analyses.

    Integrating these datasets into your projects can significantly enhance the depth and quality of your analyses, providing a robust foundation for exploring various facets of the entertainment industry. Whether you're developing a new application, conducting market research, or performing academic studies, these datasets serve as a valuable resource for gaining insights into the dynamic world of streaming media.

    Explore the Amazon Prime TV Shows and Movies Dataset and other related datasets on Crawl Feeds to elevate your data-driven projects.

  8. IMDB Most Popular by Year

    • kaggle.com
    zip
    Updated Feb 22, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    gilad (2017). IMDB Most Popular by Year [Dataset]. https://www.kaggle.com/giladstern/imdb-most-popular-by-year
    Explore at:
    zip(8374255 bytes)Available download formats
    Dataset updated
    Feb 22, 2017
    Authors
    gilad
    Description

    Content

    Around 100,000 movies acquired from IMDB. The most popular items from each year since 1950. The dataset is organized as a JSON file. The JSON is of the following format: { year1 : { movie_title1 : { 'genre' : [genre1, genre2,...], 'synopsis' : synopsis_string } movie_title2 : ... } year2 : .... } It should be noted that the list of genres could be empty.

    We only took movies with a short synopsis, and not the longer "summary" format in IMDB.

    The script is also attached as "Crawler.py".

  9. c

    Movies & TV Shows Metadata Dataset (190K+ Records, Horror-Heavy Collection)

    • crawlfeeds.com
    csv, zip
    Updated Aug 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Crawl Feeds (2025). Movies & TV Shows Metadata Dataset (190K+ Records, Horror-Heavy Collection) [Dataset]. https://crawlfeeds.com/datasets/movies-tv-shows-metadata-dataset-190k-records-horror-heavy-collection
    Explore at:
    zip, csvAvailable download formats
    Dataset updated
    Aug 23, 2025
    Dataset authored and provided by
    Crawl Feeds
    License

    https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy

    Description

    This comprehensive dataset features detailed metadata for over 190,000 movies and TV shows, with a strong concentration in the Horror genre. It is ideal for entertainment research, machine learning models, genre-specific trend analysis, and content recommendation systems.

    Each record contains rich information, making it perfect for streaming platforms, film industry analysts, or academic media researchers.

    Primary Genre Focus: Horror

    Use Cases:

    • Build movie recommendation systems or genre classifiers

    • Train NLP models on movie descriptions

    • Analyze Horror content trends over time

    • Explore box office vs. rating correlations

    • Enrich entertainment datasets with directorial and cast metadata

  10. TMDB Movies (2000-2020) with `imdb_id`

    • kaggle.com
    zip
    Updated May 20, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hudson Leonardo MENDES (2020). TMDB Movies (2000-2020) with `imdb_id` [Dataset]. https://www.kaggle.com/hudsonmendes/tmdb-movies-20002020-with-imdb-id
    Explore at:
    zip(23899453 bytes)Available download formats
    Dataset updated
    May 20, 2020
    Authors
    Hudson Leonardo MENDES
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Summary

    TMDb movies database with id_imdb that can be used for batch processing of TMDb films, as an alternative to request the TMDb API 6 million times with the id from IMDb to find the links.

    The Problem

    IMDb provides snapshots of their databases on titles, casting, etc. However, they do not provide user reviews. Furthermore, it is against their Terms of Use to do any form of Scraping of their webpages.

    TMDb, an Alternative to IMDb TMDb (The Movie Database) on the other hand, does provide user reviews, through their API. It is even possible to search a film by their imdb_id.

    However, if for any reason you must stick to the IMDB as your base dataset, and collect information for a good portion of IMDB's 6,782,091 entries, you are doomed.

    10% of 6,782,091 would amount for 678,209 API requests, and even though you may not be rate limited, it will still take days.

    Solution

    I've then created this script (https://github.com/hudsonmendes/lambda-tmdb-distributed-downloader) that can be used to download, with good level of parallelism, TMDb movies by their IMDb id.

    Apart from the extra data that TMDb makes available (like full release date, for example), we attach the IMDb ID that was found (as id_imdb) to the TMDB movie JSON, and save it in S3.

    Acknowledgements

    It would not be possible to put together this data if it wasn't for snapshot of data provided by IMDB or by the nice API provided by TMDB. Special thanks for both providers to provide either data or the API, documentation, run the infra-structure and allow us, through their terms to have access to such data.

  11. Z

    Data from: Extended datasets from MM-IMDB and Ads-Parallelity dataset with...

    • data-staging.niaid.nih.gov
    • zenodo.org
    Updated Feb 24, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shunsuke Kitada; Yuki Iwazaki; Riku Togashi; Hitoshi Iyatomi (2023). Extended datasets from MM-IMDB and Ads-Parallelity dataset with the features from Google Cloud Vision API [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_7050923
    Explore at:
    Dataset updated
    Feb 24, 2023
    Dataset provided by
    CyberAgent, Inc.
    Hosei University
    Authors
    Shunsuke Kitada; Yuki Iwazaki; Riku Togashi; Hitoshi Iyatomi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is extended datasets from MM-IMDB [Arevalo+ ICLRW'17], Ads-Parallelity [Zhang+ BMVC'18] dataset with the features from Google Cloud Vision API. These datasets are stored in jsonl (JSON Lines) format.

    Abstract (from our paper):

    There is increasing interest in the use of multimodal data in various web applications, such as digital advertising and e-commerce. Typical methods for extracting important information from multimodal data rely on a mid-fusion architecture that combines the feature representations from multiple encoders. However, as the number of modalities increases, several potential problems with the mid-fusion model structure arise, such as an increase in the dimensionality of the concatenated multimodal features and missing modalities. To address these problems, we propose a new concept that considers multimodal inputs as a set of sequences, namely, deep multimodal sequence sets (DM2S2). Our set-aware concept consists of three components that capture the relationships among multiple modalities: (a) a BERT-based encoder to handle the inter- and intra-order of elements in the sequences, (b) intra-modality residual attention (IntraMRA) to capture the importance of the elements in a modality, and (c) inter-modality residual attention (InterMRA) to enhance the importance of elements with modality-level granularity further. Our concept exhibits performance that is comparable to or better than the previous set-aware models. Furthermore, we demonstrate that the visualization of the learned InterMRA and IntraMRA weights can provide an interpretation of the prediction results.

    Dataset (MM-IMDB and Ads-Parallelity):

    We extended two multimodal datasets, namely, MM-IMDB [Arevalo+ ICLRW'17], Ads-Parallelity [Zhang+ BMVC'18] for the empirical experiments. The MM-IMDB dataset contains 25,925 movies with multiple labels (genres). We used the original split provided in the dataset and reported the F1 scores (micro, macro, and samples) of the test set. The Ads-Parallelity dataset contains 670 images and slogans from persuasive advertisements to understand the implicit relationship (parallel and non-parallel) between these two modalities. A binary classification task is used to predict whether the text and image in the same ad convey the same message.

    We transformed the following multimodal information (i.e., visual, textual, and categorical data) into textual tokens and fed these into our proposed model. We used the Google Cloud Vision API for the visual features to obtain the following four pieces of information as tokens: (1) text from the OCR, (2) category labels from the label detection, (3) object tags from the object detection, and (4) the number of faces from the facial detection. We input the labels and object detection results as a sequence in order of confidence, as obtained from the API. We describe the visual, textual, and categorical features of each dataset below.

    MM-IMDB: We used the title and plot of movies as the textual features, and the aforementioned API results based on poster images as visual features.

    Ads-Parallelity: We used the same API-based visual features as in MM-IMDB. Furthermore, we used textual and categorical features consisting of textual inputs of transcriptions and messages, and categorical inputs of natural and text concrete images.

  12. d

    Replication Data for: Movie Scripts Corpus

    • dataone.org
    Updated Sep 24, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Drouet, Lance (2024). Replication Data for: Movie Scripts Corpus [Dataset]. http://doi.org/10.7910/DVN/PZTL2L
    Explore at:
    Dataset updated
    Sep 24, 2024
    Dataset provided by
    Harvard Dataverse
    Authors
    Drouet, Lance
    Description

    Data Source: https://www.kaggle.com/datasets/gufukuro/movie-scripts-corpus Data Description : Movie Scripts Corpus This corpus was collected to use for screenplay analysis with machine learning methods. Corpus includes movie scripts, crawled from different sources, their annotations by script structural elements and movies metadata. Corpus description Screenplay data consists of: Movie scripts TXT-documents with raw full text (2858 docs) Movie scripts TXT-documents with full text lemmas (2858 docs) Manual annotation TXT-documents for some movie scripts (33 docs, more than 6000 annotated rows) Movie scripts annotations TXT-documents obtained by BERT Movie scripts annotations json-documents obtained by rule-based annotator ScreenPy Movies metadata consists of: Cut versions of movie reviews and scores from metacritic: Number of reviews: 21025 Number of movies with reviews: 2038 Metadata for movies, including: title, akas, launch year, score from metacritic, imdb user rating and number of votes from imdb.com, movie awards, opening weekend, producers, budget, script department, production companies, writers, directors, cast info, countries involved in production, age restrict, plot (with outline), keywords, genres, taglines, critics' synopsis Screenplay awards information: Academy Awards adapted screenplay, Academy Awards original screenplay, BAFTA, Golden Globe Award for Best Screenplay, Writers Guild Awards Winners & Nominees 2020-2013 nominations information for 462 movies in total. Movie characters data consists of: Script text fragments with dialogs and scene descriptions for characters, gathered with annotators: 2153 movies and text fragments for 32114 characters in total Gender labels for 4792 characters

  13. Palestinian Movies JSON Dataset

    • kaggle.com
    zip
    Updated Feb 26, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sondos Aabed (2024). Palestinian Movies JSON Dataset [Dataset]. https://www.kaggle.com/datasets/sondosaabed/palestinian-movies-json-dataset
    Explore at:
    zip(9292 bytes)Available download formats
    Dataset updated
    Feb 26, 2024
    Authors
    Sondos Aabed
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    This dataset was scrapped from my list on Internet Movies Database List about Palestinian Movies.

    Using apify actor to scrap the data and download the file: https://console.apify.com/actors/poWuYPmbfLGBn5Mf8/console

    This is the list: https://www.imdb.com/list/ls563010565/?sort=alpha,asc&st_dt=&mode=detail&page=1

    To use this dataset

    It's usable for raw JSON response

    https://raw.githubusercontent.com/sondosaabed/Palestinian-Movies-JSON-Dataset/main/palestinian_movies.json
    
    
  14. The Movies Dataset

    • kaggle.com
    zip
    Updated Nov 10, 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rounak Banik (2017). The Movies Dataset [Dataset]. https://www.kaggle.com/datasets/rounakbanik/the-movies-dataset/code
    Explore at:
    zip(238862293 bytes)Available download formats
    Dataset updated
    Nov 10, 2017
    Authors
    Rounak Banik
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    These files contain metadata for all 45,000 movies listed in the Full MovieLens Dataset. The dataset consists of movies released on or before July 2017. Data points include cast, crew, plot keywords, budget, revenue, posters, release dates, languages, production companies, countries, TMDB vote counts and vote averages.

    This dataset also has files containing 26 million ratings from 270,000 users for all 45,000 movies. Ratings are on a scale of 1-5 and have been obtained from the official GroupLens website.

    Content

    This dataset consists of the following files:

    movies_metadata.csv: The main Movies Metadata file. Contains information on 45,000 movies featured in the Full MovieLens dataset. Features include posters, backdrops, budget, revenue, release dates, languages, production countries and companies.

    keywords.csv: Contains the movie plot keywords for our MovieLens movies. Available in the form of a stringified JSON Object.

    credits.csv: Consists of Cast and Crew Information for all our movies. Available in the form of a stringified JSON Object.

    links.csv: The file that contains the TMDB and IMDB IDs of all the movies featured in the Full MovieLens dataset.

    links_small.csv: Contains the TMDB and IMDB IDs of a small subset of 9,000 movies of the Full Dataset.

    ratings_small.csv: The subset of 100,000 ratings from 700 users on 9,000 movies.

    The Full MovieLens Dataset consisting of 26 million ratings and 750,000 tag applications from 270,000 users on all the 45,000 movies in this dataset can be accessed here

    Acknowledgements

    This dataset is an ensemble of data collected from TMDB and GroupLens. The Movie Details, Credits and Keywords have been collected from the TMDB Open API. This product uses the TMDb API but is not endorsed or certified by TMDb. Their API also provides access to data on many additional movies, actors and actresses, crew members, and TV shows. You can try it for yourself here.

    The Movie Links and Ratings have been obtained from the Official GroupLens website. The files are a part of the dataset available here

    https://www.themoviedb.org/assets/static_cache/9b3f9c24d9fd5f297ae433eb33d93514/images/v4/logos/408x161-powered-by-rectangle-green.png" alt="">

    Inspiration

    This dataset was assembled as part of my second Capstone Project for Springboard's Data Science Career Track. I wanted to perform an extensive EDA on Movie Data to narrate the history and the story of Cinema and use this metadata in combination with MovieLens ratings to build various types of Recommender Systems.

    Both my notebooks are available as kernels with this dataset: The Story of Film and Movie Recommender Systems

    Some of the things you can do with this dataset: Predicting movie revenue and/or movie success based on a certain metric. What movies tend to get higher vote counts and vote averages on TMDB? Building Content Based and Collaborative Filtering Based Recommendation Engines.

  15. IMDB Genre Classification Dataset

    • kaggle.com
    zip
    Updated Nov 24, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    aptmess (2022). IMDB Genre Classification Dataset [Dataset]. https://www.kaggle.com/datasets/nelepie/imdb-genre-classification
    Explore at:
    zip(11712640006 bytes)Available download formats
    Dataset updated
    Nov 24, 2022
    Authors
    aptmess
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    10000 film's posters and descriptions parsed from the IMDB cite for Genre Classification Task.

    Folder contains:

    • labels.json - Information about film's genre mapping to labels from 0 to 23.

    • parsed_data.json:

    Information about each film, represented by his own film_id (based on IMDB film_id):

    • title - title of the film
    • description - film description, parsed from IMDB
    • poster_url - link to film's poste
    • genre - main genre of the film
    • labels - additional genres of the film
    • releaseDate - release date of the film
    • film_year - year of the release date

    Sample:

      "tt6443346": {
      "title": "Black Adam"
      "description":string"Nearly 5,000 years after he was bestowed with the almighty powers of the Egyptian gods - and imprisoned just as quickly - Black Adam is freed from his earthly tomb, ready to unleash his unique form of justice on the modern world."
      "poster_url": "https://m.media-amazon.com/images/M/MV5BYzZkOGUwMzMtMTgyNS00YjFlLTg5NzYtZTE3Y2E5YTA5NWIyXkEyXkFqcGdeQXVyMjkwOTAyMDU@._V1_QL75_UX190_CR0,0,190,281_.jpg"
      "genre": "SuperHero"
      "labels": ["Action", "Adventure", "Fantasy"]
      "releaseDate": NULL
      "film_year": 2022
    }
    
    • `
  16. Top Rated Movies Rated By IMDB

    • kaggle.com
    zip
    Updated Mar 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    PRAJJAL DHAR (2023). Top Rated Movies Rated By IMDB [Dataset]. https://www.kaggle.com/datasets/prajjaldhar/top-rated-movies-rated-by-imdb/versions/1
    Explore at:
    zip(3156322 bytes)Available download formats
    Dataset updated
    Mar 21, 2023
    Authors
    PRAJJAL DHAR
    Description

    The IMDB movie data is a comprehensive data set that contains information about movies from the Internet Movie Database (IMDB). It is an extensive collection of movie-related data, including movie titles, release dates, genres, ratings, and reviews.

    The data set contains information about a wide range of movies, including both old and new films from various countries and languages. It is an excellent resource for those interested in movie analysis, as it includes information such as the movie's budget, box office revenue, and cast and crew details.

    The IMDB movie data set is widely used by data scientists, researchers, and movie enthusiasts to perform analysis and draw insights. By analyzing the data, one can gain valuable insights about the movie industry, such as the most popular genres, the most successful directors, and the impact of ratings and reviews on box office performance.

    The data set is available for free and can be downloaded in various formats, including CSV, JSON, and SQL. This makes it easily accessible and usable by anyone interested in conducting analysis on movie-related data

  17. Large Movie Review Dataset

    • kaggle.com
    zip
    Updated Jan 20, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kris (2018). Large Movie Review Dataset [Dataset]. https://www.kaggle.com/pankrzysiu/keras-imdb
    Explore at:
    zip(136987095 bytes)Available download formats
    Dataset updated
    Jan 20, 2018
    Authors
    Kris
    Description

    Context

    This is a huge dataset and takes around 400 seconds to load into kernel. If you need quickly IMDB data in Keras kernel use the following dataset instead:

    https://www.kaggle.com/pankrzysiu/keras-imdb-reviews

    Content

    A set of 50,000 highly-polarized reviews from the Internet Movie Database.

    Usage Instructions

    aclImdb_v1.zip file

    This file is to be used directly in your code. The .zip file will be automatically uncompressed by Kaggle.

    imdb* files

    from os import listdir, makedirs
    from os.path import join, exists, expanduser
    
    cache_dir = expanduser(join('~', '.keras'))
    if not exists(cache_dir):
      makedirs(cache_dir)
    datasets_dir = join(cache_dir, 'datasets')
    if not exists(datasets_dir):
      makedirs(datasets_dir)
    
    # If you have multiple input files, change the below cp commands accordingly, typically:
    # !cp ../input/keras-imdb/imdb* ~/.keras/datasets/
    !cp ../input/imdb* ~/.keras/datasets/
    

    Acknowledgements

    The files are on the net in these locations:

    https://s3.amazonaws.com/text-datasets/imdb.npz

    https://s3.amazonaws.com/text-datasets/imdb_word_index.json

    They are used by keras imdb.py:

    https://github.com/keras-team/keras/blob/master/keras/datasets/imdb.py

    Inspiration

    "Python Deep Learning" Book example is using this:

    https://github.com/fchollet/deep-learning-with-python-notebooks/blob/master/6.1-using-word-embeddings.ipynb

  18. 48K IMDB Movies Data

    • kaggle.com
    zip
    Updated Jan 2, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    reza jafari (2021). 48K IMDB Movies Data [Dataset]. https://www.kaggle.com/rezaunderfit/48k-imdb-movies-data
    Explore at:
    zip(82356556 bytes)Available download formats
    Dataset updated
    Jan 2, 2021
    Authors
    reza jafari
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    These files contain 48,000 Movies data. It's good for researcher how wants to make an online recommender systems. It almost contain all Movies that exist in MovieLens 1M datasets.

    CITATION

    ================================================================================

    To acknowledge use of the dataset in publications, please cite the following:

    RJ Ziarani, 48K IMDB Movies With Datasets, accessed 25 July 2021 ,2021

    Data Folder DESCRIPTION

    ================================================================================

    These Folder contain movie's Data.You can access each movie with following pattern:

    Data/Year/IMDBID/IMDBID.json

    Example:

    Data/2020\tt4532038\tt4532038.json

    Example for a movie's data:

    {"@context": "http://schema.org", "@type": "Movie", "url": "/title/tt4532038/", "name": "The War with Grandpa", "image": "https://m.media-amazon.com/images/M/MV5BNTlkZDQ1ODEtY2ZiMS00OGNhLWJlZDctYzY0NTFmNmQ2NDAzXkEyXkFqcGdeQXVyMTkxNjUyNQ@@._V1_.jpg", "genre": ["Comedy", "Drama", "Family"], "contentRating": "PG", "actor": [{"@type": "Person", "url": "/name/nm0000134/", "name": "Robert De Niro"}, {"@type": "Person", "url": "/name/nm0000235/", "name": "Uma Thurman"}, {"@type": "Person", "url": "/name/nm1443527/", "name": "Rob Riggle"}, {"@type": "Person", "url": "/name/nm4625502/", "name": "Oakes Fegley"}], "director": {"@type": "Person", "url": "/name/nm0384722/", "name": "Tim Hill"}, "creator": [{"@type": "Person", "url": "/name/nm0040022/", "name": "Tom J. Astle"}, {"@type": "Person", "url": "/name/nm0256079/", "name": "Matt Ember"}, {"@type": "Person", "url": "/name/nm0809759/", "name": "Robert Kimmel Smith"}, {"@type": "Organization", "url": "/company/co0482253/"}, {"@type": "Organization", "url": "/company/co0017712/"}, {"@type": "Organization", "url": "/company/co0639852/"}, {"@type": "Organization", "url": "/company/co0437328/"}, {"@type": "Organization", "url": "/company/co0641417/"}], "description": "The War with Grandpa is a movie starring Robert De Niro, Uma Thurman, and Rob Riggle. Upset that he has to share the room he loves with his grandfather, Peter decides to declare war in an attempt to get it back.", "datePublished": "2020-08-27", "keywords": "mother son relationship,christmas,room,family conflict,family relationships", "aggregateRating": {"@type": "AggregateRating", "ratingCount": 9310, "bestRating": "10.0", "worstRating": "1.0", "ratingValue": "5.5"}, "review": {"@type": "Review", "itemReviewed": {"@type": "CreativeWork", "url": "/title/tt4532038/"}, "author": {"@type": "Person", "name": "byron-116"}, "dateCreated": "2020-08-28", "inLanguage": "English", "name": "Suitable for juveniles only...", "reviewBody": "It's pathetic to watch such great stars in this film apt for juveniles only. Watch it if you are under 14 years old.....", "reviewRating": {"@type": "Rating", "worstRating": "1", "bestRating": "10", "ratingValue": "4"}}, "duration": "PT1H34M", "trailer": {"@type": "VideoObject", "name": "Official Trailer", "embedUrl": "/video/imdb/vi911785497", "thumbnail": {"@type": "ImageObject", "contentUrl": "https://m.media-amazon.com/images/M/MV5BMTdhNWI1N2QtMjQ5Yi00M2M5LWE3YWQtMDE5YmNhMmFmZTVkXkEyXkFqcGdeQXRyYW5zY29kZS13b3JrZmxvdw@@._V1_.jpg"}, "thumbnailUrl": "https://m.media-amazon.com/images/M/MV5BMTdhNWI1N2QtMjQ5Yi00M2M5LWE3YWQtMDE5YmNhMmFmZTVkXkEyXkFqcGdeQXRyYW5zY29kZS13b3JrZmxvdw@@._V1_.jpg", "description": "The next big family-fun film is hitting theaters soon! Check out the trailer for THE WAR WITH GRANDPA starring Robert De Niro, Christopher Walken, Uma Thurman, Rob Riggle, Cheech Marin, Laura Marano and Oakes Fegly. Coming soon to theaters!", "uploadDate": "2020-08-13T17:40:20Z"}}

    Poster Folder DESCRIPTION

    48K IMDB Movies With Posters

  19. Netflix Prize Shows Information (9000 Shows)

    • kaggle.com
    zip
    Updated Oct 24, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Akash Guna (2021). Netflix Prize Shows Information (9000 Shows) [Dataset]. https://www.kaggle.com/datasets/akashguna/netflix-prize-shows-information/discussion
    Explore at:
    zip(10833651 bytes)Available download formats
    Dataset updated
    Oct 24, 2021
    Authors
    Akash Guna
    Description

    Context

    Netfilx prize data is one of the popular datasets available today for OTT Recommandation. Netflix Prize Dataset contains title, userid, rating,date of rating as the only attributes for recommandation . we extend the Netflix prize dataset by scraping IMDB data about the titles in Netflix prize dataset. Any copyyright to the scraped data belongs to its respective owners.

    Content

    The Dataset contains information of approximately 9000 movies and tv shows available in Netflix prize datasets. Information like duration of movie, cast and crew,genre,languages,etc are present. For Columns which hold multiple values in a row arrays have been used to store those values. Please use the .json file to access the dataset to avoid string related errors.

    Inspiration

    Could you build a Hybrid recommandation system by combining our dataset along with Netflix Prize Dataset.

    Update 1

    Some movies present in imdb.csv and imdb.json have information of movies with titles same as in Netflix Prize Dataset but were made after 2005 (release of Netflix Prize Dataset) this has been corrected in imdb_processed.csv and imdb_processed.json . Please use this processed data while using the dataset for tasks specific to Netfilx Prize Dataset.

    Link to Netflix Prize Dataset

    https://www.kaggle.com/netflix-inc/netflix-prize-data

  20. Keras IMDB Reviews

    • kaggle.com
    zip
    Updated Jan 26, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kris (2018). Keras IMDB Reviews [Dataset]. https://www.kaggle.com/pankrzysiu/keras-imdb-reviews
    Explore at:
    zip(18163301 bytes)Available download formats
    Dataset updated
    Jan 26, 2018
    Authors
    Kris
    Description

    Context

    Using Keras inside Kaggle requires you to provide cached datasets. This dataset loads quickly into kernels and Keras.

    Content

    A set of 50,000 highly-polarized reviews from the Internet Movie Database.

    Usage Instructions

    imdb* files

    from os import listdir, makedirs
    from os.path import join, exists, expanduser
    
    cache_dir = expanduser(join('~', '.keras'))
    if not exists(cache_dir):
      makedirs(cache_dir)
    datasets_dir = join(cache_dir, 'datasets')
    if not exists(datasets_dir):
      makedirs(datasets_dir)
    
    # If you have multiple input files, change the below cp commands accordingly, typically:
    # !cp ../input/keras-imdb-reviews/imdb* ~/.keras/datasets/
    !cp ../input/imdb* ~/.keras/datasets/
    

    Acknowledgements

    The files are on the net in these locations:

    https://s3.amazonaws.com/text-datasets/imdb.npz

    https://s3.amazonaws.com/text-datasets/imdb_word_index.json

    They are used by keras imdb.py:

    https://github.com/keras-team/keras/blob/master/keras/datasets/imdb.py

    Inspiration

    "Python Deep Learning" Book example is using this: https://github.com/fchollet/deep-learning-with-python-notebooks/blob/master/3.5-classifying-movie-reviews.ipynb

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Prakash Mahara (2025). IMDB Top 100 Movies [Dataset]. https://www.kaggle.com/datasets/prakash27x/imdb-top-100-movies
Organization logo

IMDB Top 100 Movies

Explore at:
7 scholarly articles cite this dataset (View in Google Scholar)
zip(18499 bytes)Available download formats
Dataset updated
May 5, 2025
Authors
Prakash Mahara
Description

# 🏆 IMDB Top 100 Movies Dataset

This dataset contains detailed information about the Top 100 movies from IMDb, collected to assist film enthusiasts, data analysts, and machine learning practitioners in exploring trends and insights in the film industry.

📁 Dataset Features

Each movie entry includes: 🎬 Title – Name of the movie 📅 Year – Year of release ⭐ Rating – IMDb user rating (out of 10) 📣 Genres – List of genres the movie belongs to 🎥 Director – Director(s) of the movie 👥 Stars – Leading cast ⏱️ Runtime – Duration in minutes 📝 Summary – A brief synopsis of the movie 🧾 Votes – Number of user votes 💰 Gross – Box office gross (if available)

🎯 Use Cases

Data Visualization: Create graphs showing rating trends, genre distributions, etc. Recommendation Systems: Build a content-based movie recommender. NLP Projects: Use summaries for natural language processing tasks. Exploratory Data Analysis: Great dataset for practicing EDA techniques.

🛠️ Source

The data is derived from IMDb's public listings and compiled into JSON format for easy use in Python-based projects.

Search
Clear search
Close search
Google apps
Main menu