23 datasets found

IMDB Top 100 Movies
kaggle.com
zip
Updated May 5, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Prakash Mahara (2025). IMDB Top 100 Movies [Dataset]. https://www.kaggle.com/datasets/prakash27x/imdb-top-100-movies
Explore at:
zip(18499 bytes)Available download formats
Dataset updated
May 5, 2025
Authors
Prakash Mahara
Description
# 🏆 IMDB Top 100 Movies Dataset

This dataset contains detailed information about the Top 100 movies from IMDb, collected to assist film enthusiasts, data analysts, and machine learning practitioners in exploring trends and insights in the film industry.

📁 Dataset Features

Each movie entry includes: 🎬 Title – Name of the movie 📅 Year – Year of release ⭐ Rating – IMDb user rating (out of 10) 📣 Genres – List of genres the movie belongs to 🎥 Director – Director(s) of the movie 👥 Stars – Leading cast ⏱️ Runtime – Duration in minutes 📝 Summary – A brief synopsis of the movie 🧾 Votes – Number of user votes 💰 Gross – Box office gross (if available)

🎯 Use Cases

Data Visualization: Create graphs showing rating trends, genre distributions, etc. Recommendation Systems: Build a content-based movie recommender. NLP Projects: Use summaries for natural language processing tasks. Exploratory Data Analysis: Great dataset for practicing EDA techniques.

🛠️ Source

The data is derived from IMDb's public listings and compiled into JSON format for easy use in Python-based projects.
imdb-movies
kaggle.com
zip
Updated Nov 9, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Emmanuel Pogbe (2025). imdb-movies [Dataset]. https://www.kaggle.com/datasets/emmanuelpogbe/imdb-movies
Explore at:
zip(4041 bytes)Available download formats
Dataset updated
Nov 9, 2025
Authors
Emmanuel Pogbe
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset is a json file that contains movie information from imdb Fields in json: title, year, rating, genre, director, votes

The first 100 entries were directly from IMDB Top 100 Movies - https://www.kaggle.com/datasets/prakash27x/imdb-top-100-movies

The next 10 entries are movies produced in 2007 (for a database management project) and were scraped from IMDB by me
h
IMDB-Reviews
huggingface.co
Updated Jul 27, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daksh Bhardwaj (2025). IMDB-Reviews [Dataset]. http://doi.org/10.57967/hf/6003
Explore at:
Unique identifier
https://doi.org/10.57967/hf/6003
Dataset updated
Jul 27, 2025
Authors
Daksh Bhardwaj
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
Dataset Card for IMDb Multi-Movie Review Dataset

Dataset Summary

The IMDb Multi-Movie Review Dataset contains approximately 114,000 user reviews collected from over 150 movies on IMDb.Each movie is stored as a separate JSON file, identified by its movie_id (IMDb ID).Each JSON file includes a list of structured reviews, where every review consists of:

title: A short summary or headline of the review. review: The full detailed user review. rating: A numeric rating (1–10)… See the full description on the dataset page: https://huggingface.co/datasets/Daksh0505/IMDB-Reviews.
c
Movies and Tv Shows Dataset
crawlfeeds.com
kaggle.com
csv, zip
Updated Jul 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Crawl Feeds (2025). Movies and Tv Shows Dataset [Dataset]. https://crawlfeeds.com/datasets/movies-and-tv-shows-dataset
Explore at:
zip, csvAvailable download formats
Dataset updated
Jul 4, 2025
Dataset authored and provided by
Crawl Feeds
License
https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy
Description
Explore our meticulously curated Movies dataset and TV shows dataset, designed to cater to diverse analytical and research needs. Whether you're a data scientist, a student, or a business professional, these datasets provide valuable insights into the entertainment industry.

Key Features of the Movies Dataset:

Extensive collection of global movies across various genres and languages.

Detailed metadata, including titles, release dates, genres, directors, cast, and ratings.

Regularly updated to ensure relevance and accuracy.

Why Choose Our TV Shows Dataset?

Our TV shows dataset is your gateway to understanding trends in episodic content. It includes:

Comprehensive details about popular and niche TV shows.

Information on episode counts, seasons, ratings, and networks.

Insights into audience preferences and regional programming.

Applications of These Datasets

These datasets are perfect for:

Machine learning models for recommendation systems.

Academic research on media trends and audience behavior.

Business strategies for entertainment platforms.

Unlock the power of TV show data with our Crawl Feeds TV Shows Dataset. Start analyzing today and gain valuable insights into your favorite shows!
All 8000+ Top Rated IMDB Movies Dataset
kaggle.com
zip
Updated Nov 27, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rajkumar Dubey 10 (2023). All 8000+ Top Rated IMDB Movies Dataset [Dataset]. https://www.kaggle.com/datasets/rajkumardubey10/all-top-rated-imdb-movies-dataset/discussion
Explore at:
zip(1270149 bytes)Available download formats
Dataset updated
Nov 27, 2023
Authors
Rajkumar Dubey 10
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Context:

I made this dataset for "Unlock cinematic gems with a dataset featuring IMDB's top-rated movies, ensuring precise and exceptional movie recommendations for an unparalleled viewing experience."

source: The dataset was collected from The Movie Database (TMDB) using a valid API key. The CSV data was scrape https://api.themoviedb.org/3/movie/top_rated/ by ensuring proper authorization to access their database .

The raw data obtained from API responses was processed to extract relevant information. This may include parsing JSON responses, handling pagination, and cleaning the data to ensure consistency.

Inspiration: The inspiration behind making this dataset is that you can build a recommendation system for your project and you can also do EDA on this dataset and make your mini project.
IMDB Movies User Reviews
kaggle.com
zip
Updated Jul 14, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Saad Azam (2022). IMDB Movies User Reviews [Dataset]. https://www.kaggle.com/datasets/sadmadlad/imdb-user-reviews
Explore at:
zip(15867263 bytes)Available download formats
Dataset updated
Jul 14, 2022
Authors
Saad Azam
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Motivation

Bringing you another scraping exercise with BeautifulSoup and Selenium. If you are interested in the scrapper, you can check out this link. .

Dataset Structure

MovieFolder/ -metadata.json -movieReviews.csv

Movie: Number of User Reviews - SpiderMan No Way Home': 6034 - Joker': 11357, - Avengers Endgame: 9513 - The Dark Knight: 7642 - Forrest Gump: 2960 - Pulp Fiction: 3475 - The Avengers: 2081 - Morbius: 1910 - Thor: 1864 - John Wick 3: 2417

Challenges you can try

Read Morbius User Reviews

Make Word cloud of one of the movies Best of luck
c
Amazon prime tv shows and movies dataset
crawlfeeds.com
csv, zip
Updated Jul 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Crawl Feeds (2025). Amazon prime tv shows and movies dataset [Dataset]. https://crawlfeeds.com/datasets/amazon-prime-tv-shows-and-movies-dataset
Explore at:
zip, csvAvailable download formats
Dataset updated
Jul 4, 2025
Dataset authored and provided by
Crawl Feeds
License
https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy
Description
Amazon Prime TV Shows and Movies Dataset offered by Crawl Feeds is an extensive resource containing over 92,000 records in JSON format. This dataset encompasses a wide array of data points, including links, titles, descriptions, release dates, genres, posters, streaming platforms, countries, number of seasons, content ratings, IMDb ratings, cast and crew details, unique identifiers, and scraping timestamps. Such comprehensive information is invaluable for researchers, data analysts, and developers aiming to conduct in-depth analyses, develop recommendation systems, or explore trends within Amazon Prime's content library.

For those interested in broader media datasets, Crawl Feeds also offers the Movies and TV Shows Dataset, which includes 118,000 records, and the IMDb Movie Details Dataset, comprising 250,000 records. These datasets provide extensive information across various platforms, facilitating comparative studies and cross-platform analyses.

Integrating these datasets into your projects can significantly enhance the depth and quality of your analyses, providing a robust foundation for exploring various facets of the entertainment industry. Whether you're developing a new application, conducting market research, or performing academic studies, these datasets serve as a valuable resource for gaining insights into the dynamic world of streaming media.

Explore the Amazon Prime TV Shows and Movies Dataset and other related datasets on Crawl Feeds to elevate your data-driven projects.
IMDB Most Popular by Year
kaggle.com
zip
Updated Feb 22, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
gilad (2017). IMDB Most Popular by Year [Dataset]. https://www.kaggle.com/giladstern/imdb-most-popular-by-year
Explore at:
zip(8374255 bytes)Available download formats
Dataset updated
Feb 22, 2017
Authors
gilad
Description
Content

Around 100,000 movies acquired from IMDB. The most popular items from each year since 1950. The dataset is organized as a JSON file. The JSON is of the following format: { year1 : { movie_title1 : { 'genre' : [genre1, genre2,...], 'synopsis' : synopsis_string } movie_title2 : ... } year2 : .... } It should be noted that the list of genres could be empty.

We only took movies with a short synopsis, and not the longer "summary" format in IMDB.

The script is also attached as "Crawler.py".
c
Movies & TV Shows Metadata Dataset (190K+ Records, Horror-Heavy Collection)
crawlfeeds.com
csv, zip
Updated Aug 23, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Crawl Feeds (2025). Movies & TV Shows Metadata Dataset (190K+ Records, Horror-Heavy Collection) [Dataset]. https://crawlfeeds.com/datasets/movies-tv-shows-metadata-dataset-190k-records-horror-heavy-collection
Explore at:
zip, csvAvailable download formats
Dataset updated
Aug 23, 2025
Dataset authored and provided by
Crawl Feeds
License
https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy
Description
This comprehensive dataset features detailed metadata for over 190,000 movies and TV shows, with a strong concentration in the Horror genre. It is ideal for entertainment research, machine learning models, genre-specific trend analysis, and content recommendation systems.

Each record contains rich information, making it perfect for streaming platforms, film industry analysts, or academic media researchers.

Primary Genre Focus: Horror

Use Cases:

Build movie recommendation systems or genre classifiers

Train NLP models on movie descriptions

Analyze Horror content trends over time

Explore box office vs. rating correlations

Enrich entertainment datasets with directorial and cast metadata
TMDB Movies (2000-2020) with `imdb_id`
kaggle.com
zip
Updated May 20, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hudson Leonardo MENDES (2020). TMDB Movies (2000-2020) with `imdb_id` [Dataset]. https://www.kaggle.com/hudsonmendes/tmdb-movies-20002020-with-imdb-id
Explore at:
zip(23899453 bytes)Available download formats
Dataset updated
May 20, 2020
Authors
Hudson Leonardo MENDES
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
Summary

TMDb movies database with id_imdb that can be used for batch processing of TMDb films, as an alternative to request the TMDb API 6 million times with the id from IMDb to find the links.

The Problem

IMDb provides snapshots of their databases on titles, casting, etc. However, they do not provide user reviews. Furthermore, it is against their Terms of Use to do any form of Scraping of their webpages.

TMDb, an Alternative to IMDb TMDb (The Movie Database) on the other hand, does provide user reviews, through their API. It is even possible to search a film by their imdb_id.

However, if for any reason you must stick to the IMDB as your base dataset, and collect information for a good portion of IMDB's 6,782,091 entries, you are doomed.

10% of 6,782,091 would amount for 678,209 API requests, and even though you may not be rate limited, it will still take days.

Solution

I've then created this script (https://github.com/hudsonmendes/lambda-tmdb-distributed-downloader) that can be used to download, with good level of parallelism, TMDb movies by their IMDb id.

Apart from the extra data that TMDb makes available (like full release date, for example), we attach the IMDb ID that was found (as id_imdb) to the TMDB movie JSON, and save it in S3.

Acknowledgements

It would not be possible to put together this data if it wasn't for snapshot of data provided by IMDB or by the nice API provided by TMDB. Special thanks for both providers to provide either data or the API, documentation, run the infra-structure and allow us, through their terms to have access to such data.
Z
Data from: Extended datasets from MM-IMDB and Ads-Parallelity dataset with...
data-staging.niaid.nih.gov
zenodo.org
Updated Feb 24, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shunsuke Kitada; Yuki Iwazaki; Riku Togashi; Hitoshi Iyatomi (2023). Extended datasets from MM-IMDB and Ads-Parallelity dataset with the features from Google Cloud Vision API [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_7050923
Explore at:
Dataset updated
Feb 24, 2023
Dataset provided by
CyberAgent, Inc.
Hosei University
Authors
Shunsuke Kitada; Yuki Iwazaki; Riku Togashi; Hitoshi Iyatomi
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This is extended datasets from MM-IMDB [Arevalo+ ICLRW'17], Ads-Parallelity [Zhang+ BMVC'18] dataset with the features from Google Cloud Vision API. These datasets are stored in jsonl (JSON Lines) format.

Abstract (from our paper):

There is increasing interest in the use of multimodal data in various web applications, such as digital advertising and e-commerce. Typical methods for extracting important information from multimodal data rely on a mid-fusion architecture that combines the feature representations from multiple encoders. However, as the number of modalities increases, several potential problems with the mid-fusion model structure arise, such as an increase in the dimensionality of the concatenated multimodal features and missing modalities. To address these problems, we propose a new concept that considers multimodal inputs as a set of sequences, namely, deep multimodal sequence sets (DM2S2). Our set-aware concept consists of three components that capture the relationships among multiple modalities: (a) a BERT-based encoder to handle the inter- and intra-order of elements in the sequences, (b) intra-modality residual attention (IntraMRA) to capture the importance of the elements in a modality, and (c) inter-modality residual attention (InterMRA) to enhance the importance of elements with modality-level granularity further. Our concept exhibits performance that is comparable to or better than the previous set-aware models. Furthermore, we demonstrate that the visualization of the learned InterMRA and IntraMRA weights can provide an interpretation of the prediction results.

Dataset (MM-IMDB and Ads-Parallelity):

We extended two multimodal datasets, namely, MM-IMDB [Arevalo+ ICLRW'17], Ads-Parallelity [Zhang+ BMVC'18] for the empirical experiments. The MM-IMDB dataset contains 25,925 movies with multiple labels (genres). We used the original split provided in the dataset and reported the F1 scores (micro, macro, and samples) of the test set. The Ads-Parallelity dataset contains 670 images and slogans from persuasive advertisements to understand the implicit relationship (parallel and non-parallel) between these two modalities. A binary classification task is used to predict whether the text and image in the same ad convey the same message.

We transformed the following multimodal information (i.e., visual, textual, and categorical data) into textual tokens and fed these into our proposed model. We used the Google Cloud Vision API for the visual features to obtain the following four pieces of information as tokens: (1) text from the OCR, (2) category labels from the label detection, (3) object tags from the object detection, and (4) the number of faces from the facial detection. We input the labels and object detection results as a sequence in order of confidence, as obtained from the API. We describe the visual, textual, and categorical features of each dataset below.

MM-IMDB: We used the title and plot of movies as the textual features, and the aforementioned API results based on poster images as visual features.

Ads-Parallelity: We used the same API-based visual features as in MM-IMDB. Furthermore, we used textual and categorical features consisting of textual inputs of transcriptions and messages, and categorical inputs of natural and text concrete images.
d
Replication Data for: Movie Scripts Corpus
dataone.org
Updated Sep 24, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Drouet, Lance (2024). Replication Data for: Movie Scripts Corpus [Dataset]. http://doi.org/10.7910/DVN/PZTL2L
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/PZTL2L
Dataset updated
Sep 24, 2024
Dataset provided by
Harvard Dataverse
Authors
Drouet, Lance
Description
Data Source: https://www.kaggle.com/datasets/gufukuro/movie-scripts-corpus Data Description : Movie Scripts Corpus This corpus was collected to use for screenplay analysis with machine learning methods. Corpus includes movie scripts, crawled from different sources, their annotations by script structural elements and movies metadata. Corpus description Screenplay data consists of: Movie scripts TXT-documents with raw full text (2858 docs) Movie scripts TXT-documents with full text lemmas (2858 docs) Manual annotation TXT-documents for some movie scripts (33 docs, more than 6000 annotated rows) Movie scripts annotations TXT-documents obtained by BERT Movie scripts annotations json-documents obtained by rule-based annotator ScreenPy Movies metadata consists of: Cut versions of movie reviews and scores from metacritic: Number of reviews: 21025 Number of movies with reviews: 2038 Metadata for movies, including: title, akas, launch year, score from metacritic, imdb user rating and number of votes from imdb.com, movie awards, opening weekend, producers, budget, script department, production companies, writers, directors, cast info, countries involved in production, age restrict, plot (with outline), keywords, genres, taglines, critics' synopsis Screenplay awards information: Academy Awards adapted screenplay, Academy Awards original screenplay, BAFTA, Golden Globe Award for Best Screenplay, Writers Guild Awards Winners & Nominees 2020-2013 nominations information for 462 movies in total. Movie characters data consists of: Script text fragments with dialogs and scene descriptions for characters, gathered with annotators: 2153 movies and text fragments for 32114 characters in total Gender labels for 4792 characters
Palestinian Movies JSON Dataset
kaggle.com
zip
Updated Feb 26, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sondos Aabed (2024). Palestinian Movies JSON Dataset [Dataset]. https://www.kaggle.com/datasets/sondosaabed/palestinian-movies-json-dataset
Explore at:
zip(9292 bytes)Available download formats
Dataset updated
Feb 26, 2024
Authors
Sondos Aabed
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
This dataset was scrapped from my list on Internet Movies Database List about Palestinian Movies.

Using apify actor to scrap the data and download the file: https://console.apify.com/actors/poWuYPmbfLGBn5Mf8/console

This is the list: https://www.imdb.com/list/ls563010565/?sort=alpha,asc&st_dt=&mode=detail&page=1

To use this dataset

It's usable for raw JSON response

https://raw.githubusercontent.com/sondosaabed/Palestinian-Movies-JSON-Dataset/main/palestinian_movies.json
The Movies Dataset
kaggle.com
zip
Updated Nov 10, 2017
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rounak Banik (2017). The Movies Dataset [Dataset]. https://www.kaggle.com/datasets/rounakbanik/the-movies-dataset/code
Explore at:
zip(238862293 bytes)Available download formats
Dataset updated
Nov 10, 2017
Authors
Rounak Banik
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Context

These files contain metadata for all 45,000 movies listed in the Full MovieLens Dataset. The dataset consists of movies released on or before July 2017. Data points include cast, crew, plot keywords, budget, revenue, posters, release dates, languages, production companies, countries, TMDB vote counts and vote averages.

This dataset also has files containing 26 million ratings from 270,000 users for all 45,000 movies. Ratings are on a scale of 1-5 and have been obtained from the official GroupLens website.

Content

This dataset consists of the following files:

movies_metadata.csv: The main Movies Metadata file. Contains information on 45,000 movies featured in the Full MovieLens dataset. Features include posters, backdrops, budget, revenue, release dates, languages, production countries and companies.

keywords.csv: Contains the movie plot keywords for our MovieLens movies. Available in the form of a stringified JSON Object.

credits.csv: Consists of Cast and Crew Information for all our movies. Available in the form of a stringified JSON Object.

links.csv: The file that contains the TMDB and IMDB IDs of all the movies featured in the Full MovieLens dataset.

links_small.csv: Contains the TMDB and IMDB IDs of a small subset of 9,000 movies of the Full Dataset.

ratings_small.csv: The subset of 100,000 ratings from 700 users on 9,000 movies.

The Full MovieLens Dataset consisting of 26 million ratings and 750,000 tag applications from 270,000 users on all the 45,000 movies in this dataset can be accessed here

Acknowledgements

This dataset is an ensemble of data collected from TMDB and GroupLens. The Movie Details, Credits and Keywords have been collected from the TMDB Open API. This product uses the TMDb API but is not endorsed or certified by TMDb. Their API also provides access to data on many additional movies, actors and actresses, crew members, and TV shows. You can try it for yourself here.

The Movie Links and Ratings have been obtained from the Official GroupLens website. The files are a part of the dataset available here

https://www.themoviedb.org/assets/static_cache/9b3f9c24d9fd5f297ae433eb33d93514/images/v4/logos/408x161-powered-by-rectangle-green.png" alt="">

Inspiration

This dataset was assembled as part of my second Capstone Project for Springboard's Data Science Career Track. I wanted to perform an extensive EDA on Movie Data to narrate the history and the story of Cinema and use this metadata in combination with MovieLens ratings to build various types of Recommender Systems.

Both my notebooks are available as kernels with this dataset: The Story of Film and Movie Recommender Systems

Some of the things you can do with this dataset: Predicting movie revenue and/or movie success based on a certain metric. What movies tend to get higher vote counts and vote averages on TMDB? Building Content Based and Collaborative Filtering Based Recommendation Engines.
IMDB Genre Classification Dataset
kaggle.com
zip
Updated Nov 24, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
aptmess (2022). IMDB Genre Classification Dataset [Dataset]. https://www.kaggle.com/datasets/nelepie/imdb-genre-classification
Explore at:
zip(11712640006 bytes)Available download formats
Dataset updated
Nov 24, 2022
Authors
aptmess
License
http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
Description
10000 film's posters and descriptions parsed from the IMDB cite for Genre Classification Task.

Folder contains:

labels.json - Information about film's genre mapping to labels from 0 to 23.

parsed_data.json:

Information about each film, represented by his own film_id (based on IMDB film_id):

title - title of the film

description - film description, parsed from IMDB

poster_url - link to film's poste

genre - main genre of the film

labels - additional genres of the film

releaseDate - release date of the film

film_year - year of the release date

Sample:

"tt6443346": { "title": "Black Adam" "description":string"Nearly 5,000 years after he was bestowed with the almighty powers of the Egyptian gods - and imprisoned just as quickly - Black Adam is freed from his earthly tomb, ready to unleash his unique form of justice on the modern world." "poster_url": "https://m.media-amazon.com/images/M/MV5BYzZkOGUwMzMtMTgyNS00YjFlLTg5NzYtZTE3Y2E5YTA5NWIyXkEyXkFqcGdeQXVyMjkwOTAyMDU@._V1_QL75_UX190_CR0,0,190,281_.jpg" "genre": "SuperHero" "labels": ["Action", "Adventure", "Fantasy"] "releaseDate": NULL "film_year": 2022 }

`
Top Rated Movies Rated By IMDB
kaggle.com
zip
Updated Mar 21, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
PRAJJAL DHAR (2023). Top Rated Movies Rated By IMDB [Dataset]. https://www.kaggle.com/datasets/prajjaldhar/top-rated-movies-rated-by-imdb/versions/1
Explore at:
zip(3156322 bytes)Available download formats
Dataset updated
Mar 21, 2023
Authors
PRAJJAL DHAR
Description
The IMDB movie data is a comprehensive data set that contains information about movies from the Internet Movie Database (IMDB). It is an extensive collection of movie-related data, including movie titles, release dates, genres, ratings, and reviews.

The data set contains information about a wide range of movies, including both old and new films from various countries and languages. It is an excellent resource for those interested in movie analysis, as it includes information such as the movie's budget, box office revenue, and cast and crew details.

The IMDB movie data set is widely used by data scientists, researchers, and movie enthusiasts to perform analysis and draw insights. By analyzing the data, one can gain valuable insights about the movie industry, such as the most popular genres, the most successful directors, and the impact of ratings and reviews on box office performance.

The data set is available for free and can be downloaded in various formats, including CSV, JSON, and SQL. This makes it easily accessible and usable by anyone interested in conducting analysis on movie-related data
Large Movie Review Dataset
kaggle.com
zip
Updated Jan 20, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kris (2018). Large Movie Review Dataset [Dataset]. https://www.kaggle.com/pankrzysiu/keras-imdb
Explore at:
zip(136987095 bytes)Available download formats
Dataset updated
Jan 20, 2018
Authors
Kris
Description
Context

This is a huge dataset and takes around 400 seconds to load into kernel. If you need quickly IMDB data in Keras kernel use the following dataset instead:

https://www.kaggle.com/pankrzysiu/keras-imdb-reviews

Content

A set of 50,000 highly-polarized reviews from the Internet Movie Database.

Usage Instructions

aclImdb_v1.zip file

This file is to be used directly in your code. The .zip file will be automatically uncompressed by Kaggle.

imdb* files

from os import listdir, makedirs from os.path import join, exists, expanduser cache_dir = expanduser(join('~', '.keras')) if not exists(cache_dir): makedirs(cache_dir) datasets_dir = join(cache_dir, 'datasets') if not exists(datasets_dir): makedirs(datasets_dir) # If you have multiple input files, change the below cp commands accordingly, typically: # !cp ../input/keras-imdb/imdb* ~/.keras/datasets/ !cp ../input/imdb* ~/.keras/datasets/

Acknowledgements

The files are on the net in these locations:

https://s3.amazonaws.com/text-datasets/imdb.npz

https://s3.amazonaws.com/text-datasets/imdb_word_index.json

They are used by keras imdb.py:

https://github.com/keras-team/keras/blob/master/keras/datasets/imdb.py

Inspiration

"Python Deep Learning" Book example is using this:

https://github.com/fchollet/deep-learning-with-python-notebooks/blob/master/6.1-using-word-embeddings.ipynb
48K IMDB Movies Data
kaggle.com
zip
Updated Jan 2, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
reza jafari (2021). 48K IMDB Movies Data [Dataset]. https://www.kaggle.com/rezaunderfit/48k-imdb-movies-data
Explore at:
zip(82356556 bytes)Available download formats
Dataset updated
Jan 2, 2021
Authors
reza jafari
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Context

These files contain 48,000 Movies data. It's good for researcher how wants to make an online recommender systems. It almost contain all Movies that exist in MovieLens 1M datasets.

CITATION

================================================================================

To acknowledge use of the dataset in publications, please cite the following:

RJ Ziarani, 48K IMDB Movies With Datasets, accessed 25 July 2021 ,2021

Data Folder DESCRIPTION

================================================================================

These Folder contain movie's Data.You can access each movie with following pattern:

Data/Year/IMDBID/IMDBID.json

Example:

Data/2020\tt4532038\tt4532038.json

Example for a movie's data:

{"@context": "http://schema.org", "@type": "Movie", "url": "/title/tt4532038/", "name": "The War with Grandpa", "image": "https://m.media-amazon.com/images/M/MV5BNTlkZDQ1ODEtY2ZiMS00OGNhLWJlZDctYzY0NTFmNmQ2NDAzXkEyXkFqcGdeQXVyMTkxNjUyNQ@@._V1_.jpg", "genre": ["Comedy", "Drama", "Family"], "contentRating": "PG", "actor": [{"@type": "Person", "url": "/name/nm0000134/", "name": "Robert De Niro"}, {"@type": "Person", "url": "/name/nm0000235/", "name": "Uma Thurman"}, {"@type": "Person", "url": "/name/nm1443527/", "name": "Rob Riggle"}, {"@type": "Person", "url": "/name/nm4625502/", "name": "Oakes Fegley"}], "director": {"@type": "Person", "url": "/name/nm0384722/", "name": "Tim Hill"}, "creator": [{"@type": "Person", "url": "/name/nm0040022/", "name": "Tom J. Astle"}, {"@type": "Person", "url": "/name/nm0256079/", "name": "Matt Ember"}, {"@type": "Person", "url": "/name/nm0809759/", "name": "Robert Kimmel Smith"}, {"@type": "Organization", "url": "/company/co0482253/"}, {"@type": "Organization", "url": "/company/co0017712/"}, {"@type": "Organization", "url": "/company/co0639852/"}, {"@type": "Organization", "url": "/company/co0437328/"}, {"@type": "Organization", "url": "/company/co0641417/"}], "description": "The War with Grandpa is a movie starring Robert De Niro, Uma Thurman, and Rob Riggle. Upset that he has to share the room he loves with his grandfather, Peter decides to declare war in an attempt to get it back.", "datePublished": "2020-08-27", "keywords": "mother son relationship,christmas,room,family conflict,family relationships", "aggregateRating": {"@type": "AggregateRating", "ratingCount": 9310, "bestRating": "10.0", "worstRating": "1.0", "ratingValue": "5.5"}, "review": {"@type": "Review", "itemReviewed": {"@type": "CreativeWork", "url": "/title/tt4532038/"}, "author": {"@type": "Person", "name": "byron-116"}, "dateCreated": "2020-08-28", "inLanguage": "English", "name": "Suitable for juveniles only...", "reviewBody": "It's pathetic to watch such great stars in this film apt for juveniles only. Watch it if you are under 14 years old.....", "reviewRating": {"@type": "Rating", "worstRating": "1", "bestRating": "10", "ratingValue": "4"}}, "duration": "PT1H34M", "trailer": {"@type": "VideoObject", "name": "Official Trailer", "embedUrl": "/video/imdb/vi911785497", "thumbnail": {"@type": "ImageObject", "contentUrl": "https://m.media-amazon.com/images/M/MV5BMTdhNWI1N2QtMjQ5Yi00M2M5LWE3YWQtMDE5YmNhMmFmZTVkXkEyXkFqcGdeQXRyYW5zY29kZS13b3JrZmxvdw@@._V1_.jpg"}, "thumbnailUrl": "https://m.media-amazon.com/images/M/MV5BMTdhNWI1N2QtMjQ5Yi00M2M5LWE3YWQtMDE5YmNhMmFmZTVkXkEyXkFqcGdeQXRyYW5zY29kZS13b3JrZmxvdw@@._V1_.jpg", "description": "The next big family-fun film is hitting theaters soon! Check out the trailer for THE WAR WITH GRANDPA starring Robert De Niro, Christopher Walken, Uma Thurman, Rob Riggle, Cheech Marin, Laura Marano and Oakes Fegly. Coming soon to theaters!", "uploadDate": "2020-08-13T17:40:20Z"}}

Poster Folder DESCRIPTION

48K IMDB Movies With Posters
Netflix Prize Shows Information (9000 Shows)
kaggle.com
zip
Updated Oct 24, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Akash Guna (2021). Netflix Prize Shows Information (9000 Shows) [Dataset]. https://www.kaggle.com/datasets/akashguna/netflix-prize-shows-information/discussion
Explore at:
zip(10833651 bytes)Available download formats
Dataset updated
Oct 24, 2021
Authors
Akash Guna
Description
Context

Netfilx prize data is one of the popular datasets available today for OTT Recommandation. Netflix Prize Dataset contains title, userid, rating,date of rating as the only attributes for recommandation . we extend the Netflix prize dataset by scraping IMDB data about the titles in Netflix prize dataset. Any copyyright to the scraped data belongs to its respective owners.

Content

The Dataset contains information of approximately 9000 movies and tv shows available in Netflix prize datasets. Information like duration of movie, cast and crew,genre,languages,etc are present. For Columns which hold multiple values in a row arrays have been used to store those values. Please use the .json file to access the dataset to avoid string related errors.

Inspiration

Could you build a Hybrid recommandation system by combining our dataset along with Netflix Prize Dataset.

Update 1

Some movies present in imdb.csv and imdb.json have information of movies with titles same as in Netflix Prize Dataset but were made after 2005 (release of Netflix Prize Dataset) this has been corrected in imdb_processed.csv and imdb_processed.json . Please use this processed data while using the dataset for tasks specific to Netfilx Prize Dataset.

Link to Netflix Prize Dataset

https://www.kaggle.com/netflix-inc/netflix-prize-data
Keras IMDB Reviews
kaggle.com
zip
Updated Jan 26, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kris (2018). Keras IMDB Reviews [Dataset]. https://www.kaggle.com/pankrzysiu/keras-imdb-reviews
Explore at:
zip(18163301 bytes)Available download formats
Dataset updated
Jan 26, 2018
Authors
Kris
Description
Context

Using Keras inside Kaggle requires you to provide cached datasets. This dataset loads quickly into kernels and Keras.

Content

A set of 50,000 highly-polarized reviews from the Internet Movie Database.

Usage Instructions

imdb* files

from os import listdir, makedirs from os.path import join, exists, expanduser cache_dir = expanduser(join('~', '.keras')) if not exists(cache_dir): makedirs(cache_dir) datasets_dir = join(cache_dir, 'datasets') if not exists(datasets_dir): makedirs(datasets_dir) # If you have multiple input files, change the below cp commands accordingly, typically: # !cp ../input/keras-imdb-reviews/imdb* ~/.keras/datasets/ !cp ../input/imdb* ~/.keras/datasets/

Acknowledgements

The files are on the net in these locations:

https://s3.amazonaws.com/text-datasets/imdb.npz

https://s3.amazonaws.com/text-datasets/imdb_word_index.json

They are used by keras imdb.py:

https://github.com/keras-team/keras/blob/master/keras/datasets/imdb.py

Inspiration

"Python Deep Learning" Book example is using this: https://github.com/fchollet/deep-learning-with-python-notebooks/blob/master/3.5-classifying-movie-reviews.ipynb

Facebook

Twitter

Click to copy link

Link copied

Cite

Prakash Mahara (2025). IMDB Top 100 Movies [Dataset]. https://www.kaggle.com/datasets/prakash27x/imdb-top-100-movies

IMDB Top 100 Movies

Explore at:

7 scholarly articles cite this dataset (View in Google Scholar)

zip(18499 bytes)Available download formats

Dataset updated

May 5, 2025

Authors

Prakash Mahara

Description

# 🏆 IMDB Top 100 Movies Dataset

This dataset contains detailed information about the Top 100 movies from IMDb, collected to assist film enthusiasts, data analysts, and machine learning practitioners in exploring trends and insights in the film industry.

📁 Dataset Features

Each movie entry includes: 🎬 Title – Name of the movie 📅 Year – Year of release ⭐ Rating – IMDb user rating (out of 10) 📣 Genres – List of genres the movie belongs to 🎥 Director – Director(s) of the movie 👥 Stars – Leading cast ⏱️ Runtime – Duration in minutes 📝 Summary – A brief synopsis of the movie 🧾 Votes – Number of user votes 💰 Gross – Box office gross (if available)

🎯 Use Cases

Data Visualization: Create graphs showing rating trends, genre distributions, etc. Recommendation Systems: Build a content-based movie recommender. NLP Projects: Use summaries for natural language processing tasks. Exploratory Data Analysis: Great dataset for practicing EDA techniques.

🛠️ Source

The data is derived from IMDb's public listings and compiled into JSON format for easy use in Python-based projects.

Clear search

Close search

Google apps

Main menu

IMDB Top 100 Movies

📁 Dataset Features

🎯 Use Cases

🛠️ Source

imdb-movies

IMDB-Reviews

Movies and Tv Shows Dataset

Key Features of the Movies Dataset:

Why Choose Our TV Shows Dataset?

Applications of These Datasets

All 8000+ Top Rated IMDB Movies Dataset

IMDB Movies User Reviews

Motivation

Dataset Structure

Challenges you can try

Amazon prime tv shows and movies dataset

IMDB Most Popular by Year

Content

Movies & TV Shows Metadata Dataset (190K+ Records, Horror-Heavy Collection)

Use Cases:

TMDB Movies (2000-2020) with `imdb_id`

Summary

The Problem

Solution

Acknowledgements

Data from: Extended datasets from MM-IMDB and Ads-Parallelity dataset with...

Replication Data for: Movie Scripts Corpus

Palestinian Movies JSON Dataset

The Movies Dataset

Context

Content

Acknowledgements

Inspiration

IMDB Genre Classification Dataset

Top Rated Movies Rated By IMDB

Large Movie Review Dataset

Context

Content

Usage Instructions

aclImdb_v1.zip file

imdb* files

Acknowledgements

Inspiration

48K IMDB Movies Data

Context

CITATION

Data Folder DESCRIPTION

Poster Folder DESCRIPTION

Netflix Prize Shows Information (9000 Shows)

Context

Content

Inspiration

Update 1

Link to Netflix Prize Dataset

Keras IMDB Reviews

Context

Content

Usage Instructions

imdb* files

Acknowledgements

Inspiration

IMDB Top 100 Movies

📁 Dataset Features

🎯 Use Cases

🛠️ Source