100+ datasets found
  1. MovieLens 9000 Movies Dataset

    • kaggle.com
    zip
    Updated Aug 24, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ikram Ali (2023). MovieLens 9000 Movies Dataset [Dataset]. https://www.kaggle.com/datasets/akkefa/movielens-9000-movies-dataset
    Explore at:
    zip(994099 bytes)Available download formats
    Dataset updated
    Aug 24, 2023
    Authors
    Ikram Ali
    Description

    This dataset (ml-latest-small) describes 5-star rating and free-text tagging activity from MovieLens, a movie recommendation service. It contains 100836 ratings and 3683 tag applications across 9742 movies. These data were created by 610 users between March 29, 1996 and September 24, 2018. This dataset was generated on September 26, 2018.

    Users were selected at random for inclusion. All selected users had rated at least 20 movies. No demographic information is included. Each user is represented by an id, and no other information is provided.

    The data are contained in the files links.csv, movies.csv, ratings.csv and tags.csv.

    Content and Use of Files

    The dataset files are written as comma-separated values files with a single header row. Columns that contain commas (,) are escaped using double-quotes ("). These files are encoded as UTF-8. If accented characters in movie titles or tag values (e.g. Misérables, Les (1995)) display incorrectly, make sure that any program reading the data, such as a text editor, terminal, or script, is configured for UTF-8.

    User Ids

    MovieLens users were selected at random for inclusion. Their ids have been anonymized. User ids are consistent between ratings.csv and tags.csv (i.e., the same id refers to the same user across the two files).

    Movie Ids

    Only movies with at least one rating or tag are included in the dataset. These movie ids are consistent with those used on the MovieLens web site (e.g., id 1 corresponds to the URL https://movielens.org/movies/1). Movie ids are consistent between ratings.csv, tags.csv, movies.csv, and links.csv (i.e., the same id refers to the same movie across these four data files).

    Ratings Data File Structure (ratings.csv)

    All ratings are contained in the file ratings.csv. Each line of this file after the header row represents one rating of one movie by one user, and has the following format:

    userId,movieId,rating,timestamp The lines within this file are ordered first by userId, then, within user, by movieId.

    Ratings are made on a 5-star scale, with half-star increments (0.5 stars - 5.0 stars).

    Timestamps represent seconds since midnight Coordinated Universal Time (UTC) of January 1, 1970.

    Tags Data File Structure (tags.csv)

    All tags are contained in the file tags.csv. Each line of this file after the header row represents one tag applied to one movie by one user, and has the following format:

    userId,movieId,tag,timestamp The lines within this file are ordered first by userId, then, within user, by movieId.

    Tags are user-generated metadata about movies. Each tag is typically a single word or short phrase. The meaning, value, and purpose of a particular tag is determined by each user.

    Timestamps represent seconds since midnight Coordinated Universal Time (UTC) of January 1, 1970.

    Movies Data File Structure (movies.csv)

    Movie information is contained in the file movies.csv. Each line of this file after the header row represents one movie, and has the following format:

    movieId,title,genres Movie titles are entered manually or imported from https://www.themoviedb.org/, and include the year of release in parentheses. Errors and inconsistencies may exist in these titles.

    Genres are a pipe-separated list, and are selected from the following:

    • Action
    • Adventure
    • Animation
    • Children's
    • Comedy
    • Crime
    • Documentary
    • Drama
    • Fantasy
    • Film-Noir
    • Horror
    • Musical
    • Mystery
    • Romance
    • Sci-Fi
    • Thriller
    • War
    • Western
  2. MovieLens Latest Small

    • kaggle.com
    zip
    Updated Oct 12, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    GroupLens (2018). MovieLens Latest Small [Dataset]. https://www.kaggle.com/datasets/grouplens/movielens-latest-small
    Explore at:
    zip(993937 bytes)Available download formats
    Dataset updated
    Oct 12, 2018
    Dataset authored and provided by
    GroupLens
    Description

    Dataset

    This dataset was created by Max Harper

    Released under Other (specified in description)

    Contents

  3. MovieLens 10M Dataset (Latest Version)

    • kaggle.com
    zip
    Updated Feb 9, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amir Motefaker (2023). MovieLens 10M Dataset (Latest Version) [Dataset]. https://www.kaggle.com/datasets/amirmotefaker/movielens-10m-dataset-latest-version
    Explore at:
    zip(67393808 bytes)Available download formats
    Dataset updated
    Feb 9, 2023
    Authors
    Amir Motefaker
    License

    https://cdla.io/sharing-1-0/https://cdla.io/sharing-1-0/

    Description

    This data set contains 10000054 ratings and 95580 tags applied to 10681 movies by 71567 users of the online movie recommender service MovieLens.

    Users were selected at random for inclusion. All users selected had rated at least 20 movies. Unlike previous MovieLens data sets, no demographic information is included. Each user is represented by an id, and no other information is provided.

    The data are contained in three files, movies.dat, ratings.dat, and tags.dat. Also included are scripts for generating subsets of the data to support the five-fold cross-validation of rating predictions. More details about the contents and use of all these files follow.

    This and other GroupLens data sets are publicly available for download at GroupLens Data Sets.

  4. g

    MovieLens 100K

    • grouplens.org
    • kaggle.com
    Updated Oct 12, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2015). MovieLens 100K [Dataset]. https://grouplens.org/datasets/movielens/100k/
    Explore at:
    Dataset updated
    Oct 12, 2015
    Description

    Stable benchmark dataset. 100,000 ratings from 1000 users on 1700 movies. Released 4/1998.

  5. g

    MovieLens 1M

    • grouplens.org
    • kaggle.com
    Updated Mar 19, 2016
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2016). MovieLens 1M [Dataset]. https://grouplens.org/datasets/movielens/1m/
    Explore at:
    Dataset updated
    Mar 19, 2016
    Description

    Stable benchmark dataset. 1 million ratings from 6000 users on 4000 movies. Released 2/2003.

  6. 🎥 MovieLens Small: Ratings (1995-2019)

    • kaggle.com
    zip
    Updated Apr 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Paweł Kauf (2024). 🎥 MovieLens Small: Ratings (1995-2019) [Dataset]. https://www.kaggle.com/datasets/pawelkauf/movielens-25m-ratings-1995-2019
    Explore at:
    zip(696323 bytes)Available download formats
    Dataset updated
    Apr 22, 2024
    Authors
    Paweł Kauf
    Description

    🔍 Overview: This dataset is part of the MovieLens Latest Datasets. It includes 100,000 ratings on 9,000 movies by 600 users, last updated in September 2018. It is designed for dynamic exploration and testing of machine learning models, particularly suitable for those interested in developing or testing recommender systems. This dataset provides a snapshot of user interactions with movies, ideal for academic purposes and casual experimentation in data science projects.

    ✨Conditions of Use: - Research Use Only: The dataset may be used for any research purposes under the condition that it is not used for commercial or revenue-bearing purposes without explicit permission from a faculty member of the GroupLens Research Project at the University of Minnesota. - No Endorsement: Users may not state or imply any endorsement from the University of Minnesota or the GroupLens Research Group. - Mandatory Citation: Users must acknowledge the use of the dataset in any publications that result from the use of the data set, by citing: F. Maxwell Harper and Joseph A. Konstan. 2015. The MovieLens Datasets: History and Context. ACM Transactions on Interactive Intelligent Systems (TiiS), 5, 4: 19:1–19:19. DOI - No Redistribution: The dataset can be redistributed, including transformations, as long as it is distributed under these same license conditions. - Disclaimer of Liability: Neither the University of Minnesota, its affiliates, nor employees are liable for any damages arising out of the use or inability to use the dataset (including but not limited to loss of data or data being rendered inaccurate).

  7. Movies and Ratings

    • zenodo.org
    zip
    Updated Aug 23, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Michael Kaufmann; Michael Kaufmann (2023). Movies and Ratings [Dataset]. http://doi.org/10.5281/zenodo.7665868
    Explore at:
    zipAvailable download formats
    Dataset updated
    Aug 23, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Michael Kaufmann; Michael Kaufmann
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Transformed, cleaned dataset with reduced number of columns for all 45,000 movies listed in the full MovieLens dataset of movies released in July 2017 or earlier. Data points include movie ID, title, budget, languages, and genres. This dataset also includes 26 million ratings from 270,000 users for all 45,000 movies. Ratings are given on a scale of 1 to 5 and include user ID, movie ID, rating, and timestamp.

    This dataset consists of the following files:

    * movies.csv: The main movie metadata file. Contains information on 45,000 movies included in the full MovieLens dataset.

    * ratings.csv: The full MovieLens dataset with 26 million ratings and 750,000 tag applications from 270,000 users on all 45,000 movies in this dataset.

    This dataset is a further development of the following public domain dataset published on Kaggle:

    https://www.kaggle.com/datasets/rounakbanik/the-movies-dataset

    This data was obtained from the official GroupLens website. The data was originally obtained from The Movies DataBase (TMDB) via the TMDB AP

  8. MovieLens 1M Dataset

    • kaggle.com
    zip
    Updated Jan 23, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Oded Golden (2021). MovieLens 1M Dataset [Dataset]. https://www.kaggle.com/odedgolden/movielens-1m-dataset
    Explore at:
    zip(6111600 bytes)Available download formats
    Dataset updated
    Jan 23, 2021
    Authors
    Oded Golden
    Description

    Dataset

    This dataset was created by Oded Golden

    Contents

  9. Movielens 100k dataset

    • kaggle.com
    zip
    Updated Dec 1, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fakhre Alam (2020). Movielens 100k dataset [Dataset]. https://www.kaggle.com/fakhrealam0786/movielens-100k-dataset
    Explore at:
    zip(534814 bytes)Available download formats
    Dataset updated
    Dec 1, 2020
    Authors
    Fakhre Alam
    Description

    Context

    This dataset is a subset of MovieLens 100k data which were collected by the GroupLens Research Project at the University of Minnesota. You can find full dataset from here👍

    Content

    This data set consists of 6 columns: * movie_id -- unique id for each movie * title -- title of the movie * year -- year in which the movie was released * directors -- director of the movie * actors -- actors of the movie * genres -- genres of the movie (ex: comedy, action, horror, etc...)

    Acknowledgements

    Thanks to GroupLens for providing up this data.

  10. MovieLens 100K Dataset

    • kaggle.com
    zip
    Updated Mar 7, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Upendra Kumar (2021). MovieLens 100K Dataset [Dataset]. https://www.kaggle.com/datasets/imkushwaha/movielens-100k-dataset
    Explore at:
    zip(1487438 bytes)Available download formats
    Dataset updated
    Mar 7, 2021
    Authors
    Upendra Kumar
    Description

    Dataset

    This dataset was created by Upendra Kumar

    Contents

  11. Datasets to Evaluate Accuracy, Miscalibration and Popularity Lift in...

    • data.niaid.nih.gov
    Updated Sep 12, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kowald, Dominik (2023). Datasets to Evaluate Accuracy, Miscalibration and Popularity Lift in Recommendations [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7428434
    Explore at:
    Dataset updated
    Sep 12, 2023
    Dataset provided by
    Know-Centerhttp://know-center.at/
    Authors
    Kowald, Dominik
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This repository contains three datasets for evaluating accuracy, miscalibration and popularity lift in recommender systems. All datasets contain genre/category information in addition to different user group splits:

    Last.fm (lfm.zip), based on the LFM-1b dataset of JKU Linz (http://www.cp.jku.at/datasets/LFM-1b/)

    MovieLens (ml.zip), based on MovieLens-1M dataset (https://grouplens.org/datasets/movielens/1m/)

    MyAnimeList (anime.zip), based on the MyAnimeList dataset of Kaggle (https://www.kaggle.com/CooperUnion/anime-recommendations-database)

    'user_events_cats.txt' contains the users' rating/interaction data along with a list of genres/categories assigend to the rated items. The list of categories is given in 'categories.txt'. Additionally, assignments to three user groups that differ in their inclination to popular/mainstream items are provided: LowPop in 'low_main_users.txt', MedPop in 'med_main_users.txt', and HighPop in 'high_main_users.txt'.

    The format of the three user files are "user,mainstreaminess"

    The format of the user-events files are "user,item,preference,cats", where different categories are separated by '|'

    The format of the categories files are "category-name,index", where index refers to the category-id in the user-events files

    Example Python-code for analyzing the datasets as well as empirical results on calibration, popularity lift and accuracy can be found on GitHub: https://github.com/domkowald/FairRecSys

  12. MovieLens 25M Dataset

    • kaggle.com
    zip
    Updated Sep 13, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    GaryMK (2021). MovieLens 25M Dataset [Dataset]. https://www.kaggle.com/garymk/movielens-25m-dataset
    Explore at:
    zip(270043812 bytes)Available download formats
    Dataset updated
    Sep 13, 2021
    Authors
    GaryMK
    Description

    Summary

    This dataset (ml-25m) describes 5-star rating and free-text tagging activity from MovieLens, a movie recommendation service. It contains 25000095 ratings and 1093360 tag applications across 62423 movies. These data were created by 162541 users between January 09, 1995 and November 21, 2019. This dataset was generated on November 21, 2019.

    Users were selected at random for inclusion. All selected users had rated at least 20 movies. No demographic information is included. Each user is represented by an id, and no other information is provided.

    The data are contained in the files genome-scores.csv, genome-tags.csv, links.csv, movies.csv, ratings.csv and tags.csv. More details about the contents and use of all these files follows.

    Context

    There's a story behind every dataset and here's your opportunity to share yours.

    Content

    What's inside is more than just rows and columns. Make it easy for others to get started by describing how you acquired the data and what time period it represents, too.

    Acknowledgements

    We wouldn't be here without the help of others. If you owe any attributions or thanks, include them here along with any citations of past research.

    Inspiration

    Your data will be in front of the world's largest data science community. What questions do you want to see answered?

  13. O

    MovieLens

    • opendatalab.com
    • tensorflow.org
    • +1more
    zip
    Updated Mar 10, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    University of Minnesota (2023). MovieLens [Dataset]. https://opendatalab.com/OpenDataLab/MovieLens
    Explore at:
    zipAvailable download formats
    Dataset updated
    Mar 10, 2023
    Dataset provided by
    University of Minnesota
    License

    https://grouplens.org/datasets/movielens/https://grouplens.org/datasets/movielens/

    Description

    GroupLens Research has collected and made available rating data sets from the MovieLens web site (https://movielens.org). The data sets were collected over various periods of time, depending on the size of the set. Before using these data sets, please review their README files for the usage licenses and other details.

  14. MovieLens-ml_latest-27M

    • kaggle.com
    zip
    Updated Dec 26, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ganesh Bajaj (2022). MovieLens-ml_latest-27M [Dataset]. https://www.kaggle.com/datasets/bajajganesh/movielensml-latest27m
    Explore at:
    zip(657835356 bytes)Available download formats
    Dataset updated
    Dec 26, 2022
    Authors
    Ganesh Bajaj
    Description

    Problem Statement:

    Create a movie recommendation system

    Summary

    This dataset (ml-latest) describes 5-star rating and free-text tagging activity from MovieLens, a movie recommendation service. It contains 27753444 ratings and 1108997 tag applications across 58098 movies. These data were created by 283228 users between January 09, 1995 and September 26, 2018. This dataset was generated on September 26, 2018. Users were selected at random for inclusion. All selected users had rated at least 1 movies. No demographic information is included. Each user is represented by an id, and no other information is provided. The data are contained in the files genome-scores.csv, genome-tags.csv, links.csv, movies.csv, ratings.csv and tags.csv. More details about the contents and use of all these files follows. This is a development dataset. As such, it may change over time and is not an appropriate dataset for shared research results. See available benchmark datasets if that is your intent. This and other GroupLens data sets are publicly available for download at http://grouplens.org/datasets/. Citation To acknowledge use of the dataset in publications, please cite the following paper: F. Maxwell Harper and Joseph A. Konstan. 2015. The MovieLens Datasets: History and Context. ACM Transactions on Interactive Intelligent Systems (TiiS) 5, 4: 19:1–19:19. https://doi.org/10.1145/2827872 Further Information About GroupLens GroupLens is a research group in the Department of Computer Science and Engineering at the University of Minnesota. Since its inception in 1992, GroupLens's research projects have explored a variety of fields including: 1. recommender systems 2. online communities 3. mobile and ubiquitious technologies 4. digital libraries local geographic information systems GroupLens Research operates a movie recommender based on collaborative filtering, MovieLens, which is the source of these data. We encourage you to visit http://movielens.org to try it out! If you have exciting ideas for experimental work to conduct on MovieLens, send us an email at grouplens-info@cs.umn.edu - we are always interested in working with external collaborators.

    Content and Use of Files

    Formatting and Encoding The dataset files are written as comma-separated values files with a single header row. Columns that contain commas (,) are escaped using double-quotes ("). These files are encoded as UTF-8. If accented characters in movie titles or tag values (e.g. Misérables, Les (1995)) display incorrectly, make sure that any program reading the data, such as a text editor, terminal, or script, is configured for UTF-8. User Ids MovieLens users were selected at random for inclusion. Their ids have been anonymized. User ids are consistent between ratings.csv and tags.csv (i.e., the same id refers to the same user across the two files). Movie Ids Only movies with at least one rating or tag are included in the dataset. These movie ids are consistent with those used on the MovieLens web site (e.g., id 1 corresponds to the URL https://movielens.org/movies/1). Movie ids are consistent between ratings.csv, tags.csv, movies.csv, and links.csv (i.e., the same id refers to the same movie across these four data files).

    Ratings Data File Structure (ratings.csv)

    All ratings are contained in the file ratings.csv. Each line of this file after the header row represents one rating of one movie by one user, and has the following format: userId,movieId,rating,timestamp

    The lines within this file are ordered first by userId, then, within user, by movieId. Ratings are made on a 5-star scale, with half-star increments (0.5 stars - 5.0 stars). Timestamps represent seconds since midnight Coordinated Universal Time (UTC) of January 1, 1970. Tags Data File Structure (tags.csv) All tags are contained in the file tags.csv. Each line of this file after the header row represents one tag applied to one movie by one user, and has the following format: userId,movieId,tag,timestamp

    The lines within this file are ordered first by userId, then, within user, by movieId. Tags are user-generated metadata about movies. Each tag is typically a single word or short phrase. The meaning, value, and purpose of a particular tag is determined by each user. Timestamps represent seconds since midnight Coordinated Universal Time (UTC) of January 1, 1970.

    Movies Data File Structure (movies.csv)

    Movie information is contained in the file movies.csv. Each line of this file after the header row represents one movie, and has the following format: movieId,title,genres

    Movie titles are entered manually or imported from https://www.themoviedb.org/, and include the year of release in parentheses. Errors and inconsistencies may exist in these titles. Genres are a pipe-separated list, and are selected from the following: 1.Action 2.Adventure 3.Animation 4.Children's 5.Comedy 6.Crim...

  15. MovieLens 100k Dataset

    • kaggle.com
    zip
    Updated Jul 28, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    vikas bhat (2020). MovieLens 100k Dataset [Dataset]. https://www.kaggle.com/bhatvikas/movielens-100k-dataset
    Explore at:
    zip(4998818 bytes)Available download formats
    Dataset updated
    Jul 28, 2020
    Authors
    vikas bhat
    Description

    Dataset

    This dataset was created by vikas bhat

    Contents

  16. d

    E-learning Recommender System Dataset

    • search.dataone.org
    • kaggle.com
    Updated Nov 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hafsa, Mounir (2023). E-learning Recommender System Dataset [Dataset]. http://doi.org/10.7910/DVN/BMY3UD
    Explore at:
    Dataset updated
    Nov 8, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Hafsa, Mounir
    Description

    Mandarine Academy Recommender System (MARS) Dataset is captured from real-world open MOOC {https://mooc.office365-training.com/}. The dataset offers both explicit and implicit ratings, for both French and English versions of the MOOC. Compared with classical recommendation datasets like Movielens, this is a rather small dataset due to the nature of available content (educational). However, the dataset offers insights into real-world ratings and provides testing grounds away from common datasets. All items are available online for viewing in both French and English versions. All selected users had rated at least 1 item. No demographic information is included. Each user is represented by an id and job (if available). For both French and English, the same kind of files is available in .csv format. We provide the following files: Users: contains information about user ids and their jobs. Items: contains information about items (resources) in the selected language. Contains a mix of feature types. Ratings: Both explicit (Watch time) and implicit (page views of items). Formatting and Encoding The dataset files are written as comma-separated values files with a single header row. Columns that contain commas (,) are escaped using double quotes ("). These files are encoded as UTF-8. User Ids User ids are consistent between explicit_ratings.csv and implicit_ratings.csv and users.csv (i.e., the same id refers to the same user across the dataset). Item Ids Item ids are consistent between explicit_ratings.csv, implicit_ratings.csv, and items.csv (i.e., the same id refers to the same item across the dataset). Ratings Data File Structure All ratings are contained in the files explicit_ratings.csv and implicit_ratings.csv. Each line of this file after the header row represents one rating of one item by one user, and has the following format: item_id,user_id,created_at (implicit_ratings.csv) user_id,item_id,watch_percentage,created_at,rating (explicit_ratings.csv) Item Data File Structure Item information is contained in the file items.csv. Each line of this file after the header row represents one item, and has the following format: item_id,language,name,nb_views,description,created_at,Difficulty,Job,Software,Theme,duration,type

  17. MovieLens Metadata Datasets

    • kaggle.com
    zip
    Updated Dec 23, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    thanhnghth (2023). MovieLens Metadata Datasets [Dataset]. https://www.kaggle.com/datasets/thanhnghth/movielens-metadata-datasets
    Explore at:
    zip(210826217 bytes)Available download formats
    Dataset updated
    Dec 23, 2023
    Authors
    thanhnghth
    Description

    MovieLens has a publicly available full dataset containing approximately 33,000,000 ratings and 2,000,000 tag applications applied to 86,000 movies by 330,975 users between January 09, 1995 and July 20, 2023. Includes tag genome data with 14 million relevance scores across 1,100 tags. This dataset was generated on July 20, 2023. A small subset of the dataset, containing 100,000 ratings and 3,600 tag applications applied to 9,000 movies by 600 users between March 29, 1996 and September 24, 2018. The subset was generated on September 26, 2018. The Metadata Datasets include: 1. movies_metadata.csv: The file containing metadata collected from TMDB for over 86,000 movies. Data includes budget, revenue, date released, genres, etc. 2. credits.csv: Complete information on credits for a particular movie. Data includes Director, Producer, Actors, Characters, etc. 3. keywords.csv: Contains plot keywords associated with a movie.

  18. MovieLens 25M Dataset

    • kaggle.com
    zip
    Updated Sep 3, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Joseph Fernandez (2024). MovieLens 25M Dataset [Dataset]. https://www.kaggle.com/datasets/joeferndz/movielens-25m-dataset
    Explore at:
    zip(270043752 bytes)Available download formats
    Dataset updated
    Sep 3, 2024
    Authors
    Joseph Fernandez
    Description

    This dataset is about Movies and it is used for educational purpose only. Please read the "Movies - Ratings - README.txt" file for usage license. I am copying this dataset here only to help me and the fellow Kagglers to learn about the SURPRISE package for recommendation system.

    To learn more about the datasets from MovieLens, please visit : https://grouplens.org/datasets/movielens/

    This specific dataset was used by me to learn about the SURPRISE recommendation system module.

    I selected the Movies 25M dataset.

    MovieLens 25M movie ratings. Stable benchmark dataset. 25 million ratings and one million tag applications applied to 62,000 movies by 162,000 users. Includes tag genome data with 15 million relevance scores across 1,129 tags. Released 12/2019

  19. MOVIELENS DATASET

    • kaggle.com
    zip
    Updated Apr 26, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ParryGarg (2022). MOVIELENS DATASET [Dataset]. https://www.kaggle.com/datasets/parrygarg/movielens-dataset
    Explore at:
    zip(152128074 bytes)Available download formats
    Dataset updated
    Apr 26, 2022
    Authors
    ParryGarg
    Description

    Dataset

    This dataset was created by ParryGarg

    Contents

  20. MovieLens Dataset (Movies & Reviews)

    • kaggle.com
    zip
    Updated Jun 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ebru İşeri Sobay (2024). MovieLens Dataset (Movies & Reviews) [Dataset]. https://www.kaggle.com/datasets/ebruiserisobay/movie-and-rating
    Explore at:
    zip(146846672 bytes)Available download formats
    Dataset updated
    Jun 9, 2024
    Authors
    Ebru İşeri Sobay
    Description

    The dataset is provided by MovieLens, a movie recommendation service. It contains movies along with their rating scores. It includes 2,000,0263 ratings for 27,278 movies. This dataset was created on October 17, 2016. It contains data from 138,493 users and covers the period between January 9, 1995, and March 31, 2015. Users were randomly selected. It is known that all selected users have rated at least 20 movies.

    movie file:

    movieId: Unique movie identifier. title: Movie title. genres: Genre. rating file:

    userid: Unique user identifier. (UniqueID) movieId: Unique movie identifier. (UniqueID) rating: Rating given to the movie by the user. timestamp: Rating date.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Ikram Ali (2023). MovieLens 9000 Movies Dataset [Dataset]. https://www.kaggle.com/datasets/akkefa/movielens-9000-movies-dataset
Organization logo

MovieLens 9000 Movies Dataset

100,000 ratings and 3,600 tag applications applied to 9000 movies by 600 users.

Explore at:
zip(994099 bytes)Available download formats
Dataset updated
Aug 24, 2023
Authors
Ikram Ali
Description

This dataset (ml-latest-small) describes 5-star rating and free-text tagging activity from MovieLens, a movie recommendation service. It contains 100836 ratings and 3683 tag applications across 9742 movies. These data were created by 610 users between March 29, 1996 and September 24, 2018. This dataset was generated on September 26, 2018.

Users were selected at random for inclusion. All selected users had rated at least 20 movies. No demographic information is included. Each user is represented by an id, and no other information is provided.

The data are contained in the files links.csv, movies.csv, ratings.csv and tags.csv.

Content and Use of Files

The dataset files are written as comma-separated values files with a single header row. Columns that contain commas (,) are escaped using double-quotes ("). These files are encoded as UTF-8. If accented characters in movie titles or tag values (e.g. Misérables, Les (1995)) display incorrectly, make sure that any program reading the data, such as a text editor, terminal, or script, is configured for UTF-8.

User Ids

MovieLens users were selected at random for inclusion. Their ids have been anonymized. User ids are consistent between ratings.csv and tags.csv (i.e., the same id refers to the same user across the two files).

Movie Ids

Only movies with at least one rating or tag are included in the dataset. These movie ids are consistent with those used on the MovieLens web site (e.g., id 1 corresponds to the URL https://movielens.org/movies/1). Movie ids are consistent between ratings.csv, tags.csv, movies.csv, and links.csv (i.e., the same id refers to the same movie across these four data files).

Ratings Data File Structure (ratings.csv)

All ratings are contained in the file ratings.csv. Each line of this file after the header row represents one rating of one movie by one user, and has the following format:

userId,movieId,rating,timestamp The lines within this file are ordered first by userId, then, within user, by movieId.

Ratings are made on a 5-star scale, with half-star increments (0.5 stars - 5.0 stars).

Timestamps represent seconds since midnight Coordinated Universal Time (UTC) of January 1, 1970.

Tags Data File Structure (tags.csv)

All tags are contained in the file tags.csv. Each line of this file after the header row represents one tag applied to one movie by one user, and has the following format:

userId,movieId,tag,timestamp The lines within this file are ordered first by userId, then, within user, by movieId.

Tags are user-generated metadata about movies. Each tag is typically a single word or short phrase. The meaning, value, and purpose of a particular tag is determined by each user.

Timestamps represent seconds since midnight Coordinated Universal Time (UTC) of January 1, 1970.

Movies Data File Structure (movies.csv)

Movie information is contained in the file movies.csv. Each line of this file after the header row represents one movie, and has the following format:

movieId,title,genres Movie titles are entered manually or imported from https://www.themoviedb.org/, and include the year of release in parentheses. Errors and inconsistencies may exist in these titles.

Genres are a pipe-separated list, and are selected from the following:

  • Action
  • Adventure
  • Animation
  • Children's
  • Comedy
  • Crime
  • Documentary
  • Drama
  • Fantasy
  • Film-Noir
  • Horror
  • Musical
  • Mystery
  • Romance
  • Sci-Fi
  • Thriller
  • War
  • Western
Search
Clear search
Close search
Google apps
Main menu