This dataset contains a set of movie ratings from the MovieLens website, a movie recommendation service. This dataset was collected and maintained by GroupLens, a research group at the University of Minnesota. There are 5 versions included: "25m", "latest-small", "100k", "1m", "20m". In all datasets, the movies data and ratings data are joined on "movieId". The 25m dataset, latest-small dataset, and 20m dataset contain only movie data and rating data. The 1m dataset and 100k dataset contain demographic data in addition to movie and rating data.
For each version, users can view either only the movies data by adding the "-movies" suffix (e.g. "25m-movies") or the ratings data joined with the movies data (and users data in the 1m and 100k datasets) by adding the "-ratings" suffix (e.g. "25m-ratings").
The features below are included in all versions with the "-ratings" suffix.
The "100k-ratings" and "1m-ratings" versions in addition include the following demographic features.
In addition, the "100k-ratings" dataset would also have a feature "raw_user_age" which is the exact ages of the users who made the rating
Datasets with the "-movies" suffix contain only "movie_id", "movie_title", and "movie_genres" features.
To use this dataset:
import tensorflow_datasets as tfds
ds = tfds.load('movielens', split='train')
for ex in ds.take(4):
print(ex)
See the guide for more informations on tensorflow_datasets.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Movie Lens Small Latest Dataset’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/shubhammehta21/movie-lens-small-latest-dataset on 30 September 2021.
--- Dataset description provided by original source is as follows ---
This dataset (ml-latest-small) describes 5-star rating and free-text tagging activity from MovieLens, a movie recommendation service. It contains 100836 ratings and 3683 tag applications across 9742 movies. These data were created by 610 users between March 29, 1996 and September 24, 2018. This dataset was generated on September 26, 2018.
Users were selected at random for inclusion. All selected users had rated at least 20 movies. No demographic information is included. Each user is represented by an id, and no other information is provided.
The data are contained in the files links.csv
, movies.csv
, ratings.csv
and tags.csv
. More details about the contents and use of all these files follows.
This is a development dataset. As such, it may change over time and is not an appropriate dataset for shared research results. See available benchmark datasets if that is your intent.
This and other GroupLens data sets are publicly available for download at
--- Original source retains full ownership of the source dataset ---
This dataset (ml-latest-small) describes 5-star rating and free-text tagging activity from MovieLens, a movie recommendation service. It contains 100836 ratings and 3683 tag applications across 9742 movies. These data were created by 610 users between March 29, 1996 and September 24, 2018. This dataset was generated on September 26, 2018.
Users were selected at random for inclusion. All selected users had rated at least 20 movies. No demographic information is included. Each user is represented by an id, and no other information is provided.
The data are contained in the files - - links.csv - movies.csv - ratings.csv - tags.csv
This and other GroupLens data sets are publicly available for download at http://grouplens.org/datasets/.
License: This dataset is sourced from the GroupLens Research Group at the University of Minnesota. It is provided for non-commercial research and educational purposes only. License details can be found here under Usage License - https://files.grouplens.org/datasets/movielens/ml-latest-small-README.html
Important:
Citation F. Maxwell Harper and Joseph A. Konstan. 2015. The MovieLens Datasets: History and Context. ACM Transactions on Interactive Intelligent Systems (TiiS) 5, 4: 19:1–19:19. https://doi.org/10.1145/2827872
Not seeing a result you expected?
Learn how you can add new datasets to our index.
This dataset contains a set of movie ratings from the MovieLens website, a movie recommendation service. This dataset was collected and maintained by GroupLens, a research group at the University of Minnesota. There are 5 versions included: "25m", "latest-small", "100k", "1m", "20m". In all datasets, the movies data and ratings data are joined on "movieId". The 25m dataset, latest-small dataset, and 20m dataset contain only movie data and rating data. The 1m dataset and 100k dataset contain demographic data in addition to movie and rating data.
For each version, users can view either only the movies data by adding the "-movies" suffix (e.g. "25m-movies") or the ratings data joined with the movies data (and users data in the 1m and 100k datasets) by adding the "-ratings" suffix (e.g. "25m-ratings").
The features below are included in all versions with the "-ratings" suffix.
The "100k-ratings" and "1m-ratings" versions in addition include the following demographic features.
In addition, the "100k-ratings" dataset would also have a feature "raw_user_age" which is the exact ages of the users who made the rating
Datasets with the "-movies" suffix contain only "movie_id", "movie_title", and "movie_genres" features.
To use this dataset:
import tensorflow_datasets as tfds
ds = tfds.load('movielens', split='train')
for ex in ds.take(4):
print(ex)
See the guide for more informations on tensorflow_datasets.