19 datasets found
  1. g

    MovieLens 100K

    • grouplens.org
    • kaggle.com
    Updated Oct 12, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2015). MovieLens 100K [Dataset]. https://grouplens.org/datasets/movielens/100k/
    Explore at:
    Dataset updated
    Oct 12, 2015
    Description

    Stable benchmark dataset. 100,000 ratings from 1000 users on 1700 movies. Released 4/1998.

  2. g

    MovieLens 1M

    • grouplens.org
    • kaggle.com
    Updated Mar 19, 2016
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2016). MovieLens 1M [Dataset]. https://grouplens.org/datasets/movielens/1m/
    Explore at:
    Dataset updated
    Mar 19, 2016
    Description

    Stable benchmark dataset. 1 million ratings from 6000 users on 4000 movies. Released 2/2003.

  3. g

    MovieLens 20M

    • grouplens.org
    • academictorrents.com
    Updated Mar 19, 2016
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2016). MovieLens 20M [Dataset]. https://grouplens.org/datasets/movielens/20m/
    Explore at:
    Dataset updated
    Mar 19, 2016
    Description

    Stable benchmark dataset. 20 million ratings and 465,000 tag applications applied to 27,000 movies by 138,000 users. Includes tag genome data with 12 million relevance scores across 1,100 tags. Released 4/2015; updated 10/2016 to update links.csv and add tag genome data.

  4. g

    MovieLens 25M

    • grouplens.org
    Updated Dec 11, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2019). MovieLens 25M [Dataset]. https://grouplens.org/datasets/movielens/25m/
    Explore at:
    Dataset updated
    Dec 11, 2019
    Description

    Stable benchmark dataset. 25 million ratings and one million tag applications applied to 62,000 movies by 162,000 users. Includes tag genome data with 15 million relevance scores across 1,129 tags. Released 12/2019

  5. T

    movielens

    • tensorflow.org
    • opendatalab.com
    • +1more
    Updated Jul 8, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2020). movielens [Dataset]. https://www.tensorflow.org/datasets/catalog/movielens
    Explore at:
    Dataset updated
    Jul 8, 2020
    Description

    This dataset contains a set of movie ratings from the MovieLens website, a movie recommendation service. This dataset was collected and maintained by GroupLens, a research group at the University of Minnesota. There are 5 versions included: "25m", "latest-small", "100k", "1m", "20m". In all datasets, the movies data and ratings data are joined on "movieId". The 25m dataset, latest-small dataset, and 20m dataset contain only movie data and rating data. The 1m dataset and 100k dataset contain demographic data in addition to movie and rating data.

    • "25m": This is the latest stable version of the MovieLens dataset. It is recommended for research purposes.
    • "latest-small": This is a small subset of the latest version of the MovieLens dataset. It is changed and updated over time by GroupLens.
    • "100k": This is the oldest version of the MovieLens datasets. It is a small dataset with demographic data.
    • "1m": This is the largest MovieLens dataset that contains demographic data.
    • "20m": This is one of the most used MovieLens datasets in academic papers along with the 1m dataset.

    For each version, users can view either only the movies data by adding the "-movies" suffix (e.g. "25m-movies") or the ratings data joined with the movies data (and users data in the 1m and 100k datasets) by adding the "-ratings" suffix (e.g. "25m-ratings").

    The features below are included in all versions with the "-ratings" suffix.

    • "movie_id": a unique identifier of the rated movie
    • "movie_title": the title of the rated movie with the release year in parentheses
    • "movie_genres": a sequence of genres to which the rated movie belongs
    • "user_id": a unique identifier of the user who made the rating
    • "user_rating": the score of the rating on a five-star scale
    • "timestamp": the timestamp of the ratings, represented in seconds since midnight Coordinated Universal Time (UTC) of January 1, 1970

    The "100k-ratings" and "1m-ratings" versions in addition include the following demographic features.

    • "user_gender": gender of the user who made the rating; a true value corresponds to male
    • "bucketized_user_age": bucketized age values of the user who made the rating, the values and the corresponding ranges are:
      • 1: "Under 18"
      • 18: "18-24"
      • 25: "25-34"
      • 35: "35-44"
      • 45: "45-49"
      • 50: "50-55"
      • 56: "56+"
    • "user_occupation_label": the occupation of the user who made the rating represented by an integer-encoded label; labels are preprocessed to be consistent across different versions
    • "user_occupation_text": the occupation of the user who made the rating in the original string; different versions can have different set of raw text labels
    • "user_zip_code": the zip code of the user who made the rating

    In addition, the "100k-ratings" dataset would also have a feature "raw_user_age" which is the exact ages of the users who made the rating

    Datasets with the "-movies" suffix contain only "movie_id", "movie_title", and "movie_genres" features.

    To use this dataset:

    import tensorflow_datasets as tfds
    
    ds = tfds.load('movielens', split='train')
    for ex in ds.take(4):
     print(ex)
    

    See the guide for more informations on tensorflow_datasets.

  6. MovieLens 20M Dataset

    • kaggle.com
    Updated Aug 15, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    GroupLens (2018). MovieLens 20M Dataset [Dataset]. https://www.kaggle.com/datasets/grouplens/movielens-20m-dataset/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 15, 2018
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    GroupLens
    Description

    Context

    The datasets describe ratings and free-text tagging activities from MovieLens, a movie recommendation service. It contains 20000263 ratings and 465564 tag applications across 27278 movies. These data were created by 138493 users between January 09, 1995 and March 31, 2015. This dataset was generated on October 17, 2016.

    Users were selected at random for inclusion. All selected users had rated at least 20 movies.

    Content

    No demographic information is included. Each user is represented by an id, and no other information is provided.

    The data are contained in six files.

    tag.csv that contains tags applied to movies by users:

    • userId

    • movieId

    • tag

    • timestamp

    rating.csv that contains ratings of movies by users:

    • userId

    • movieId

    • rating

    • timestamp

    movie.csv that contains movie information:

    • movieId

    • title

    • genres

    link.csv that contains identifiers that can be used to link to other sources:

    • movieId

    • imdbId

    • tmbdId

    genome_scores.csv that contains movie-tag relevance data:

    • movieId

    • tagId

    • relevance

    genome_tags.csv that contains tag descriptions:

    • tagId

    • tag

    Acknowledgements

    The original datasets can be found here. To acknowledge use of the dataset in publications, please cite the following paper:

    F. Maxwell Harper and Joseph A. Konstan. 2015. The MovieLens Datasets: History and Context. ACM Transactions on Interactive Intelligent Systems (TiiS) 5, 4, Article 19 (December 2015), 19 pages. DOI=http://dx.doi.org/10.1145/2827872

    Inspiration

    Some ideas worth exploring:

    • Which genres receive the highest ratings? How does this change over time?

    • Determine the temporal trends in the genres/tagging activity of the movies released

  7. H

    Standardized Hudup dataset based on Movielens 100k

    • dataverse.harvard.edu
    • data.mendeley.com
    Updated Feb 16, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Loc Nguyen (2021). Standardized Hudup dataset based on Movielens 100k [Dataset]. http://doi.org/10.7910/DVN/ZF3GWF
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 16, 2021
    Dataset provided by
    Harvard Dataverse
    Authors
    Loc Nguyen
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Standardized Hudup dataset receives information from raw data, which is composed of ten units such as “hdp_config”, “hdp_account”, “hdp_attribute_map”, “hdp_nominal”, “hdp_user”, “hdp_item”, “hdp_rating”, “hdp_context_template”, “hdp_context”, and “hdp_sample”. Each unit has particular functions, which is described in the section of data description. Hudup dataset is meta-data which models any raw data with abstract level. The default raw data which is source of Hudup dataset here is Movielens dataset (GroupLens, 1998) 100K has 100,000 ratings from 943 users on 1682 movies (items), which is available at https://files.grouplens.org/datasets/movielens/ml-100k.zip.

  8. g

    MovieLens 32M

    • grouplens.org
    Updated May 19, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). MovieLens 32M [Dataset]. https://grouplens.org/datasets/movielens/32m/
    Explore at:
    Dataset updated
    May 19, 2024
    Description

    Stable benchmark dataset. 32 million ratings and two million tag applications applied to 87,585 movies by 200,948 users. Collected 10/2023 Released 05/2024

  9. Book Genome Dataset

    • kaggle.com
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Daniel Young (2023). Book Genome Dataset [Dataset]. https://www.kaggle.com/datasets/youngdaniel/book-genome-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 30, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Daniel Young
    License

    Attribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
    License information was derived automatically

    Description

    I uploaded GroupLens' Book Genome dataset on Kaggle. It doesn't seem like they're active here any more and I want to use this here for some exploratory learning work I did.

    Official link here: https://grouplens.org/datasets/book-genome/

    Tag Genome is a data structure containing scores indicating the degree to which tags apply to items, such as movies or books. This dataset contains a Tag Genome generated for a set of books along with the data used for its generation (raw data). Raw data consists of a subset of the Goodreads dataset [Wan and McAuley, 2018, Wan et al., 2019] and book-tag ratings. The Goodreads subset includes information on popular books, such as titles, authors, release years, user ratings, reviews and shelves. Shelves are lists that users use to organize books in Goodreads (https://www.goodreads.com/). In these instructions, we refer to adding books to shelves as attaching tags (shelf names) to books. To collect book-tag ratings, we conducted a survey on Amazon Mechanical Turk, where we asked users to indicate degree to which tags apply to books from this subset. To generate book-tag scores, we used two state-of-the-art algorithms: Glmer [Vig et al., 2012] and TagDL [Kotkov et al., 2021]. The code is available in the following GitHub repository: https://github.com/Bionic1251/Revisiting-the-Tag-Relevance-Prediction-Problem

  10. the_movies_dataset

    • kaggle.com
    zip
    Updated Jun 19, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    sezgin ildes (2021). the_movies_dataset [Dataset]. https://www.kaggle.com/sezginildes/the-movies-dataset
    Explore at:
    zip(15456686 bytes)Available download formats
    Dataset updated
    Jun 19, 2021
    Authors
    sezgin ildes
    Description

    Context These files contain metadata for all 45,000 movies listed in the Full MovieLens Dataset. The dataset consists of movies released on or before July 2017. Data points include cast, crew, plot keywords, budget, revenue, posters, release dates, languages, production companies, countries, TMDB vote counts and vote averages.

    This dataset also has files containing 26 million ratings from 270,000 users for all 45,000 movies. Ratings are on a scale of 1-5 and have been obtained from the official GroupLens website.

    Content This dataset consists of the following files:

    movies_metadata.csv: The main Movies Metadata file. Contains information on 45,000 movies featured in the Full MovieLens dataset. Features include posters, backdrops, budget, revenue, release dates, languages, production countries and companies.

    keywords.csv: Contains the movie plot keywords for our MovieLens movies. Available in the form of a stringified JSON Object.

    credits.csv: Consists of Cast and Crew Information for all our movies. Available in the form of a stringified JSON Object.

    links.csv: The file that contains the TMDB and IMDB IDs of all the movies featured in the Full MovieLens dataset.

    links_small.csv: Contains the TMDB and IMDB IDs of a small subset of 9,000 movies of the Full Dataset.

    ratings_small.csv: The subset of 100,000 ratings from 700 users on 9,000 movies.

    The Full MovieLens Dataset consisting of 26 million ratings and 750,000 tag applications from 270,000 users on all the 45,000 movies in this dataset can be accessed here

    Acknowledgements This dataset is an ensemble of data collected from TMDB and GroupLens. The Movie Details, Credits and Keywords have been collected from the TMDB Open API. This product uses the TMDb API but is not endorsed or certified by TMDb. Their API also provides access to data on many additional movies, actors and actresses, crew members, and TV shows. You can try it for yourself here.

    The Movie Links and Ratings have been obtained from the Official GroupLens website. The files are a part of the dataset available here

    Inspiration This dataset was assembled as part of my second Capstone Project for Springboard's Data Science Career Track. I wanted to perform an extensive EDA on Movie Data to narrate the history and the story of Cinema and use this metadata in combination with MovieLens ratings to build various types of Recommender Systems.

    Both my notebooks are available as kernels with this dataset: The Story of Film and Movie Recommender Systems

    Some of the things you can do with this dataset: Predicting movie revenue and/or movie success based on a certain metric. What movies tend to get higher vote counts and vote averages on TMDB? Building Content Based and Collaborative Filtering Based Recommendation Engines.

  11. h

    MovieLens Dataset - 100K 评级 - Dataset - 海数据

    • haidatas.com
    Updated Mar 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). MovieLens Dataset - 100K 评级 - Dataset - 海数据 [Dataset]. https://haidatas.com/dataset/movielens-shujuji-100k-pingji
    Explore at:
    Dataset updated
    Mar 9, 2025
    Description

    关于数据集 此数据集 (ml-latest-small) 描述了电影推荐服务 MovieLens 的 5 星评分和自由文本标记活动。它包含 9742 部电影的 100836 个评分和 3683 个标签应用。这些数据由 610 名用户在 1996 年 3 月 29 日至 2018 年 9 月 24 日期间创建。此数据集于 2018 年 9 月 26 日生成。 用户是随机选择的。所有选定的用户都至少评价过 20 部电影。不包括人口统计信息。每个用户都用一个 ID 表示,不提供其他信息。 数据包含在以下文件中 - 链接.csv 电影.csv 评级.csv 标签.csv 该数据集和其他 GroupLens 数据集均可从http://grouplens.org/datasets/公开下载。 许可证: 此数据集来源于明尼苏达大学的 GroupLens 研究小组。它仅用于非商业研究和教育目的。 许可证详细信息可在使用许可证下找到 - https://files.grouplens.org/datasets/movielens/ml-latest-small-README.html 重要的: 此数据集按“原样”提供,不提供任何担保。 如需商业使用,请联系 grouplens-info@umn.edu。” 引文 F. Maxwell Harper 和 Joseph A. Konstan。2015 年。MovieLens 数据集:历史和背景。ACM 交互式智能系统汇刊 (TiiS) 5, 4: 19:1–19:19。https ://doi.org/10.1145/2827872

  12. movie_rating_data

    • kaggle.com
    Updated Nov 9, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    pooh (2017). movie_rating_data [Dataset]. https://www.kaggle.com/ashukr/movie-rating-data/activity
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 9, 2017
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    pooh
    Description

    Context

    Stable benchmark dataset. 20 million ratings and 465,000 tag applications applied to 27,000 movies by 138,000 users.

    Content

    • this dataset has got three files named as ratings.csv, movies.csv and tags.csv

      • movies.csv In the 3 columns stored are the values of movieId, title and genre. The title has got the release year of movie in parenthesis. The movie list range from Dickson Greeting (1891) to movies of 2015. With the total of 27278 movies.
      • ratings.csv the movies have been rated by 138493 users on the scale of 1 to 5, this file contains the information divided in the column 'userId', 'movieId', 'rating' and 'timestamp'.

      • tags.csv this file has the data divided under category 'userId','movieId' and 'tag'

    Acknowledgements

    I got this data from MovieLens, for a mini project. http://grouplens.org/datasets/movielens/20m/"> This is the link to original data set

    Inspiration

    You have got a ton data. You can use this to make fun decisions like which is the best movie series of all time or create a completely new story out of the data that you have.

  13. data1.tar.gz

    • figshare.com
    application/x-gzip
    Updated May 20, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Douglas Bates (2020). data1.tar.gz [Dataset]. http://doi.org/10.6084/m9.figshare.12343910.v1
    Explore at:
    application/x-gzipAvailable download formats
    Dataset updated
    May 20, 2020
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Douglas Bates
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    A collection of several data sets used to illustrate fitting (generalized) linear mixed-effects models. Individual data sets are in Feather format (https://github.com/wesm/feather). They include Dyestuff, Dyestuff2, Penicillin, Pastes, InstEval, sleepstudy, cbpp, Contraception, grouseticks and VerbAgg from the lme4 package for R. The kb07 data is from github.com/dalejbarr/kronmueller-barr-2007 and ml1m is from https://grouplens.org/datasets/movielens/1m/

  14. Z

    Fair RecSys Datasets

    • data.niaid.nih.gov
    Updated Feb 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kowald Dominik (2023). Fair RecSys Datasets [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6123878
    Explore at:
    Dataset updated
    Feb 22, 2023
    Dataset authored and provided by
    Kowald Dominik
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Four multimedia recommender systems datasets to study popularity bias and fairness:

    Last.fm (lfm.zip), based on the LFM-1b dataset of JKU Linz (http://www.cp.jku.at/datasets/LFM-1b/)

    MovieLens (ml.zip), based on MovieLens-1M dataset (https://grouplens.org/datasets/movielens/1m/)

    BookCrossing (book.zip), based on the BookCrossing dataset of Uni Freiburg (http://www2.informatik.uni-freiburg.de/~cziegler/BX/)

    MyAnimeList (anime.zip), based on the MyAnimeList dataset of Kaggle (https://www.kaggle.com/CooperUnion/anime-recommendations-database)

    Each dataset contains of user interactions (user_events.txt) and three user groups that differ in their inclination to popular/mainstream items: LowPop (low_main_users.txt), MedPop (med_main_users.txt), and HighPop (high_main_users.txt).

    The format of the three user files are "user,mainstreaminess"

    The format of the user-events files are "user,item,preference"

    Example Python-code for analyzing the datasets as well as more information on the user groups can be found on Github (https://github.com/domkowald/FairRecSys) and on Arxiv (https://arxiv.org/abs/2203.00376)

  15. FEDORA-Recsys Test Traces

    • zenodo.org
    zip
    Updated Mar 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jinyu Liu; Jinyu Liu (2025). FEDORA-Recsys Test Traces [Dataset]. http://doi.org/10.5281/zenodo.14818428
    Explore at:
    zipAvailable download formats
    Dataset updated
    Mar 28, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Jinyu Liu; Jinyu Liu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Synthetic trace generated using techniques from DLRM from the data distributions of the Taobao Ad Display/Click Dataset and the Movielens 20M Dataset. Intended for testing of FEDORA-OramSim simulator.

  16. Grouplens Datasets (ml-1m, ml-100K, and hetrec2011-movielens-2k-v2)

    • figshare.com
    zip
    Updated Jun 6, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    F. Maxwell Harper; Joseph A. Konstan (2023). Grouplens Datasets (ml-1m, ml-100K, and hetrec2011-movielens-2k-v2) [Dataset]. http://doi.org/10.6084/m9.figshare.7093595.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 6, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    F. Maxwell Harper; Joseph A. Konstan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    GroupLens is a research lab in the Department of Computer Science and Engineering at the University of Minnesota, Twin Cities specializing in recommender systems, online communities, mobile and ubiquitous technologies, digital libraries, and local geographic information systems.

  17. Movielens 100k dataset

    • kaggle.com
    Updated Dec 1, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fakhre Alam (2020). Movielens 100k dataset [Dataset]. https://www.kaggle.com/datasets/fakhrealam0786/movielens-100k-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 1, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Fakhre Alam
    Description

    Context

    This dataset is a subset of MovieLens 100k data which were collected by the GroupLens Research Project at the University of Minnesota. You can find full dataset from here👍

    Content

    This data set consists of 6 columns: * movie_id -- unique id for each movie * title -- title of the movie * year -- year in which the movie was released * directors -- director of the movie * actors -- actors of the movie * genres -- genres of the movie (ex: comedy, action, horror, etc...)

    Acknowledgements

    Thanks to GroupLens for providing up this data.

  18. MovieLens Latest Small

    • kaggle.com
    zip
    Updated Oct 12, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    GroupLens (2018). MovieLens Latest Small [Dataset]. https://www.kaggle.com/grouplens/movielens-latest-small
    Explore at:
    zip(993937 bytes)Available download formats
    Dataset updated
    Oct 12, 2018
    Dataset authored and provided by
    GroupLens
    Description

    Dataset

    This dataset was created by Max Harper

    Released under Other (specified in description)

    Contents

    It contains the following files:

  19. Movielens - Case Study

    • kaggle.com
    Updated Mar 23, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Khushboo Nagdewani (2020). Movielens - Case Study [Dataset]. https://www.kaggle.com/khushboon/movielens-case-study/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 23, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Khushboo Nagdewani
    Description

    Background of Problem Statement

    The GroupLens Research Project is a research group in the Department of Computer Science and Engineering at the University of Minnesota. Members of the GroupLens Research Project are involved in many research projects related to the fields of information filtering, collaborative filtering, and recommender systems. The project is led by professors John Riedl and Joseph Konstan. The project began to explore automated collaborative filtering in 1992 but is most well known for its worldwide trial of an automated collaborative filtering system for Usenet news in 1996. Since then the project has expanded its scope to research overall information by filtering solutions, integrating into content-based methods, as well as, improving current collaborative filtering technology.

    Problem Objective :

    Here, we ask you to perform the analysis using the Exploratory Data Analysis technique. You need to find features affecting the ratings of any particular movie and build a model to predict the movie ratings.

  20. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
(2015). MovieLens 100K [Dataset]. https://grouplens.org/datasets/movielens/100k/

MovieLens 100K

Explore at:
Dataset updated
Oct 12, 2015
Description

Stable benchmark dataset. 100,000 ratings from 1000 users on 1700 movies. Released 4/1998.

Search
Clear search
Close search
Google apps
Main menu