7 datasets found
  1. g

    MovieLens 100K

    • grouplens.org
    Updated Oct 12, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2015). MovieLens 100K [Dataset]. https://grouplens.org/datasets/movielens/100k/
    Explore at:
    Dataset updated
    Oct 12, 2015
    Description

    Stable benchmark dataset. 100,000 ratings from 1000 users on 1700 movies. Released 4/1998.

  2. g

    MovieLens 1M

    • grouplens.org
    • kaggle.com
    Updated Mar 19, 2016
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2016). MovieLens 1M [Dataset]. https://grouplens.org/datasets/movielens/1m/
    Explore at:
    Dataset updated
    Mar 19, 2016
    Description

    Stable benchmark dataset. 1 million ratings from 6000 users on 4000 movies. Released 2/2003.

  3. H

    Raw rating data Movielens (GroupLens, 1998) 100K

    • dataverse.harvard.edu
    Updated Feb 16, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Loc Nguyen (2021). Raw rating data Movielens (GroupLens, 1998) 100K [Dataset]. http://doi.org/10.7910/DVN/H2WHJL
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 16, 2021
    Dataset provided by
    Harvard Dataverse
    Authors
    Loc Nguyen
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    The raw rating data Movielens (GroupLens, 1998) 100K has 100,000 ratings from 943 users on 1682 movies (items), which is available at https://files.grouplens.org/datasets/movielens/ml-100k.zip.

  4. H

    Standardized Hudup dataset based on Movielens 100k

    • dataverse.harvard.edu
    • data.mendeley.com
    Updated Feb 16, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Loc Nguyen (2021). Standardized Hudup dataset based on Movielens 100k [Dataset]. http://doi.org/10.7910/DVN/ZF3GWF
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 16, 2021
    Dataset provided by
    Harvard Dataverse
    Authors
    Loc Nguyen
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Standardized Hudup dataset receives information from raw data, which is composed of ten units such as “hdp_config”, “hdp_account”, “hdp_attribute_map”, “hdp_nominal”, “hdp_user”, “hdp_item”, “hdp_rating”, “hdp_context_template”, “hdp_context”, and “hdp_sample”. Each unit has particular functions, which is described in the section of data description. Hudup dataset is meta-data which models any raw data with abstract level. The default raw data which is source of Hudup dataset here is Movielens dataset (GroupLens, 1998) 100K has 100,000 ratings from 943 users on 1682 movies (items), which is available at https://files.grouplens.org/datasets/movielens/ml-100k.zip.

  5. The Movies Dataset

    • kaggle.com
    zip
    Updated Nov 10, 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rounak Banik (2017). The Movies Dataset [Dataset]. https://www.kaggle.com/datasets/rounakbanik/the-movies-dataset/discussion/168175
    Explore at:
    zip(238862293 bytes)Available download formats
    Dataset updated
    Nov 10, 2017
    Authors
    Rounak Banik
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    These files contain metadata for all 45,000 movies listed in the Full MovieLens Dataset. The dataset consists of movies released on or before July 2017. Data points include cast, crew, plot keywords, budget, revenue, posters, release dates, languages, production companies, countries, TMDB vote counts and vote averages.

    This dataset also has files containing 26 million ratings from 270,000 users for all 45,000 movies. Ratings are on a scale of 1-5 and have been obtained from the official GroupLens website.

    Content

    This dataset consists of the following files:

    movies_metadata.csv: The main Movies Metadata file. Contains information on 45,000 movies featured in the Full MovieLens dataset. Features include posters, backdrops, budget, revenue, release dates, languages, production countries and companies.

    keywords.csv: Contains the movie plot keywords for our MovieLens movies. Available in the form of a stringified JSON Object.

    credits.csv: Consists of Cast and Crew Information for all our movies. Available in the form of a stringified JSON Object.

    links.csv: The file that contains the TMDB and IMDB IDs of all the movies featured in the Full MovieLens dataset.

    links_small.csv: Contains the TMDB and IMDB IDs of a small subset of 9,000 movies of the Full Dataset.

    ratings_small.csv: The subset of 100,000 ratings from 700 users on 9,000 movies.

    The Full MovieLens Dataset consisting of 26 million ratings and 750,000 tag applications from 270,000 users on all the 45,000 movies in this dataset can be accessed here

    Acknowledgements

    This dataset is an ensemble of data collected from TMDB and GroupLens. The Movie Details, Credits and Keywords have been collected from the TMDB Open API. This product uses the TMDb API but is not endorsed or certified by TMDb. Their API also provides access to data on many additional movies, actors and actresses, crew members, and TV shows. You can try it for yourself here.

    The Movie Links and Ratings have been obtained from the Official GroupLens website. The files are a part of the dataset available here

    https://www.themoviedb.org/assets/static_cache/9b3f9c24d9fd5f297ae433eb33d93514/images/v4/logos/408x161-powered-by-rectangle-green.png" alt="">

    Inspiration

    This dataset was assembled as part of my second Capstone Project for Springboard's Data Science Career Track. I wanted to perform an extensive EDA on Movie Data to narrate the history and the story of Cinema and use this metadata in combination with MovieLens ratings to build various types of Recommender Systems.

    Both my notebooks are available as kernels with this dataset: The Story of Film and Movie Recommender Systems

    Some of the things you can do with this dataset: Predicting movie revenue and/or movie success based on a certain metric. What movies tend to get higher vote counts and vote averages on TMDB? Building Content Based and Collaborative Filtering Based Recommendation Engines.

  6. A

    ‘last.fm Music Artist Scrobbles’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Feb 14, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘last.fm Music Artist Scrobbles’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-last-fm-music-artist-scrobbles-b1d2/0776ba62/?iid=000-706&v=presentation
    Explore at:
    Dataset updated
    Feb 14, 2022
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘last.fm Music Artist Scrobbles’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/pcbreviglieri/lastfm-music-artist-scrobbles on 14 February 2022.

    --- Dataset description provided by original source is as follows ---

    This dataset is a summarized, sanitized subset of the one released at The 2nd International Workshop on Information Heterogeneity and Fusion in Recommender Systems (HetRec 2011), currently hosted at the GroupLens website (here).

    Sanitization included: (a) artist name mispelling correction and standardization; (b) reassignment of artists referenced with two or more artist id's; (c) removal of artists listed as 'unknown' or through their website addresses.

    The original dataset contains a larger number of files, including tag-related information, in addition to users, artists and scrobble counts. last.fm was contacted by the author and asked for some recent version of this content, in similar format, with no return until June 15th, 2020.

    --- Original source retains full ownership of the source dataset ---

  7. MovieLens Latest Small

    • kaggle.com
    zip
    Updated Oct 12, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    GroupLens (2018). MovieLens Latest Small [Dataset]. https://www.kaggle.com/grouplens/movielens-latest-small
    Explore at:
    zip(993937 bytes)Available download formats
    Dataset updated
    Oct 12, 2018
    Dataset authored and provided by
    GroupLens
    Description

    Dataset

    This dataset was created by Max Harper

    Released under Other (specified in description)

    Contents

  8. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
(2015). MovieLens 100K [Dataset]. https://grouplens.org/datasets/movielens/100k/

MovieLens 100K

Explore at:
Dataset updated
Oct 12, 2015
Description

Stable benchmark dataset. 100,000 ratings from 1000 users on 1700 movies. Released 4/1998.

Search
Clear search
Close search
Google apps
Main menu