100+ datasets found
  1. Kaggle Upload

    • kaggle.com
    zip
    Updated Oct 30, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Miracle Smith (2024). Kaggle Upload [Dataset]. https://www.kaggle.com/datasets/miraclesmith/kaggle-upload
    Explore at:
    zip(720434 bytes)Available download formats
    Dataset updated
    Oct 30, 2024
    Authors
    Miracle Smith
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    Dataset

    This dataset was created by Miracle Smith

    Released under Database: Open Database, Contents: Database Contents

    Contents

  2. File Upload

    • kaggle.com
    zip
    Updated Nov 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Krishna_raj@84 (2025). File Upload [Dataset]. https://www.kaggle.com/datasets/krishnaraj84/file-upload
    Explore at:
    zip(50526 bytes)Available download formats
    Dataset updated
    Nov 11, 2025
    Authors
    Krishna_raj@84
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Krishna_raj@84

    Released under MIT

    Contents

  3. Kaggle: User Uploaded Dataset

    • kaggle.com
    zip
    Updated Oct 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bryan Weather Chung (2023). Kaggle: User Uploaded Dataset [Dataset]. https://www.kaggle.com/datasets/bryanchungweather/kaggle-user-uploaded-dataset
    Explore at:
    zip(100393 bytes)Available download formats
    Dataset updated
    Oct 5, 2023
    Authors
    Bryan Weather Chung
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Description

    This comprehensive collection serves as a valuable resource for data enthusiasts, researchers, and analysts seeking to explore a wide range of topics and uncover unique insights.

    Context and Sources

    Context: Our dataset is curated from user contributions on the Kaggle platform

    Sources: https://www.kaggle.com/datasets?topic=musicDataset

    Column NameDefinition
    Dataset TitleThe title of the dataset
    URLThe web address
    AuthorThe individual or organization responsible for uploading the dataset.
    Last UpdatedThe date when the dataset was last modified or updated.
    Usability ScoreAn indicator of the dataset's quality, usefulness, and ease of use, as rated by Kaggle.
    File SizeThe size of the dataset file, helping users estimate the storage requirements.
    Upvote CountThe number of upvotes received by the dataset, reflecting its popularity and relevance among users.
    Medal TypeKaggles Progression Type.

  4. Kaggle Top DatasetsπŸš€πŸ“Š

    • kaggle.com
    zip
    Updated Apr 10, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aaron Frias (2024). Kaggle Top DatasetsπŸš€πŸ“Š [Dataset]. https://www.kaggle.com/datasets/aaronfriasr/kaggle-top-datasets
    Explore at:
    zip(1572305 bytes)Available download formats
    Dataset updated
    Apr 10, 2024
    Authors
    Aaron Frias
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Context

    Kaggle is one of the largest communities of data scientists and machine learning practitioners in the world, and its platform hosts thousands of datasets covering a wide range of topics and industries. With so many options to choose from, it can be difficult to know where to start or what datasets are worth exploring. That's where this dataset comes in. By scraping information about the top 10,000 datasets on Kaggle, we have created a single source of truth for the most popular and useful datasets on the platform. This dataset is not just a list of names and numbers, but a valuable tool for data enthusiasts and professionals alike, providing insights into the latest trends and techniques in data science and machine learning

    Column description - Dataset_name - Name of the dataset - Author_name - Name of the author - Author_id - Kaggle id of the author - No_of_files - Number of files the author has uploaded - size - Size of all the files - Type_of_file - Type of the files such as csv, json etc. - Upvotes - Total upvotes of the dataset - Medals - Medal of the dataset - Usability - Usability of the dataset - Date - Date in which the dataset is uploaded - Day - Day in which the dataset is uploaded - Time - Time in which the dataset is uploaded - Dataset_link - Kaggle link of the dataset

    Acknowledgements The data has been scraped from the official Kaggle Website and is available under the Creative Common License.

    Enjoy & Keep Learning !!!

  5. fxcking kaggle let me upload only zip file

    • kaggle.com
    zip
    Updated Oct 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    chilli_sawze (2025). fxcking kaggle let me upload only zip file [Dataset]. https://www.kaggle.com/datasets/chillisawze/fxcking-kaggle-let-me-upload-only-zip-file
    Explore at:
    zip(5094369181 bytes)Available download formats
    Dataset updated
    Oct 28, 2025
    Authors
    chilli_sawze
    Description

    Dataset

    This dataset was created by chilli_sawze

    Contents

  6. my_upload_file

    • kaggle.com
    zip
    Updated Sep 1, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Haidy Ashraf21 (2023). my_upload_file [Dataset]. https://www.kaggle.com/datasets/haidyashraf21/my-upload-file
    Explore at:
    zip(264 bytes)Available download formats
    Dataset updated
    Sep 1, 2023
    Authors
    Haidy Ashraf21
    Description

    Dataset

    This dataset was created by Haidy Ashraf21

    Contents

  7. (Sunset)πŸ“’ Meta Kaggle ported to MS SQL SERVER

    • kaggle.com
    zip
    Updated Mar 20, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    BwandoWando (2024). (Sunset)πŸ“’ Meta Kaggle ported to MS SQL SERVER [Dataset]. https://www.kaggle.com/datasets/bwandowando/meta-kaggle-ported-to-sql-server-2022-database
    Explore at:
    zip(8635902534 bytes)Available download formats
    Dataset updated
    Mar 20, 2024
    Authors
    BwandoWando
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    I've always wanted to explore Kaggle's Meta Kaggle dataset but I am more comfortable on using TSQL when it comes to writing (very) complex queries. Also, I tend to write queries faster when using SQL MANAGEMENT STUDIO, like 100x faster. So, I ported Kaggle's Meta Kaggle dataset into MS SQL SERVER 2022 database format, created a backup file, then uploaded it here.

    • MSSQL VERSION: SQL Server 2022
    • Collation: SQL_Latin1_General_CP1_CI_AS
    • Recovery model: simple

    Requirements

    • Download and install the SQL SERVER 2022 Developer edition here
    • Download the backup file
    • Restore the backup file into your local. If you havent done this before, it's easy and straightforward. Here is a guide.

    (QUOTED FROM THE ORIGINAL DATASET)

    Meta Kaggle

    Explore Kaggle's public data on competitions, datasets, kernels (code/ notebooks) and more Meta Kaggle may not be the Rosetta Stone of data science, but they think there's a lot to learn (and plenty of fun to be had) from this collection of rich data about Kaggle’s community and activity.

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F1842206%2F2ad97bce7839d6e57674e7a82981ed23%2F2Egeb8R.png?generation=1688912953875842&alt=media" alt="">

    Notes

  8. testing file upload

    • kaggle.com
    zip
    Updated Nov 24, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ahmad Basher (2024). testing file upload [Dataset]. https://www.kaggle.com/datasets/ahmadbasher/testing-file-upload
    Explore at:
    zip(6336 bytes)Available download formats
    Dataset updated
    Nov 24, 2024
    Authors
    Ahmad Basher
    Description

    Dataset

    This dataset was created by Ahmad Basher

    Contents

  9. Metadata of Kaggle dataset _Include MedalVoteCount

    • kaggle.com
    zip
    Updated Dec 20, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    kukuroo3 (2021). Metadata of Kaggle dataset _Include MedalVoteCount [Dataset]. https://www.kaggle.com/datasets/kukuroo3/dataset-of-kaggle-dataset-include-medalvotecount
    Explore at:
    zip(11216728 bytes)Available download formats
    Dataset updated
    Dec 20, 2021
    Authors
    kukuroo3
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    https://github.com/dean-kg/RoadToExpertRanking_Kaggle/blob/main/kg_medal.png?raw=true" alt="kaggle_medal">

    The Kaggle Dataset medal rule has a bronze medal when a user with a rank of novice or higher upvotes 5 or more, a silver medal when 20 or more upvotes, and a gold medal when 50 or more. Recently I uploaded a lot of datasets to Kaggle. However, although I have won many bronze medals, I have never won more than a silver medal. So, I created this dataset to check the characteristics of the dataset that will receive the silver medal. The metadata of the dataset that received at least one upvote among all Kaggle datasets and the number of MedalVoteCount in each dataset were recorded together.

    This dataset can be used to create strategies for receiving silver and gold medals.

    Content

    42,955 meta data of datasets from 2015-12 to 2021-11

    • DataSetMedals : medal color
    • ct : create time
    • dataUrl :data url (follwed https://www.kaggle.com/)
    • totalviews
    • votecount : total vote counts
    • medalvotecount : upvote Counting by users who are upper Novice Rank
    • totaldownloads : downloads counts
    • totalkernel :kernel counts
    • title
    • description
    • key : dataset tags
    • license

    Source

    https://www.kaggle.com/kaggle/meta-kaggle and get "MedalVoteCount" value by scraping

  10. Kaggle Datasets - Summary, Topics, Classification

    • kaggle.com
    zip
    Updated Nov 16, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Katherine Marsh (2020). Kaggle Datasets - Summary, Topics, Classification [Dataset]. https://www.kaggle.com/datasets/katherinemarsh/kaggle-datasets-summary-topics-classification
    Explore at:
    zip(273449 bytes)Available download formats
    Dataset updated
    Nov 16, 2020
    Authors
    Katherine Marsh
    Description

    Context

    Companies and individuals are storing increasingly more data digitally; however, much of the data is unused because it is unclassified. How many times have you opened your downloads folder, found a file you downloaded a year ago and you have no idea what the contents are? You can read through those files individually but imagine doing that for thousands of files. All that raw data in storage facilities create data lakes. As the amount of data grows and the complexity rises, data lakes become data swamps. The potentially valuable and interesting datasets will likely remain unused. Our tool addresses the need to classify these large pools of data in a visually effective and succinct manner by identifying keywords in datasets, and classifying datasets into a consistent taxonomy.

    The files listed within kaggleDatasetSummaryTopicsClassification.csv have been processed with our tool to generate the keywords and taxonomic classification as seen below. The summaries are not generated from our system. Instead they were retrieved from user input as they uploaded the files on Kaggle. We planned to utilize these summaries to create an NLG model to generate summaries from any input file. Unfortunately we were not able to collect enough data to build a good model. Hopefully the data within this set might help future users achieve that goal.

    Acknowledgements

    Developed with Senior Design Center at NC State in collaboration with SAS. Senior Design Team: Tanya Chu, Katherine Marsh, Nikhil Milind, Anna Owens SAS Representatives: : Nancy Rausch, Marty Warner, Brant Kay, Tyler Wendell, JP Trawinski

  11. Kaggle Dataset Metadata Repository

    • kaggle.com
    zip
    Updated Nov 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ijaj Ahmed (2024). Kaggle Dataset Metadata Repository [Dataset]. https://www.kaggle.com/datasets/ijajdatanerd/kaggle-dataset-metadata-repository
    Explore at:
    zip(5122110 bytes)Available download formats
    Dataset updated
    Nov 16, 2024
    Authors
    Ijaj Ahmed
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F13367141%2F444a868e669671faf9007822d6f2d348%2FAdd%20a%20heading.png?generation=1731775788329917&alt=media" alt="">

    Kaggle Dataset Metadata Collection πŸ“Š

    This dataset provides comprehensive metadata on various Kaggle datasets, offering detailed information about the dataset owners, creators, usage statistics, licensing, and more. It can help researchers, data scientists, and Kaggle enthusiasts quickly analyze the key attributes of different datasets on Kaggle. πŸ“š

    Dataset Overview:

    • Purpose: To provide detailed insights into Kaggle dataset metadata.
    • Content: Information related to the dataset's owner, creator, usage metrics, licensing, and more.
    • Target Audience: Data scientists, Kaggle competitors, and dataset curators.

    Columns Description πŸ“‹

    • datasetUrl 🌐: The URL of the Kaggle dataset page. This directs you to the specific dataset's page on Kaggle.

    • ownerAvatarUrl πŸ–ΌοΈ: The URL of the dataset owner's profile avatar on Kaggle.

    • ownerName πŸ‘€: The name of the dataset owner. This can be the individual or organization that created and maintains the dataset.

    • ownerUrl 🌍: A link to the Kaggle profile page of the dataset owner.

    • ownerUserId πŸ’Ό: The unique user ID of the dataset owner on Kaggle.

    • ownerTier πŸŽ–οΈ: The ownership tier, such as "Tier 1" or "Tier 2," indicating the owner's status or level on Kaggle.

    • creatorName πŸ‘©β€πŸ’»: The name of the dataset creator, which could be different from the owner.

    • creatorUrl 🌍: A link to the Kaggle profile page of the dataset creator.

    • creatorUserId πŸ’Ό: The unique user ID of the dataset creator.

    • scriptCount πŸ“œ: The number of scripts (kernels) associated with this dataset.

    • scriptsUrl πŸ”—: A link to the scripts (kernels) page for the dataset, where you can explore related code.

    • forumUrl πŸ’¬: The URL to the discussion forum for this dataset, where users can ask questions and share insights.

    • viewCount πŸ‘€: The number of views the dataset page has received on Kaggle.

    • downloadCount ⬇️: The number of times the dataset has been downloaded by users.

    • dateCreated πŸ“…: The date when the dataset was first created and uploaded to Kaggle.

    • dateUpdated πŸ”„: The date when the dataset was last updated or modified.

    • voteButton πŸ‘: The metadata for the dataset's vote button, showing how users interact with the dataset's quality ratings.

    • categories 🏷️: The categories or tags associated with the dataset, helping users filter datasets based on topics of interest (e.g., "Healthcare," "Finance").

    • licenseName πŸ›‘οΈ: The name of the license under which the dataset is shared (e.g., "CC0," "MIT License").

    • licenseShortName πŸ”‘: A short form or abbreviation of the dataset's license name (e.g., "CC0" for Creative Commons Zero).

    • datasetSize πŸ“¦: The size of the dataset in terms of storage, typically measured in MB or GB.

    • commonFileTypes πŸ“‚: A list of common file types included in the dataset (e.g., .csv, .json, .xlsx).

    • downloadUrl ⬇️: A direct link to download the dataset files.

    • newKernelNotebookUrl πŸ“: A link to a new kernel or notebook related to this dataset, for those who wish to explore it programmatically.

    • newKernelScriptUrl πŸ’»: A link to a new script for running computations or processing data related to the dataset.

    • usabilityRating 🌟: A rating or score representing how usable the dataset is, based on user feedback.

    • firestorePath πŸ”: A reference to the path in Firestore where this dataset’s metadata is stored.

    • datasetSlug 🏷️: A URL-friendly version of the dataset name, typically used for URLs.

    • rank πŸ“ˆ: The dataset's rank based on certain metrics (e.g., downloads, votes, views).

    • datasource 🌐: The source or origin of the dataset (e.g., government data, private organizations).

    • medalUrl πŸ…: A URL pointing to the dataset's medal or badge, indicating the dataset's quality or relevance.

    • hasHashLink πŸ”—: Indicates whether the dataset has a hash link for verifying data integrity.

    • ownerOrganizationId 🏒: The unique organization ID of the dataset's owner if the owner is an organization rather than an individual.

    • totalVotes πŸ—³οΈ: The total number of votes the dataset has received from users, reflecting its popularity or quality.

    • category_names πŸ“‘: A comma-separated string of category names that represent the dataset’s classification.

    This dataset is a valuable resource for those who want to analyze Kaggle's ecosystem, discover high-quality datasets, and explore metadata in a structured way. πŸŒπŸ“Š

  12. Kaggle Dataset

    • kaggle.com
    zip
    Updated Feb 9, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chidambara Raju G (2023). Kaggle Dataset [Dataset]. https://www.kaggle.com/datasets/rajugc/kaggle-dataset/discussion
    Explore at:
    zip(1572305 bytes)Available download formats
    Dataset updated
    Feb 9, 2023
    Authors
    Chidambara Raju G
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Context

    Kaggle is one of the largest communities of data scientists and machine learning practitioners in the world, and its platform hosts thousands of datasets covering a wide range of topics and industries. With so many options to choose from, it can be difficult to know where to start or what datasets are worth exploring. That's where this dataset comes in. By scraping information about the top 10,000 datasets on Kaggle, we have created a single source of truth for the most popular and useful datasets on the platform. This dataset is not just a list of names and numbers, but a valuable tool for data enthusiasts and professionals alike, providing insights into the latest trends and techniques in data science and machine learning

    Column description

    • Dataset_name - Name of the dataset
    • Author_name - Name of the author
    • Author_id - Kaggle id of the author
    • No_of_files - Number of files the author has uploaded
    • size - Size of all the files
    • Type_of_file - Type of the files such as csv, json etc.
    • Upvotes - Total upvotes of the dataset
    • Medals - Medal of the dataset
    • Usability - Usability of the dataset
    • Date - Date in which the dataset is uploaded
    • Day - Day in which the dataset is uploaded
    • Time - Time in which the dataset is uploaded
    • Dataset_link - Kaggle link of the dataset

    Acknowledgements

    The data has been scraped from the official Kaggle Website and is available under the Creative Common License.

    Keep Learning !!!

  13. Daily Energy Production in India

    • kaggle.com
    zip
    Updated Jul 20, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    vaibhav panvalkar (2020). Daily Energy Production in India [Dataset]. https://www.kaggle.com/datasets/vpanvalkar/daily-energy-production-in-india/data
    Explore at:
    zip(64718 bytes)Available download formats
    Dataset updated
    Jul 20, 2020
    Authors
    vaibhav panvalkar
    Area covered
    India
    Description

    Dataset

    This dataset was created by vaibhav panvalkar

    Contents

  14. 20BN_jester_V1_videos

    • kaggle.com
    zip
    Updated May 15, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    kyle-cloud (2021). 20BN_jester_V1_videos [Dataset]. https://www.kaggle.com/datasets/kylecloud/20bn-jester-v1-videos
    Explore at:
    zip(27378087293 bytes)Available download formats
    Dataset updated
    May 15, 2021
    Authors
    kyle-cloud
    Description

    Dataset

    This dataset was created by kyle-cloud

    Contents

  15. Uploaded files

    • kaggle.com
    zip
    Updated Apr 15, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    jinmuyan7 (2023). Uploaded files [Dataset]. https://www.kaggle.com/datasets/jinmuyan7/uploaded-files
    Explore at:
    zip(4143198403 bytes)Available download formats
    Dataset updated
    Apr 15, 2023
    Authors
    jinmuyan7
    Description

    Dataset

    This dataset was created by jinmuyan7

    Contents

  16. Reddit /r/datasets Dataset

    • kaggle.com
    zip
    Updated Nov 28, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2022). Reddit /r/datasets Dataset [Dataset]. https://www.kaggle.com/datasets/thedevastator/the-meta-corpus-of-datasets-the-reddit-dataset
    Explore at:
    zip(9619636 bytes)Available download formats
    Dataset updated
    Nov 28, 2022
    Authors
    The Devastator
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    The Meta-Corpus of Datasets: The Reddit Dataset

    The Complete Collection of Datasets Posted on Reddit

    By SocialGrep [source]

    About this dataset

    A subreddit dataset is a collection of posts and comments made on Reddit's /r/datasets board. This dataset contains all the posts and comments made on the /r/datasets subreddit from its inception to March 1, 2022. The dataset was procured using SocialGrep. The data does not include usernames to preserve users' anonymity and to prevent targeted harassment

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    In order to use this dataset, you will need to have a text editor such as Microsoft Word or LibreOffice installed on your computer. You will also need a web browser such as Google Chrome or Mozilla Firefox.

    Once you have the necessary software installed, open the The Reddit Dataset folder and double-click on the the-reddit-dataset-dataset-posts.csv file to open it in your preferred text editor.

    In the document, you will see a list of posts with the following information for each one: title, sentiment, score, URL, created UTC, permalink, subreddit NSFW status, and subreddit name.

    You can use this information to analyze trends in data sets posted on /r/datasets over time. For example, you could calculate the average score for all posts and compare it to the average score for posts in specific subReddits. Additionally, sentiment analysis could be performed on the titles of posts to see if there is a correlation between positive/negative sentiment and upvotes/downvotes

    Research Ideas

    • Finding correlations between different types of datasets
    • Determining which datasets are most popular on Reddit
    • Analyzing the sentiments of post and comments on Reddit's /r/datasets board

    Acknowledgements

    If you use this dataset in your research, please credit the original authors.

    Data Source

    License

    License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

    Columns

    File: the-reddit-dataset-dataset-comments.csv | Column name | Description | |:-------------------|:---------------------------------------------------| | type | The type of post. (String) | | subreddit.name | The name of the subreddit. (String) | | subreddit.nsfw | Whether or not the subreddit is NSFW. (Boolean) | | created_utc | The time the post was created, in UTC. (Timestamp) | | permalink | The permalink for the post. (String) | | body | The body of the post. (String) | | sentiment | The sentiment of the post. (String) | | score | The score of the post. (Integer) |

    File: the-reddit-dataset-dataset-posts.csv | Column name | Description | |:-------------------|:---------------------------------------------------| | type | The type of post. (String) | | subreddit.name | The name of the subreddit. (String) | | subreddit.nsfw | Whether or not the subreddit is NSFW. (Boolean) | | created_utc | The time the post was created, in UTC. (Timestamp) | | permalink | The permalink for the post. (String) | | score | The score of the post. (Integer) | | domain | The domain of the post. (String) | | url | The URL of the post. (String) | | selftext | The self-text of the post. (String) | | title | The title of the post. (String) |

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. If you use this dataset in your research, please credit SocialGrep.

  17. upload_coco_files

    • kaggle.com
    zip
    Updated Oct 27, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Robin (2025). upload_coco_files [Dataset]. https://www.kaggle.com/datasets/robinluy/upload-coco-files
    Explore at:
    zip(578326 bytes)Available download formats
    Dataset updated
    Oct 27, 2025
    Authors
    Robin
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Robin

    Released under Apache 2.0

    Contents

  18. Stylish Product Image Dataset

    • kaggle.com
    zip
    Updated May 21, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Santosh Kumar (2022). Stylish Product Image Dataset [Dataset]. https://www.kaggle.com/datasets/kuchhbhi/stylish-product-image-dataset
    Explore at:
    zip(9509715613 bytes)Available download formats
    Dataset updated
    May 21, 2022
    Authors
    Santosh Kumar
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context:

    The idea came to my mind to scrap this data. I was working on an e-commerce project Fashion Product Recommendation (an end-to-end project). In this project, upload any fashion image and it will show the 10 closest recommendations.

    https://user-images.githubusercontent.com/40932902/169657090-20d3342d-d472-48e3-bc34-8a9686b09961.png" alt="">

    https://user-images.githubusercontent.com/40932902/169657035-870bb803-f985-482a-ac16-789d0fcf2a2b.png" alt="">

    https://user-images.githubusercontent.com/40932902/169013855-099838d6-8612-45ce-8961-28ccf44f81f7.png" alt="">

    I completed my project on this image dataset . The problem I faced while deploying on the Heroku server. Due to the large project file size, I was unable to deploy as Heroku offers limited memory space for a free account.

    As currently, I am only familiar with Heroku. Learning AWS for big projects. So, I decided to scrap my own image dataset with much more information that can help me to transform this project to the next level. Scraped this data from flipkart.com(e-commerce website) in two formats Image and textual data in tabular format.

    About this Dataset:

    This dataset contains 65k images (400x450 pixel)) of fashion/style products and accessories like clothing, footwear, accessories, and many more. There is a CSV file also mapped with the image name and the id column in tabular data. The name of the image is in a unique numerical format like 1.png, 62299.png Image name and Id columns are the same. So, suppose you want to find the details of any image then you can find them using the image name id, go to the Id column in the csv file and that id rows will be the details of the image. You can find the notebook in the code section which I used to scrap this data.

    Columns of CSV Dataset: 1. id : Unique id same as the image name 2. brand: Brand name of the product 3. title: Title of the product 4. sold_price: selling price of the product 5. actual_price: Actual price of the product 6. url : unique URL of every product 7. img: Image URL

    How did helped me this dataset: 1. I trained my CNN model using the image data, that's the only use of the image dataset. 2. In my front-end page of the project to display results, I used Image URL and displayed after extracting from the web. This helped me to not upload the image dataset with the project on the server and this saved huge memory space. 3. Using the url displaying live price and** ratings** from the Flipkart website. 4. And there is a Buy button mapped with the url you will be redirected to the original product page and buy it from there. after using this dataset I changed my project name from Fashion Product Recommender to Flipkart Fashion Product Recommender. πŸ˜„πŸ˜„πŸ˜„

    Still, the memory problem was not resolved as the model trained file was above 500MB on the complete dataset. So I tried on multiple sets and finally, I deployed after training on 1000 images only. In the future, I will try on another platform to deploy the complete project. I learned many new things while working on this dataset.

    Your Job:

    1. You can use this dataset in your deep learning projects, go and try to create interesting projects.
    2. You can use CSV data in your Machine Learning projects, first you need to do feature construction from the title columns as there is much information hidden and some data cleaning required.
    3. There is two complete records missing in csv data, your job is to find the missing data with the help of image dataset and fill as per your knowledge.

    This is a huge dataset in terms of records as well as memory size. To download this dataset you need high internet speed.

    To download the same dataset in small size less than 500mb you can find it here, everything is the same as this dataset only I reduced the pixel of the image from 400x450px to ** 65x80pixels**.

    Pls, Rate this work

    Support with Upvote... that encourages me to research more.

    Share your feedback, reviews, and suggestions if any.

    Thanks!!

  19. HR Analytics Dataset

    • kaggle.com
    Updated Jan 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shodolamu Opeyemi (2025). HR Analytics Dataset [Dataset]. https://www.kaggle.com/datasets/hopesb/hr-analytics-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 18, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Shodolamu Opeyemi
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    The uploaded dataset contains detailed information about employees, training programs, and other HR-related metrics. Here's an overview:

    General Details:

    Rows: 3,150

    Columns: 39

    Column Names:

    1. Unnamed: 0

    2. FirstName

    3. LastName

    4. StartDate

    5. ExitDate

    6. Title

    7. Supervisor

    8. ADEmail

    9. BusinessUnit

    10. EmployeeStatus

    11. EmployeeType

    12. PayZone

    13. EmployeeClassificationType

    14. TerminationType

    15. TerminationDescription

    16. DepartmentType

    17. Division

    18. DOB

    19. State

    20. JobFunctionDescription

    21. GenderCode

    22. LocationCode

    23. RaceDesc

    24. MaritalDesc

    25. Performance Score

    26. Current Employee Rating

    27. Employee ID

    28. Survey Date

    29. Engagement Score

    30. Satisfaction Score

    31. Work-Life Balance Score

    32. Training Date

    33. Training Program Name

    34. Training Type

    35. Training Outcome

    36. Location

    37. Trainer

    38. Training Duration (Days)

    39. Training Cost

    Summary:

    Employee Data: Contains details such as names, start and exit dates, job titles, and supervisors.

    Performance and Survey Metrics: Includes engagement, satisfaction, and work-life balance scores.

    Training Information: Covers program names, training types, outcomes, durations, costs, and trainer details.

    Diversity Details: Includes gender, race, and marital status.

    Status & Classification: Indicates employee status (active/terminated), type, and termination reasons.

  20. kaggle trending datasets August 2022

    • kaggle.com
    zip
    Updated Aug 3, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yasir Raza (2022). kaggle trending datasets August 2022 [Dataset]. https://www.kaggle.com/datasets/yasirabdaali/kaggle-trending-datasets-august-2022
    Explore at:
    zip(31770 bytes)Available download formats
    Dataset updated
    Aug 3, 2022
    Authors
    Yasir Raza
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This data set contains info about the datasets trending on kaggle. This dataset has info like dataset author, dataset title, file size,number of files,uploading date, upvotes, medals and usability score.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Miracle Smith (2024). Kaggle Upload [Dataset]. https://www.kaggle.com/datasets/miraclesmith/kaggle-upload
Organization logo

Kaggle Upload

Explore at:
22 scholarly articles cite this dataset (View in Google Scholar)
zip(720434 bytes)Available download formats
Dataset updated
Oct 30, 2024
Authors
Miracle Smith
License

http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

Description

Dataset

This dataset was created by Miracle Smith

Released under Database: Open Database, Contents: Database Contents

Contents

Search
Clear search
Close search
Google apps
Main menu