100+ datasets found

Kaggle Upload
kaggle.com
zip
Updated Oct 30, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Miracle Smith (2024). Kaggle Upload [Dataset]. https://www.kaggle.com/datasets/miraclesmith/kaggle-upload
Explore at:
zip(720434 bytes)Available download formats
Dataset updated
Oct 30, 2024
Authors
Miracle Smith
License
http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
Description
Dataset

This dataset was created by Miracle Smith

Released under Database: Open Database, Contents: Database Contents

Contents
File Upload
kaggle.com
zip
Updated Nov 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Krishna_raj@84 (2025). File Upload [Dataset]. https://www.kaggle.com/datasets/krishnaraj84/file-upload
Explore at:
zip(50526 bytes)Available download formats
Dataset updated
Nov 11, 2025
Authors
Krishna_raj@84
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset

This dataset was created by Krishna_raj@84

Released under MIT

Contents

Kaggle: User Uploaded Dataset

kaggle.com

zip

Updated Oct 5, 2023

Facebook

Twitter

Click to copy link

Link copied

Cite

Bryan Weather Chung (2023). Kaggle: User Uploaded Dataset [Dataset]. https://www.kaggle.com/datasets/bryanchungweather/kaggle-user-uploaded-dataset

Explore at:

zip(100393 bytes)Available download formats

Dataset updated

Oct 5, 2023

Authors

Bryan Weather Chung

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Description

This comprehensive collection serves as a valuable resource for data enthusiasts, researchers, and analysts seeking to explore a wide range of topics and uncover unique insights.

Context and Sources

Context: Our dataset is curated from user contributions on the Kaggle platform

Sources: https://www.kaggle.com/datasets?topic=musicDataset

Column Name	Definition
Dataset Title	The title of the dataset
URL	The web address
Author	The individual or organization responsible for uploading the dataset.
Last Updated	The date when the dataset was last modified or updated.
Usability Score	An indicator of the dataset's quality, usefulness, and ease of use, as rated by Kaggle.
File Size	The size of the dataset file, helping users estimate the storage requirements.
Upvote Count	The number of upvotes received by the dataset, reflecting its popularity and relevance among users.
Medal Type	Kaggles Progression Type.

Kaggle Top Datasets🚀📊
kaggle.com
zip
Updated Apr 10, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Aaron Frias (2024). Kaggle Top Datasets🚀📊 [Dataset]. https://www.kaggle.com/datasets/aaronfriasr/kaggle-top-datasets
Explore at:
zip(1572305 bytes)Available download formats
Dataset updated
Apr 10, 2024
Authors
Aaron Frias
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
Context

Kaggle is one of the largest communities of data scientists and machine learning practitioners in the world, and its platform hosts thousands of datasets covering a wide range of topics and industries. With so many options to choose from, it can be difficult to know where to start or what datasets are worth exploring. That's where this dataset comes in. By scraping information about the top 10,000 datasets on Kaggle, we have created a single source of truth for the most popular and useful datasets on the platform. This dataset is not just a list of names and numbers, but a valuable tool for data enthusiasts and professionals alike, providing insights into the latest trends and techniques in data science and machine learning

Column description - Dataset_name - Name of the dataset - Author_name - Name of the author - Author_id - Kaggle id of the author - No_of_files - Number of files the author has uploaded - size - Size of all the files - Type_of_file - Type of the files such as csv, json etc. - Upvotes - Total upvotes of the dataset - Medals - Medal of the dataset - Usability - Usability of the dataset - Date - Date in which the dataset is uploaded - Day - Day in which the dataset is uploaded - Time - Time in which the dataset is uploaded - Dataset_link - Kaggle link of the dataset

Acknowledgements The data has been scraped from the official Kaggle Website and is available under the Creative Common License.

Enjoy & Keep Learning !!!
fxcking kaggle let me upload only zip file
kaggle.com
zip
Updated Oct 28, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
chilli_sawze (2025). fxcking kaggle let me upload only zip file [Dataset]. https://www.kaggle.com/datasets/chillisawze/fxcking-kaggle-let-me-upload-only-zip-file
Explore at:
zip(5094369181 bytes)Available download formats
Dataset updated
Oct 28, 2025
Authors
chilli_sawze
Description
Dataset

This dataset was created by chilli_sawze

Contents
my_upload_file
kaggle.com
zip
Updated Sep 1, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Haidy Ashraf21 (2023). my_upload_file [Dataset]. https://www.kaggle.com/datasets/haidyashraf21/my-upload-file
Explore at:
zip(264 bytes)Available download formats
Dataset updated
Sep 1, 2023
Authors
Haidy Ashraf21
Description
Dataset

This dataset was created by Haidy Ashraf21

Contents
(Sunset)📒 Meta Kaggle ported to MS SQL SERVER
kaggle.com
zip
Updated Mar 20, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
BwandoWando (2024). (Sunset)📒 Meta Kaggle ported to MS SQL SERVER [Dataset]. https://www.kaggle.com/datasets/bwandowando/meta-kaggle-ported-to-sql-server-2022-database
Explore at:
zip(8635902534 bytes)Available download formats
Dataset updated
Mar 20, 2024
Authors
BwandoWando
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Context

I've always wanted to explore Kaggle's Meta Kaggle dataset but I am more comfortable on using TSQL when it comes to writing (very) complex queries. Also, I tend to write queries faster when using SQL MANAGEMENT STUDIO, like 100x faster. So, I ported Kaggle's Meta Kaggle dataset into MS SQL SERVER 2022 database format, created a backup file, then uploaded it here.

MSSQL VERSION: SQL Server 2022

Collation: SQL_Latin1_General_CP1_CI_AS

Recovery model: simple

Requirements

Download and install the SQL SERVER 2022 Developer edition here

Download the backup file

Restore the backup file into your local. If you havent done this before, it's easy and straightforward. Here is a guide.

(QUOTED FROM THE ORIGINAL DATASET)

Meta Kaggle

Explore Kaggle's public data on competitions, datasets, kernels (code/ notebooks) and more Meta Kaggle may not be the Rosetta Stone of data science, but they think there's a lot to learn (and plenty of fun to be had) from this collection of rich data about Kaggle’s community and activity.

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F1842206%2F2ad97bce7839d6e57674e7a82981ed23%2F2Egeb8R.png?generation=1688912953875842&alt=media" alt="">

Notes

I repeat, I just ported the dataset. All credits to Kaggle for the amazing source dataset.

Cover image from https://picryl.com/media/space-earth-bug-ce3ca6
testing file upload
kaggle.com
zip
Updated Nov 24, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ahmad Basher (2024). testing file upload [Dataset]. https://www.kaggle.com/datasets/ahmadbasher/testing-file-upload
Explore at:
zip(6336 bytes)Available download formats
Dataset updated
Nov 24, 2024
Authors
Ahmad Basher
Description
Dataset

This dataset was created by Ahmad Basher

Contents
Metadata of Kaggle dataset _Include MedalVoteCount
kaggle.com
zip
Updated Dec 20, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
kukuroo3 (2021). Metadata of Kaggle dataset _Include MedalVoteCount [Dataset]. https://www.kaggle.com/datasets/kukuroo3/dataset-of-kaggle-dataset-include-medalvotecount
Explore at:
zip(11216728 bytes)Available download formats
Dataset updated
Dec 20, 2021
Authors
kukuroo3
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Context

https://github.com/dean-kg/RoadToExpertRanking_Kaggle/blob/main/kg_medal.png?raw=true" alt="kaggle_medal">

The Kaggle Dataset medal rule has a bronze medal when a user with a rank of novice or higher upvotes 5 or more, a silver medal when 20 or more upvotes, and a gold medal when 50 or more. Recently I uploaded a lot of datasets to Kaggle. However, although I have won many bronze medals, I have never won more than a silver medal. So, I created this dataset to check the characteristics of the dataset that will receive the silver medal. The metadata of the dataset that received at least one upvote among all Kaggle datasets and the number of MedalVoteCount in each dataset were recorded together.

This dataset can be used to create strategies for receiving silver and gold medals.

Content

42,955 meta data of datasets from 2015-12 to 2021-11

DataSetMedals : medal color

ct : create time

dataUrl :data url (follwed https://www.kaggle.com/)

totalviews

votecount : total vote counts

medalvotecount : upvote Counting by users who are upper Novice Rank

totaldownloads : downloads counts

totalkernel :kernel counts

title

description

key : dataset tags

license

Source

https://www.kaggle.com/kaggle/meta-kaggle and get "MedalVoteCount" value by scraping
Kaggle Datasets - Summary, Topics, Classification
kaggle.com
zip
Updated Nov 16, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Katherine Marsh (2020). Kaggle Datasets - Summary, Topics, Classification [Dataset]. https://www.kaggle.com/datasets/katherinemarsh/kaggle-datasets-summary-topics-classification
Explore at:
zip(273449 bytes)Available download formats
Dataset updated
Nov 16, 2020
Authors
Katherine Marsh
Description
Context

Companies and individuals are storing increasingly more data digitally; however, much of the data is unused because it is unclassified. How many times have you opened your downloads folder, found a file you downloaded a year ago and you have no idea what the contents are? You can read through those files individually but imagine doing that for thousands of files. All that raw data in storage facilities create data lakes. As the amount of data grows and the complexity rises, data lakes become data swamps. The potentially valuable and interesting datasets will likely remain unused. Our tool addresses the need to classify these large pools of data in a visually effective and succinct manner by identifying keywords in datasets, and classifying datasets into a consistent taxonomy.

The files listed within kaggleDatasetSummaryTopicsClassification.csv have been processed with our tool to generate the keywords and taxonomic classification as seen below. The summaries are not generated from our system. Instead they were retrieved from user input as they uploaded the files on Kaggle. We planned to utilize these summaries to create an NLG model to generate summaries from any input file. Unfortunately we were not able to collect enough data to build a good model. Hopefully the data within this set might help future users achieve that goal.

Acknowledgements

Developed with Senior Design Center at NC State in collaboration with SAS. Senior Design Team: Tanya Chu, Katherine Marsh, Nikhil Milind, Anna Owens SAS Representatives: : Nancy Rausch, Marty Warner, Brant Kay, Tyler Wendell, JP Trawinski
Kaggle Dataset Metadata Repository
kaggle.com
zip
Updated Nov 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ijaj Ahmed (2024). Kaggle Dataset Metadata Repository [Dataset]. https://www.kaggle.com/datasets/ijajdatanerd/kaggle-dataset-metadata-repository
Explore at:
zip(5122110 bytes)Available download formats
Dataset updated
Nov 16, 2024
Authors
Ijaj Ahmed
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F13367141%2F444a868e669671faf9007822d6f2d348%2FAdd%20a%20heading.png?generation=1731775788329917&alt=media" alt="">

Kaggle Dataset Metadata Collection 📊

This dataset provides comprehensive metadata on various Kaggle datasets, offering detailed information about the dataset owners, creators, usage statistics, licensing, and more. It can help researchers, data scientists, and Kaggle enthusiasts quickly analyze the key attributes of different datasets on Kaggle. 📚

Dataset Overview:

Purpose: To provide detailed insights into Kaggle dataset metadata.

Content: Information related to the dataset's owner, creator, usage metrics, licensing, and more.

Target Audience: Data scientists, Kaggle competitors, and dataset curators.

Columns Description 📋

datasetUrl 🌐: The URL of the Kaggle dataset page. This directs you to the specific dataset's page on Kaggle.

ownerAvatarUrl 🖼️: The URL of the dataset owner's profile avatar on Kaggle.

ownerName 👤: The name of the dataset owner. This can be the individual or organization that created and maintains the dataset.

ownerUrl 🌍: A link to the Kaggle profile page of the dataset owner.

ownerUserId 💼: The unique user ID of the dataset owner on Kaggle.

ownerTier 🎖️: The ownership tier, such as "Tier 1" or "Tier 2," indicating the owner's status or level on Kaggle.

creatorName 👩‍💻: The name of the dataset creator, which could be different from the owner.

creatorUrl 🌍: A link to the Kaggle profile page of the dataset creator.

creatorUserId 💼: The unique user ID of the dataset creator.

scriptCount 📜: The number of scripts (kernels) associated with this dataset.

scriptsUrl 🔗: A link to the scripts (kernels) page for the dataset, where you can explore related code.

forumUrl 💬: The URL to the discussion forum for this dataset, where users can ask questions and share insights.

viewCount 👀: The number of views the dataset page has received on Kaggle.

downloadCount ⬇️: The number of times the dataset has been downloaded by users.

dateCreated 📅: The date when the dataset was first created and uploaded to Kaggle.

dateUpdated 🔄: The date when the dataset was last updated or modified.

voteButton 👍: The metadata for the dataset's vote button, showing how users interact with the dataset's quality ratings.

categories 🏷️: The categories or tags associated with the dataset, helping users filter datasets based on topics of interest (e.g., "Healthcare," "Finance").

licenseName 🛡️: The name of the license under which the dataset is shared (e.g., "CC0," "MIT License").

licenseShortName 🔑: A short form or abbreviation of the dataset's license name (e.g., "CC0" for Creative Commons Zero).

datasetSize 📦: The size of the dataset in terms of storage, typically measured in MB or GB.

commonFileTypes 📂: A list of common file types included in the dataset (e.g., .csv, .json, .xlsx).

downloadUrl ⬇️: A direct link to download the dataset files.

newKernelNotebookUrl 📝: A link to a new kernel or notebook related to this dataset, for those who wish to explore it programmatically.

newKernelScriptUrl 💻: A link to a new script for running computations or processing data related to the dataset.

usabilityRating 🌟: A rating or score representing how usable the dataset is, based on user feedback.

firestorePath 🔍: A reference to the path in Firestore where this dataset’s metadata is stored.

datasetSlug 🏷️: A URL-friendly version of the dataset name, typically used for URLs.

rank 📈: The dataset's rank based on certain metrics (e.g., downloads, votes, views).

datasource 🌐: The source or origin of the dataset (e.g., government data, private organizations).

medalUrl 🏅: A URL pointing to the dataset's medal or badge, indicating the dataset's quality or relevance.

hasHashLink 🔗: Indicates whether the dataset has a hash link for verifying data integrity.

ownerOrganizationId 🏢: The unique organization ID of the dataset's owner if the owner is an organization rather than an individual.

totalVotes 🗳️: The total number of votes the dataset has received from users, reflecting its popularity or quality.

category_names 📑: A comma-separated string of category names that represent the dataset’s classification.

This dataset is a valuable resource for those who want to analyze Kaggle's ecosystem, discover high-quality datasets, and explore metadata in a structured way. 🌍📊
Kaggle Dataset
kaggle.com
zip
Updated Feb 9, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Chidambara Raju G (2023). Kaggle Dataset [Dataset]. https://www.kaggle.com/datasets/rajugc/kaggle-dataset/discussion
Explore at:
zip(1572305 bytes)Available download formats
Dataset updated
Feb 9, 2023
Authors
Chidambara Raju G
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
Context

Kaggle is one of the largest communities of data scientists and machine learning practitioners in the world, and its platform hosts thousands of datasets covering a wide range of topics and industries. With so many options to choose from, it can be difficult to know where to start or what datasets are worth exploring. That's where this dataset comes in. By scraping information about the top 10,000 datasets on Kaggle, we have created a single source of truth for the most popular and useful datasets on the platform. This dataset is not just a list of names and numbers, but a valuable tool for data enthusiasts and professionals alike, providing insights into the latest trends and techniques in data science and machine learning

Column description

Dataset_name - Name of the dataset

Author_name - Name of the author

Author_id - Kaggle id of the author

No_of_files - Number of files the author has uploaded

size - Size of all the files

Type_of_file - Type of the files such as csv, json etc.

Upvotes - Total upvotes of the dataset

Medals - Medal of the dataset

Usability - Usability of the dataset

Date - Date in which the dataset is uploaded

Day - Day in which the dataset is uploaded

Time - Time in which the dataset is uploaded

Dataset_link - Kaggle link of the dataset

Acknowledgements

The data has been scraped from the official Kaggle Website and is available under the Creative Common License.

Keep Learning !!!
Daily Energy Production in India
kaggle.com
zip
Updated Jul 20, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
vaibhav panvalkar (2020). Daily Energy Production in India [Dataset]. https://www.kaggle.com/datasets/vpanvalkar/daily-energy-production-in-india/data
Explore at:
zip(64718 bytes)Available download formats
Dataset updated
Jul 20, 2020
Authors
vaibhav panvalkar
Area covered
India
Description
Dataset

This dataset was created by vaibhav panvalkar

Contents
20BN_jester_V1_videos
kaggle.com
zip
Updated May 15, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
kyle-cloud (2021). 20BN_jester_V1_videos [Dataset]. https://www.kaggle.com/datasets/kylecloud/20bn-jester-v1-videos
Explore at:
zip(27378087293 bytes)Available download formats
Dataset updated
May 15, 2021
Authors
kyle-cloud
Description
Dataset

This dataset was created by kyle-cloud

Contents
Uploaded files
kaggle.com
zip
Updated Apr 15, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
jinmuyan7 (2023). Uploaded files [Dataset]. https://www.kaggle.com/datasets/jinmuyan7/uploaded-files
Explore at:
zip(4143198403 bytes)Available download formats
Dataset updated
Apr 15, 2023
Authors
jinmuyan7
Description
Dataset

This dataset was created by jinmuyan7

Contents
Reddit /r/datasets Dataset
kaggle.com
zip
Updated Nov 28, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2022). Reddit /r/datasets Dataset [Dataset]. https://www.kaggle.com/datasets/thedevastator/the-meta-corpus-of-datasets-the-reddit-dataset
Explore at:
zip(9619636 bytes)Available download formats
Dataset updated
Nov 28, 2022
Authors
The Devastator
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
The Meta-Corpus of Datasets: The Reddit Dataset

The Complete Collection of Datasets Posted on Reddit

By SocialGrep [source]

About this dataset

A subreddit dataset is a collection of posts and comments made on Reddit's /r/datasets board. This dataset contains all the posts and comments made on the /r/datasets subreddit from its inception to March 1, 2022. The dataset was procured using SocialGrep. The data does not include usernames to preserve users' anonymity and to prevent targeted harassment

More Datasets

For more datasets, click here.

Featured Notebooks

🚨 Your notebook can be here! 🚨!

How to use the dataset

In order to use this dataset, you will need to have a text editor such as Microsoft Word or LibreOffice installed on your computer. You will also need a web browser such as Google Chrome or Mozilla Firefox.

Once you have the necessary software installed, open the The Reddit Dataset folder and double-click on the the-reddit-dataset-dataset-posts.csv file to open it in your preferred text editor.

In the document, you will see a list of posts with the following information for each one: title, sentiment, score, URL, created UTC, permalink, subreddit NSFW status, and subreddit name.

You can use this information to analyze trends in data sets posted on /r/datasets over time. For example, you could calculate the average score for all posts and compare it to the average score for posts in specific subReddits. Additionally, sentiment analysis could be performed on the titles of posts to see if there is a correlation between positive/negative sentiment and upvotes/downvotes

Research Ideas

Finding correlations between different types of datasets

Determining which datasets are most popular on Reddit

Analyzing the sentiments of post and comments on Reddit's /r/datasets board

Acknowledgements

If you use this dataset in your research, please credit the original authors.

Data Source

License

License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

Columns

File: the-reddit-dataset-dataset-comments.csv | Column name | Description | |:-------------------|:---------------------------------------------------| | type | The type of post. (String) | | subreddit.name | The name of the subreddit. (String) | | subreddit.nsfw | Whether or not the subreddit is NSFW. (Boolean) | | created_utc | The time the post was created, in UTC. (Timestamp) | | permalink | The permalink for the post. (String) | | body | The body of the post. (String) | | sentiment | The sentiment of the post. (String) | | score | The score of the post. (Integer) |

File: the-reddit-dataset-dataset-posts.csv | Column name | Description | |:-------------------|:---------------------------------------------------| | type | The type of post. (String) | | subreddit.name | The name of the subreddit. (String) | | subreddit.nsfw | Whether or not the subreddit is NSFW. (Boolean) | | created_utc | The time the post was created, in UTC. (Timestamp) | | permalink | The permalink for the post. (String) | | score | The score of the post. (Integer) | | domain | The domain of the post. (String) | | url | The URL of the post. (String) | | selftext | The self-text of the post. (String) | | title | The title of the post. (String) |

Acknowledgements

If you use this dataset in your research, please credit the original authors. If you use this dataset in your research, please credit SocialGrep.
upload_coco_files
kaggle.com
zip
Updated Oct 27, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Robin (2025). upload_coco_files [Dataset]. https://www.kaggle.com/datasets/robinluy/upload-coco-files
Explore at:
zip(578326 bytes)Available download formats
Dataset updated
Oct 27, 2025
Authors
Robin
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Dataset

This dataset was created by Robin

Released under Apache 2.0

Contents
Stylish Product Image Dataset
kaggle.com
zip
Updated May 21, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Santosh Kumar (2022). Stylish Product Image Dataset [Dataset]. https://www.kaggle.com/datasets/kuchhbhi/stylish-product-image-dataset
Explore at:
zip(9509715613 bytes)Available download formats
Dataset updated
May 21, 2022
Authors
Santosh Kumar
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Context:

The idea came to my mind to scrap this data. I was working on an e-commerce project Fashion Product Recommendation (an end-to-end project). In this project, upload any fashion image and it will show the 10 closest recommendations.

https://user-images.githubusercontent.com/40932902/169657090-20d3342d-d472-48e3-bc34-8a9686b09961.png" alt="">

https://user-images.githubusercontent.com/40932902/169657035-870bb803-f985-482a-ac16-789d0fcf2a2b.png" alt="">

https://user-images.githubusercontent.com/40932902/169013855-099838d6-8612-45ce-8961-28ccf44f81f7.png" alt="">

I completed my project on this image dataset . The problem I faced while deploying on the Heroku server. Due to the large project file size, I was unable to deploy as Heroku offers limited memory space for a free account.

As currently, I am only familiar with Heroku. Learning AWS for big projects. So, I decided to scrap my own image dataset with much more information that can help me to transform this project to the next level. Scraped this data from flipkart.com(e-commerce website) in two formats Image and textual data in tabular format.

About this Dataset:

This dataset contains 65k images (400x450 pixel)) of fashion/style products and accessories like clothing, footwear, accessories, and many more. There is a CSV file also mapped with the image name and the id column in tabular data. The name of the image is in a unique numerical format like 1.png, 62299.png Image name and Id columns are the same. So, suppose you want to find the details of any image then you can find them using the image name id, go to the Id column in the csv file and that id rows will be the details of the image. You can find the notebook in the code section which I used to scrap this data.

Columns of CSV Dataset: 1. id : Unique id same as the image name 2. brand: Brand name of the product 3. title: Title of the product 4. sold_price: selling price of the product 5. actual_price: Actual price of the product 6. url : unique URL of every product 7. img: Image URL

How did helped me this dataset: 1. I trained my CNN model using the image data, that's the only use of the image dataset. 2. In my front-end page of the project to display results, I used Image URL and displayed after extracting from the web. This helped me to not upload the image dataset with the project on the server and this saved huge memory space. 3. Using the url displaying live price and** ratings** from the Flipkart website. 4. And there is a Buy button mapped with the url you will be redirected to the original product page and buy it from there. after using this dataset I changed my project name from Fashion Product Recommender to Flipkart Fashion Product Recommender. 😄😄😄

Still, the memory problem was not resolved as the model trained file was above 500MB on the complete dataset. So I tried on multiple sets and finally, I deployed after training on 1000 images only. In the future, I will try on another platform to deploy the complete project. I learned many new things while working on this dataset.

Your Job:

You can use this dataset in your deep learning projects, go and try to create interesting projects.

You can use CSV data in your Machine Learning projects, first you need to do feature construction from the title columns as there is much information hidden and some data cleaning required.

There is two complete records missing in csv data, your job is to find the missing data with the help of image dataset and fill as per your knowledge.

This is a huge dataset in terms of records as well as memory size. To download this dataset you need high internet speed.

To download the same dataset in small size less than 500mb you can find it here, everything is the same as this dataset only I reduced the pixel of the image from 400x450px to ** 65x80pixels**.

Pls, Rate this work

Support with Upvote... that encourages me to research more.

Share your feedback, reviews, and suggestions if any.

Thanks!!
HR Analytics Dataset
kaggle.com
Updated Jan 18, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shodolamu Opeyemi (2025). HR Analytics Dataset [Dataset]. https://www.kaggle.com/datasets/hopesb/hr-analytics-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 18, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Shodolamu Opeyemi
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
The uploaded dataset contains detailed information about employees, training programs, and other HR-related metrics. Here's an overview:

General Details:

Rows: 3,150

Columns: 39

Column Names:

Unnamed: 0

FirstName

LastName

StartDate

ExitDate

Title

Supervisor

ADEmail

BusinessUnit

EmployeeStatus

EmployeeType

PayZone

EmployeeClassificationType

TerminationType

TerminationDescription

DepartmentType

Division

DOB

State

JobFunctionDescription

GenderCode

LocationCode

RaceDesc

MaritalDesc

Performance Score

Current Employee Rating

Employee ID

Survey Date

Engagement Score

Satisfaction Score

Work-Life Balance Score

Training Date

Training Program Name

Training Type

Training Outcome

Location

Trainer

Training Duration (Days)

Training Cost

Summary:

Employee Data: Contains details such as names, start and exit dates, job titles, and supervisors.

Performance and Survey Metrics: Includes engagement, satisfaction, and work-life balance scores.

Training Information: Covers program names, training types, outcomes, durations, costs, and trainer details.

Diversity Details: Includes gender, race, and marital status.

Status & Classification: Indicates employee status (active/terminated), type, and termination reasons.
kaggle trending datasets August 2022
kaggle.com
zip
Updated Aug 3, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yasir Raza (2022). kaggle trending datasets August 2022 [Dataset]. https://www.kaggle.com/datasets/yasirabdaali/kaggle-trending-datasets-august-2022
Explore at:
zip(31770 bytes)Available download formats
Dataset updated
Aug 3, 2022
Authors
Yasir Raza
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
This data set contains info about the datasets trending on kaggle. This dataset has info like dataset author, dataset title, file size,number of files,uploading date, upvotes, medals and usability score.

Facebook

Twitter

Click to copy link

Link copied

Cite

Miracle Smith (2024). Kaggle Upload [Dataset]. https://www.kaggle.com/datasets/miraclesmith/kaggle-upload

Kaggle Upload

Explore at:

22 scholarly articles cite this dataset (View in Google Scholar)

zip(720434 bytes)Available download formats

Dataset updated

Oct 30, 2024

Authors

Miracle Smith

License

http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

Description

Dataset

This dataset was created by Miracle Smith

Released under Database: Open Database, Contents: Database Contents

Clear search

Close search

Google apps

Main menu

Kaggle Upload

Dataset

Contents

File Upload

Dataset

Contents

Kaggle: User Uploaded Dataset

Kaggle Top Datasets🚀📊

fxcking kaggle let me upload only zip file

Dataset

Contents

my_upload_file

Dataset

Contents

(Sunset)📒 Meta Kaggle ported to MS SQL SERVER

Context

Requirements

(QUOTED FROM THE ORIGINAL DATASET)

Meta Kaggle

Notes

testing file upload

Dataset

Contents

Metadata of Kaggle dataset _Include MedalVoteCount

Context

Content

Source

Kaggle Datasets - Summary, Topics, Classification

Context

Acknowledgements

Kaggle Dataset Metadata Repository

Kaggle Dataset Metadata Collection 📊

Dataset Overview:

Columns Description 📋

Kaggle Dataset

Context

Column description

Acknowledgements

Daily Energy Production in India

Dataset

Contents

20BN_jester_V1_videos

Dataset

Contents

Uploaded files

Dataset

Contents

Reddit /r/datasets Dataset

The Meta-Corpus of Datasets: The Reddit Dataset

The Complete Collection of Datasets Posted on Reddit

About this dataset

More Datasets

Featured Notebooks

How to use the dataset

Research Ideas

Acknowledgements

License

Columns

Acknowledgements

upload_coco_files

Dataset

Contents

Stylish Product Image Dataset

Context:

About this Dataset:

Your Job:

This is a huge dataset in terms of records as well as memory size. To download this dataset you need high internet speed.

Pls, Rate this work

Support with Upvote... that encourages me to research more.

Share your feedback, reviews, and suggestions if any.

Thanks!!

HR Analytics Dataset

kaggle trending datasets August 2022

Kaggle Upload

Dataset

Contents