100+ datasets found
  1. Top 1000 Kaggle Datasets

    • kaggle.com
    zip
    Updated Jan 3, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Trrishan (2022). Top 1000 Kaggle Datasets [Dataset]. https://www.kaggle.com/datasets/notkrishna/top-1000-kaggle-datasets
    Explore at:
    zip(34269 bytes)Available download formats
    Dataset updated
    Jan 3, 2022
    Authors
    Trrishan
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    From wiki

    Kaggle, a subsidiary of Google LLC, is an online community of data scientists and machine learning practitioners. Kaggle allows users to find and publish data sets, explore and build models in a web-based data-science environment, work with other data scientists and machine learning engineers, and enter competitions to solve data science challenges.

    Kaggle got its start in 2010 by offering machine learning competitions and now also offers a public data platform, a cloud-based workbench for data science, and Artificial Intelligence education. Its key personnel were Anthony Goldbloom and Jeremy Howard. Nicholas Gruen was founding chair succeeded by Max Levchin. Equity was raised in 2011 valuing the company at $25 million. On 8 March 2017, Google announced that they were acquiring Kaggle.[1][2]

    Source: Kaggle

  2. Student Performance Data Set

    • kaggle.com
    zip
    Updated Mar 27, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data-Science Sean (2020). Student Performance Data Set [Dataset]. https://www.kaggle.com/datasets/larsen0966/student-performance-data-set
    Explore at:
    zip(12353 bytes)Available download formats
    Dataset updated
    Mar 27, 2020
    Authors
    Data-Science Sean
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    If this Data Set is useful, and upvote is appreciated. This data approach student achievement in secondary education of two Portuguese schools. The data attributes include student grades, demographic, social and school related features) and it was collected by using school reports and questionnaires. Two datasets are provided regarding the performance in two distinct subjects: Mathematics (mat) and Portuguese language (por). In [Cortez and Silva, 2008], the two datasets were modeled under binary/five-level classification and regression tasks. Important note: the target attribute G3 has a strong correlation with attributes G2 and G1. This occurs because G3 is the final year grade (issued at the 3rd period), while G1 and G2 correspond to the 1st and 2nd-period grades. It is more difficult to predict G3 without G2 and G1, but such prediction is much more useful (see paper source for more details).

  3. Cars4u

    • kaggle.com
    zip
    Updated Apr 12, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sukhmani Bedi (2021). Cars4u [Dataset]. https://www.kaggle.com/datasets/sukhmanibedi/cars4u
    Explore at:
    zip(173037 bytes)Available download formats
    Dataset updated
    Apr 12, 2021
    Authors
    Sukhmani Bedi
    Description

    Dataset

    This dataset was created by Sukhmani Bedi

    Contents

  4. 60k-data-with-context-v2

    • kaggle.com
    Updated Sep 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chris Deotte (2023). 60k-data-with-context-v2 [Dataset]. https://www.kaggle.com/datasets/cdeotte/60k-data-with-context-v2
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 2, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Chris Deotte
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset can be used to train an Open Book model for Kaggle's LLM Science Exam competition. This dataset was generated by searching and concatenating all publicly shared datasets on Sept 1 2023.

    The context column was generated using Mgoksu's notebook here with NUM_TITLES=5 and NUM_SENTENCES=20

    The source column indicates where the dataset originated. Below are the sources:

    source = 1 & 2 * Radek's 6.5k dataset. Discussion here annd here, dataset here.

    source = 3 & 4 * Radek's 15k + 5.9k. Discussion here and here, dataset here

    source = 5 & 6 * Radek's 6k + 6k. Discussion here and here, dataset here

    source = 7 * Leonid's 1k. Discussion here, dataset here

    source = 8 * Gigkpeaeums 3k. Discussion here, dataset here

    source = 9 * Anil 3.4k. Discussion here, dataset here

    source = 10, 11, 12 * Mgoksu 13k. Discussion here, dataset here

  5. Fake News Classification

    • kaggle.com
    zip
    Updated Oct 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Saurabh Shahane (2023). Fake News Classification [Dataset]. https://www.kaggle.com/datasets/saurabhshahane/fake-news-classification
    Explore at:
    zip(96615040 bytes)Available download formats
    Dataset updated
    Oct 8, 2023
    Authors
    Saurabh Shahane
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    (WELFake) is a dataset of 72,134 news articles with 35,028 real and 37,106 fake news. For this, authors merged four popular news datasets (i.e. Kaggle, McIntire, Reuters, BuzzFeed Political) to prevent over-fitting of classifiers and to provide more text data for better ML training.

    Dataset contains four columns: Serial number (starting from 0); Title (about the text news heading); Text (about the news content); and Label (0 = fake and 1 = real).

    There are 78098 data entries in csv file out of which only 72134 entries are accessed as per the data frame.

    Published in: IEEE Transactions on Computational Social Systems: pp. 1-13 (doi: 10.1109/TCSS.2021.3068519).

  6. Data from: Global Superstore Dataset

    • kaggle.com
    zip
    Updated Nov 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fatih İlhan (2023). Global Superstore Dataset [Dataset]. https://www.kaggle.com/datasets/fatihilhan/global-superstore-dataset
    Explore at:
    zip(3349507 bytes)Available download formats
    Dataset updated
    Nov 16, 2023
    Authors
    Fatih İlhan
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    About this file The Kaggle Global Superstore dataset is a comprehensive dataset containing information about sales and orders in a global superstore. It is a valuable resource for data analysis and visualization tasks. This dataset has been processed and transformed from its original format (txt) to CSV using the R programming language. The original dataset is available here, and the transformed CSV file used in this analysis can be found here.

    Here is a description of the columns in the dataset:

    category: The category of products sold in the superstore.

    city: The city where the order was placed.

    country: The country in which the superstore is located.

    customer_id: A unique identifier for each customer.

    customer_name: The name of the customer who placed the order.

    discount: The discount applied to the order.

    market: The market or region where the superstore operates.

    ji_lu_shu: An unknown or unspecified column.

    order_date: The date when the order was placed.

    order_id: A unique identifier for each order.

    order_priority: The priority level of the order.

    product_id: A unique identifier for each product.

    product_name: The name of the product.

    profit: The profit generated from the order.

    quantity: The quantity of products ordered.

    region: The region where the order was placed.

    row_id: A unique identifier for each row in the dataset.

    sales: The total sales amount for the order.

    segment: The customer segment (e.g., consumer, corporate, or home office).

    ship_date: The date when the order was shipped.

    ship_mode: The shipping mode used for the order.

    shipping_cost: The cost of shipping for the order.

    state: The state or region within the country.

    sub_category: The sub-category of products within the main category.

    year: The year in which the order was placed.

    market2: Another column related to market information.

    weeknum: The week number when the order was placed.

    This dataset can be used for various data analysis tasks, including understanding sales patterns, customer behavior, and profitability in the context of a global superstore.

  7. Video Game Sales

    • kaggle.com
    zip
    Updated Oct 26, 2016
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    GregorySmith (2016). Video Game Sales [Dataset]. https://www.kaggle.com/datasets/gregorut/videogamesales
    Explore at:
    zip(390286 bytes)Available download formats
    Dataset updated
    Oct 26, 2016
    Authors
    GregorySmith
    Description

    This dataset contains a list of video games with sales greater than 100,000 copies. It was generated by a scrape of vgchartz.com.

    Fields include

    • Rank - Ranking of overall sales

    • Name - The games name

    • Platform - Platform of the games release (i.e. PC,PS4, etc.)

    • Year - Year of the game's release

    • Genre - Genre of the game

    • Publisher - Publisher of the game

    • NA_Sales - Sales in North America (in millions)

    • EU_Sales - Sales in Europe (in millions)

    • JP_Sales - Sales in Japan (in millions)

    • Other_Sales - Sales in the rest of the world (in millions)

    • Global_Sales - Total worldwide sales.

    The script to scrape the data is available at https://github.com/GregorUT/vgchartzScrape. It is based on BeautifulSoup using Python. There are 16,598 records. 2 records were dropped due to incomplete information.

  8. ❤️‍🩹 Medical Condition Prediction Dataset

    • kaggle.com
    Updated Sep 13, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ciobanu Marius (2024). ❤️‍🩹 Medical Condition Prediction Dataset [Dataset]. https://www.kaggle.com/datasets/marius2303/medical-condition-prediction-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 13, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Ciobanu Marius
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    About Dataset

    This dataset provides information about various medical conditions such as Cancer, Pneumonia, and Diabetic based on demographic, lifestyle, and health-related features. It contains randomly generated user data, including multiple missing values, making it suitable for handling imbalanced classification tasks and missing data problems.

    Features

    • id: Unique identifier for each user.
    • full_name: Randomly generated user name.
    • age: Age of the user (ranging from 18 to 90 years), with some missing values.
    • gender: The gender of the user (categorized as Male, Female, or Non-Binary).
    • smoking_status: Indicates the smoking status of the user (Smoker, Non-Smoker, Former-Smoker).
    • bmi: Body Mass Index (BMI) of the user (ranging from 15 to 40), with some missing values.
    • blood_pressure: Blood pressure levels of the user (ranging from 90 to 180 mmHg), with some missing values.
    • glucose_levels: Blood glucose levels of the user (ranging from 70 to 200 mg/dL), with some missing values.
    • condition: The target label indicating the medical condition of the user (Cancer, Pneumonia, or Diabetic), with imbalanced distribution (15% Cancer, 25% Pneumonia, 60% Diabetic).

    Goal

    The objective of this dataset is to predict the medical condition (Cancer, Pneumonia, Diabetic) of a user based on their demographic, lifestyle, and health-related features. This dataset can be used to explore strategies for dealing with imbalanced classes and missing data in healthcare applications. ​

  9. Materials and their Mechanical Properties

    • kaggle.com
    zip
    Updated Apr 15, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Purushottam Nawale (2023). Materials and their Mechanical Properties [Dataset]. https://www.kaggle.com/datasets/purushottamnawale/materials
    Explore at:
    zip(145487 bytes)Available download formats
    Dataset updated
    Apr 15, 2023
    Authors
    Purushottam Nawale
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    We utilized a dataset of Machine Design materials, which includes information on their mechanical properties. The dataset was obtained from the Autodesk Material Library and comprises 15 columns, also referred to as features/attributes. This dataset is a real-world dataset, and it does not contain any random values. However, due to missing values, we only utilized seven of these columns for our ML model. You can access the related GitHub Repository here: https://github.com/purushottamnawale/material-selection-using-machine-learning

    To develop a ML model, we employed several Python libraries, including NumPy, pandas, scikit-learn, and graphviz, in addition to other technologies such as Weka, MS Excel, VS Code, Kaggle, Jupyter Notebook, and GitHub. We employed Weka software to swiftly visualize the data and comprehend the relationships between the features, without requiring any programming expertise.

    My Problem statement is Material Selection for EV Chassis. So, if you have any specific ideas, be sure to implement them and add the codes on Kaggle.

    A Detailed Research Paper is available on https://iopscience.iop.org/article/10.1088/1742-6596/2601/1/012014

  10. (🌅 Sunset) Kaggle Users' Country + Regions Info

    • kaggle.com
    zip
    Updated Feb 14, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    BwandoWando (2024). (🌅 Sunset) Kaggle Users' Country + Regions Info [Dataset]. https://www.kaggle.com/datasets/bwandowando/kaggle-user-country-regions
    Explore at:
    zip(2376511 bytes)Available download formats
    Dataset updated
    Feb 14, 2024
    Authors
    BwandoWando
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    [Context]

    The official Meta-Kaggle dataset contains the Users.csv file which contains Username, DisplayName, RegisterDate, and PerformanceTier fields but doesn't contain location data of the Kaggle Users. This dataset augments that data with additional country and region information.

    [Note]

    I haven't included the username and displayname values on purpose, just the userid to be joined back to the Meta-Kaggle official Users.csv file.

    [Limitations]

    It is possible that some users haven't inputted their details when the scraper went through their accounts and thus have missing data. Another possibility is that users may have updated their info after the scraper went through their accounts, thus resulting in inconsistencies.

    [How I defined active in this dataset]

    • Users that have received an upvote in the forums, datasets, or notebooks
    • Users that have given an upvote in the forums, datasets, or notebooks
    • Users that have created a thread, a forum post, a notebook, or a dataset
    • Users that made a competition submission
    • Users that exist in the Meta-Kaggle Users dataset
    • Date cut-off of Jan 01, 2019

    [Update]

    • 15-Feb-2024- Since the Kaggle member's profile page update, the scrapers arent working anymore as the UI layout has changed. Will fix this when we get the time.
  11. Social Media and Mental Health

    • kaggle.com
    zip
    Updated Jul 18, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SouvikAhmed071 (2023). Social Media and Mental Health [Dataset]. https://www.kaggle.com/datasets/souvikahmed071/social-media-and-mental-health
    Explore at:
    zip(10944 bytes)Available download formats
    Dataset updated
    Jul 18, 2023
    Authors
    SouvikAhmed071
    License

    Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
    License information was derived automatically

    Description

    This dataset was originally collected for a data science and machine learning project that aimed at investigating the potential correlation between the amount of time an individual spends on social media and the impact it has on their mental health.

    The project involves conducting a survey to collect data, organizing the data, and using machine learning techniques to create a predictive model that can determine whether a person should seek professional help based on their answers to the survey questions.

    This project was completed as part of a Statistics course at a university, and the team is currently in the process of writing a report and completing a paper that summarizes and discusses the findings in relation to other research on the topic.

    The following is the Google Colab link to the project, done on Jupyter Notebook -

    https://colab.research.google.com/drive/1p7P6lL1QUw1TtyUD1odNR4M6TVJK7IYN

    The following is the GitHub Repository of the project -

    https://github.com/daerkns/social-media-and-mental-health

    Libraries used for the Project -

    Pandas
    Numpy
    Matplotlib
    Seaborn
    Sci-kit Learn
    
  12. SQLi XSS dataset

    • kaggle.com
    zip
    Updated Oct 23, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AlexTrinity (2023). SQLi XSS dataset [Dataset]. https://www.kaggle.com/datasets/alextrinity/sqli-xss-dataset
    Explore at:
    zip(65053040 bytes)Available download formats
    Dataset updated
    Oct 23, 2023
    Authors
    AlexTrinity
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    This dataset was not finished, had some incorrect content, just for test. This dataset has SQL injection, XSS and some bash script mixed in string, but not finish.

  13. NSL-KDD Dataset

    • kaggle.com
    zip
    Updated Mar 27, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sanket Rai (2019). NSL-KDD Dataset [Dataset]. https://www.kaggle.com/datasets/sanketrai/nslkdd-dataset
    Explore at:
    zip(2971564 bytes)Available download formats
    Dataset updated
    Mar 27, 2019
    Authors
    Sanket Rai
    Description

    Dataset

    This dataset was created by Sanket Rai

    Contents

  14. Fake News detection

    • kaggle.com
    zip
    Updated Dec 7, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    jruvika (2017). Fake News detection [Dataset]. https://www.kaggle.com/datasets/jruvika/fake-news-detection
    Explore at:
    zip(5123662 bytes)Available download formats
    Dataset updated
    Dec 7, 2017
    Authors
    jruvika
    License

    Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
    License information was derived automatically

    Description

    Dataset

    This dataset was created by jruvika

    Released under Database: Open Database, Contents: © Original Authors

    Contents

  15. Receipt Dataset for information extraction

    • kaggle.com
    zip
    Updated Dec 5, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dhia Znaidi (2022). Receipt Dataset for information extraction [Dataset]. https://www.kaggle.com/datasets/dhiaznaidi/receiptdatasetssd300v2
    Explore at:
    zip(453180544 bytes)Available download formats
    Dataset updated
    Dec 5, 2022
    Authors
    Dhia Znaidi
    Description

    Dataset

    This dataset was created by Dhia Znaidi

    Contents

  16. E-Commerce Data

    • kaggle.com
    zip
    Updated Aug 17, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Carrie (2017). E-Commerce Data [Dataset]. https://www.kaggle.com/datasets/carrie1/ecommerce-data
    Explore at:
    zip(7548686 bytes)Available download formats
    Dataset updated
    Aug 17, 2017
    Authors
    Carrie
    Description

    Context

    Typically e-commerce datasets are proprietary and consequently hard to find among publicly available data. However, The UCI Machine Learning Repository has made this dataset containing actual transactions from 2010 and 2011. The dataset is maintained on their site, where it can be found by the title "Online Retail".

    Content

    "This is a transnational data set which contains all the transactions occurring between 01/12/2010 and 09/12/2011 for a UK-based and registered non-store online retail.The company mainly sells unique all-occasion gifts. Many customers of the company are wholesalers."

    Acknowledgements

    Per the UCI Machine Learning Repository, this data was made available by Dr Daqing Chen, Director: Public Analytics group. chend '@' lsbu.ac.uk, School of Engineering, London South Bank University, London SE1 0AA, UK.

    Image from stocksnap.io.

    Inspiration

    Analyses for this dataset could include time series, clustering, classification and more.

  17. UCI-dataset

    • kaggle.com
    zip
    Updated Aug 17, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Md Waquar Azam (2022). UCI-dataset [Dataset]. https://www.kaggle.com/datasets/mdwaquarazam/ucidatasetlist
    Explore at:
    zip(20774 bytes)Available download formats
    Dataset updated
    Aug 17, 2022
    Authors
    Md Waquar Azam
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset is about list of dataset provided by UCI ML , If you are a learner and want some data on the basis of year ,categories, profession or some other criteria you search it from here.

    There are 8 rows in the dataset in which all details are given. --link --Data-Name --data type --default task --attribute-type --instances --attributes --year

    Some missing values are present there also,

    You can analyse the as per your requirement

    EDA

  18. Intelligent Manufacturing Dataset

    • kaggle.com
    zip
    Updated Feb 27, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ziya (2025). Intelligent Manufacturing Dataset [Dataset]. https://www.kaggle.com/datasets/ziya07/intelligent-manufacturing-dataset
    Explore at:
    zip(8902944 bytes)Available download formats
    Dataset updated
    Feb 27, 2025
    Authors
    Ziya
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    The Intelligent Manufacturing Dataset for Predictive Optimization is a dataset designed for research in smart manufacturing, AI-driven process optimization, and predictive maintenance. It simulates real-time sensor data from industrial machines, incorporating 6G network slicing for enhanced communication and resource allocation.

    Key Features: ✔ Industrial IoT Sensor Data – Temperature, vibration, power consumption, etc. ✔ 6G Network Performance Metrics – Latency, packet loss, and communication efficiency. ✔ Production Efficiency Indicators – Defect rate, predictive maintenance score, error rate. ✔ Target Column (Efficiency_Status) – Classifies manufacturing efficiency as High, Medium, or Low based on performance metrics.

    Applications: 🔹 AI-based predictive maintenance 🔹 Resource allocation optimization in 6G-enabled smart factories 🔹 Real-time anomaly detection in industrial production 🔹 Deep learning model training for intelligent manufacturing systems

    This dataset serves as a benchmark for AI and deep learning applications in Industry 4.0 and 6G network-integrated manufacturing systems.

  19. Alzheimer MRI 4 classes dataset

    • kaggle.com
    zip
    Updated Jan 4, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marco Pinamonti (2022). Alzheimer MRI 4 classes dataset [Dataset]. https://www.kaggle.com/datasets/marcopinamonti/alzheimer-mri-4-classes-dataset
    Explore at:
    zip(35808295 bytes)Available download formats
    Dataset updated
    Jan 4, 2022
    Authors
    Marco Pinamonti
    Description

    Context

    This dataset is a copy of the images in the dataset at the link: Alzheimer's Dataset (4 class of Images).

    Content

    The original dataset contained MRI images of 32 horizontal slices of the brain divided into 4 classes: - Mild Dementia - Moderate Dementia - Non Dementia - Very Mild Dementia

    For each classes there were a different number of subjects: - 28 subjects for the Mild Dementia Class - 2 subjects for the Moderate Dementia Class - 100 subjects for the Non Dementia Class - 70 subjects for the Very Mild Dementia Class

    The problem of the original dataset was that the train and the test sets contained different slices of the brain because the images of the dataset were ordered by the position of the slice and the train/test set division was performed by putting the first percentage of images in the train set and the last ones in the test set.

    In this dataset the original train and test set have been united and the images have been divided between train, test and validation set randomly.

  20. Comprehensive Supply Chain Analysis

    • kaggle.com
    Updated Sep 15, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dorothy Joel (2023). Comprehensive Supply Chain Analysis [Dataset]. https://www.kaggle.com/datasets/dorothyjoel/us-regional-sales
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 15, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Dorothy Joel
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    This supply chain analysis provides a comprehensive view of the company's order and distribution processes, allowing for in-depth analysis and optimization of various aspects of the supply chain, from procurement and inventory management to sales and customer satisfaction. It empowers the company to make data-driven decisions to improve efficiency, reduce costs, and enhance customer experiences. The provided supply chain analysis dataset contains various columns that capture important information related to the company's order and distribution processes:

    • OrderNumber • Sales Channel • WarehouseCode • ProcuredDate • CurrencyCode • OrderDate • ShipDate • DeliveryDate • SalesTeamID • CustomerID • StoreID • ProductID • Order Quantity • Discount Applied • Unit Cost • Unit Price

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Trrishan (2022). Top 1000 Kaggle Datasets [Dataset]. https://www.kaggle.com/datasets/notkrishna/top-1000-kaggle-datasets
Organization logo

Top 1000 Kaggle Datasets

Kaggle's most popular datasets

Explore at:
2 scholarly articles cite this dataset (View in Google Scholar)
zip(34269 bytes)Available download formats
Dataset updated
Jan 3, 2022
Authors
Trrishan
License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

From wiki

Kaggle, a subsidiary of Google LLC, is an online community of data scientists and machine learning practitioners. Kaggle allows users to find and publish data sets, explore and build models in a web-based data-science environment, work with other data scientists and machine learning engineers, and enter competitions to solve data science challenges.

Kaggle got its start in 2010 by offering machine learning competitions and now also offers a public data platform, a cloud-based workbench for data science, and Artificial Intelligence education. Its key personnel were Anthony Goldbloom and Jeremy Howard. Nicholas Gruen was founding chair succeeded by Max Levchin. Equity was raised in 2011 valuing the company at $25 million. On 8 March 2017, Google announced that they were acquiring Kaggle.[1][2]

Source: Kaggle

Search
Clear search
Close search
Google apps
Main menu