21 datasets found

Learn pandas
kaggle.com
Updated Apr 25, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
npscul (2021). Learn pandas [Dataset]. https://www.kaggle.com/npscul/learn-pandas/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 25, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
npscul
Description
Dataset

This dataset was created by npscul

Contents
Learn Data Science Series Part 1
kaggle.com
Updated Dec 30, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rupesh Kumar (2022). Learn Data Science Series Part 1 [Dataset]. https://www.kaggle.com/datasets/hunter0007/learn-data-science-part-1
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 30, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Rupesh Kumar
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Please feel free to share it with others and consider supporting me if you find it helpful ⭐️.

Overview:

Chapter 1: Getting started with pandas

Chapter 2: Analysis: Bringing it all together and making decisions

Chapter 3: Appending to DataFrame

Chapter 4: Boolean indexing of dataframes

Chapter 5: Categorical data

Chapter 6: Computational Tools

Chapter 7: Creating DataFrames

Chapter 8: Cross sections of different axes with MultiIndex

Chapter 9: Data Types

Chapter 10: Dealing with categorical variables

Chapter 11: Duplicated data

Chapter 12: Getting information about DataFrames

Chapter 13: Gotchas of pandas

Chapter 14: Graphs and Visualizations

Chapter 15: Grouping Data

Chapter 16: Grouping Time Series Data

Chapter 17: Holiday Calendars

Chapter 18: Indexing and selecting data

Chapter 19: IO for Google BigQuery

Chapter 20: JSON

Chapter 21: Making Pandas Play Nice With Native Python Datatypes

Chapter 22: Map Values

Chapter 23: Merge, join, and concatenate

Chapter 24: Meta: Documentation Guidelines

Chapter 25: Missing Data

Chapter 26: MultiIndex

Chapter 27: Pandas Datareader

Chapter 28: Pandas IO tools (reading and saving data sets)

Chapter 29: pd.DataFrame.apply

Chapter 30: Read MySQL to DataFrame

Chapter 31: Read SQL Server to Dataframe

Chapter 32: Reading files into pandas DataFrame

Chapter 33: Resampling

Chapter 34: Reshaping and pivoting

Chapter 35: Save pandas dataframe to a csv file

Chapter 36: Series

Chapter 37: Shifting and Lagging Data

Chapter 38: Simple manipulation of DataFrames

Chapter 39: String manipulation

Chapter 40: Using .ix, .iloc, .loc, .at and .iat to access a DataFrame

Chapter 41: Working with Time Series
Data from: car-sales
kaggle.com
zip
Updated Jun 30, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Makar Baderko (2020). car-sales [Dataset]. https://www.kaggle.com/makarbaderko/carsales
Explore at:
zip(18661 bytes)Available download formats
Dataset updated
Jun 30, 2020
Authors
Makar Baderko
Description
Dataset

This dataset was created by Makar Baderko

Released under Data files © Original Authors

Contents
P
EDGE-IIOTSET Dataset
paperswithcode.com
Updated Oct 16, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). EDGE-IIOTSET Dataset [Dataset]. https://paperswithcode.com/dataset/edge-iiotset
Explore at:
Dataset updated
Oct 16, 2023
Description
ABSTRACT In this project, we propose a new comprehensive realistic cyber security dataset of IoT and IIoT applications, called Edge-IIoTset, which can be used by machine learning-based intrusion detection systems in two different modes, namely, centralized and federated learning. Specifically, the proposed testbed is organized into seven layers, including, Cloud Computing Layer, Network Functions Virtualization Layer, Blockchain Network Layer, Fog Computing Layer, Software-Defined Networking Layer, Edge Computing Layer, and IoT and IIoT Perception Layer. In each layer, we propose new emerging technologies that satisfy the key requirements of IoT and IIoT applications, such as, ThingsBoard IoT platform, OPNFV platform, Hyperledger Sawtooth, Digital twin, ONOS SDN controller, Mosquitto MQTT brokers, Modbus TCP/IP, ...etc. The IoT data are generated from various IoT devices (more than 10 types) such as Low-cost digital sensors for sensing temperature and humidity, Ultrasonic sensor, Water level detection sensor, pH Sensor Meter, Soil Moisture sensor, Heart Rate Sensor, Flame Sensor, ...etc.). However, we identify and analyze fourteen attacks related to IoT and IIoT connectivity protocols, which are categorized into five threats, including, DoS/DDoS attacks, Information gathering, Man in the middle attacks, Injection attacks, and Malware attacks. In addition, we extract features obtained from different sources, including alerts, system resources, logs, network traffic, and propose new 61 features with high correlations from 1176 found features. After processing and analyzing the proposed realistic cyber security dataset, we provide a primary exploratory data analysis and evaluate the performance of machine learning approaches (i.e., traditional machine learning as well as deep learning) in both centralized and federated learning modes.

Instructions:

Great news! The Edge-IIoT dataset has been featured as a "Document in the top 1% of Web of Science." This indicates that it is ranked within the top 1% of all publications indexed by the Web of Science (WoS) in terms of citations and impact.

Please kindly visit kaggle link for the updates: https://www.kaggle.com/datasets/mohamedamineferrag/edgeiiotset-cyber-sec...

Free use of the Edge-IIoTset dataset for academic research purposes is hereby granted in perpetuity. Use for commercial purposes is allowable after asking the leader author, Dr Mohamed Amine Ferrag, who has asserted his right under the Copyright.

The details of the Edge-IIoT dataset were published in following the paper. For the academic/public use of these datasets, the authors have to cities the following paper:

Mohamed Amine Ferrag, Othmane Friha, Djallel Hamouda, Leandros Maglaras, Helge Janicke, "Edge-IIoTset: A New Comprehensive Realistic Cyber Security Dataset of IoT and IIoT Applications for Centralized and Federated Learning", IEEE Access, April 2022 (IF: 3.37), DOI: 10.1109/ACCESS.2022.3165809

Link to paper : https://ieeexplore.ieee.org/document/9751703

The directories of the Edge-IIoTset dataset include the following:

•File 1 (Normal traffic)

-File 1.1 (Distance): This file includes two documents, namely, Distance.csv and Distance.pcap. The IoT sensor (Ultrasonic sensor) is used to capture the IoT data.

-File 1.2 (Flame_Sensor): This file includes two documents, namely, Flame_Sensor.csv and Flame_Sensor.pcap. The IoT sensor (Flame Sensor) is used to capture the IoT data.

-File 1.3 (Heart_Rate): This file includes two documents, namely, Flame_Sensor.csv and Flame_Sensor.pcap. The IoT sensor (Flame Sensor) is used to capture the IoT data.

-File 1.4 (IR_Receiver): This file includes two documents, namely, IR_Receiver.csv and IR_Receiver.pcap. The IoT sensor (IR (Infrared) Receiver Sensor) is used to capture the IoT data.

-File 1.5 (Modbus): This file includes two documents, namely, Modbus.csv and Modbus.pcap. The IoT sensor (Modbus Sensor) is used to capture the IoT data.

-File 1.6 (phValue): This file includes two documents, namely, phValue.csv and phValue.pcap. The IoT sensor (pH-sensor PH-4502C) is used to capture the IoT data.

-File 1.7 (Soil_Moisture): This file includes two documents, namely, Soil_Moisture.csv and Soil_Moisture.pcap. The IoT sensor (Soil Moisture Sensor v1.2) is used to capture the IoT data.

-File 1.8 (Sound_Sensor): This file includes two documents, namely, Sound_Sensor.csv and Sound_Sensor.pcap. The IoT sensor (LM393 Sound Detection Sensor) is used to capture the IoT data.

-File 1.9 (Temperature_and_Humidity): This file includes two documents, namely, Temperature_and_Humidity.csv and Temperature_and_Humidity.pcap. The IoT sensor (DHT11 Sensor) is used to capture the IoT data.

-File 1.10 (Water_Level): This file includes two documents, namely, Water_Level.csv and Water_Level.pcap. The IoT sensor (Water sensor) is used to capture the IoT data.

•File 2 (Attack traffic):

-File 2.1 (Attack traffic (CSV files)): This file includes 13 documents, namely, Backdoor_attack.csv, DDoS_HTTP_Flood_attack.csv, DDoS_ICMP_Flood_attack.csv, DDoS_TCP_SYN_Flood_attack.csv, DDoS_UDP_Flood_attack.csv, MITM_attack.csv, OS_Fingerprinting_attack.csv, Password_attack.csv, Port_Scanning_attack.csv, Ransomware_attack.csv, SQL_injection_attack.csv, Uploading_attack.csv, Vulnerability_scanner_attack.csv, XSS_attack.csv. Each document is specific for each attack.

-File 2.2 (Attack traffic (PCAP files)): This file includes 13 documents, namely, Backdoor_attack.pcap, DDoS_HTTP_Flood_attack.pcap, DDoS_ICMP_Flood_attack.pcap, DDoS_TCP_SYN_Flood_attack.pcap, DDoS_UDP_Flood_attack.pcap, MITM_attack.pcap, OS_Fingerprinting_attack.pcap, Password_attack.pcap, Port_Scanning_attack.pcap, Ransomware_attack.pcap, SQL_injection_attack.pcap, Uploading_attack.pcap, Vulnerability_scanner_attack.pcap, XSS_attack.pcap. Each document is specific for each attack.

•File 3 (Selected dataset for ML and DL):

-File 3.1 (DNN-EdgeIIoT-dataset): This file contains a selected dataset for the use of evaluating deep learning-based intrusion detection systems.

-File 3.2 (ML-EdgeIIoT-dataset): This file contains a selected dataset for the use of evaluating traditional machine learning-based intrusion detection systems.

Step 1: Downloading The Edge-IIoTset dataset From the Kaggle platform from google.colab import files

!pip install -q kaggle

files.upload()

!mkdir ~/.kaggle

!cp kaggle.json ~/.kaggle/

!chmod 600 ~/.kaggle/kaggle.json

!kaggle datasets download -d mohamedamineferrag/edgeiiotset-cyber-security-dataset-of-iot-iiot -f "Edge-IIoTset dataset/Selected dataset for ML and DL/DNN-EdgeIIoT-dataset.csv"

!unzip DNN-EdgeIIoT-dataset.csv.zip

!rm DNN-EdgeIIoT-dataset.csv.zip

Step 2: Reading the Datasets' CSV file to a Pandas DataFrame: import pandas as pd

import numpy as np

df = pd.read_csv('DNN-EdgeIIoT-dataset.csv', low_memory=False)

Step 3 : Exploring some of the DataFrame's contents: df.head(5)

print(df['Attack_type'].value_counts())

Step 4: Dropping data (Columns, duplicated rows, NAN, Null..): from sklearn.utils import shuffle

drop_columns = ["frame.time", "ip.src_host", "ip.dst_host", "arp.src.proto_ipv4","arp.dst.proto_ipv4",

"http.file_data","http.request.full_uri","icmp.transmit_timestamp", "http.request.uri.query", "tcp.options","tcp.payload","tcp.srcport", "tcp.dstport", "udp.port", "mqtt.msg"]

df.drop(drop_columns, axis=1, inplace=True)

df.dropna(axis=0, how='any', inplace=True)

df.drop_duplicates(subset=None, keep="first", inplace=True)

df = shuffle(df)

df.isna().sum()

print(df['Attack_type'].value_counts())

Step 5: Categorical data encoding (Dummy Encoding): import numpy as np

from sklearn.model_selection import train_test_split

from sklearn.preprocessing import StandardScaler

from sklearn import preprocessing

def encode_text_dummy(df, name):

dummies = pd.get_dummies(df[name])

for x in dummies.columns:

dummy_name = f"{name}-{x}" df[dummy_name] = dummies[x]

df.drop(name, axis=1, inplace=True)

encode_text_dummy(df,'http.request.method')

encode_text_dummy(df,'http.referer')

encode_text_dummy(df,"http.request.version")

encode_text_dummy(df,"dns.qry.name.len")

encode_text_dummy(df,"mqtt.conack.flags")

encode_text_dummy(df,"mqtt.protoname")

encode_text_dummy(df,"mqtt.topic")

Step 6: Creation of the preprocessed dataset df.to_csv('preprocessed_DNN.csv', encoding='utf-8')

For more information about the dataset, please contact the lead author of this project, Dr Mohamed Amine Ferrag, on his email: mohamed.amine.ferrag@gmail.com

More information about Dr. Mohamed Amine Ferrag is available at:

https://www.linkedin.com/in/Mohamed-Amine-Ferrag

https://dblp.uni-trier.de/pid/142/9937.html

https://www.researchgate.net/profile/Mohamed_Amine_Ferrag

https://scholar.google.fr/citations?user=IkPeqxMAAAAJ&hl=fr&oi=ao

https://www.scopus.com/authid/detail.uri?authorId=56115001200

https://publons.com/researcher/1322865/mohamed-amine-ferrag/

https://orcid.org/0000-0002-0632-3172

Last Updated: 27 Mar. 2023
PANDA fanconic model weights
kaggle.com
Updated Jul 22, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Claudio Fanconi (2020). PANDA fanconic model weights [Dataset]. https://www.kaggle.com/fanconic/panda-tiles-20x112x112/metadata
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 22, 2020
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Claudio Fanconi
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Context

16x112x112 tiles of images in PNG format from the PANDA prostate detection challenge.

Content

The dataset contains 16 tiles of size 112x112 for every image of the original competition dataset. The 20 images are the ones containing the most significant pixel information,

Acknowledgements

The data in this dataset was created with the following kernel: https://www.kaggle.com/fanconic/panda-20x112x112-tiles-for-efficientnetb0

Many thanks to @lafoss for the original kernel: https://www.kaggle.com/iafoss/panda-16x128x128-tiles You da real MVP!
Social Power NBA
kaggle.com
Updated Aug 1, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Noah Gift (2017). Social Power NBA [Dataset]. https://www.kaggle.com/datasets/noahgift/social-power-nba/suggestions
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 1, 2017
Dataset provided by
Kaggle
Authors
Noah Gift
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
Context

This data set contains combined on-court performance data for NBA players in the 2016-2017 season, alongside salary, Twitter engagement, and Wikipedia traffic data.

Further information can be found in a series of articles for IBM Developerworks: "Explore valuation and attendance using data science and machine learning" and "Exploring the individual NBA players".

A talk about this dataset has slides from March, 2018, Strata:

https://www.slideshare.net/noahgift/social-power-andinfluenceinthenba-89807740?qid=3f9f835a-f3d7-4174-8a8c-c97f9c82e614&v=&b=&from_search=1

Further reading on this dataset is in the book Pragmatic AI, in Chapter 6 or full book, Pragmatic AI: An introduction to Cloud-based Machine Learning and watch lesson 9 in Essential Machine Learning and AI with Python and Jupyter Notebook

Followup Items

You can watch a breakdown of using cluster analysis on the Pragmatic AI YouTube channel

Learn to deploy a Kaggle project into a production Machine Learning sklearn + flask + container by reading Python for Devops: Learn Ruthlessly Effective Automation, Chapter 14: MLOps and Machine learning engineering

Use social media to predict a winning season with this notebook: https://github.com/noahgift/core-stats-datascience/blob/master/Lesson2_7_Trends_Supervized_Learning.ipynb

Learn to use the cloud for data analysis.

Acknowledgement

Data sources include ESPN, Basketball-Reference, Twitter, Five-ThirtyEight, and Wikipedia. The source code for this dataset (in Python and R) can be found on GitHub. Links to more writing can be found at noahgift.com.

Inspiration

Do NBA fans know more about who the best players are, or do owners?

What is the true worth of the social media presence of athletes in the NBA?
Age and Sex Prediction by Artificial Intelligence
kaggle.com
Updated Jul 5, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
EMİRHAN BULUT (2025). Age and Sex Prediction by Artificial Intelligence [Dataset]. https://www.kaggle.com/datasets/emirhanai/age-and-sex-prediction-by-artificial-intelligence
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 5, 2025
Dataset provided by
Kaggle
Authors
EMİRHAN BULUT
License
http://www.gnu.org/licenses/agpl-3.0.htmlhttp://www.gnu.org/licenses/agpl-3.0.html
Description
Age and Sex Prediction from Image - Convolutional Neural Network with Artificial Intelligence

I developed an artificial intelligence software that predicts your Age and Gender. It has a 93% accuracy rate. I'm 21 years old and he predicted my age 100% correctly! I adjusted the algorithm and prepared the codes. A system that works together with Neural Networks in the Deep Learning system. I used Convolutional Layers from Convolutional Neural Networks. I am pleased to present this software for humanity. Doctoral students can use it in their theses or various companies can use this software! Upload your photo, guess your age and gender!

Kind regards,

Emirhan BULUT

Head of AI & AI Inventor

The coding language used:

Python 3.9.8

Libraries Used:

TensorFlow

Keras

OpenCV

MatPlotlib

NumPy

Pandas

Scikit-learn - (SKLEARN)

https://raw.githubusercontent.com/emirhanai/Age-and-Sex-Prediction-from-Image---Convolutional-Neural-Network-with-Artificial-Intelligence/main/Age%20and%20Sex%20Prediction%20from%20Image%20-%20Convolutional%20Neural%20Network%20with%20Artificial%20Intelligence.png" alt="Age and Sex Prediction from Image - Convolutional Neural Network with Artificial Intelligence">

Developer Information:

Name-Surname: Emirhan BULUT

Contact (Email) : emirhan@isap.solutions

LinkedIn : https://www.linkedin.com/in/artificialintelligencebulut/

Kaggle: https://www.kaggle.com/emirhanai

Official Website: https://www.emirhanbulut.com.tr
Cryptocurrency Prediction Artificial Intelligence
kaggle.com
Updated Jul 7, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
EMİRHAN BULUT (2025). Cryptocurrency Prediction Artificial Intelligence [Dataset]. https://www.kaggle.com/datasets/emirhanai/cryptocurrency-prediction-artificial-intelligence
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 7, 2025
Dataset provided by
Kaggle
Authors
EMİRHAN BULUT
License
http://www.gnu.org/licenses/agpl-3.0.htmlhttp://www.gnu.org/licenses/agpl-3.0.html
Description
Cryptocurrency-Prediction-with-Artificial-Intelligence

First Version.. Cryptocurrency Prediction with Artificial Intelligence (Deep Learning via LSTM Neural Networks)- Emirhan BULUT I developed Cryptocurrency Prediction (Deep Learning with LSTM Neural Networks) software with Artificial Intelligence. I predicted the fall on December 28, 2021 with 98.5% accuracy in the XRP/USDT pair. '0.009179626158151918' MAE Score, '0.0002120391943355104' MSE Score, 98.35% Accuracy Question software has been completed.

The XRP/USDT pair forecast for December 28, 2021 was correctly forecasted based on data from Binance.

Software codes and information are shared with you as open source code free of charge on GitHub and My Personal Web Address.

Happy learning!

Emirhan BULUT

Senior Artificial Intelligence Engineer & Inventor

The coding language used:

Python 3.9.8

Libraries Used:

Tensorflow - Keras

NumPy

Matplotlib

Pandas

Scikit-learn - (SKLEARN)

https://raw.githubusercontent.com/emirhanai/Cryptocurrency-Prediction-with-Artificial-Intelligence/main/XRP-1%20-%20PREDICTION.png" alt="Cryptocurrency Prediction with Artificial Intelligence (Deep Learning via LSTM Neural Networks)- Emirhan BULUT">

Developer Information:

Name-Surname: Emirhan BULUT

Contact (Email) : emirhan@isap.solutions

LinkedIn : https://www.linkedin.com/in/artificialintelligencebulut/

Kaggle: https://www.kaggle.com/emirhanai

Official Website: https://www.emirhanbulut.com.tr
Enhanced Pizza Sales Data (2024–2025)
kaggle.com
Updated May 12, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
akshay gaikwad (2025). Enhanced Pizza Sales Data (2024–2025) [Dataset]. https://www.kaggle.com/datasets/akshaygaikwad448/pizza-delivery-data-with-enhanced-features
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 12, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
akshay gaikwad
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
This is a realistic and structured pizza sales dataset covering the time span from **2024 to 2025. ** Whether you're a beginner in data science, a student working on a machine learning project, or an experienced analyst looking to test out time series forecasting and dashboard building, this dataset is for you.

📁 What’s Inside? The dataset contains rich details from a pizza business including:

✅ Order Dates & Times ✅ Pizza Names & Categories (Veg, Non-Veg, Classic, Gourmet, etc.) ✅ Sizes (Small, Medium, Large, XL) ✅ Prices ✅ Order Quantities ✅ Customer Preferences & Trends

It is neatly organized in Excel format and easy to use with tools like Python (Pandas), Power BI, Excel, or Tableau.

💡** Why Use This Dataset?** This dataset is ideal for:

📈 Sales Analysis & Reporting 🧠 Machine Learning Models (demand forecasting, recommendations) 📅 Time Series Forecasting 📊 Data Visualization Projects 🍽️ Customer Behavior Analysis 🛒 Market Basket Analysis 📦 Inventory Management Simulations

🧠 Perfect For: Data Science Beginners & Learners BI Developers & Dashboard Designers MBA Students (Marketing, Retail, Operations) Hackathons & Case Study Competitions

pizza, sales data, excel dataset, retail analysis, data visualization, business intelligence, forecasting, time series, customer insights, machine learning, pandas, beginner friendly
Bank Data Analysis
kaggle.com
Updated Mar 19, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Steve Gallegos (2022). Bank Data Analysis [Dataset]. https://www.kaggle.com/stevegallegos/bank-marketing-data-set/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 19, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Steve Gallegos
Description
Data Set Information

The bank.csv dataset describes about a phone call between customer and customer care staffs who are working for Portuguese banking institution. The dataset is about, whether the customer will get the scheme or product such as bank term deposit. Maximum the data will have ‘yes’ or ‘no’ type data.

bank-additional-full.csv with all examples (41188) and 20 inputs, ordered by date (from May 2008 to November 2010)

Changed file name to bank.csv after delimited

Goal

The main goal is to predict if clients will subscribe to a term deposit or not.

Attribute Information

-Input Variables -

Bank Client Data: 1 - age: (numeric) 2 - job: type of job (categorical: admin., blue-collar, entrepreneur, housemaid, management, retired, self-employed, services, student, technician, unemployed, unknown) 3 - marital: marital status (categorical: divorced, married, single, unknown; note: divorced means either divorced or widowed) 4 - education: (categorical: basic.4y, basic.6y, basic.9y, high.school, illiterate, professional.course, university.degree, unknown) 5 - default: has credit in default? (categorical: no, yes, unknown) 6 - housing: has housing loan? (categorical: no, yes, unknown) 7 - loan: has personal loan? (categorical: no, yes, unknown)

Related with the Last Contact of the Current Campaign: 8 - contact: contact communication type (categorical: cellular, telephone) 9 - month: last contact month of year (categorical: jan, feb, mar, ..., nov, dec) 10 - day_of_week: last contact day of the week (categorical: mon, tue, wed, thu, fri) 11 - duration: last contact duration, in seconds (numeric). Important note: this attribute highly affects the output target (e.g., if duration=0 then y='no'). Yet, the duration is not known before a call is performed. Also, after the end of the call y is obviously known. Thus, this input should only be included for benchmark purposes and should be discarded if the intention is to have a realistic predictive model.

Other Attributes: 12 - campaign: number of contacts performed during this campaign and for this client (numeric, includes last contact) 13 - pdays: number of days that passed by after the client was last contacted from a previous campaign (numeric; 999 means client was not previously contacted) 14 - previous: number of contacts performed before this campaign and for this client (numeric) 15 - poutcome: outcome of the previous marketing campaign (categorical: failure, nonexistent, success)

#Social and Economic Context Attributes 16 - emp.var.rate: employment variation rate - quarterly indicator (numeric) 17 - cons.price.idx: consumer price index - monthly indicator (numeric) 18 - cons.conf.idx: consumer confidence index - monthly indicator (numeric) 19 - euribor3m: euribor 3 month rate - daily indicator (numeric) 20 - nr.employed: number of employees - quarterly indicator (numeric)

Output Variable (Desired Target): 21 - y (deposit): - has the client subscribed a term deposit? (binary: yes, no) -> changed column title from '***y***' to '***deposit***'

Source

[Moro et al., 2014] S. Moro, P. Cortez and P. Rita. A Data-Driven Approach to Predict the Success of Bank Telemarketing. Decision Support Systems, Elsevier, 62:22-31, June 2014
Hospital Management Dataset
kaggle.com
Updated May 30, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kanak Baghel (2025). Hospital Management Dataset [Dataset]. https://www.kaggle.com/datasets/kanakbaghel/hospital-management-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 30, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Kanak Baghel
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
This is a structured, multi-table dataset designed to simulate a hospital management system. It is ideal for practicing data analysis, SQL, machine learning, and healthcare analytics.

Dataset Overview

This dataset includes five CSV files:

patients.csv – Patient demographics, contact details, registration info, and insurance data

doctors.csv – Doctor profiles with specializations, experience, and contact information

appointments.csv – Appointment dates, times, visit reasons, and statuses

treatments.csv – Treatment types, descriptions, dates, and associated costs

billing.csv – Billing amounts, payment methods, and status linked to treatments

📁 Files & Column Descriptions

** patients.csv**

Contains patient demographic and registration details.

Column Description

patient_id -> Unique ID for each patient first_name -> Patient's first name last_name -> Patient's last name gender -> Gender (M/F) date_of_birth -> Date of birth contact_number -> Phone number address -> Address of the patient registration_date -> Date of first registration at the hospital insurance_provider -> Insurance company name insurance_number -> Policy number email -> Email address

** doctors.csv**

Details about the doctors working in the hospital.

Column Description

doctor_id -> Unique ID for each doctor first_name -> Doctor's first name last_name -> Doctor's last name specialization -> Medical field of expertise phone_number -> Contact number years_experience -> Total years of experience hospital_branch -> Branch of hospital where doctor is based email -> Official email address

appointments.csv

Records of scheduled and completed patient appointments.

Column Description

appointment_id -> Unique appointment ID patient_id -> ID of the patient doctor_id -> ID of the attending doctor appointment_date -> Date of the appointment appointment_time -> Time of the appointment reason_for_visit -> Purpose of visit (e.g., checkup) status -> Status (Scheduled, Completed, Cancelled)

treatments.csv

Information about the treatments given during appointments.

Column Description

treatment_id -> Unique ID for each treatment appointment_id -> Associated appointment ID treatment_type -> Type of treatment (e.g., MRI, X-ray) description -> Notes or procedure details cost -> Cost of treatment treatment_date -> Date when treatment was given

** billing.csv**

Billing and payment details for treatments.

Column Description

bill_id -> Unique billing ID patient_id -> ID of the billed patient treatment_id -> ID of the related treatment bill_date -> Date of billing amount -> Total amount billed payment_method -> Mode of payment (Cash, Card, Insurance) payment_status -> Status of payment (Paid, Pending, Failed)

Possible Use Cases

SQL queries and relational database design

Exploratory data analysis (EDA) and dashboarding

Machine learning projects (e.g., cost prediction, no-show analysis)

Feature engineering and data cleaning practice

End-to-end healthcare analytics workflows

Recommended Tools & Resources

SQL (joins, filters, window functions)

Pandas and Matplotlib/Seaborn for EDA

Scikit-learn for ML models

Pandas Profiling for automated EDA

Plotly for interactive visualizations

Please Note that :

All data is synthetically generated for educational and project use. No real patient information is included.

If you find this dataset helpful, consider upvoting or sharing your insights by creating a Kaggle notebook.
Classified Ads for Cars - unique maker/model/year
kaggle.com
Updated Mar 9, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Volodymyr Sergeyev (2019). Classified Ads for Cars - unique maker/model/year [Dataset]. https://www.kaggle.com/vsergeyev/classified-ads-for-cars-unique-makermodelyear/notebooks
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 9, 2019
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Volodymyr Sergeyev
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Context

Thanks to Miroslav Zoricak and https://www.kaggle.com/mirosval/personal-cars-classifieds

Inspiration

How many unique car makers are

How many models

Learn pandas
Data from: EHR data
kaggle.com
Updated Apr 30, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bipul Shahi (2025). EHR data [Dataset]. https://www.kaggle.com/datasets/vipulshahi/ehr-data/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 30, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Bipul Shahi
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
🔍 Dataset Overview

Each patient in the dataset has 30 days of continuous health data. The goal is to predict if a patient will progress to a critical condition based on their vital signs, medication adherence, and symptoms recorded daily.

There are 10 columns in the dataset:

Column Name Description patient_id Unique identifier for each patient. day Day number (from 1 to 30) indicating sequential daily records. bp_systolic Systolic blood pressure (top number) in mm Hg. Higher values may indicate hypertension. bp_diastolic Diastolic blood pressure (bottom number) in mm Hg. heart_rate Heartbeats per minute. Elevated heart rate can signal stress, infection, or deterioration. respiratory_rate Breaths per minute. Elevated rates can indicate respiratory distress. temperature Body temperature in °F. Fever or hypothermia are signs of infection or inflammation. oxygen_saturation Percentage of oxygen in blood. Lower values are concerning (< 94%). med_adherence Patient’s medication adherence (between 0 and 1). Lower values may contribute to worsening. symptom_severity Subjective symptom rating (scale of 1–10). Higher means worse condition. progressed_to_critical Target label: 1 if patient deteriorated to a critical condition, else 0. 🎯 Final Task (Prediction Objective)

Problem Type: Binary classification with time-series data.

Goal: Train deep learning models (RNN, LSTM, GRU) to learn temporal patterns from a patient's 30-day health history and predict whether the patient will progress to a critical condition.

📈 How the Data is Used for Modeling

Input: A 3D array shaped as (num_patients, 30, 8) where: 30 = number of days (timesteps), 8 = features per day (excluding ID, day, and target). Output: A binary label for each patient (0 or 1). 🔄 Feature Contribution to Prediction

Feature Why It Matters bp_systolic/dia Persistently high or rising BP may signal stress, cardiac issues, or deterioration. heart_rate A rising heart rate can indicate fever, infection, or organ distress. respiratory_rate Often increases early in critical illnesses like sepsis or COVID. temperature Fever is a key sign of infection. Chronic low/high temp may indicate underlying pathology. oxygen_saturation A declining oxygen level is a strong predictor of respiratory failure. med_adherence Poor medication adherence is often linked to worsening chronic conditions. symptom_severity Patient-reported worsening symptoms may precede measurable physiological changes. 🛠 Tools You’ll Use

Task Tool/Technique Data processing Pandas, NumPy, Scikit-learn Time series modeling Keras (using SimpleRNN, LSTM, GRU) Evaluation Accuracy, Loss, ROC Curve (optional)
Telecom Consumer Complaints
kaggle.com
Updated May 21, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Aditya6196 (2020). Telecom Consumer Complaints [Dataset]. https://www.kaggle.com/aditya6196/telecom-consumer-complaints/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 21, 2020
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Aditya6196
Description
DESCRIPTION

Comcast is an American global telecommunication company. The firm has been providing terrible customer service. They continue to fall short despite repeated promises to improve. Only last month (October 2016) the authority fined them a $2.3 million, after receiving over 1000 consumer complaints. The existing database will serve as a repository of public customer complaints filed against Comcast. It will help to pin down what is wrong with Comcast's customer service.

Data Dictionary

Ticket #: Ticket number assigned to each complaint

Customer Complaint: Description of complaint

Date: Date of complaint

Time: Time of complaint

Received Via: Mode of communication of the complaint

City: Customer city

State: Customer state

Zipcode: Customer zip

Status: Status of complaint

Filing on behalf of someone

Analysis Task

To perform these tasks, you can use any of the different Python libraries such as NumPy, SciPy, Pandas, scikit-learn, matplotlib, and BeautifulSoup.

Import data into Python environment.

Provide the trend chart for the number of complaints at monthly and daily granularity levels.

Provide a table with the frequency of complaint types.

Which complaint types are maximum i.e., around internet, network issues, or across any other domains. - Create a new categorical variable with value as Open and Closed. Open & Pending is to be categorized as Open and Closed & Solved is to be categorized as Closed. - Provide state wise status of complaints in a stacked bar chart. Use the categorized variable from Q3. Provide insights on:

Which state has the maximum complaints Which state has the highest percentage of unresolved complaints - Provide the percentage of complaints resolved till date, which were received through the Internet and customer care calls.
Tajweed Dataset
kaggle.com
Updated Apr 6, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ala'a Abdu Saleh Alawdi (2025). Tajweed Dataset [Dataset]. https://www.kaggle.com/datasets/alawdisoft/tajweed-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 6, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Ala'a Abdu Saleh Alawdi
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
The provided code processes a Tajweed dataset, which appears to be a collection of audio recordings categorized by different Tajweed rules (Ikhfa, Izhar, Idgham, Iqlab). Let's break down the dataset's structure and the code's functionality:

Dataset Structure:

Organized by Tajweed Rule and Sheikh: The dataset is structured into directories for each Tajweed rule (e.g., 'Ikhfa', 'Izhar'). Within each rule's directory, there are subdirectories representing different reciters (sheikhs). This hierarchical organization is crucial for creating a structured metadata file and for training machine learning models.

Audio Files: The audio files (presumably WAV or other supported formats) are stored within the sheikh's subdirectories. The original filenames are not standardized.

Multiple Sheikhs per Rule: The dataset includes multiple recitations for each rule from different sheikhs, offering diversity in pronunciation.

Google Drive Storage: The dataset is located on Google Drive, which requires mounting the drive to access the data within a Colab environment.

Code Functionality:

Initialization and Imports: The code begins with necessary imports (pandas, pydub) and mounts Google Drive. Pydub is used for audio file format conversion.

Directory Listing: It initially checks if a specified directory exists (for example, Alaa_alhsri/Ikhfa) and lists its files, demonstrating basic file system access.

Metadata Creation: The core of the script is the generation of metadata, which provides essential information about each audio file. The tajweed_paths dictionary maps each Tajweed rule to a list of paths, associating each path with the reciter's name.

Iterating through Paths: The code iterates through each Tajweed rule and its corresponding paths.

File Listing: Inside each directory, it iterates through the audio files.

Metadata Dictionary: For each audio file, it creates a metadata dictionary that includes:

global_id: A unique identifier for each audio file.

original_filename: The original filename of the audio file.

new_filename: A standardized filename that incorporates the Tajweed rule (label), sheikh's ID, audio number, and a global ID.

label: The Tajweed rule.

sheikh_id: A numerical identifier for each sheikh.

sheikh_name: The name of the reciter.

audio_number: A sequential number for the audio files within a specific sheikh and Tajweed rule combination.

original_path: Full path to the original audio file.

new_path: Full path to the intended location for the renamed and potentially converted audio file.

Pandas DataFrame: The metadata is collected in a list of dictionaries and then converted into a Pandas DataFrame for easier viewing and processing. This DataFrame is highly informative.

File Renaming and Conversion:

File Renaming: (commented out) The code is able to rename the audio files to the standardized format defined in new_filename and store it in the designated directory.

Audio Conversion to WAV: The script then converts any files in the specified directories to .wav format, creating standardized files in a new output_dataset directory. The new filenames are based on rules, sheikh and a counter.

Metadata Export: Finally, the compiled metadata is saved as a CSV file (metadata.csv) in the output directory. This CSV file is crucial for training any machine learning model using this data.
GitHub Commit Messages Dataset
kaggle.com
Updated Apr 21, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dhruvil Dave (2021). GitHub Commit Messages Dataset [Dataset]. http://doi.org/10.34740/kaggle/dsv/2143532
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.34740/kaggle/dsv/2143532
Dataset updated
Apr 21, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Dhruvil Dave
License
Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
License information was derived automatically
Description
https://github.githubassets.com/images/modules/site/home/footer-illustration.svg" alt="GitHub">

Image credits: https://github.com

Introduction

This is a dataset that contains all commit messages and its related metadata from 34 popular GitHub repositories. These repositories are:

tensorflow/tensorflow

pytorch/pytorch

torvalds/linux

python/cpython

rust-lang/rust

microsoft/TypeScript

microsoft/vscode

golang/go

numpy/numpy

scikit-learn/scikit-learn

openbsd/src

freebsd/freebsd-src

pandas-dev/pandas

scipy/scipy

tidyverse/ggplot2

kubernetes/kubernetes

postgres/postgres

nodejs/node

facebook/react

angular/angular

matplotlib/matplotlib

apache/httpd

nginx/nginx

opencv/opencv

ipython/ipython

rstudio/rstudio

jupyterlab/jupyterlab

gcc-mirror/gcc

apple/swift

denoland/deno

apache/spark

llvm/llvm-project

chromium/chromium

v8/v8

Data as of Wed Apr 21 03:42:44 PM IST 2021

Credits

Image credits: Unsplash - plhnk
RAPIDS
kaggle.com
Updated Jun 29, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Chris Deotte (2021). RAPIDS [Dataset]. https://www.kaggle.com/cdeotte/rapids/tasks
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 29, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Chris Deotte
Description
Use this dataset to install RAPIDS in Kaggle notebooks. Installation takes 1 minute. Add the following lines of code to your notebook and turn GPU on. Change rapids.21.06 below to the version desired. (Currently v21.06, v0.19, v0.18 and v0.17 are available).

import sys !cp ../input/rapids/rapids.21.06 /opt/conda/envs/rapids.tar.gz !cd /opt/conda/envs/ && tar -xzvf rapids.tar.gz > /dev/null sys.path = ["/opt/conda/envs/rapids/lib/python3.7/site-packages"] + sys.path sys.path = ["/opt/conda/envs/rapids/lib/python3.7"] + sys.path sys.path = ["/opt/conda/envs/rapids/lib"] + sys.path !cp /opt/conda/envs/rapids/lib/libxgboost.so /opt/conda/lib/

Read more about RAPIDS here. The RAPIDS libraries allow us to perform all our data science on GPUs including reading data, transforming data, modeling, validation, and prediction. The package cuDF provides Pandas functionality and cuML provides Scikit-learn functionality. Other packages provide additional tools.

Since GPUs are faster than CPUs, we save time, save money, and can increase model accuracy by performing additional tasks like hyperparameter searches, feature engineering and selection, data augmentation, and ensembling with bagging and boosting.
Coursera AI Global Skills Index 2019 data
kaggle.com
Updated Dec 19, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Parul Pandey (2019). Coursera AI Global Skills Index 2019 data [Dataset]. https://www.kaggle.com/parulpandey/coursera-ai-global-skills-index-2019-data/kernels
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 19, 2019
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Parul Pandey
Description
Context

Coursera is an online platform for higher education. The Coursera Global Skills Index (GSI) draws upon this rich data to benchmark 60 countries and 10 industries across Business, Technology, and Data Science skills to reveal skills development trends around the world.

Content

Cousera measured the skill proficiency of countries in AI overall and in the related skills of math, machine learning, statistics, statistical programming, and software engineering. These related skills cover the breadth of knowledge needed to build and deploy AI-powered technologies within organizations and society: • Math: the theoretical background necessary to conduct and apply AI research •**Statistics**: empirical skills needed to fit and measure the impact of AI models •**Machine Learning**: skills needed to build self-learning models like deep learning and other supervised models that power most AI applications today •**Statistical Programming**: programming skills needed to implement AI models such as in python and related packages like sci-kit learn and pandas •**Software Engineering**: programming skills needed to design and scale AI-powered applications

Acknowledgements

Artificial Intelligence Index Report 2019

Global Skills Index 2019
Top 100 Canadian Beers
kaggle.com
Updated May 8, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sam Wong (2017). Top 100 Canadian Beers [Dataset]. https://www.kaggle.com/shwong/top-100-canadian-beers/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 8, 2017
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Sam Wong
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Area covered
Canada
Description
This is a dataset created as part of a tutorial on basic web scraping. Please visit One for the Road for the tutorial!

Introduction

The top 100 Canadian beers as ranked by visitors to BeerAdvocate.com. This dataset is intended only to help users learn how to scrape web data using BeautifulSoup and turn it into a Pandas dataframe.

Content

This dataset lists the top 100 Canadian beers:

Rank: rank, from 1 to 100, as rated by BeerAdvocate.com users

Name: name of the beer

Brewery: the brewery responsible for this delicious creation

Style: the style of the beer

ABV: Alcohol by Volume (%)

Score: Overall score determined by BeerAdvocate.com users

Ratings: Number of ratings

Acknowledgements

Thanks to all the readers and contributors of BeerAdvocate, selflessly pouring, drinking, and reviewing beers for our benefit.

Version 2 of this dataset was scraped on 5/08/2017 from https://www.beeradvocate.com/lists/ca/
Kung Fu Panda
kaggle.com
Updated Nov 7, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zeeshan-ul-hassan Usmani (2017). Kung Fu Panda [Dataset]. https://www.kaggle.com/datasets/zusmani/kung-fu-panda/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Nov 7, 2017
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Zeeshan-ul-hassan Usmani
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Context

Do you know what is common among Kung Fu Panda, Alvin and the Chipmunks, Monster Trucks, Trolls, Spongebob Movie and Monster Vs Aliens? They all were scripted by the same authors - Jonathan Aibel and Glenn Berger.

Kung Fu Panda is a 2008 animated movie by DreamWorks Production. It has made $631 million and its one of the most successful film on the box office from DreamWorks.

There is much talk and discussions on this movie beyond cinema-goers. Some like to learn leadership lessons from it and few others try to link it with Christianity, Taoism, Mysticism and Islam.

I was wondering if we can see the script from data science perspective and can answer some of the questions with significant implications in movie and other industries.

I welcome you all to do Data Science Martial Arts with Kung-fu-Panda and see who survives

Content

It’s a complete script of Kung Fu Panda 1 and 2 in CSV format with all background narrations, scene settings and movie dialogues by characters (Po, Master Shufy, Tai Lung, Tigress, Monkey, Viper, Oogway, Mr. Ping, Mantis and Crane).

Acknowledgements

Kung Fu Panda is a production by DreamWorks Studios. All scripts were gathered from online public sources like this and this.

Inspiration

Some ideas worth exploring:

• Can we train the neural network to recognize the character by dialogue? For example, if I give any line from the script, your algorithm will be able to tell who’s more likely to say this in movie?

• Can we make the word cloud for each character (and perhaps compare it with other movie characters by same authors and see who is similar to who)

• Can we train a chat bot for Oogway to Po so kids can talk to it and it would respond the same way as Oogway or Po would

• Can we calculate the average length or dialogue

• Can we estimate the difficulty level of vocabulary being used and perhaps compare it with movies of other genre

• Can we compare the script with some religious text and find out similarities

Facebook

Twitter

Click to copy link

Link copied

Cite

npscul (2021). Learn pandas [Dataset]. https://www.kaggle.com/npscul/learn-pandas/code

Learn pandas

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Apr 25, 2021

Dataset provided by

Kagglehttp://kaggle.com/

Authors

npscul

Description

Dataset

This dataset was created by npscul

Clear search

Close search

Google apps

Main menu

Learn pandas

Dataset

Contents

Learn Data Science Series Part 1

Please feel free to share it with others and consider supporting me if you find it helpful ⭐️.

Overview:

Data from: car-sales

Dataset

Contents

EDGE-IIOTSET Dataset

PANDA fanconic model weights

Context

Content

Acknowledgements

Social Power NBA

Context

Followup Items

Acknowledgement

Inspiration

Age and Sex Prediction by Artificial Intelligence

Age and Sex Prediction from Image - Convolutional Neural Network with Artificial Intelligence

The coding language used:

Libraries Used:

Developer Information:

Cryptocurrency Prediction Artificial Intelligence

Cryptocurrency-Prediction-with-Artificial-Intelligence

The coding language used:

Libraries Used:

Developer Information:

Enhanced Pizza Sales Data (2024–2025)

Bank Data Analysis

Data Set Information

Goal

Attribute Information

-Input Variables -

Source

Hospital Management Dataset

Classified Ads for Cars - unique maker/model/year

Context

Inspiration

Data from: EHR data

Telecom Consumer Complaints

Tajweed Dataset

GitHub Commit Messages Dataset

Introduction

Credits

RAPIDS

Coursera AI Global Skills Index 2019 data

Context

Content

Acknowledgements

Top 100 Canadian Beers

Introduction

Content

Acknowledgements

Kung Fu Panda

Context

Content

Acknowledgements

Inspiration

Learn pandas

Dataset

Contents