20 datasets found

N
SQL Project
data.cityofnewyork.us
application/rdfxml +5
Updated May 29, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Department of Finance (DOF) (2025). SQL Project [Dataset]. https://data.cityofnewyork.us/City-Government/SQL-Project/hek5-e7qj
Explore at:
json, csv, application/rdfxml, xml, application/rssxml, tsvAvailable download formats
Dataset updated
May 29, 2025
Authors
Department of Finance (DOF)
Description
Check out our data lens page for additional data filtering and sorting options: https://data.cityofnewyork.us/view/i4p3-pe6a

This dataset contains Open Parking and Camera Violations issued by the City of New York. Updates will be applied to this data set on the following schedule:

New or open tickets will be updated weekly (Sunday). Tickets satisfied will be updated daily (Tuesday through Sunday). NOTE: Summonses that have been written-off are indicated by blank financials.

Summons images will not be available during scheduled downtime on Sunday - Monday from 1:00 am to 2:30 am and on Sundays from 5:00 am to 10:00 am.

Initial dataset loaded 05/14/2016.
Library Management System SQL Project
kaggle.com
Updated Aug 22, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Najir 0123 (2024). Library Management System SQL Project [Dataset]. https://www.kaggle.com/datasets/najir0123/library-management-system-sql-project
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 22, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Najir 0123
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Library Dataset for SQL Project

Watch Full Video -- https://www.youtube.com/watch?v=6X2-P9fNVvw

Project Files -- https://github.com/najirh/Library-System-Management---P2?tab=readme-ov-file
MY SQL DATA CLEANING PROJECT
kaggle.com
Updated Jun 20, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
George M122 (2024). MY SQL DATA CLEANING PROJECT [Dataset]. https://www.kaggle.com/datasets/georgem122/my-sql-data-cleaning-project/data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 20, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
George M122
Description
Dataset

This dataset was created by George M122

Contents
sql-project-img
kaggle.com
Updated Sep 26, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Luis Lira (2023). sql-project-img [Dataset]. https://www.kaggle.com/datasets/luisliraportfolio/sql-project-img/data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 26, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Luis Lira
Description
Dataset

This dataset was created by Luis Lira

Contents
Student's mental health
kaggle.com
Updated Apr 7, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Abdallah Nasser (2024). Student's mental health [Dataset]. https://www.kaggle.com/abdallahprogrammer/students-mental-health/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 7, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Abdallah Nasser
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Dataset

This dataset was created by Abdallah Nasser

Released under Apache 2.0

Contents
Hospital Database Management System SQL Project
kaggle.com
Updated May 9, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Andrew Dolcimascolo-Garrett (2024). Hospital Database Management System SQL Project [Dataset]. https://www.kaggle.com/datasets/andrewdolcigarrett/hospital-database-management-system-sql-project/versions/1
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 9, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Andrew Dolcimascolo-Garrett
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset

This dataset was created by Andrew Dolcimascolo-Garrett

Released under MIT

Contents
SQL PROJECT-1 BY JITENDRA KUMAR
kaggle.com
Updated Nov 19, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jitendra Kumar (2023). SQL PROJECT-1 BY JITENDRA KUMAR [Dataset]. https://www.kaggle.com/datasets/jktdatascientist/sql-project-1-by-jitendra-kumar/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Nov 19, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Jitendra Kumar
Description
Dataset

This dataset was created by Jitendra Kumar

Released under Other (specified in description)

Contents
Nashville Housing Data : SQL project
kaggle.com
Updated Oct 9, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Paragi Jain11 (2023). Nashville Housing Data : SQL project [Dataset]. https://www.kaggle.com/paragijain11/nashville-housing-data-sql-project/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 9, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Paragi Jain11
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Area covered
Nashville
Description
Dataset

This dataset was created by Paragi Jain11

Released under CC0: Public Domain

Contents
Covid-SQL-project
kaggle.com
Updated Jul 16, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Emmanuel Chude (2023). Covid-SQL-project [Dataset]. https://www.kaggle.com/datasets/emmanuelchude/sql-project
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 16, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Emmanuel Chude
Description
Dataset

This dataset was created by Emmanuel Chude

Contents
SQL-Project: Dataset - Delitos BA 2021
kaggle.com
Updated Sep 26, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Luis Lira (2023). SQL-Project: Dataset - Delitos BA 2021 [Dataset]. https://www.kaggle.com/datasets/luisliraportfolio/delitos-ba-2021
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 26, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Luis Lira
Description
Dataset

This dataset was created by Luis Lira

Contents
S&P 500 Companies Analysis Project
kaggle.com
Updated Apr 6, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
anshadkaggle (2025). S&P 500 Companies Analysis Project [Dataset]. https://www.kaggle.com/datasets/anshadkaggle/s-and-p-500-companies-analysis-project
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 6, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
anshadkaggle
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
This project focuses on analyzing the S&P 500 companies using data analysis tools like Python (Pandas), SQL, and Power BI. The goal is to extract insights related to sectors, industries, locations, and more, and visualize them using dashboards.

Included Files:

sp500_cleaned.csv – Cleaned dataset used for analysis

sp500_analysis.ipynb – Jupyter Notebook (Python + SQL code)

dashboard_screenshot.png – Screenshot of Power BI dashboard

README.md – Summary of the project and key takeaways

This project demonstrates practical data cleaning, querying, and visualization skills.
Cupcake Business - Sales Data Analysis
kaggle.com
Updated Mar 23, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
MITHRA CHANDRAN (2024). Cupcake Business - Sales Data Analysis [Dataset]. http://doi.org/10.34740/kaggle/dsv/7922498
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.34740/kaggle/dsv/7922498
Dataset updated
Mar 23, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
MITHRA CHANDRAN
Description
This project answers some business questions for a cupcake business company, by analyzing their sales data by SQL. The business wants to know

Find the unique flavors.

Find the revenue per flavor

Total Revenue for the year 2023

Which month has the highest sales?

which flavor sells most during this month?

Which is the most popular flavor?

Which flavor has the most rating?

Is there any relation between rating 5 and revenue?

Top 3 loyal customers

From which city are we getting the most orders?

Here the database used is PostgreSQL .
Sales Data Analysis Using MySQL, Excel & Power BI
kaggle.com
Updated Mar 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
pooja Career (2025). Sales Data Analysis Using MySQL, Excel & Power BI [Dataset]. https://www.kaggle.com/datasets/poojacareer/sales-data-analysis-using-mysql-excel-and-power-bi
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 15, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
pooja Career
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
📊 Sales Data Analysis Using MySQL, Excel & Power BI 🔍 Project Overview This project focuses on analyzing sales data to extract valuable insights, identify trends, and support business decision-making. Using MySQL for querying, Excel for data manipulation, and Power BI for visualization, we explore key sales performance metrics.

🛠 Tools Used ✅ MySQL – Data storage, cleaning, and analysis using SQL queries. ✅ Excel – Data preprocessing, pivot tables, and basic visualization. ✅ Power BI – Interactive dashboards for advanced data visualization.

📂 Dataset Information Source: Kaggle Superstore Sales Dataset Data Size: 10,000+ records Key Features: Sales, Customer Details, Ship Mode, Product Category, Region

📌 Key Business Questions Answered 1️⃣ What are the top-performing sales regions? ✅ Used Power BI Map Visualization to analyze sales distribution by region. ✅ Key Insight: The highest sales were recorded in the West & East regions, while some regions showed potential for improvement.

2️⃣ Which product categories drive the highest revenue? ✅ Used Excel Pivot Tables to aggregate Sales by Category. ✅ Observation: "Technology" products had the highest sales, followed by "Furniture" and "Office Supplies."

3️⃣ Who are the top 10 customers by sales volume? ✅ Extracted top customers using SQL Queries & Power BI Ranking Functions. ✅ Business Insight: Retaining these customers can significantly boost revenue.

4️⃣ Which are the top 5 best-selling products? ✅ Aggregated product sales using MySQL SUM() function. ✅ Result: High-demand products identified, helping in inventory planning.

5️⃣ How does shipping mode affect sales? ✅ Created Power BI Slicer & Bar Chart for Ship Mode Analysis. ✅ Finding: Standard Class was the most used, while Same-Day shipping had lower but high-value orders.

📊 Power BI Dashboard Overview 🔹 Sales by Region – Geographical performance map 🔹 Top 10 Customers – Key customers contributing to revenue 🔹 Category & Sales – Identifying best-performing categories 🔹 Top 5 Products – Sales contribution by product 🔹 Shipping Mode Impact – Analyzing customer shipping preferences

📈 Business Insights & Recommendations 📌 Optimize Marketing Efforts – Focus more on high-performing regions. 📌 Inventory Management – Maintain high stock levels for top-selling products. 📌 Customer Retention Strategies – Prioritize personalized marketing for top customers. 📌 Improve Shipping Efficiency – Explore cost-effective shipping options for increased profitability.

📢 Why This Project? This project helped me strengthen my SQL querying skills, enhance Excel data manipulation, and build Power BI dashboards for professional data storytelling.

💡 Next Steps: Expanding analysis with predictive analytics & machine learning.

📎 Project Files & Resources 📂 Dataset – Available on Kaggle 📊 Power BI Dashboard – Shared in project files 📜 SQL Queries & Excel Reports – Available for reference

🚀 Let's Connect! 👨‍💻 LinkedIn – www.linkedin.com/in/ pooja-akash-lohkare-62a6a5b6

📧 Contact – poojacareer789@gmail.com

If you found this useful, upvote & comment with your feedback! 🙌
moved_project_sql_result_01
kaggle.com
Updated Oct 18, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
José Francisco Lara Cárdemas (2023). moved_project_sql_result_01 [Dataset]. https://www.kaggle.com/datasets/josephfaster/moved-project-sql-result-01-csv
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 18, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
José Francisco Lara Cárdemas
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Dataset

This dataset was created by José Francisco Lara Cardenas

Released under CC0: Public Domain

Contents
Bellabeat Case Study using SQL and Tableau
kaggle.com
Updated Oct 22, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ragini1 (2023). Bellabeat Case Study using SQL and Tableau [Dataset]. https://www.kaggle.com/ragini1/bellabeat-case-study-using-sql-and-tableau/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 22, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Ragini1
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Dataset

This dataset was created by Ragini1

Released under CC0: Public Domain

Contents
AW2019 Sales Overview
kaggle.com
Updated Feb 5, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Xavier Berge (2025). AW2019 Sales Overview [Dataset]. https://www.kaggle.com/datasets/xavierberge/aw2019-sales-overview
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 5, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Xavier Berge
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Dataset extracted from the 2019 Adventure Works database. 4 files: -Dimension Calendar -Dimension Customer -Dimension Product -Fact Internet Sales

All tables used in the SQL project attached to the Dataset.
Healthcare Management System
kaggle.com
Updated Dec 23, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anouska Abhisikta (2023). Healthcare Management System [Dataset]. https://www.kaggle.com/datasets/anouskaabhisikta/healthcare-management-system
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 23, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Anouska Abhisikta
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Patients Table:

PatientID: Unique identifier for each patient.

firstname: First name of the patient.

lastname: Last name of the patient.

email: Email address of the patient.

This table stores information about individual patients, including their names and contact details.

Doctors Table:

DoctorID: Unique identifier for each doctor.

DoctorName: Full name of the doctor.

Specialization: Area of medical specialization.

DoctorContact: Contact details of the doctor.

This table contains details about healthcare providers, including their names, specializations, and contact information.

Appointments Table:

AppointmentID: Unique identifier for each appointment.

Date: Date of the appointment.

Time: Time of the appointment.

PatientID: Foreign key referencing the Patients table, indicating the patient for the appointment.

DoctorID: Foreign key referencing the Doctors table, indicating the doctor for the appointment.

This table records scheduled appointments, linking patients to doctors.

MedicalProcedure Table:

ProcedureID: Unique identifier for each medical procedure.

ProcedureName: Name or description of the medical procedure.

AppointmentID: Foreign key referencing the Appointments table, indicating the appointment associated with the procedure.

This table stores details about medical procedures associated with specific appointments.

Billing Table:

InvoiceID: Unique identifier for each billing transaction.

PatientID: Foreign key referencing the Patients table, indicating the patient for the billing transaction.

Items: Description of items or services billed.

Amount: Amount charged for the billing transaction.

This table maintains records of billing transactions, associating them with specific patients.

demo Table:

ID: Primary key, serves as a unique identifier for each record.

Name: Name of the entity.

Hint: Additional information or hint about the entity.

This table appears to be a demonstration or testing table, possibly unrelated to the healthcare management system.

This dataset schema is designed to capture comprehensive information about patients, doctors, appointments, medical procedures, and billing transactions in a healthcare management system. Adjustments can be made based on specific requirements, and additional attributes can be included as needed.
Super store
kaggle.com
Updated Feb 13, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Somayeh Sahebi (2024). Super store [Dataset]. https://www.kaggle.com/datasets/somayehsahebi/super-store/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 13, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Somayeh Sahebi
Description
****Super Store Analytics with SQL and Looker Studio****

I am excited to share a project that I recently completed, focusing on comprehensive analytics for a superstore using SQL and visualizations crafted in Looker Studio. This project aimed to enhance decision-making processes by leveraging robust data analysis and interactive visualizations. All the data presented is expressed in thousands.

Key Components: Query Optimization: Leveraging the power of SQL ( DBeaver, Postgres) , I implemented optimized queries to extract meaningful insights from the vast dataset. This involved employing aggregate functions, joins, and subqueries to retrieve specific information such as sales trends, customer behaviors, and inventory management.

Looker Studio Visualizations: To provide a user-friendly interface for data exploration, Looker Studio was employed to create interactive and insightful visualizations. Dashboards were crafted to offer a holistic view of superstore performance, enabling stakeholders to identify patterns, trends, and areas for improvement. Looker studio visualisation

Project Achievements:

Enhanced data-driven decision-making processes for the superstore management team.

Improved operational efficiency through insights into product performance, customer preferences, and inventory management.

Provided a scalable and adaptable solution for ongoing analytics and reporting needs.

Data Description:

Order_ID (integer): Unique identifier for each order. Order_Date (date): Date when the order was placed. Ship_Date (date): Date when the order was shipped. Interval (day) (integer): Number of days between order placement and shipment. Ship_Mode (string): Shipping method chosen for the order. Customer_ID (integer): Unique identifier for each customer. Customer_Name (string): Name of the customer. Segment (string): Customer segmentation. Country (string): Country where the order was placed. City (string): City where the order was placed. State (string): State where the order was placed. Postal_Code (string): Postal code of the order location. Region (string): Geographical region of the order. Product_ID (integer): Unique identifier for each product. Category (string): Product category. Sub_Category (string): Product sub-category. Product_Name (string): Name of the product. Sales (float): Sales amount for the order. Quantity (integer): Quantity of products in the order. Discount (float): Discount applied to the order. Profit (float): Profit generated from the order. Returned (string): Indicates whether the order was returned (Yes/No). Person (string): Customer categorization. Region (string): Geographic region associated with the customer.*

SELECT o."Order_ID", o."Order_Date", o."Ship_Date", o."Ship_Mode", o."Segment", o."Country", o."City", o."State", o."Postal_Code", o."Region", o."Product_ID", o."Category", o."Sub_Category", o."Product_Name", o."Sales", o."Quantity", o."Discount", o."Profit", p."Person", CASE WHEN r."Returned" = 'yes' THEN 'yes' ELSE 'No' END AS "Returned", min("Sales" - "Profit") as Total_cost, min("Sales" / ("Quantity" * (1 - "Discount"))) as price_per_unit FROM orders o JOIN "Return" r ON r."Order_ID" = o."Order_ID" JOIN people p ON p."Region" = o."Region" GROUP BY o."Order_ID", o."Order_Date", o."Ship_Date", o."Ship_Mode", o."Segment", o."Country", o."City", o."State", o."Postal_Code", o."Region", o."Product_ID", o."Category", o."Sub_Category", o."Product_Name", o."Sales", o."Quantity", o."Discount", "Returned" , o."Profit", p."Person" ORDER BY Total_cost, price_per_unit DESC;

Questions:

What specific factors contributed to the fluctuations in average total cost between 2014 and 2017?

Can you identify any outliers or anomalies in the total cost data during this period?

What were the key drivers behind the increase in average profit from 2014 to 2017?

Are there any specific categories or sub-categories that experienced a significant change in profit?

Which categories or products contributed the most to the growth in average sales from 2014 to 2017?

Did any specific periods within this timeframe witness a notable spike or decline in sales?

Can you provide more detailed insights into the technology category's profitability, especially regarding its near-zero profit in 2016?

What other categories exhibited distinct patterns in the histogram analysis?

What factors contributed to the cost discrepancy between average sales and average total cost for furniture...
Healthcare Fraud Detection Dataset
kaggle.com
Updated Mar 6, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Vishal Jaiswal (2025). Healthcare Fraud Detection Dataset [Dataset]. https://www.kaggle.com/datasets/jaiswalmagic1/healthcare-fraud-detection-dataset/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 6, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Vishal Jaiswal
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
This dataset contains comprehensive synthetic healthcare data designed for fraud detection analysis. It includes information on patients, healthcare providers, insurance claims, and payments. The dataset is structured to mimic real-world healthcare transactions, where fraudulent activities such as false claims, overbilling, and duplicate charges can be identified through advanced analytics.

The dataset is suitable for practicing SQL queries, exploratory data analysis (EDA), machine learning for fraud detection, and visualization techniques. It is designed to help data analysts and data scientists develop and refine their analytical skills in the healthcare insurance domain.

Dataset Overview The dataset consists of four CSV files:

Patients Data (patients.csv)

Contains demographic details of patients, such as age, gender, insurance type, and location. Can be used to analyze patient demographics and healthcare usage patterns. Providers Data (providers.csv)

Contains information about healthcare providers, including provider ID, specialty, location, and associated hospital.

Useful for identifying fraudulent claims linked to specific providers or hospitals. Claims Data (claims.csv)

Contains records of insurance claims made by patients, including diagnosis codes, treatment details, provider ID, and claim amount.

Can be analyzed for suspicious patterns, such as excessive claims from a single provider or duplicate claims for the same patient.

Payments Data (payments.csv) Contains details of claim payments made by insurance companies, including payment amount, claim ID, and reimbursement status.

Helps in detecting discrepancies between claims and actual reimbursements. Possible Analysis Ideas

This dataset allows for multiple analysis approaches, including but not limited to:

🔹 Fraud Detection: Identify patterns in claims data to detect fraudulent activities (e.g., excessive billing, duplicate claims). 🔹 Provider Behavior Analysis: Analyze providers who have an unusually high claim volume or high rejection rates. 🔹 Payment Trends: Compare claims vs. payments to find irregularities in reimbursement patterns. 🔹 Patient Demographics & Utilization: Explore which patient groups are more likely to file claims and receive reimbursements. 🔹 SQL Query Practice: Perform advanced SQL queries, including joins, aggregations, window functions, and subqueries, to extract insights from the data.

Use Cases Practicing SQL queries for job interviews and real-world projects. Learning data cleaning, data wrangling, and feature engineering for healthcare analytics. Applying machine learning techniques for fraud detection. Gaining insights into the healthcare insurance domain and its challenges.

License & Usage License: CC0 Public Domain (Free to use for any purpose).

Attribution: Not required but appreciated. Intended Use: This dataset is for educational and research purposes only.

This dataset is an excellent resource for aspiring data analysts, data scientists, and SQL learners who want to gain hands-on experience in healthcare fraud detection.
Computer Science Students Career Prediction
kaggle.com
Updated Jul 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rugved Patil (2024). Computer Science Students Career Prediction [Dataset]. https://www.kaggle.com/datasets/devildyno/computer-science-students-career-prediction/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 16, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Rugved Patil
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
Computer Science Students Dataset

This dataset contains information about computer science students from a fictional university. It includes attributes such as Student ID, Name, Gender, Age, GPA, Major, Interested Domain, Projects undertaken, and skills in Python, SQL, and Java. The dataset aims to provide insights into the academic performance, career aspirations, and technical skills of students in the field of computer science.

Columns: Student ID: Unique identifier for each student. Name: Name of the student. Gender: Gender of the student. Age: Age of the student. GPA: Grade Point Average of the student. Major: Field of study within computer science. Interested Domain: Area of interest within the field of computer science. Projects: Noteworthy projects completed by the student. Python: Proficiency level in Python programming. SQL: Proficiency level in SQL querying. Java: Proficiency level in Java programming.

Future Career: Intended career path or job aspiration (target variable).

Purpose: This dataset is suitable for tasks such as predictive modeling to understand factors influencing career choices in computer science students. The "Future Career" column serves as the target variable for classification tasks. Researchers, educators, and data enthusiasts can utilize this dataset for various educational and analytical purposes in the realm of computer science education and career planning.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Department of Finance (DOF) (2025). SQL Project [Dataset]. https://data.cityofnewyork.us/City-Government/SQL-Project/hek5-e7qj

SQL Project

Explore at:

json, csv, application/rdfxml, xml, application/rssxml, tsvAvailable download formats

Dataset updated

May 29, 2025

Authors

Department of Finance (DOF)

Description

Check out our data lens page for additional data filtering and sorting options: https://data.cityofnewyork.us/view/i4p3-pe6a

This dataset contains Open Parking and Camera Violations issued by the City of New York. Updates will be applied to this data set on the following schedule:

New or open tickets will be updated weekly (Sunday). Tickets satisfied will be updated daily (Tuesday through Sunday). NOTE: Summonses that have been written-off are indicated by blank financials.

Summons images will not be available during scheduled downtime on Sunday - Monday from 1:00 am to 2:30 am and on Sundays from 5:00 am to 10:00 am.

Initial dataset loaded 05/14/2016.

Clear search

Close search

Google apps

Main menu

SQL Project

Library Management System SQL Project

MY SQL DATA CLEANING PROJECT

Dataset

Contents

sql-project-img

Dataset

Contents

Student's mental health

Dataset

Contents

Hospital Database Management System SQL Project

Dataset

Contents

SQL PROJECT-1 BY JITENDRA KUMAR

Dataset

Contents

Nashville Housing Data : SQL project

Dataset

Contents

Covid-SQL-project

Dataset

Contents

SQL-Project: Dataset - Delitos BA 2021

Dataset

Contents

S&P 500 Companies Analysis Project

Cupcake Business - Sales Data Analysis

Sales Data Analysis Using MySQL, Excel & Power BI

moved_project_sql_result_01

Dataset

Contents

Bellabeat Case Study using SQL and Tableau

Dataset

Contents

AW2019 Sales Overview

Healthcare Management System

Super store

Healthcare Fraud Detection Dataset

Computer Science Students Career Prediction

SQL ProjectSee More Versions

SQL Project