Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This is a simulated dataset exploring how lifestyle habits affect academic performance in students. With 1,000 synthetic student records and 15+ features including study hours, sleep patterns, social media usage, diet quality, mental health, and final exam scores, it’s perfect for ML projects, regression analysis, clustering, and data viz. Created using realistic patterns for educational practice.
Ever wondered how much Netflix, sleep, or TikTok scrolling affects your grades? 👀 This dataset simulates 1,000 students' daily habits—from study time to mental health—and compares them to final exam scores. It's like spying on your GPA through the lens of lifestyle. Perfect for EDA, ML practice, or just vibing with data while pretending to be productive.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This synthetic dataset simulates the academic and lifestyle behaviors of 80,000 students, including diverse features like study habits, mental health, family background, motivation, and environmental factors. The goal is to explore how different variables affect student performance in terms of GPA and exam scores.
student_id: Unique student identifier.age: Age of the student (16–28).gender: Male, Female, or Other.major: Field of study (e.g., Computer Science, Engineering, Arts).study_hours_per_day: Average hours studied daily.social_media_hours, netflix_hours, screen_time: Time spent on various screens.part_time_job: Whether the student has a job (Yes/No).attendance_percentage: Academic attendance in percentage.sleep_hours, exercise_frequency, diet_quality: Lifestyle factors.mental_health_rating, stress_level, exam_anxiety_score: Psychological indicators (1–10).extracurricular_participation, access_to_tutoring: Support and engagement.family_income_range, parental_support_level, parental_education_level: Background and support.motivation_level, time_management_score: Self-management skills (1–10).learning_style: Preferred method of learning.study_environment: Common location for studying.dropout_risk: Yes/No — derived from stress and motivation levels.previous_gpa, exam_score: Target performance indicators.The dataset was synthetically generated using Python with realistic statistical modeling, Gaussian distributions, conditional logic, and heuristics to simulate actual student behavior and academic outcomes.
Key points: - Realistic distributions for study hours, stress, and motivation. - Exam score derived from GPA + noise. - GPA computed based on study hours, sleep, stress, motivation, support, and tutoring. - Diversity in majors, income, and support levels.
For full generation logic, see the associated code [below or in GitHub].
This is a synthetic dataset intended for research and educational purposes only. It does not contain any real student data.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains information about students' academic performance, study habits, and external factors affecting their final exam scores. It is designed for predictive modeling, data visualization, and educational analytics.
This dataset is useful for:
- Predicting student final exam scores 📈
- Identifying key factors that impact academic performance 🎯
- Exploring feature importance in education-related datasets 📊
- Building machine learning models for regression and classification 🤖
| Column Name | Description |
|---|---|
| Student_ID | Unique identifier for each student. |
| Gender | Gender of the student (Male/Female). |
| Study_Hours_per_Week | Average number of study hours per week. |
| Attendance_Rate | Attendance percentage (50% - 100%). |
| Past_Exam_Scores | Average score of previous exams (50 - 100). |
| Parental_Education_Level | Education level of parents (High School, Bachelors, Masters, PhD). |
| Internet_Access_at_Home | Whether the student has internet access at home (Yes/No). |
| Extracurricular_Activities | Whether the student participates in extracurricular activities (Yes/No). |
| Final_Exam_Score (Target) | The final exam score of the student (50 - 100, integer values). |
| Pass_Fail (Target) | The student status (Pass/Fail). |
This dataset is open for public use. Feel free to use it for learning, research, and model-building! 🚀
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset provides a comprehensive view of student performance and learning behavior, integrating academic, demographic, behavioral, and psychological factors.
It was created by merging two publicly available Kaggle datasets, resulting in a unified dataset of 14,003 student records with 16 attributes. All entries are anonymized, with no personally identifiable information.
StudyHours, Attendance, Extracurricular, AssignmentCompletion, OnlineCourses, DiscussionsResources, Internet, EduTechMotivation, StressLevelGender, Age (18–30 years)LearningStyleExamScore, FinalGradeThe dataset can be used for:
ExamScore, FinalGrade)The dataset was analyzed in Python using:
LearningStyle categories & extracting insights for adaptive learningmerged_dataset.csv → 14,003 rows × 16 columns
Includes student demographics, behaviors, engagement, learning styles, and performance indicators.This dataset is an excellent playground for educational data mining — from clustering and behavioral analytics to predictive modeling and personalized learning applications.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
📝 Description: This dataset consists of 100 student records generated using a custom Python script. Each entry captures various academic and demographic features such as:
Student Name
Gender
Age
Study Hours
Attendance Percentage
Grades (Midterm, Final, and Average)
Parental Education
Internet Access
Extra-Curricular Participation
📌 Use Cases:
Machine Learning model training (e.g., grade prediction)
Data preprocessing and cleaning exercises
Exploratory data analysis and visualization
Educational dashboards and Power BI demonstrations
💡 Ideal for students and educators aiming to practice regression, classification, data cleaning, and insight generation using a compact, readable dataset.
📂 Note: The dataset is balanced for gender and contains a variety of typical academic performance patterns.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The dataset provides valuable insights into the demographics and gaming habits of students, capturing various attributes that could be analyzed to uncover meaningful correlations. Each entry in the dataset includes the student's gender, identified as either "Female" or "Male," along with a unique school code that serves as an identifier for each institution. This allows for the categorization and grouping of students based on their respective schools, enabling comparative analyses across different educational institutions.
One of the key aspects covered in the dataset is the student's gaming experience, which includes the number of years they have been playing games. This attribute can indicate whether gaming is a long-term habit or a relatively new activity for the student. Additionally, the dataset records how frequently students engage in gaming, likely measured on a scale from 1 to 5, providing a quantitative representation of their gaming intensity. To further elaborate on gaming engagement, the dataset also tracks the average number of hours a student spends playing games daily. This metric can be crucial in understanding whether extended gaming sessions have an impact on academic performance. Moreover, the dataset distinguishes whether a student actively plays games or not, which can be particularly useful in comparative studies assessing the behaviors of gaming versus non-gaming students.
Beyond gaming habits, the dataset delves into socioeconomic factors by including the annual income of the student's family. This "Parent Revenue" variable can help researchers examine the potential influence of economic background on a student's gaming behavior and academic performance. Additionally, the education levels of both the student's father and mother are recorded, offering insights into whether parental education has any correlation with the student's gaming frequency, academic performance, or gaming choices.
Academic performance is another critical component of this dataset, represented by the "Grade" variable, which provides a measure of the student's academic standing. This information can be instrumental in investigating how gaming habits, parental background, and socioeconomic status contribute to or hinder academic success.
This dataset presents an excellent opportunity for analysis on platforms like Kaggle. Potential research directions could include exploring the relationship between gaming frequency and academic performance, investigating whether students from higher-income families spend more or fewer hours gaming, or analyzing if parental education has any impact on the types of games students play or their duration of play. By leveraging this dataset, researchers can identify trends and generate insights that may inform policies on gaming habits, parental involvement, and educational strategies.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains academic and behavioral information of high-school students from Grade 9 to Grade 12. It focuses on understanding the relationship between study hours, attendance, exam performance, and technology-supported learning. The dataset also highlights students’ learning preferences and the influence of smartphone usage on their education.
Researchers, data analysts, machine learning practitioners, and education strategists can explore:
Factors that drive academic success
Predictive modeling of student performance
Impact of online vs offline learning
Correlation between digital habits and grades
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
For what purpose was the dataset created?
The purpose is to predict students' end-of-term performances using ML techniques.
Additional Information
1-10 of the data are the personal questions, 11-16. questions include family questions, and the remaining questions include education habits.
Class Labels
Student ID
1- Student Age (1: 18-21, 2: 22-25, 3: above 26)
2- Sex (1: female, 2: male)
3- Graduated high-school type: (1: private, 2: state, 3: other)
4- Scholarship type: (1: None, 2: 25%, 3: 50%, 4: 75%, 5: Full)
5- Additional work: (1: Yes, 2: No)
6- Regular artistic or sports activity: (1: Yes, 2: No)
7- Do you have a partner: (1: Yes, 2: No)
8- Total salary if available (1: USD 135-200, 2: USD 201-270, 3: USD 271-340, 4: USD 341-410, 5: above 410)
9- Transportation to the university: (1: Bus, 2: Private car/taxi, 3: bicycle, 4: Other)
10- Accommodation type in Cyprus: (1: rental, 2: dormitory, 3: with family, 4: Other)
11- Mothers’ education: (1: primary school, 2: secondary school, 3: high school, 4: university, 5: MSc., 6: Ph.D.)
12- Fathers’ education: (1: primary school, 2: secondary school, 3: high school, 4: university, 5: MSc., 6: Ph.D.)
13- Number of sisters/brothers (if available): (1: 1, 2:, 2, 3: 3, 4: 4, 5: 5 or above)
14- Parental status: (1: married, 2: divorced, 3: died - one of them or both)
15- Mothers’ occupation: (1: retired, 2: housewife, 3: government officer, 4: private sector employee, 5: self-employment, 6: other)
16- Fathers’ occupation: (1: retired, 2: government officer, 3: private sector employee, 4: self-employment, 5: other)
17- Weekly study hours: (1: None, 2: <5 hours, 3: 6-10 hours, 4: 11-20 hours, 5: more than 20 hours)
18- Reading frequency (non-scientific books/journals): (1: None, 2: Sometimes, 3: Often)
19- Reading frequency (scientific books/journals): (1: None, 2: Sometimes, 3: Often)
20- Attendance to the seminars/conferences related to the department: (1: Yes, 2: No)
21- Impact of your projects/activities on your success: (1: positive, 2: negative, 3: neutral)
22- Attendance to classes (1: always, 2: sometimes, 3: never)
23- Preparation to midterm exams 1: (1: alone, 2: with friends, 3: not applicable)
24- Preparation to midterm exams 2: (1: closest date to the exam, 2: regularly during the semester, 3: never)
25- Taking notes in classes: (1: never, 2: sometimes, 3: always)
26- Listening in classes: (1: never, 2: sometimes, 3: always)
27- Discussion improves my interest and success in the course: (1: never, 2: sometimes, 3: always)
28- Flip-classroom: (1: not useful, 2: useful, 3: not applicable)
29- Cumulative grade point average in the last semester (/4.00): (1: <2.00, 2: 2.00-2.49, 3: 2.50-2.99, 4: 3.00-3.49, 5: above 3.49)
30- Expected Cumulative grade point average in the graduation (/4.00): (1: <2.00, 2: 2.00-2.49, 3: 2.50-2.99, 4: 3.00-3.49, 5: above 3.49)
31- Course ID
32- OUTPUT Grade (0: Fail, 1: DD, 2: DC, 3: CC, 4: CB, 5: BB, 6: BA, 7: AA)
Citation Requests/Acknowledgements
Yılmaz N., Sekeroglu B. (2020) Student Performance Classification Using Artificial Intelligence Techniques. In: Aliev R., Kacprzyk J., Pedrycz W., Jamshidi M., Babanli M., Sadikoglu F. (eds) 10th International Conference on Theory and Application of Soft Computing, Computing with Words and Perceptions - ICSCCW-2019. ICSCCW 2019. Advances in Intelligent Systems and Computing, vol 1095. Springer, Cham
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Dataset Overview: This dataset contains survey responses collected from students in a college located in Satara, Maharashtra, India. The survey was conducted to gather information about students' library usage, reading habits, learning preferences, and other related factors.
Columns: The dataset consists of 29 columns representing different survey questions and responses. The columns include information such as gender, faculty, location, preferred study materials, library visit frequency, average time spent in college, preferred learning language, reading preferences, COVID-19 pandemic impact, book purchasing behavior, parents' occupation and education, and more.
Data Collection: The survey was shared with students in the college library, and their responses were collected using a Google Form. Approximately 10-15k students studying in various courses, ranging from 11th grade to master's degree, participated in the survey.
Data Format: The dataset is provided in CSV format, with each row representing a student's survey response and each column representing a specific survey question.
Data Usage: This dataset can be used to gain insights into students' library usage patterns, reading habits, and learning preferences. It can be used for exploratory data analysis, statistical analysis, and building predictive models related to student behavior, library services, or educational interventions.
Data Quality: The dataset has been cleaned and preprocessed to remove any identifiable personal information and ensure data privacy. However, it is always advisable to handle the data responsibly and in accordance with applicable data protection regulations.
Here's a column-wise description of the dataset:
gender: Gender of the student. faculty: Faculty or department of the student. Enter Your Location: Location of the student. kind of books preferred for study: Preferred type of books for studying. How Frequently do you visit library: Frequency of visiting the library. For what Purposes do you visit library: Purposes for visiting the library. Average Time spent in college: Average time spent in college. What is general Purposes: General purposes of the student. Which one is your Preferred location: Preferred location. What is your preferred time?: Preferred time for activities. Preferred language for Learning: Preferred language for learning. Preferred type for reading: Preferred type of reading material. Do you enjoy the Reading: Enjoyment of reading. Which mode of learning: Preferred mode of learning. Dose Covid Pandemic Ch: Impact of the Covid pandemic on learning. How do you study before collage: Study habits before college. How do you study after Collage: Study habits after college. Do you aware about Nati: Awareness about National Digital Library. Do you Using National di: Usage of National Digital Library. Dose Covid 19 Pandemic Affected Your Reading Habits: Impact of the Covid-19 pandemic on reading habits. Do you purchase Books from store: Book purchasing behavior from physical stores. Average Expenditure on books: Average expenditure on books. Occupation Of Father: Occupation of the student's father. Parents Education: Education level of the student's parents. Select your Faculty: Select faculty or department. Enter your Location: Enter location. Preferred Language for Learning: Preferred language for learning. Do you Using National dig: Usage of National Digital Library. Occupation of Father: Occupation of the student's father.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
📘 Description
The Student Academic Performance Dataset contains detailed academic and lifestyle information of 250 students, created to analyze how various factors — such as study hours, sleep, attendance, stress, and social media usage — influence their overall academic outcomes and GPA.
This dataset is synthetic but realistic, carefully generated to reflect believable academic patterns and relationships. It’s perfect for learning data analysis, statistics, and visualization using Excel, Python, or R.
The data includes 12 attributes, primarily numerical, ensuring that it’s suitable for a wide range of analytical tasks — from basic descriptive statistics (mean, median, SD) to correlation and regression analysis.
📊 Key Features
🧮 250 rows and 12 columns
💡 Mostly numerical — great for Excel-based statistical functions
🔍 No missing values — ready for direct use
📈 Balanced and realistic — ideal for clear visualizations and trend analysis
🎯 Suitable for:
Descriptive statistics
Correlation & regression
Data visualization projects
Dashboard creation (Excel, Tableau, Power BI)
💡 Possible Insights to Explore
How do study hours impact GPA?
Is there a relationship between stress levels and performance?
Does social media usage reduce study efficiency?
Do students with higher attendance achieve better grades?
⚙️ Data Generation Details
Each record represents a unique student.
GPA is calculated using a weighted formula based on midterm and final scores.
Relationships are designed to be realistic — for example:
Higher study hours → higher scores and GPA
Higher stress → slightly lower sleep hours
Excessive social media time → reduced academic performance
⚠️ Disclaimer
This dataset is synthetically generated using statistical modeling techniques and does not contain any real student data. It is intended purely for educational, analytical, and research purposes.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset was generated through a survey-style and contains 1000 entries of simulated student data. It includes attributes such as age, gender, daily study hours, revision frequency, preferred study time, usage of online learning platforms, social media usage, sleep hours, exam stress level, and last exam score percentage. The dataset aims to assist in exploring the relationship between study habits and academic performance.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains academic-related information of 200 students. It includes details about their study habits, sleep duration, attendance, past academic performance, and final exam scores. The data can be used to analize factors that influence exam performance and to identify patterns in student learning outcomes.
student_id – A unique code given to each student for identification.
hours_studied – The number of hours a student studied before the exam.
sleep_hours – The average number of hours the student slept daily.
**attendance_percent **– The percentage of classes attended by the student.
previous_scores – The marks a student obtained in previous tests or assessments.
exam_score – The final exam score of the student, used as the main performance measure.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
This dataset contains detailed numerical records of 10,000 students, capturing their academic performance and study behavior across multiple dimensions. The dataset includes student scores in core subjects (Math, Science, English, and History), learning activity metrics (study hours per week, quiz attempts, consecutive logins, material access frequency), and additional attributes such as improvement rates, time per question, class size, faculty ratio, and predicted performance. Derived variables like science-math score difference and English-history score ratio provide insights into individual subject trends. The data can be used for academic performance analysis, predictive modeling, learning behavior studies, and educational resource optimization.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset was created by Hamza
Released under CC0: Public Domain
Facebook
TwitterThis dataset is real data of 5,000 records collected from a private learning provider. The dataset includes key attributes necessary for exploring patterns, correlations, and insights related to academic performance.
Columns: 01. Student_ID: Unique identifier for each student. 02. First_Name: Student’s first name. 03. Last_Name: Student’s last name. 04. Email: Contact email (can be anonymized). 05. Gender: Male, Female, Other. 06. Age: The age of the student. 07. Department: Student's department (e.g., CS, Engineering, Business). 08. Attendance (%): Attendance percentage (0-100%). 09. Midterm_Score: Midterm exam score (out of 100). 10. Final_Score: Final exam score (out of 100). 11. Assignments_Avg: Average score of all assignments (out of 100). 12. Quizzes_Avg: Average quiz scores (out of 100). 13. Participation_Score: Score based on class participation (0-10). 14. Projects_Score: Project evaluation score (out of 100). 15. Total_Score: Weighted sum of all grades. 16. Grade: Letter grade (A, B, C, D, F). 17. Study_Hours_per_Week: Average study hours per week. 18. Extracurricular_Activities: Whether the student participates in extracurriculars (Yes/No). 19. Internet_Access_at_Home: Does the student have access to the internet at home? (Yes/No). 20. Parent_Education_Level: Highest education level of parents (None, High School, Bachelor's, Master's, PhD). 21. Family_Income_Level: Low, Medium, High. 22. Stress_Level (1-10): Self-reported stress level (1: Low, 10: High). 23. Sleep_Hours_per_Night: Average hours of sleep per night.
The Attendance is not part of the Total_Score or has very minimal weight.
Calculating the weighted sum: Total Score=a⋅Midterm+b⋅Final+c⋅Assignments+d⋅Quizzes+e⋅Participation+f⋅Projects
| Component | Weight (%) |
|---|---|
| Midterm | 15% |
| Final | 25% |
| Assignments Avg | 15% |
| Quizzes Avg | 10% |
| Participation | 5% |
| Projects Score | 30% |
| Total | 100% |
Dataset contains: - Missing values (nulls): in some records (e.g., Attendance, Assignments, or Parent Education Level). - Bias in some Datae (ex: grading e.g., students with high attendance get slightly better grades). - Imbalanced distributions (e.g., some departments having more students).
Note: - The dataset is real, but I included some bias to create a greater challenge for my students. - Some Columns have been masked as the Data owner requested. "Students_Grading_Dataset_Biased.csv" contains the biased Dataset "Students Performance Dataset" Contains the masked dataset
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains comprehensive information on 2,392 high school students, detailing their demographics, study habits, parental involvement, extracurricular activities, and academic performance. The target variable, GradeClass, classifies students' grades into distinct categories, providing a robust dataset for educational research, predictive modeling, and statistical analysis.
This dataset offers a comprehensive view of the factors influencing students' academic performance, making it ideal for educational research, development of predictive models, and statistical analysis.
This dataset, shared by Rabie El Kharoua, is original and has never been shared before. It is made available under the CC BY 4.0 license, allowing anyone to use the dataset in any form as long as proper citation is given to the author. A DOI is provided for proper referencing. Please note that duplication of this work within Kaggle is not permitted.
This dataset is synthetic and was generated for educational purposes, making it ideal for data science and machine learning projects. It is an original dataset, owned by Mr. Rabie El Kharoua, and has not been previously shared. You are free to use it under the license outlined on the data card. The dataset is offered without any guarantees. Details about the data provider will be shared soon.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset is designed for practicing classification tasks, specifically predicting whether a student will pass or fail a course based on various academic and demographic factors. It contains 40,000 records of students, with attributes including study habits, attendance rates, previous grades, and more. The dataset also introduces challenges such as missing values, incorrect data, and noise, making it ideal for practicing data cleaning, exploratory data analysis (EDA), and feature engineering.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
📄 Dataset Description The Student Study Habits dataset captures the relationship between students' daily routines, study behaviors, and academic performance. It includes features such as study time, sleep hours, screen time, and participation habits, allowing for comprehensive analysis of how lifestyle and learning strategies affect grades or exam results.
This dataset is ideal for:
Educational data mining
Predictive modeling of academic outcomes
Identifying effective study patterns
Building classifiers or regressors for performance prediction
🧾 Column Descriptors
| Column Name | Description |
| ---------------------- | -------------------------------------------------------------------------- |
| Study_Hours | Number of hours spent studying daily |
| Sleep_Hours | Average number of sleep hours per night |
| Screen_Time | Daily screen usage in hours (excluding study-related use) |
| Attendance_Rate | Percentage of classes attended |
| Group_Study | Whether the student engages in group study sessions (Yes/No or 1/0) |
| Extra_Curricular | Participation in extracurricular activities (Yes/No or 1/0) |
| Stress_Level | Self-reported stress level (scale 1–10 or categorical: Low/Medium/High) |
| Parental_Involvement | Level of parental support in academics (Low/Medium/High) |
| Academic_Performance | Target variable – performance label (e.g., GPA, Score, or Pass/Fail class) |
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset gave us the 1000 students of academic performance which contain maths reading or writing score and It also gave us the information of gender, ethnicity, parent's education, lunch type or test preparing course.
Facebook
Twitterhttp://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
This dataset was created by tias_zzz
Released under Database: Open Database, Contents: Database Contents
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This is a simulated dataset exploring how lifestyle habits affect academic performance in students. With 1,000 synthetic student records and 15+ features including study hours, sleep patterns, social media usage, diet quality, mental health, and final exam scores, it’s perfect for ML projects, regression analysis, clustering, and data viz. Created using realistic patterns for educational practice.
Ever wondered how much Netflix, sleep, or TikTok scrolling affects your grades? 👀 This dataset simulates 1,000 students' daily habits—from study time to mental health—and compares them to final exam scores. It's like spying on your GPA through the lens of lifestyle. Perfect for EDA, ML practice, or just vibing with data while pretending to be productive.