100+ datasets found

Student Performance Data Set
kaggle.com
Updated Mar 27, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Data-Science Sean (2020). Student Performance Data Set [Dataset]. https://www.kaggle.com/datasets/larsen0966/student-performance-data-set
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 27, 2020
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Data-Science Sean
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
If this Data Set is useful, and upvote is appreciated. This data approach student achievement in secondary education of two Portuguese schools. The data attributes include student grades, demographic, social and school related features) and it was collected by using school reports and questionnaires. Two datasets are provided regarding the performance in two distinct subjects: Mathematics (mat) and Portuguese language (por). In [Cortez and Silva, 2008], the two datasets were modeled under binary/five-level classification and regression tasks. Important note: the target attribute G3 has a strong correlation with attributes G2 and G1. This occurs because G3 is the final year grade (issued at the 3rd period), while G1 and G2 correspond to the 1st and 2nd-period grades. It is more difficult to predict G3 without G2 and G1, but such prediction is much more useful (see paper source for more details).
student data analysis
kaggle.com
Updated Nov 17, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
maira javeed (2023). student data analysis [Dataset]. https://www.kaggle.com/datasets/mairajaveed/student-data-analysis
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Nov 17, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
maira javeed
Description
In this project, we aim to analyze and gain insights into the performance of students based on various factors that influence their academic achievements. We have collected data related to students' demographic information, family background, and their exam scores in different subjects.

**********Key Objectives:*********

Performance Evaluation: Evaluate and understand the academic performance of students by analyzing their scores in various subjects.

Identifying Underlying Factors: Investigate factors that might contribute to variations in student performance, such as parental education, family size, and student attendance.

Visualizing Insights: Create data visualizations to present the findings effectively and intuitively.

Dataset Details:

The dataset used in this analysis contains information about students, including their age, gender, parental education, lunch type, and test scores in subjects like mathematics, reading, and writing.

Analysis Highlights:

We will perform a comprehensive analysis of the dataset, including data cleaning, exploration, and visualization to gain insights into various aspects of student performance.

By employing statistical methods and machine learning techniques, we will determine the significant factors that affect student performance.

Why This Matters:

Understanding the factors that influence student performance is crucial for educators, policymakers, and parents. This analysis can help in making informed decisions to improve educational outcomes and provide support where it is most needed.

Acknowledgments:

We would like to express our gratitude to [mention any data sources or collaborators] for making this dataset available.

Please Note:

This project is meant for educational and analytical purposes. The dataset used is fictitious and does not represent any specific educational institution or individuals.
d
School Attendance by Student Group and District, 2021-2022
catalog.data.gov
data.ct.gov
+1more
Updated Jun 21, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.ct.gov (2025). School Attendance by Student Group and District, 2021-2022 [Dataset]. https://catalog.data.gov/dataset/school-attendance-by-student-group-and-district-2021-2022
Explore at:
Dataset updated
Jun 21, 2025
Dataset provided by
data.ct.gov
Description
This dataset includes the attendance rate for public school students PK-12 by student group and by district during the 2021-2022 school year. Student groups include: Students experiencing homelessness Students with disabilities Students who qualify for free/reduced lunch English learners All high needs students Non-high needs students Students by race/ethnicity (Hispanic/Latino of any race, Black or African American, White, All other races) Attendance rates are provided for each student group by district and for the state. Students who are considered high needs include students who are English language learners, who receive special education, or who qualify for free and reduced lunch. When no attendance data is displayed in a cell, data have been suppressed to safeguard student confidentiality, or to ensure that statistics based on a very small sample size are not interpreted as equally representative as those based on a sufficiently larger sample size. For more information on CSDE data suppression policies, please visit http://edsight.ct.gov/relatedreports/BDCRE%20Data%20Suppression%20Rules.pdf.
School information and student demographics
open.canada.ca
datasets.ai
+1more
html, xlsx
Updated Jun 18, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Government of Ontario (2025). School information and student demographics [Dataset]. https://open.canada.ca/data/en/dataset/d85f68c5-fcb0-4b4d-aec5-3047db47dcd5
Explore at:
xlsx, htmlAvailable download formats
Dataset updated
Jun 18, 2025
Dataset provided by
Government of Ontariohttps://www.ontario.ca/
License
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Time period covered
Sep 1, 2017 - Jun 30, 2024
Description
Data includes: board and school information, grade 3 and 6 EQAO student achievements for reading, writing and mathematics, and grade 9 mathematics EQAO and OSSLT. Data excludes private schools, Education and Community Partnership Programs (ECPP), summer, night and continuing education schools. How Are We Protecting Privacy? Results for OnSIS and Statistics Canada variables are suppressed based on school population size to better protect student privacy. In order to achieve this additional level of protection, the Ministry has used a methodology that randomly rounds a percentage either up or down depending on school enrolment. In order to protect privacy, the ministry does not publicly report on data when there are fewer than 10 individuals represented. * Percentages depicted as 0 may not always be 0 values as in certain situations the values have been randomly rounded down or there are no reported results at a school for the respective indicator. * Percentages depicted as 100 are not always 100, in certain situations the values have been randomly rounded up. The school enrolment totals have been rounded to the nearest 5 in order to better protect and maintain student privacy. The information in the School Information Finder is the most current available to the Ministry of Education at this time, as reported by schools, school boards, EQAO and Statistics Canada. The information is updated as frequently as possible. This information is also available on the Ministry of Education's School Information Finder website by individual school. Descriptions for some of the data types can be found in our glossary. School/school board and school authority contact information are updated and maintained by school boards and may not be the most current version. For the most recent information please visit: https://data.ontario.ca/dataset/ontario-public-school-contact-information.
d
Data from: Quality Time for Students: Learning In and Out of School
catalog.data.gov
Updated Mar 30, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Department of State (2021). Quality Time for Students: Learning In and Out of School [Dataset]. https://catalog.data.gov/dataset/quality-time-for-students-learning-in-and-out-of-school
Explore at:
Dataset updated
Mar 30, 2021
Dataset provided by
U.S. Department of State
Description
At a time when OECD and partner countries are trying to figure out how to reduce burgeoning debt and make the most of shrinking public budgets, spending on education is an obvious target for scrutiny. Education officials, teachers, policy makers, parents and students struggle to determine the merits of shorter or longer school days or school years, how much time should be allotted to various subjects, and the usefulness of after-school lessons and independent study. This report focuses on how students use learning time, both in and out of school. What are the ideal conditions to ensure that students use their learning time efficiently? What can schools do to maximise the learning that occurs during the limited amount of time students spend in class? In what kinds of lessons does learning time reap the most benefits? And how can this be determined? The report draws on data from the 2006 cycle of the Programme of International Student Assessment (PISA) to describe differences across and within countries in how much time students spend studying different subjects, how much time they spend in different types of learning activities, how they allocate their learning time and how they perform academically.
Students' Academic Performance Dataset
kaggle.com
zip
Updated Nov 25, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ibrahim Aljarah (2016). Students' Academic Performance Dataset [Dataset]. https://www.kaggle.com/aljarah/xAPI-Edu-Data
Explore at:
zip(6103 bytes)Available download formats
Dataset updated
Nov 25, 2016
Authors
Ibrahim Aljarah
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
Description

Student's Academic Performance Dataset (xAPI-Edu-Data)

Data Set Characteristics: Multivariate

Number of Instances: 480

Area: E-learning, Education, Predictive models, Educational Data Mining

Attribute Characteristics: Integer/Categorical

Number of Attributes: 16

Date: 2016-11-8

Associated Tasks: Classification

Missing Values? No

File formats: xAPI-Edu-Data.csv

Source:

Elaf Abu Amrieh, Thair Hamtini, and Ibrahim Aljarah, The University of Jordan, Amman, Jordan, http://www.Ibrahimaljarah.com www.ju.edu.jo

Dataset Information:

This is an educational data set which is collected from learning management system (LMS) called Kalboard 360. Kalboard 360 is a multi-agent LMS, which has been designed to facilitate learning through the use of leading-edge technology. Such system provides users with a synchronous access to educational resources from any device with Internet connection.

The data is collected using a learner activity tracker tool, which called experience API (xAPI). The xAPI is a component of the training and learning architecture (TLA) that enables to monitor learning progress and learner’s actions like reading an article or watching a training video. The experience API helps the learning activity providers to determine the learner, activity and objects that describe a learning experience. The dataset consists of 480 student records and 16 features. The features are classified into three major categories: (1) Demographic features such as gender and nationality. (2) Academic background features such as educational stage, grade Level and section. (3) Behavioral features such as raised hand on class, opening resources, answering survey by parents, and school satisfaction.

The dataset consists of 305 males and 175 females. The students come from different origins such as 179 students are from Kuwait, 172 students are from Jordan, 28 students from Palestine, 22 students are from Iraq, 17 students from Lebanon, 12 students from Tunis, 11 students from Saudi Arabia, 9 students from Egypt, 7 students from Syria, 6 students from USA, Iran and Libya, 4 students from Morocco and one student from Venezuela.

The dataset is collected through two educational semesters: 245 student records are collected during the first semester and 235 student records are collected during the second semester.

The data set includes also the school attendance feature such as the students are classified into two categories based on their absence days: 191 students exceed 7 absence days and 289 students their absence days under 7.

This dataset includes also a new category of features; this feature is parent parturition in the educational process. Parent participation feature have two sub features: Parent Answering Survey and Parent School Satisfaction. There are 270 of the parents answered survey and 210 are not, 292 of the parents are satisfied from the school and 188 are not.

(See the related papers for more details).

Attributes

1 Gender - student's gender (nominal: 'Male' or 'Female’)

2 Nationality- student's nationality (nominal:’ Kuwait’,’ Lebanon’,’ Egypt’,’ SaudiArabia’,’ USA’,’ Jordan’,’ Venezuela’,’ Iran’,’ Tunis’,’ Morocco’,’ Syria’,’ Palestine’,’ Iraq’,’ Lybia’)

3 Place of birth- student's Place of birth (nominal:’ Kuwait’,’ Lebanon’,’ Egypt’,’ SaudiArabia’,’ USA’,’ Jordan’,’ Venezuela’,’ Iran’,’ Tunis’,’ Morocco’,’ Syria’,’ Palestine’,’ Iraq’,’ Lybia’)

4 Educational Stages- educational level student belongs (nominal: ‘lowerlevel’,’MiddleSchool’,’HighSchool’)

5 Grade Levels- grade student belongs (nominal: ‘G-01’, ‘G-02’, ‘G-03’, ‘G-04’, ‘G-05’, ‘G-06’, ‘G-07’, ‘G-08’, ‘G-09’, ‘G-10’, ‘G-11’, ‘G-12 ‘)

6 Section ID- classroom student belongs (nominal:’A’,’B’,’C’)

7 Topic- course topic (nominal:’ English’,’ Spanish’, ‘French’,’ Arabic’,’ IT’,’ Math’,’ Chemistry’, ‘Biology’, ‘Science’,’ History’,’ Quran’,’ Geology’)

8 Semester- school year semester (nominal:’ First’,’ Second’)

9 Parent responsible for student (nominal:’mom’,’father’)

10 Raised hand- how many times the student raises his/her hand on classroom (numeric:0-100)

11- Visited resources- how many times the student visits a course content(numeric:0-100)

12 Viewing announcements-how many times the student checks the new announcements(numeric:0-100)

13 Discussion groups- how many times the student participate on discussion groups (numeric:0-100)

14 Parent Answering Survey- parent answered the surveys which are provided from school or not (nominal:’Yes’,’No’)

15 Parent School Satisfaction- the Degree of parent satisfaction from school(nominal:’Yes’,’No’)

16 Student Absence Days-the number of absence days for each student (nominal: above-7, under-7)

The students are classified into three numerical intervals based on their total grade/mark:

Low-Level: interval includes values from 0 to 69,

Middle-Level: interval includes values from 70 to 89,

High-Level: interval includes values from 90-100.

Relevant Papers:

-Amrieh, E. A., Hamtini, T., & Aljarah, I. (2016). Mining Educational Data to Predict Student’s academic Performance using Ensemble Methods. International Journal of Database Theory and Application, 9(8), 119-136.

-Amrieh, E. A., Hamtini, T., & Aljarah, I. (2015, November). Preprocessing and analyzing educational data set using X-API for improving student's performance. In Applied Electrical Engineering and Computing Technologies (AEECT), 2015 IEEE Jordan Conference on (pp. 1-5). IEEE.

Citation Request:

Please include these citations if you plan to use this dataset:

Amrieh, E. A., Hamtini, T., & Aljarah, I. (2016). Mining Educational Data to Predict Student’s academic Performance using Ensemble Methods. International Journal of Database Theory and Application, 9(8), 119-136.

-Amrieh, E. A., Hamtini, T., & Aljarah, I. (2015, November). Preprocessing and analyzing educational data set using X-API for improving student's performance. In Applied Electrical Engineering and Computing Technologies (AEECT), 2015 IEEE Jordan Conference on (pp. 1-5). IEEE.
A
‘ Predicting Student Performance’ analyzed by Analyst-2
analyst-2.ai
Updated Mar 2, 2015
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2015). ‘ Predicting Student Performance’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-predicting-student-performance-ec1b/b7296868/?iid=058-803&v=presentation
Explore at:
Dataset updated
Mar 2, 2015
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of ‘ Predicting Student Performance’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/yamqwe/student-performance on 28 January 2022.

--- Dataset description provided by original source is as follows ---

About this dataset

This data approach student achievement in secondary education of two Portuguese schools. The data attributes include student grades, demographic, social and school related features) and it was collected by using school reports and questionnaires. Two datasets are provided regarding the performance in two distinct subjects: Mathematics (mat) and Portuguese language (por). In [Cortez and Silva, 2008], the two datasets were modeled under binary/five-level classification and regression tasks. Important note: the target attribute G3 has a strong correlation with attributes G2 and G1. This occurs because G3 is the final year grade (issued at the 3rd period), while G1 and G2 correspond to the 1st and 2nd period grades. It is more difficult to predict G3 without G2 and G1, but such prediction is much more useful (see paper source for more details).

How to use this dataset

Predict Student's future performance

Understand the root causes for low performance

More datasets

Acknowledgements

If you use this dataset in your research, please credit ewenme

--- Original source retains full ownership of the source dataset ---
m
Data from: Dataset of Student Level Prediction in UAE
data.mendeley.com
Updated Dec 18, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
shatha Ghareeb (2020). Dataset of Student Level Prediction in UAE [Dataset]. http://doi.org/10.17632/3g8dtwbjjy.1
Explore at:
Unique identifier
https://doi.org/10.17632/3g8dtwbjjy.1
Dataset updated
Dec 18, 2020
Authors
shatha Ghareeb
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
United Arab Emirates
Description
The dataset comprises novel aspects specifically, in terms of student grading in diverse educational cultures within the multiple countries – Researchers and other education sectors will be able to see the impact of having varied curriculums in a country. Dataset compares different levelling cases when student transfer from curriculum to curriculum and the unreliable levelling criteria set by schools currently in an international school. The collected data can be used within the intelligent algorithms specifically machine learning and pattern analysis methods, to develop an intelligent framework applicable in multi-cultural educational systems to aid in a smooth transition “levelling, hereafter” of students who relocate from a particular education curriculum to another; and minimize the impact of switching on the students’ educational performance. The preliminary variables taken into consideration when deciding which data to collect depended on the variables. UAE is a multicultural country with many expats relocating from regions such as Asia, Europe and America. In order to meet expats needs, UAE has established many international private schools, therefore UAE was chosen to be the location of study based on many cases and struggles in levelling declared by the Ministry of Education and schools. For the first time, we present this dataset comprising students’ records for two academic years that included math, English, and science for 3 terms. Selection of subject areas and number of terms was based on influence from other researchers in similar subject matters.
University Students Data
kaggle.com
Updated May 6, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tarek Muhammed (2024). University Students Data [Dataset]. https://www.kaggle.com/datasets/tarekmuhammed/university-students-data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 6, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Tarek Muhammed
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Data Description

Private A factor with levels No and Yes indicating private or public university * Apps Number of applications received * Accept Number of applications accepted * Enroll Number of new students enrolled * Top10perc Pct. new students from top 10% of H.S. class * Top25perc Pct. new students from top 25% of H.S. class * F.Undergrad Number of fulltime undergraduates * P.Undergrad Number of parttime undergraduates * Outstate Out-of-state tuition * Room.Board Room and board costs * Books Estimated book costs * Personal Estimated personal spending * PhD Pct. of faculty with Ph.D.’s * Terminal Pct. of faculty with terminal degree * S.F.Ratio Student/faculty ratio * perc.alumni Pct. alumni who donate * Expend Instructional expenditure per student * Grad.Rate Graduation rate

You can Use it for clustering projects
AI Tool Usage by Indian College Students 2025
kaggle.com
Updated Jun 9, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rakesh Kapilavayi (2025). AI Tool Usage by Indian College Students 2025 [Dataset]. https://www.kaggle.com/datasets/rakeshkapilavai/ai-tool-usage-by-indian-college-students-2025
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 9, 2025
Dataset provided by
Kaggle
Authors
Rakesh Kapilavayi
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
AI Tool Usage by Indian College Students 2025

This unique dataset, collected via a May 2025 survey, captures how 496 Indian college students use AI tools (e.g., ChatGPT, Gemini, Copilot) in academics. It includes 16 attributes like AI tool usage, trust, impact on grades, and internet access, ideal for education analytics and machine learning.

Columns

Student_Name: Anonymized student name.

College_Name: College attended.

Stream: Academic discipline (e.g., Engineering, Arts).

Year_of_Study: Year of study (1–4).

AI_Tools_Used: Tools used (e.g., ChatGPT, Gemini).

Daily_Usage_Hours: Hours spent daily on AI tools.

Use_Cases: Purposes (e.g., Assignments, Exam Prep).

Trust_in_AI_Tools: Trust level (1–5).

Impact_on_Grades: Grade impact (-3 to +3).

Do_Professors_Allow_Use: Professor approval (Yes/No).

Preferred_AI_Tool: Preferred tool.

Awareness_Level: AI awareness (1–10).

Willing_to_Pay_for_Access: Willingness to pay (Yes/No).

State: Indian state.

Device_Used: Device (e.g., Laptop, Mobile).

Internet_Access: Access quality (Poor/Medium/High).

Use Cases

Predict academic performance using AI tool usage.

Analyze trust in AI across streams or regions.

Cluster students by usage patterns.

Study digital divide via Internet_Access.

Source: Collected via Google Forms survey in May 2025, ensuring diverse representation across India. Note: First dataset of its kind on Kaggle!
d
State- and Year-wise Number of Students who have Passed Out in different...
dataful.in
Updated Jun 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dataful (Factly) (2025). State- and Year-wise Number of Students who have Passed Out in different Disciplines of Study [Dataset]. https://dataful.in/datasets/15703
Explore at:
csv, xlsx, application/x-parquetAvailable download formats
Dataset updated
Jun 26, 2025
Dataset authored and provided by
Dataful (Factly)
License
https://dataful.in/terms-and-conditionshttps://dataful.in/terms-and-conditions
Area covered
States of India
Variables measured
Pass Out
Description
The dataset contains academic year-, gender- and state-wise compiled data on number of students who have passed out in certificate, diploma, integrated, pg diploma, undergraduate, post graduate, m.phil and ph.d educational courses from the year 2010-11 to 2020-21. In addition, the dataset also contains separate data on number of students who have passed out with 60% or more marks
Student Performace Dataset
kaggle.com
Updated May 10, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dhrumil Gohel (2021). Student Performace Dataset [Dataset]. https://www.kaggle.com/dhrumilgohel/student-performace-dataset/tasks
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 10, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Dhrumil Gohel
Description
Context

Our team was building an analytics Django web application that generated insights about student current semester performance and also historical performance. To have analytics and prediction things we need data that was not available in a much quantity.

So I have created this dataset with relevant conditions and corner cases.

Inspiration

How many Student Passes the exam?

What will be the failure ratio of students?

What will be the marks for the final exam?
d
School STAR Student Group Scores
catalog.data.gov
opendata.dc.gov
+3more
Updated Feb 5, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
City of Washington, DC (2025). School STAR Student Group Scores [Dataset]. https://catalog.data.gov/dataset/school-star-student-group-scores
Explore at:
Dataset updated
Feb 5, 2025
Dataset provided by
City of Washington, DC
Description
2018 DC School Report Card. STAR Framework student group scores by school and school framework. The STAR Framework measures performance for 10 different student groups with a minimum n size of 10 or more students at the school. The student groups are All Students, Students with Disabilities, Student who are At Risk, English Learners, and students who identify as the following ESSA-defined racial/ethnic groups: American Indian or Alaskan Native, Asian, Black or African American, Hispanic/Latino of any race, Native Hawaiian or Other Pacific Islander, White, and Two or more races. The Alternative School Framework includes an eleventh student group, At-Risk Students with Disabilities.Some students are included in the school- and LEA-level aggregations that will display on the DC School Report Card but are not included in calculations for the STAR Framework. These students are included in the “All Report Card Students” student group to distinguish from the “All Students” group used for the STAR Framework.Supplemental:Metric scores are not reported for n-sizes less than 10; metrics that have an n-size less than 10 are not included in calculation of STAR scores and ratings.At the state level, teacher data is reported on the DC School Report Card for all schools, high-poverty schools, and low-poverty schools. The definition for high-poverty and low-poverty schools is included in DC's ESSA State Plan. At the school level, teacher data is reported for the entire school, and at the LEA-level, teacher data is reported for all schools only.On the STAR Framework, 203 schools received STAR scores and ratings based on data from the 2017-18 school year. Of those 203 schools, 2 schools closed after the completion of the 2017-18 school year (Excel Academy PCS and Washington Mathematics Science Technology PCHS). Because those two schools closed, they do not receive a School Report Card and report card metrics were not calculated for those schools.Schools with non-traditional grade configurations may be assigned multiple school frameworks as part of the STAR Framework. For example, a K-8 school would be assigned the Elementary School Framework and the Middle School Framework. Because a school may have multiple school frameworks, the total number of school framework scores across the city will be greater than the total number of schools that received a STAR score and rating.Detailed information about the metrics and calculations for the DC School Report Card and STAR Framework can be found in the 2018 DC School Report Card and STAR Framework Technical Guide (https://osse.dc.gov/publication/2018-dc-school-report-card-and-star-framework-technical-guide).
u
Data from: DIPSEER: A Dataset for In-Person Student Emotion and Engagement...
observatorio-cientifico.ua.es
scidb.cn
Updated 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Márquez-Carpintero, Luis; Suescun-Ferrandiz, Sergio; Álvarez, Carolina Lorenzo; Fernandez-Herrero, Jorge; Viejo, Diego; Rosabel Roig-Vila; Cazorla, Miguel; Márquez-Carpintero, Luis; Suescun-Ferrandiz, Sergio; Álvarez, Carolina Lorenzo; Fernandez-Herrero, Jorge; Viejo, Diego; Rosabel Roig-Vila; Cazorla, Miguel (2025). DIPSEER: A Dataset for In-Person Student Emotion and Engagement Recognition in the Wild [Dataset]. https://observatorio-cientifico.ua.es/documentos/67321d21aea56d4af0484172
Explore at:
Dataset updated
2025
Authors
Márquez-Carpintero, Luis; Suescun-Ferrandiz, Sergio; Álvarez, Carolina Lorenzo; Fernandez-Herrero, Jorge; Viejo, Diego; Rosabel Roig-Vila; Cazorla, Miguel; Márquez-Carpintero, Luis; Suescun-Ferrandiz, Sergio; Álvarez, Carolina Lorenzo; Fernandez-Herrero, Jorge; Viejo, Diego; Rosabel Roig-Vila; Cazorla, Miguel
Description
Data DescriptionThe DIPSER dataset is designed to assess student attention and emotion in in-person classroom settings, consisting of RGB camera data, smartwatch sensor data, and labeled attention and emotion metrics. It includes multiple camera angles per student to capture posture and facial expressions, complemented by smartwatch data for inertial and biometric metrics. Attention and emotion labels are derived from self-reports and expert evaluations. The dataset includes diverse demographic groups, with data collected in real-world classroom environments, facilitating the training of machine learning models for predicting attention and correlating it with emotional states.Data Collection and Generation ProceduresThe dataset was collected in a natural classroom environment at the University of Alicante, Spain. The recording setup consisted of six general cameras positioned to capture the overall classroom context and individual cameras placed at each student’s desk. Additionally, smartwatches were used to collect biometric data, such as heart rate, accelerometer, and gyroscope readings.Experimental SessionsNine distinct educational activities were designed to ensure a comprehensive range of engagement scenarios:News Reading – Students read projected or device-displayed news.Brainstorming Session – Idea generation for problem-solving.Lecture – Passive listening to an instructor-led session.Information Organization – Synthesizing information from different sources.Lecture Test – Assessment of lecture content via mobile devices.Individual Presentations – Students present their projects.Knowledge Test – Conducted using Kahoot.Robotics Experimentation – Hands-on session with robotics.MTINY Activity Design – Development of educational activities with computational thinking.Technical SpecificationsRGB Cameras: Individual cameras recorded at 640×480 pixels, while context cameras captured at 1280×720 pixels.Frame Rate: 9-10 FPS depending on the setup.Smartwatch Sensors: Collected heart rate, accelerometer, gyroscope, rotation vector, and light sensor data at a frequency of 1–100 Hz.Data Organization and FormatsThe dataset follows a structured directory format:/groupX/experimentY/subjectZ.zip Each subject-specific folder contains:images/ (individual facial images)watch_sensors/ (sensor readings in JSON format)labels/ (engagement & emotion annotations)metadata/ (subject demographics & session details)Annotations and LabelingEach data entry includes engagement levels (1-5) and emotional states (9 categories) based on both self-reported labels and evaluations by four independent experts. A custom annotation tool was developed to ensure consistency across evaluations.Missing Data and Data QualitySynchronization: A centralized server ensured time alignment across devices. Brightness changes were used to verify synchronization.Completeness: No major missing data, except for occasional random frame drops due to embedded device performance.Data Consistency: Uniform collection methodology across sessions, ensuring high reliability.Data Processing MethodsTo enhance usability, the dataset includes preprocessed bounding boxes for face, body, and hands, along with gaze estimation and head pose annotations. These were generated using YOLO, MediaPipe, and DeepFace.File Formats and AccessibilityImages: Stored in standard JPEG format.Sensor Data: Provided as structured JSON files.Labels: Available as CSV files with timestamps.The dataset is publicly available under the CC-BY license and can be accessed along with the necessary processing scripts via the DIPSER GitHub repository.Potential Errors and LimitationsDue to camera angles, some student movements may be out of frame in collaborative sessions.Lighting conditions vary slightly across experiments.Sensor latency variations are minimal but exist due to embedded device constraints.CitationIf you find this project helpful for your research, please cite our work using the following bibtex entry:@misc{marquezcarpintero2025dipserdatasetinpersonstudent1, title={DIPSER: A Dataset for In-Person Student Engagement Recognition in the Wild}, author={Luis Marquez-Carpintero and Sergio Suescun-Ferrandiz and Carolina Lorenzo Álvarez and Jorge Fernandez-Herrero and Diego Viejo and Rosabel Roig-Vila and Miguel Cazorla}, year={2025}, eprint={2502.20209}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2502.20209}, } Usage and ReproducibilityResearchers can utilize standard tools like OpenCV, TensorFlow, and PyTorch for analysis. The dataset supports research in machine learning, affective computing, and education analytics, offering a unique resource for engagement and attention studies in real-world classroom environments.
i
"ChatGPT vs. Student: A Dataset for Source Classification of Computer...
ieee-dataport.org
Updated Jul 19, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ALI ABDULLAH S ALQAHTANI (2023). "ChatGPT vs. Student: A Dataset for Source Classification of Computer Science Answers [Dataset]. https://ieee-dataport.org/documents/chatgpt-vs-student-dataset-source-classification-computer-science-answers
Explore at:
Dataset updated
Jul 19, 2023
Authors
ALI ABDULLAH S ALQAHTANI
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
along with the corresponding answers from students and ChatGPT.
h
student-performance
huggingface.co
Updated May 28, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Soumyadip Sarkar (2025). student-performance [Dataset]. http://doi.org/10.57967/hf/5412
Explore at:
Unique identifier
https://doi.org/10.57967/hf/5412
Dataset updated
May 28, 2025
Authors
Soumyadip Sarkar
License
https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/
Description
Student Performance Dataset

Dataset Description

This dataset contains ten million synthetically generated student performance records, designed to mimic real-world educational data at the high-school level. It includes detailed demographic, socioeconomic, academic, behavioral, and school-context features for each student, suitable for benchmarking, machine learning, educational research, and exploratory data analysis.

File Information

Split File Name… See the full description on the dataset page: https://huggingface.co/datasets/neuralsorcerer/student-performance.
Students Test Data
kaggle.com
Updated Sep 12, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ATHARV BHARASKAR (2023). Students Test Data [Dataset]. https://www.kaggle.com/datasets/atharvbharaskar/students-test-data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 12, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
ATHARV BHARASKAR
License
ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
Description
Dataset Overview: This dataset pertains to the examination results of students who participated in a series of academic assessments at a fictitious educational institution named "University of Exampleville." The assessments were administered across various courses and academic levels, with a focus on evaluating students' performance in general management and domain-specific topics.

Columns: The dataset comprises 12 columns, each representing specific attributes and performance indicators of the students. These columns encompass information such as the students' names (which have been anonymized), their respective universities, academic program names (including BBA and MBA), specializations, the semester of the assessment, the type of examination domain (general management or domain-specific), general management scores (out of 50), domain-specific scores (out of 50), total scores (out of 100), student ranks, and percentiles.

Data Collection: The examination data was collected during a standardized assessment process conducted by the University of Exampleville. The exams were designed to assess students' knowledge and skills in general management and their chosen domain-specific subjects. It involved students from both BBA and MBA programs who were in their final year of study.

Data Format: The dataset is available in a structured format, typically as a CSV file. Each row represents a unique student's performance in the examination, while columns contain specific information about their results and academic details.

Data Usage: This dataset is valuable for analyzing and gaining insights into the academic performance of students pursuing BBA and MBA degrees. It can be used for various purposes, including statistical analysis, performance trend identification, program assessment, and comparison of scores across domains and specializations. Furthermore, it can be employed in predictive modeling or decision-making related to curriculum development and student support.

Data Quality: The dataset has undergone preprocessing and anonymization to protect the privacy of individual students. Nevertheless, it is essential to use the data responsibly and in compliance with relevant data protection regulations when conducting any analysis or research.

Data Format: The exam data is typically provided in a structured format, commonly as a CSV (Comma-Separated Values) file. Each row in the dataset represents a unique student's examination performance, and each column contains specific attributes and scores related to the examination. The CSV format allows for easy import and analysis using various data analysis tools and programming languages like Python, R, or spreadsheet software like Microsoft Excel.

Here's a column-wise description of the dataset:

Name OF THE STUDENT: The full name of the student who took the exam. (Anonymized)

UNIVERSITY: The university where the student is enrolled.

PROGRAM NAME: The name of the academic program in which the student is enrolled (BBA or MBA).

Specialization: If applicable, the specific area of specialization or major that the student has chosen within their program.

Semester: The semester or academic term in which the student took the exam.

Domain: Indicates whether the exam was divided into two parts: general management and domain-specific.

GENERAL MANAGEMENT SCORE (OUT of 50): The score obtained by the student in the general management part of the exam, out of a maximum possible score of 50.

Domain-Specific Score (Out of 50): The score obtained by the student in the domain-specific part of the exam, also out of a maximum possible score of 50.

TOTAL SCORE (OUT of 100): The total score obtained by adding the scores from the general management and domain-specific parts, out of a maximum possible score of 100.
Z
Galatanet dataset
data.niaid.nih.gov
Updated Oct 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Labatut, Vincent (2024). Galatanet dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6811541
Explore at:
Dataset updated
Oct 1, 2024
Dataset provided by
Labatut, Vincent
Balasque, Jean-Michel
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Description. This project contains the dataset relative to the Galatanet survey, conducted in 2009 and 2010 at the Galatasaray University in Istanbul (Turkey). The goal of this survey was to retrieve information regarding the social relationships between students, their feeling regarding the university in general, and their purchase behavior. The survey was conducted during two phases: the first one in 2009 and the second in 2010.

The dataset includes two kinds of data. First, the answers to most of the questions are contained in a large table, available under both CSV and MS Excel formats. An description file allows understanding the meaning of each field appearing in the table. Note thesurvey form is also contained in the archive, for reference (it is in French and Turkish only, though). Second, the social network of students is available under both Pajek and Graphml formats. Having both individual (nodal attributes) and relational (links) information in the same dataset is, to our knowledge, rare and difficult to find in public sources, and this makes (to our opinion) this dataset interesting and valuable.

All data are completely anonymous: students' names have been replaced by random numbers. Note that the survey is not exactly the same between the two phases: some small adjustments were applied thanks to the feedback from the first phase (but the datasets have been normalized since then). Also, the electronic form was very much improved for the second phase, which explains why the answers are much more complete than in the first phase.

The data were used in our following publications:

Labatut, V. & Balasque, J.-M. (2010). Business-oriented Analysis of a Social Network of University Students. In: International Conference on Advances in Social Network Analysis and Mining, 25-32. Odense, DK : IEEE. ⟨hal-00633643⟩ - DOI: 10.1109/ASONAM.2010.15

An extended version of the original article: Labatut, V. & Balasque, J.-M. (2013). Informative Value of Individual and Relational Data Compared Through Business-Oriented Community Detection. Özyer, T.; Rokne, J.; Wagner, G. & Reuser, A. H. (Eds.), The Influence of Technology on Social Network Analysis and Mining, Springer, 2013, chap.6, 303-330. ⟨hal-00633650⟩ - DOI: 10.1007/978-3-7091-1346-2_13

A more didactic article using some of these data just for illustration purposes: Labatut, V. & Balasque, J.-M. (2012). Detection and Interpretation of Communities in Complex Networks: Methods and Practical Application. Abraham, A. & Hassanien, A.-E. (Eds.), Computational Social Networks: Tools, Perspectives and Applications, Springer, chap.4, 81-113. ⟨hal-00633653⟩ - DOI: 10.1007/978-1-4471-4048-1_4

Citation. If you use this data, please cite article [1] above:

@InProceedings{Labatut2010, author = {Labatut, Vincent and Balasque, Jean-Michel}, title = {Business-oriented Analysis of a Social Network of University Students}, booktitle = {International Conference on Advances in Social Networks Analysis and Mining}, year = {2010}, pages = {25-32}, address = {Odense, DK}, publisher = {IEEE Publishing}, doi = {10.1109/ASONAM.2010.15},}

Contact. 2009-2010 by Jean-Michel Balasque (jmbalasque@gsu.edu.tr) & Vincent Labatut (vlabatut@gsu.edu.tr)

License. This dataset is open data: you can redistribute it and/or use it under the terms of the Creative Commons Zero license (see license.txt).
Master dataset: NSW government school locations and student enrolment...
data.nsw.gov.au
researchdata.edu.au
+1more
csv, json
Updated Jul 11, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
NSW Department of Education (2025). Master dataset: NSW government school locations and student enrolment numbers [Dataset]. https://data.nsw.gov.au/data/dataset/nsw-education-nsw-public-schools-master-dataset
Explore at:
csv(6537), csv(1283666), json(3526018)Available download formats
Dataset updated
Jul 11, 2025
Dataset provided by
NSW Department of Educationhttps://education.nsw.gov.au/
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
New South Wales, Government of New South Wales
Description
The master dataset contains comprehensive information for all government schools in NSW. Data items include school locations, latitude and longitude coordinates, school type, student enrolment numbers, electorate information, contact details and more.

This dataset is publicly available through the Data NSW website, and is used to support the School Finder tool.

Data Notes:

Data relating to healthy canteen is no longer up to date as it is no longer updated by the Department, this data can be sourced through NSW health.

Student enrolment numbers are based on the census of government school students undertaken on the first Friday of August; and LBOTE numbers are based on data collected in March.

School information, such as addresses and contact details, are updated regularly as required, and are the most current source of information.

Data is suppressed for indigenous and LBOTE percentages where student numbers are equal to, or less than five indicated by "np".

NSSC out of scope schools will not have an enrolment figure.

NSSC and LBOTE figures are updated annually in December.

ICSEA values are updated every February with the previous year's ICSEA values. Small schools, SSPs and Senior Secondary schools do not have their ICSEA values published by ACARA.

Family Occupation and Educational Index (FOEI) is a school-level index of educational disadvantage. Data is extracted in May and values are updated annually in December.

Following the introduction of part-time study in secondary schools in 1993, student enrolments are generally reported in full-time equivalent units (FTE). The FTE for students studying less than 10 units, the minimum workload, is determined by the formula: 0.1 x the number of units studied and represented as a proportion of the full-time enrolment of 1.0 FTE.

Data Source:

Education Statistics and Measurement. Centre for Education Statistics and Evaluation.
o
University SET data, with faculty and courses characteristics
openicpsr.org
Updated Sep 12, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Under blind review in refereed journal (2021). University SET data, with faculty and courses characteristics [Dataset]. http://doi.org/10.3886/E149801V1
Explore at:
Unique identifier
https://doi.org/10.3886/E149801V1
Dataset updated
Sep 12, 2021
Authors
Under blind review in refereed journal
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This paper explores a unique dataset of all the SET ratings provided by students of one university in Poland at the end of the winter semester of the 2020/2021 academic year. The SET questionnaire used by this university is presented in Appendix 1. The dataset is unique for several reasons. It covers all SET surveys filled by students in all fields and levels of study offered by the university. In the period analysed, the university was entirely in the online regime amid the Covid-19 pandemic. While the expected learning outcomes formally have not been changed, the online mode of study could have affected the grading policy and could have implications for some of the studied SET biases. This Covid-19 effect is captured by econometric models and discussed in the paper. The average SET scores were matched with the characteristics of the teacher for degree, seniority, gender, and SET scores in the past six semesters; the course characteristics for time of day, day of the week, course type, course breadth, class duration, and class size; the attributes of the SET survey responses as the percentage of students providing SET feedback; and the grades of the course for the mean, standard deviation, and percentage failed. Data on course grades are also available for the previous six semesters. This rich dataset allows many of the biases reported in the literature to be tested for and new hypotheses to be formulated, as presented in the introduction section. The unit of observation or the single row in the data set is identified by three parameters: teacher unique id (j), course unique id (k) and the question number in the SET questionnaire (n ϵ {1, 2, 3, 4, 5, 6, 7, 8, 9} ). It means that for each pair (j,k), we have nine rows, one for each SET survey question, or sometimes less when students did not answer one of the SET questions at all. For example, the dependent variable SET_score_avg(j,k,n) for the triplet (j=Calculus, k=John Smith, n=2) is calculated as the average of all Likert-scale answers to question nr 2 in the SET survey distributed to all students that took the Calculus course taught by John Smith. The data set has 8,015 such observations or rows. The full list of variables or columns in the data set included in the analysis is presented in the attached filesection. Their description refers to the triplet (teacher id = j, course id = k, question number = n). When the last value of the triplet (n) is dropped, it means that the variable takes the same values for all n ϵ {1, 2, 3, 4, 5, 6, 7, 8, 9}.Two attachments:- word file with variables description- Rdata file with the data set (for R language).Appendix 1. Appendix 1. The SET questionnaire was used for this paper. Evaluation survey of the teaching staff of [university name] Please, complete the following evaluation form, which aims to assess the lecturer’s performance. Only one answer should be indicated for each question. The answers are coded in the following way: 5- I strongly agree; 4- I agree; 3- Neutral; 2- I don’t agree; 1- I strongly don’t agree. Questions 1 2 3 4 5 I learnt a lot during the course. ○ ○ ○ ○ ○ I think that the knowledge acquired during the course is very useful. ○ ○ ○ ○ ○ The professor used activities to make the class more engaging. ○ ○ ○ ○ ○ If it was possible, I would enroll for the course conducted by this lecturer again. ○ ○ ○ ○ ○ The classes started on time. ○ ○ ○ ○ ○ The lecturer always used time efficiently. ○ ○ ○ ○ ○ The lecturer delivered the class content in an understandable and efficient way. ○ ○ ○ ○ ○ The lecturer was available when we had doubts. ○ ○ ○ ○ ○ The lecturer treated all students equally regardless of their race, background and ethnicity. ○ ○

Facebook

Twitter

Click to copy link

Link copied

Cite

Data-Science Sean (2020). Student Performance Data Set [Dataset]. https://www.kaggle.com/datasets/larsen0966/student-performance-data-set

Student Performance Data Set

Student achievement in secondary education of two Portuguese schools.

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Mar 27, 2020

Dataset provided by

Kagglehttp://kaggle.com/

Authors

Data-Science Sean

License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

If this Data Set is useful, and upvote is appreciated. This data approach student achievement in secondary education of two Portuguese schools. The data attributes include student grades, demographic, social and school related features) and it was collected by using school reports and questionnaires. Two datasets are provided regarding the performance in two distinct subjects: Mathematics (mat) and Portuguese language (por). In [Cortez and Silva, 2008], the two datasets were modeled under binary/five-level classification and regression tasks. Important note: the target attribute G3 has a strong correlation with attributes G2 and G1. This occurs because G3 is the final year grade (issued at the 3rd period), while G1 and G2 correspond to the 1st and 2nd-period grades. It is more difficult to predict G3 without G2 and G1, but such prediction is much more useful (see paper source for more details).

Clear search

Close search

Google apps

Main menu

Student Performance Data Set

student data analysis

School Attendance by Student Group and District, 2021-2022

School information and student demographics

Data from: Quality Time for Students: Learning In and Out of School

Students' Academic Performance Dataset

Student's Academic Performance Dataset (xAPI-Edu-Data)

Source:

Dataset Information:

Attributes

The students are classified into three numerical intervals based on their total grade/mark:

Relevant Papers:

Citation Request:

‘ Predicting Student Performance’ analyzed by Analyst-2

About this dataset

How to use this dataset

Acknowledgements

Data from: Dataset of Student Level Prediction in UAE

University Students Data

AI Tool Usage by Indian College Students 2025

AI Tool Usage by Indian College Students 2025

Columns

Use Cases

State- and Year-wise Number of Students who have Passed Out in different...

Student Performace Dataset

Context

Inspiration

School STAR Student Group Scores

Data from: DIPSEER: A Dataset for In-Person Student Emotion and Engagement...

"ChatGPT vs. Student: A Dataset for Source Classification of Computer...

student-performance

Students Test Data

Galatanet dataset

Master dataset: NSW government school locations and student enrolment...

University SET data, with faculty and courses characteristics

Student Performance Data Set

Student achievement in secondary education of two Portuguese schools.