https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
If this Data Set is useful, and upvote is appreciated. This data approach student achievement in secondary education of two Portuguese schools. The data attributes include student grades, demographic, social and school related features) and it was collected by using school reports and questionnaires. Two datasets are provided regarding the performance in two distinct subjects: Mathematics (mat) and Portuguese language (por). In [Cortez and Silva, 2008], the two datasets were modeled under binary/five-level classification and regression tasks. Important note: the target attribute G3 has a strong correlation with attributes G2 and G1. This occurs because G3 is the final year grade (issued at the 3rd period), while G1 and G2 correspond to the 1st and 2nd-period grades. It is more difficult to predict G3 without G2 and G1, but such prediction is much more useful (see paper source for more details).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Student Enrollment reports the number of enrolled students per year, per grade.
The data here is from the report entitled Trends in Enrollment, Credit Attainment, and Remediation at Connecticut Public Universities and Community Colleges: Results from P20WIN for the High School Graduating Classes of 2010 through 2016. The report answers three questions: 1. Enrollment: What percentage of the graduating class enrolled in a Connecticut public university or community college (UCONN, the four Connecticut State Universities, and 12 Connecticut community colleges) within 16 months of graduation? 2. Credit Attainment: What percentage of those who enrolled in a Connecticut public university or community college within 16 months of graduation earned at least one year’s worth of credits (24 or more) within two years of enrollment? 3. Remediation: What percentage of those who enrolled in one of the four Connecticut State Universities or one of the 12 community colleges within 16 months of graduation took a remedial course within two years of enrollment? Notes on the data: District Credit: % Earning 24 Credits is a subset of the % Earning 16 Credits District Remed: % Enrolled in Remediation is a subset of the % Enrolled in 16 Months
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Data includes: board and school information, grade 3 and 6 EQAO student achievements for reading, writing and mathematics, and grade 9 mathematics EQAO and OSSLT. Data excludes private schools, Education and Community Partnership Programs (ECPP), summer, night and continuing education schools. How Are We Protecting Privacy? Results for OnSIS and Statistics Canada variables are suppressed based on school population size to better protect student privacy. In order to achieve this additional level of protection, the Ministry has used a methodology that randomly rounds a percentage either up or down depending on school enrolment. In order to protect privacy, the ministry does not publicly report on data when there are fewer than 10 individuals represented. * Percentages depicted as 0 may not always be 0 values as in certain situations the values have been randomly rounded down or there are no reported results at a school for the respective indicator. * Percentages depicted as 100 are not always 100, in certain situations the values have been randomly rounded up. The school enrolment totals have been rounded to the nearest 5 in order to better protect and maintain student privacy. The information in the School Information Finder is the most current available to the Ministry of Education at this time, as reported by schools, school boards, EQAO and Statistics Canada. The information is updated as frequently as possible. This information is also available on the Ministry of Education's School Information Finder website by individual school. Descriptions for some of the data types can be found in our glossary. School/school board and school authority contact information are updated and maintained by school boards and may not be the most current version. For the most recent information please visit: https://data.ontario.ca/dataset/ontario-public-school-contact-information.
This file includes enrollment data from 2012-13 school year. Data are disaggregated by school, district, and state levels and include counts of students by the following groups: grade level, gender, race/ethnicity, and student programs, and special characteristics. Please review the notes below for more information.
In this project, we aim to analyze and gain insights into the performance of students based on various factors that influence their academic achievements. We have collected data related to students' demographic information, family background, and their exam scores in different subjects.
**********Key Objectives:*********
Performance Evaluation: Evaluate and understand the academic performance of students by analyzing their scores in various subjects.
Identifying Underlying Factors: Investigate factors that might contribute to variations in student performance, such as parental education, family size, and student attendance.
Visualizing Insights: Create data visualizations to present the findings effectively and intuitively.
Dataset Details:
Analysis Highlights:
We will perform a comprehensive analysis of the dataset, including data cleaning, exploration, and visualization to gain insights into various aspects of student performance.
By employing statistical methods and machine learning techniques, we will determine the significant factors that affect student performance.
Why This Matters:
Understanding the factors that influence student performance is crucial for educators, policymakers, and parents. This analysis can help in making informed decisions to improve educational outcomes and provide support where it is most needed.
Acknowledgments:
We would like to express our gratitude to [mention any data sources or collaborators] for making this dataset available.
Please Note:
This project is meant for educational and analytical purposes. The dataset used is fictitious and does not represent any specific educational institution or individuals.
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
Data Set Characteristics: Multivariate
Number of Instances: 480
Area: E-learning, Education, Predictive models, Educational Data Mining
Attribute Characteristics: Integer/Categorical
Number of Attributes: 16
Date: 2016-11-8
Associated Tasks: Classification
Missing Values? No
File formats: xAPI-Edu-Data.csv
Elaf Abu Amrieh, Thair Hamtini, and Ibrahim Aljarah, The University of Jordan, Amman, Jordan, http://www.Ibrahimaljarah.com www.ju.edu.jo
This is an educational data set which is collected from learning management system (LMS) called Kalboard 360. Kalboard 360 is a multi-agent LMS, which has been designed to facilitate learning through the use of leading-edge technology. Such system provides users with a synchronous access to educational resources from any device with Internet connection.
The data is collected using a learner activity tracker tool, which called experience API (xAPI). The xAPI is a component of the training and learning architecture (TLA) that enables to monitor learning progress and learner’s actions like reading an article or watching a training video. The experience API helps the learning activity providers to determine the learner, activity and objects that describe a learning experience. The dataset consists of 480 student records and 16 features. The features are classified into three major categories: (1) Demographic features such as gender and nationality. (2) Academic background features such as educational stage, grade Level and section. (3) Behavioral features such as raised hand on class, opening resources, answering survey by parents, and school satisfaction.
The dataset consists of 305 males and 175 females. The students come from different origins such as 179 students are from Kuwait, 172 students are from Jordan, 28 students from Palestine, 22 students are from Iraq, 17 students from Lebanon, 12 students from Tunis, 11 students from Saudi Arabia, 9 students from Egypt, 7 students from Syria, 6 students from USA, Iran and Libya, 4 students from Morocco and one student from Venezuela.
The dataset is collected through two educational semesters: 245 student records are collected during the first semester and 235 student records are collected during the second semester.
The data set includes also the school attendance feature such as the students are classified into two categories based on their absence days: 191 students exceed 7 absence days and 289 students their absence days under 7.
This dataset includes also a new category of features; this feature is parent parturition in the educational process. Parent participation feature have two sub features: Parent Answering Survey and Parent School Satisfaction. There are 270 of the parents answered survey and 210 are not, 292 of the parents are satisfied from the school and 188 are not.
(See the related papers for more details).
1 Gender - student's gender (nominal: 'Male' or 'Female’)
2 Nationality- student's nationality (nominal:’ Kuwait’,’ Lebanon’,’ Egypt’,’ SaudiArabia’,’ USA’,’ Jordan’,’ Venezuela’,’ Iran’,’ Tunis’,’ Morocco’,’ Syria’,’ Palestine’,’ Iraq’,’ Lybia’)
3 Place of birth- student's Place of birth (nominal:’ Kuwait’,’ Lebanon’,’ Egypt’,’ SaudiArabia’,’ USA’,’ Jordan’,’ Venezuela’,’ Iran’,’ Tunis’,’ Morocco’,’ Syria’,’ Palestine’,’ Iraq’,’ Lybia’)
4 Educational Stages- educational level student belongs (nominal: ‘lowerlevel’,’MiddleSchool’,’HighSchool’)
5 Grade Levels- grade student belongs (nominal: ‘G-01’, ‘G-02’, ‘G-03’, ‘G-04’, ‘G-05’, ‘G-06’, ‘G-07’, ‘G-08’, ‘G-09’, ‘G-10’, ‘G-11’, ‘G-12 ‘)
6 Section ID- classroom student belongs (nominal:’A’,’B’,’C’)
7 Topic- course topic (nominal:’ English’,’ Spanish’, ‘French’,’ Arabic’,’ IT’,’ Math’,’ Chemistry’, ‘Biology’, ‘Science’,’ History’,’ Quran’,’ Geology’)
8 Semester- school year semester (nominal:’ First’,’ Second’)
9 Parent responsible for student (nominal:’mom’,’father’)
10 Raised hand- how many times the student raises his/her hand on classroom (numeric:0-100)
11- Visited resources- how many times the student visits a course content(numeric:0-100)
12 Viewing announcements-how many times the student checks the new announcements(numeric:0-100)
13 Discussion groups- how many times the student participate on discussion groups (numeric:0-100)
14 Parent Answering Survey- parent answered the surveys which are provided from school or not (nominal:’Yes’,’No’)
15 Parent School Satisfaction- the Degree of parent satisfaction from school(nominal:’Yes’,’No’)
16 Student Absence Days-the number of absence days for each student (nominal: above-7, under-7)
Low-Level: interval includes values from 0 to 69,
Middle-Level: interval includes values from 70 to 89,
High-Level: interval includes values from 90-100.
-Amrieh, E. A., Hamtini, T., & Aljarah, I. (2016). Mining Educational Data to Predict Student’s academic Performance using Ensemble Methods. International Journal of Database Theory and Application, 9(8), 119-136.
-Amrieh, E. A., Hamtini, T., & Aljarah, I. (2015, November). Preprocessing and analyzing educational data set using X-API for improving student's performance. In Applied Electrical Engineering and Computing Technologies (AEECT), 2015 IEEE Jordan Conference on (pp. 1-5). IEEE.
Please include these citations if you plan to use this dataset:
-Amrieh, E. A., Hamtini, T., & Aljarah, I. (2015, November). Preprocessing and analyzing educational data set using X-API for improving student's performance. In Applied Electrical Engineering and Computing Technologies (AEECT), 2015 IEEE Jordan Conference on (pp. 1-5). IEEE.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Data Description
Private A factor with levels No and Yes indicating private or public university * Apps Number of applications received * Accept Number of applications accepted * Enroll Number of new students enrolled * Top10perc Pct. new students from top 10% of H.S. class * Top25perc Pct. new students from top 25% of H.S. class * F.Undergrad Number of fulltime undergraduates * P.Undergrad Number of parttime undergraduates * Outstate Out-of-state tuition * Room.Board Room and board costs * Books Estimated book costs * Personal Estimated personal spending * PhD Pct. of faculty with Ph.D.’s * Terminal Pct. of faculty with terminal degree * S.F.Ratio Student/faculty ratio * perc.alumni Pct. alumni who donate * Expend Instructional expenditure per student * Grad.Rate Graduation rate
You can Use it for clustering projects
ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
Dataset Overview: This dataset pertains to the examination results of students who participated in a series of academic assessments at a fictitious educational institution named "University of Exampleville." The assessments were administered across various courses and academic levels, with a focus on evaluating students' performance in general management and domain-specific topics.
Columns: The dataset comprises 12 columns, each representing specific attributes and performance indicators of the students. These columns encompass information such as the students' names (which have been anonymized), their respective universities, academic program names (including BBA and MBA), specializations, the semester of the assessment, the type of examination domain (general management or domain-specific), general management scores (out of 50), domain-specific scores (out of 50), total scores (out of 100), student ranks, and percentiles.
Data Collection: The examination data was collected during a standardized assessment process conducted by the University of Exampleville. The exams were designed to assess students' knowledge and skills in general management and their chosen domain-specific subjects. It involved students from both BBA and MBA programs who were in their final year of study.
Data Format: The dataset is available in a structured format, typically as a CSV file. Each row represents a unique student's performance in the examination, while columns contain specific information about their results and academic details.
Data Usage: This dataset is valuable for analyzing and gaining insights into the academic performance of students pursuing BBA and MBA degrees. It can be used for various purposes, including statistical analysis, performance trend identification, program assessment, and comparison of scores across domains and specializations. Furthermore, it can be employed in predictive modeling or decision-making related to curriculum development and student support.
Data Quality: The dataset has undergone preprocessing and anonymization to protect the privacy of individual students. Nevertheless, it is essential to use the data responsibly and in compliance with relevant data protection regulations when conducting any analysis or research.
Data Format: The exam data is typically provided in a structured format, commonly as a CSV (Comma-Separated Values) file. Each row in the dataset represents a unique student's examination performance, and each column contains specific attributes and scores related to the examination. The CSV format allows for easy import and analysis using various data analysis tools and programming languages like Python, R, or spreadsheet software like Microsoft Excel.
Here's a column-wise description of the dataset:
Name OF THE STUDENT: The full name of the student who took the exam. (Anonymized)
UNIVERSITY: The university where the student is enrolled.
PROGRAM NAME: The name of the academic program in which the student is enrolled (BBA or MBA).
Specialization: If applicable, the specific area of specialization or major that the student has chosen within their program.
Semester: The semester or academic term in which the student took the exam.
Domain: Indicates whether the exam was divided into two parts: general management and domain-specific.
GENERAL MANAGEMENT SCORE (OUT of 50): The score obtained by the student in the general management part of the exam, out of a maximum possible score of 50.
Domain-Specific Score (Out of 50): The score obtained by the student in the domain-specific part of the exam, also out of a maximum possible score of 50.
TOTAL SCORE (OUT of 100): The total score obtained by adding the scores from the general management and domain-specific parts, out of a maximum possible score of 100.
Student enrollment data disaggregated by students from low-income families, students from each racial and ethnic group, gender, English learners, children with disabilities, children experiencing homelessness, children in foster care, and migratory students for each mode of instruction.
This dataset includes the attendance rate for public school students PK-12 by student group and by district during the 2021-2022 school year. Student groups include: Students experiencing homelessness Students with disabilities Students who qualify for free/reduced lunch English learners All high needs students Non-high needs students Students by race/ethnicity (Hispanic/Latino of any race, Black or African American, White, All other races) Attendance rates are provided for each student group by district and for the state. Students who are considered high needs include students who are English language learners, who receive special education, or who qualify for free and reduced lunch. When no attendance data is displayed in a cell, data have been suppressed to safeguard student confidentiality, or to ensure that statistics based on a very small sample size are not interpreted as equally representative as those based on a sufficiently larger sample size. For more information on CSDE data suppression policies, please visit http://edsight.ct.gov/relatedreports/BDCRE%20Data%20Suppression%20Rules.pdf.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset comprises novel aspects specifically, in terms of student grading in diverse educational cultures within the multiple countries – Researchers and other education sectors will be able to see the impact of having varied curriculums in a country. Dataset compares different levelling cases when student transfer from curriculum to curriculum and the unreliable levelling criteria set by schools currently in an international school. The collected data can be used within the intelligent algorithms specifically machine learning and pattern analysis methods, to develop an intelligent framework applicable in multi-cultural educational systems to aid in a smooth transition “levelling, hereafter” of students who relocate from a particular education curriculum to another; and minimize the impact of switching on the students’ educational performance. The preliminary variables taken into consideration when deciding which data to collect depended on the variables. UAE is a multicultural country with many expats relocating from regions such as Asia, Europe and America. In order to meet expats needs, UAE has established many international private schools, therefore UAE was chosen to be the location of study based on many cases and struggles in levelling declared by the Ministry of Education and schools. For the first time, we present this dataset comprising students’ records for two academic years that included math, English, and science for 3 terms. Selection of subject areas and number of terms was based on influence from other researchers in similar subject matters.
Patterns of educational attainment vary greatly across countries, and across population groups within countries. In some countries, virtually all children complete basic education whereas in others large groups fall short. The primary purpose of this database, and the associated research program, is to document and analyze these differences using a compilation of a variety of household-based data sets: Demographic and Health Surveys (DHS); Multiple Indicator Cluster Surveys (MICS); Living Standards Measurement Study Surveys (LSMS); as well as country-specific Integrated Household Surveys (IHS) such as Socio-Economic Surveys.As shown at the website associated with this database, there are dramatic differences in attainment by wealth. When households are ranked according to their wealth status (or more precisely, a proxy based on the assets owned by members of the household) there are striking differences in the attainment patterns of children from the richest 20 percent compared to the poorest 20 percent.In Mali in 2012 only 34 percent of 15 to 19 year olds in the poorest quintile have completed grade 1 whereas 80 percent of the richest quintile have done so. In many countries, for example Pakistan, Peru and Indonesia, almost all the children from the wealthiest households have completed at least one year of schooling. In some countries, like Mali and Pakistan, wealth gaps are evident from grade 1 on, in other countries, like Peru and Indonesia, wealth gaps emerge later in the school system.The EdAttain website allows a visual exploration of gaps in attainment and enrollment within and across countries, based on the international database which spans multiple years from over 120 countries and includes indicators disaggregated by wealth, gender and urban/rural location. The database underlying that site can be downloaded from here.
https://dataful.in/terms-and-conditionshttps://dataful.in/terms-and-conditions
The dataset contains academic year-, gender- and state-wise compiled data on number of students who have passed out in certificate, diploma, integrated, pg diploma, undergraduate, post graduate, m.phil and ph.d educational courses from the year 2010-11 to 2020-21. In addition, the dataset also contains separate data on number of students who have passed out with 60% or more marks
This dataset contains college enrollment information by Michigan House of Representative district. College enrollment was defined as the number of public high school students who graduated in 2017, who enrolled in a college or university. This dataset includes enrollment in two-year and four-year institutions of higher education. Click here for metadata (descriptions of the fields).
2018 DC School Report Card. STAR Framework student group scores by school and school framework. The STAR Framework measures performance for 10 different student groups with a minimum n size of 10 or more students at the school. The student groups are All Students, Students with Disabilities, Student who are At Risk, English Learners, and students who identify as the following ESSA-defined racial/ethnic groups: American Indian or Alaskan Native, Asian, Black or African American, Hispanic/Latino of any race, Native Hawaiian or Other Pacific Islander, White, and Two or more races. The Alternative School Framework includes an eleventh student group, At-Risk Students with Disabilities.Some students are included in the school- and LEA-level aggregations that will display on the DC School Report Card but are not included in calculations for the STAR Framework. These students are included in the “All Report Card Students” student group to distinguish from the “All Students” group used for the STAR Framework.Supplemental:Metric scores are not reported for n-sizes less than 10; metrics that have an n-size less than 10 are not included in calculation of STAR scores and ratings.At the state level, teacher data is reported on the DC School Report Card for all schools, high-poverty schools, and low-poverty schools. The definition for high-poverty and low-poverty schools is included in DC's ESSA State Plan. At the school level, teacher data is reported for the entire school, and at the LEA-level, teacher data is reported for all schools only.On the STAR Framework, 203 schools received STAR scores and ratings based on data from the 2017-18 school year. Of those 203 schools, 2 schools closed after the completion of the 2017-18 school year (Excel Academy PCS and Washington Mathematics Science Technology PCHS). Because those two schools closed, they do not receive a School Report Card and report card metrics were not calculated for those schools.Schools with non-traditional grade configurations may be assigned multiple school frameworks as part of the STAR Framework. For example, a K-8 school would be assigned the Elementary School Framework and the Middle School Framework. Because a school may have multiple school frameworks, the total number of school framework scores across the city will be greater than the total number of schools that received a STAR score and rating.Detailed information about the metrics and calculations for the DC School Report Card and STAR Framework can be found in the 2018 DC School Report Card and STAR Framework Technical Guide (https://osse.dc.gov/publication/2018-dc-school-report-card-and-star-framework-technical-guide).
Student Enrollment Document Retrieval
This dataset is created from the original Kaggle Delaware Student Enrollment dataset. The charts are rendered and queries created using templates. The text_description column contains OCR text extracted from the images using EasyOCR. This particular dataset is a subsample of at maximum 1000 random rows from the full dataset which can be found here.
Disclaimer
This dataset may contain publicly available images or text data. All… See the full description on the dataset page: https://huggingface.co/datasets/jinaai/student-enrollment.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This Dataset comes from Mahmoud Elhemaly.
I've modified this dataset since it had no correlation between the variables. I've used it for data visualization on Tableau. Many columns contains NON ACCURATE DATA.
Description
Student Performance & Behavior Dataset This dataset is real data of 5,000 records collected from a private learning provider. The dataset includes key attributes necessary for exploring patterns, correlations, and insights related to academic performance.
Columns:
Student_ID: Unique identifier for each student.
First_Name: Student’s first name.
Last_Name: Student’s last name.
Email: Contact email (can be anonymized).
Gender: Male, Female, Other.
Age: The age of the student.
Department: Student's department (e.g., CS, Engineering, Business).
Attendance (%): Attendance percentage (0-100%).
Participation_Score: Score based on class participation (0-10).
Projects_Score: Project evaluation score (out of 100).
Total_Score: Weighted sum of all grades.
Grade: Letter grade (A, B, C, D, F).
Study_Hours_per_Week: Average study hours per week.
Extracurricular_Activities: Whether the student participates in extracurriculars (Yes/No).
Internet_Access_at_Home: Does the student have access to the internet at home? (Yes/No).
Parent_Education_Level: Highest education level of parents (None, High School, Bachelor's, Master's, PhD).
Family_Income_Level: Low, Medium, High.
Stress_Level (1-10): Self-reported stress level (1: Low, 10: High).
Sleep_Hours_per_Night: Average hours of sleep per night.
Sleep_Hours_per_Night_Entier: with integrer only
Country: Country of origin
Dataset contains:
Missing values (nulls): in some records (e.g., Attendance, Assignments, or Parent Education Level).
Bias in some Datae (ex: grading e.g., students with high attendance get slightly better grades).
Imbalanced distributions (e.g., some departments having more students).
Our team was building an analytics Django web application that generated insights about student current semester performance and also historical performance. To have analytics and prediction things we need data that was not available in a much quantity.
So I have created this dataset with relevant conditions and corner cases.
How many Student Passes the exam?
What will be the failure ratio of students?
What will be the marks for the final exam?
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Explore all our datasets in raw format
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
If this Data Set is useful, and upvote is appreciated. This data approach student achievement in secondary education of two Portuguese schools. The data attributes include student grades, demographic, social and school related features) and it was collected by using school reports and questionnaires. Two datasets are provided regarding the performance in two distinct subjects: Mathematics (mat) and Portuguese language (por). In [Cortez and Silva, 2008], the two datasets were modeled under binary/five-level classification and regression tasks. Important note: the target attribute G3 has a strong correlation with attributes G2 and G1. This occurs because G3 is the final year grade (issued at the 3rd period), while G1 and G2 correspond to the 1st and 2nd-period grades. It is more difficult to predict G3 without G2 and G1, but such prediction is much more useful (see paper source for more details).