Facebook
TwitterAccording to a 2023 survey, ** percent of undergraduate students who were studying online in the United States were White, while ** percent were Black or African-American. In comparison, ** percent of graduate students studying online in the United States in that year were White, while ** percent were Black or African American.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Open University is a public British University that also has the highest number of undergraduate students in the UK. It is the largest academic institution in the United Kingdom (and one of the largest in Europe) with 2 Million enrolled students since it is established at 1969. As can be understood from its name, Open University mainly populated by off-campus students.
This dataset belongs to Open University Online Learning Platform (Also called as "Virtual Learning Environment(VLE)") that off-campus students use for accessing the course content, forum discussions, sending assessments and checking out assignment marks etc. It consists of 7 selected courses (mentioned as modules in the dataset). Different presentations indicated with letters "B" and "J" after year for semester 2 and semester 1 respectively.
Additionally, the dataset includes student demographics such as location, age group, disability, education level, gender etc. Student assessment marks, interactions with the Virtual Learning Environment (VLE) are also included.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F1880515%2F864453733ec7cb92f003450ed38a2144%2FScreen%20Shot%202020-01-11%20at%208.41.59%20pm.png?generation=1578735825723455&alt=media" alt="">
Kuzilek J., Hlosta M., Zdrahal Z. Open University Learning Analytics dataset Sci. Data 4:170171 doi: 10.1038/sdata.2017.171 (2017).
Personalisation of online education has a big potential to enhance the quality and efficiency of education. Any insights about how people differ from each other in the way we learn would be impressive.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset simulates behavioral and demographic data for 5,000 students enrolled in online learning platforms. It is designed for building and evaluating machine learning models that predict student dropout risk.
Features: - Demographics: age, region - Engagement: login frequency, forum posts, assignment completion rate - Activity: last activity days ago, courses enrolled - Temporal: enrollment date, exam season flag
Target Labels: - label → Binary: 0 = active/at-risk, 1 = dropped - label_multiclass → 3-class: 0 = active, 1 = at-risk, 2 = dropped
Advanced Patterns Included: - Low login frequency + incomplete assignments → higher dropout probability - Seasonal drift: dropout increases during final exam periods - Balanced class distribution (~33% per class)
Use Cases: binary classification, multi-class classification, feature importance analysis, EdTech analytics, student retention modeling.
Facebook
TwitterThis statistic shows the distribution of target populations of online education programs in the United States in 2019. In 2019, ** percent of respondents stated that their online education programs were aimed at adult students returning to school after an absence.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Dataset Description
Online Learning Engagement Dataset
This dataset contains synthetic data representing student engagement and learning behavior in an online education environment. It includes various features related to student demographics, study habits, platform activity, and academic performance.
The dataset is designed to simulate realistic online learning scenarios and can be used for educational data mining, machine learning experiments, and analytics projects.
Key variables include student login frequency, study hours, video watch time, assignment submissions, quiz attempts, and engagement metrics. These features help capture how students interact with digital learning platforms.
Possible Use Cases
This dataset can be used for several machine learning and data analysis tasks, such as:
Student engagement analysis
Dropout prediction models
Academic performance prediction
Behavioral analytics in online education
Educational data mining research
Dataset Structure
The dataset includes features such as:
Student demographics (age, gender, country)
Learning environment variables (device type, internet speed)
Platform activity metrics (login frequency, session duration)
Learning performance indicators (quiz scores, assignments)
Engagement metrics and final grade
Inspiration
Online education platforms generate large amounts of behavioral data. Understanding student engagement patterns can help educators improve learning experiences and identify students who may be at risk of dropping out.
Facebook
TwitterAccording to a survey conducted in 2022, ** percent of students in higher education agreed that the quality of online instruction in higher education is the same as the quality of in-person instruction in the United States, while ** percent said that the quality was worse.
Facebook
TwitterBy Yifan Zhu [source]
Detailed Online Course Enrollment and Student Engagement Data
The dataset provides comprehensive details about students' engagement, behavior, activity, and performance pertaining to the online courses they have registered for. Each record in this data collection corresponds to one student's experience with a specific course.
Comprising a multitude of detailed metrics and indicators that paint an exhaustive picture of each student's interaction with their chosen online course, this dataset includes information such as the registration status of the student, whether they viewed the course content or not, if they explored it in detail or just skimmed through it and if they ended up getting certified on completion.
One of the significant aspects captured in this set is user demographics – including parameters like their geographical location (depicted by 'final_cc_cname_DI'), level of education ('LoE_DI'), year of birth ('YoB') & gender that offer insightful data for analysis on a diversified cohort undertaking these courses.
It doesn't stop there - The data further delves into granular details about academic performances encompassing each individual's grade score ('grade'), date when they started the course ('start_time_DI'), date when they were last active on it('last_event_DI').
Human-Computer Interaction (HCI) metrics present another valuable perspective - These include specific actions performed by students during their learning process: number of events taking place while interacting with digital coursework('nevents'), number of active days spent within digital learning environments('ndays_act), how much content consumption took place i.e., number chapters read interactively per student within each class('nchapters') along with volume discussions taking place via forum posts developed by learners('nforum_posts').
A very salient feature captured is 'roles', which categorizes what kind of role did each enrollee play within these courses – filling positions as students primarily but also instructors & other staff members perhaps occasionally.
Last but not least is an indicator denoting whether the collected student information might be incomplete for some reason – marked under 'incomplete_flag'.
The combination of these parameters serves as a rich collection of behavioral, demographic, and performance-related data. Ideal for those interested in analyzing or developing predictive models related to online learning environments, student behaviors in digital classrooms and their engagement with course material– this dataset truly contains a wealth of insight about a rapidly evolving sector – online education
This dataset is a gold mine for anyone interested in student behavior, engagement and performance in online courses. Here's a guide on how you could potentially use this dataset.
Educational Research
The dataset provides data that can help researchers understand trends in online education.
Student Demographics: With attributes like Year of Birth (YoB), Level of Education (LoE_DI), Gender, and Country of the Student (final_cc_cname_DI) - researchers can undertake demographic-based analysis.
Student Engagement: Use metrics like number of forum posts (nforum_posts), number of days the student was active/engaged with the course content (ndays_act), whether they explored or viewed the course content (viewed and explored flags) and how many chapters they interacted with during their engagement period(nchapters).
Predictive Modelling
Machine learning engineers could use this data to train models for predictive purposes:
Predicting Course Completion: Using attributes such as level of education, registration status, initial start time etc. we could predict if students are likely to finish a course. The 'certified' field indicates whether a student finished a course which could be used as an label for supervised learning model.
Predict Performance: Features around user interactions like 'nevents', 'nchapters', ‘ndays_act’ could indicate user engagement which can help predict users likely to score higher grades ('grade').
Data Visualisation Projects
For those interested mainly in visualisation projects:
- You can visualise demographic distribution
- Create engaging visuals showing correlation between different fields such as Level Of Education vs Grade obtained
- Show what factors contribute most towards completing a course ...
Facebook
TwitterIn 2022, **** percent of higher education students in the United States were taking exclusively distance learning courses. A further **** percent of students were taking at least some distance learning courses. For both of these groups, this is a decrease from the previous year, demonstrating the declining impact of the COVID-19 pandemic.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains information on secondary school student performance collected from two Portuguese schools. It was originally introduced by **Cortez & Silva ** in the paper “Using Data Mining to Predict Secondary School Student Performance.”
The data was gathered through school reports and student questionnaires, covering demographic, social, and academic-related variables. Two separate datasets are provided:
student-mat.csv → Math course performancestudent-por.csv → Portuguese language course performanceNumber of instances: 649 (Mathematics) + 649 (Portuguese)
Number of features: 30 input variables + 3 grade outputs (G1, G2, G3)
Target variable: G3 (final grade, 0–20 scale)
Missing values: None
The main goal is to predict student academic success, especially the final grade G3.
Since G1 (first period grade) and G2 (second period grade) are highly correlated with G3, experiments can be designed with or without these features:
G3 using G1 and G2G3 without G1 and G2This dataset is suitable for:
The dataset includes 30 attributes from multiple categories:
sex, age, address, famsize, PstatusMedu, Fedu, Mjob, Fjob, guardianschool, reason, traveltime, studytime, failuresschoolsup, famsup, paid, activities, nursery, higher, internetromantic, freetime, goout, Dalc, Walc, health, absencesG1, G2, G3This dataset is a playground for classification & regression tasks, ideal for experimenting with feature selection, ensemble methods, and interpretable ML approaches.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This is an online survey on Bangladeshi Students, It has various columns such as level of education, age, access to internet, Impact of online classes on study hours, Issues faced in attending classes and device preferred to attend online class.
It is a dataset in the tabular format and it contains 18 columns and 8784 rows.
@data{DVN/PLN7GM_2021, author = {Ferdows, Jannatul}, publisher = {Harvard Dataverse}, title = {{Online Survey Data of Bangladeshi Students}}, UNF = {UNF:6:mEhft2rgYEkMCtUf6rtYog==}, year = {2021}, version = {V1}, doi = {10.7910/DVN/PLN7GM}, url = {https://doi.org/10.7910/DVN/PLN7GM} }
Facebook
TwitterAttribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
The Online Learning Behavior Dataset (Worldwide) is a synthetic dataset designed to represent global online education trends and student engagement patterns across multiple countries. This dataset simulates realistic learner activity data collected from various digital learning platforms and is ideal for data analysis, visualization, and machine learning practice.
It includes demographic information, academic background details, device and platform usage behavior, and performance indicators such as completion rates and satisfaction scores. The dataset enables exploration of worldwide e-learning habits and supports predictive modeling tasks such as student satisfaction prediction, engagement analysis, and completion rate forecasting.
🌍 Key Features:
Global Coverage: Multiple countries across different regions Demographics: Age, Gender, Education Level Academic Fields: Computer Science, Business, Engineering, Arts, Medicine, Data Science Platform Usage: Coursera, Udemy, edX, Khan Academy, Udacity, YouTube Learning Device Usage: Mobile, Laptop, Tablet, Desktop Learning Modes: Self-Paced, Instructor-Led, Hybrid
Engagement Metrics:
Daily Learning Hours Quizzes Attempted Assignments Submitted Performance Metrics: Course Completion Rate (%) Satisfaction Score (1–5)
🎯 Potential Use Cases:
Exploratory Data Analysis (EDA) Student engagement behavior modeling Completion rate prediction Satisfaction score classification Global online education trend analysis Data visualization projects Machine learning practice datasets
⚠️ Note:
This is a synthetic dataset generated for educational and research purposes. It does not represent real student data and contains no personally identifiable information (PII).
Facebook
TwitterDuring a survey conducted in Spring 2023 in the United States, the most popular factor for choosing online education was the affordability of the program, with ** percent of respondents reporting this as one of their top three reasons. The second most popular factor was the reputation of the school or program.
Facebook
Twitterhttps://scoop.market.us/privacy-policyhttps://scoop.market.us/privacy-policy
eLearning Statistics: In 1999, the term "eLearning", short for "electronic", was created. eLearning is further known as online learning and refers to the learning that occurs at a distance.
eLearning contains a broad range of courses from online college courses for K-12 students to employer training courses.
These digital resources and web applications, such as course management systems, facilitate this technological form of education.
They allow students to communicate with their professors and classmates through email or chat sessions, during taking online classes, downloading course materials, and doing other related activities.
Students usually need only a device like a laptop, tablet, or smartphone with Wi-Fi to participate in eLearning. The low entry barrier makes eLearning accessible and provides other advantages.
In the digital age, many trends have taken off quickly. Its growth has become exceptional and is not showing signs of relapsing.
Distance learning, or E-learning as it is commonly known, doesn’t take place in the traditional classroom setting where a teacher regulates and moderates information.
Facebook
Twitterhttps://www.marketreportanalytics.com/privacy-policyhttps://www.marketreportanalytics.com/privacy-policy
Discover the booming online higher education market! This comprehensive analysis reveals a 19.82% CAGR, key drivers, trends, and regional breakdowns, highlighting leading companies and competitive strategies. Explore the future of online learning and its impact on education.
Facebook
TwitterAccording to a survey conducted in 2023, ** percent of college students in the United States said that they preferred lab or interactive work to be conducted in person, while ** percent preferred online. Taking exams and researching were the only activities that college students were more likely to say that they preferred online rather than in person in that year.
Facebook
Twitterhttps://www.verifiedmarketresearch.com/privacy-policy/https://www.verifiedmarketresearch.com/privacy-policy/
U.S. Higher Education Market size was valued at USD 101165.92 Million in 2024 and is projected to reach USD 176174.98 Million by 2032, growing at a CAGR of 7.18% from 2026 to 2032.Increasing Demand for Skilled Professionals in a Knowledge-Driven Economy: The relentless evolution of the global economy into a knowledge-driven powerhouse is a primary engine of the U.S. higher education market. As industries become increasingly sophisticated and automated, the demand for highly skilled professionals with specialized knowledge and critical thinking abilities has skyrocketed.Rising Enrollment in Online and Hybrid Learning Programs Offering Flexibility: The profound shift towards online and hybrid learning models has fundamentally reshaped the accessibility and delivery of higher education, acting as a powerful growth driver.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
If this Data Set is useful, and upvote is appreciated. This data approach student achievement in secondary education of two Portuguese schools. The data attributes include student grades, demographic, social and school related features) and it was collected by using school reports and questionnaires. Two datasets are provided regarding the performance in two distinct subjects: Mathematics (mat) and Portuguese language (por). In [Cortez and Silva, 2008], the two datasets were modeled under binary/five-level classification and regression tasks. Important note: the target attribute G3 has a strong correlation with attributes G2 and G1. This occurs because G3 is the final year grade (issued at the 3rd period), while G1 and G2 correspond to the 1st and 2nd-period grades. It is more difficult to predict G3 without G2 and G1, but such prediction is much more useful (see paper source for more details).
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset provides detailed records of student dropout events from online courses, including coded reasons, timestamps, and anonymized demographic information. It enables comprehensive analysis of dropout patterns, supports targeted retention strategies, and facilitates research into factors affecting digital education completion rates.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description: This dataset supports a research study evaluating the challenges faced by undergraduate nursing students in Iraq regarding online nursing education. The study was conducted from September 26, 2020, to April 10, 2021, across eight nursing colleges in Iraq. It investigates various challenges related to learning, technology, instructor competency, communication, course design, and psychosocial factors. Data Collection: The dataset was collected through a self-reported questionnaire distributed to 320 undergraduate nursing students from eight universities. Data was gathered through Google Forms and physical questionnaires between January 8, 2021, and February 27, 2021. The study instrument consists of demographic data and six domains assessing online education challenges using a three-point Likert scale. Data Contents: Demographic Information: Includes students' age, gender, educational status, family income, residency, and internet access. Online Nursing Education Challenges: Covers six domains: Learning, understanding, and comprehension Software and e-learning tools Instructors' skills and experience Class discussions and student-teacher communication Course design and content Psycho-social circumstances Analysis Methods: The dataset was analyzed using descriptive statistics (frequency, percentage, mean scores) and inferential statistical methods, including factor analysis, ANOVA, and multiple linear regression, to assess relationships between challenges and student demographics. Ethical Considerations: Approval for the study was obtained from the Scientific Research Ethical Committee at the University of Baghdad. All participants provided informed consent before participation. Limitations: The findings are limited to the study sample and may not be generalizable. Additionally, there is a lack of comparable national and international research on this topic.
Facebook
TwitterIn 2024, about **** percent of all students who chose online degree programs in the United States said they did so because COVID-19 made it the only option available to them, a slight decrease from ** percent in the previous year. In both 2023 and 2024, however, the most commonly cited reason for students to choose online degree programs was due to existing commitments, such as work and family, preventing their attendance in campus-based courses.
Facebook
TwitterAccording to a 2023 survey, ** percent of undergraduate students who were studying online in the United States were White, while ** percent were Black or African-American. In comparison, ** percent of graduate students studying online in the United States in that year were White, while ** percent were Black or African American.