100+ datasets found

d
Massively Open Online Course for Educators (MOOC-Ed) network dataset
search.dataone.org
Updated Nov 21, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kellogg, Shaun; Edelmann, Achim (2023). Massively Open Online Course for Educators (MOOC-Ed) network dataset [Dataset]. http://doi.org/10.7910/DVN/ZZH3UB
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/ZZH3UB
Dataset updated
Nov 21, 2023
Dataset provided by
Harvard Dataverse
Authors
Kellogg, Shaun; Edelmann, Achim
Description
The dataset provides detailed information on the communications taking place between learners in two offerings of the Massively Open Online Course for Educators (MOOC-Eds) titled The Digital Learning Transition in K-12 Schools. The courses were offered to educators from the USA and abroad during the spring and fall of 2013. Though based on the same course, minor controlled variations were made to both MOOCs in terms of the course length, discussion prompts, and group size. The primary use of this dataset is to enable social network analyses (SNAs) of these communications. In particular, it allows modeling network mechanisms to better understand factors that facilitate or impede the exchange of information among educators, and includes relevant characteristics of the participants, such as their professional roles and their experience in education.
i
A Dataset on Online Learning-based Web Behavior from Different Countries...
ieee-dataport.org
Updated Jul 29, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Saumick Pradhan (2025). A Dataset on Online Learning-based Web Behavior from Different Countries Before and After COVID-19 [Dataset]. https://ieee-dataport.org/open-access/dataset-online-learning-based-web-behavior-different-countries-and-after-covid-19
Explore at:
Dataset updated
Jul 29, 2025
Authors
Saumick Pradhan
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
2022
Open-Source GIScience Online Course
ckan.americaview.org
Updated Nov 2, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ckan.americaview.org (2021). Open-Source GIScience Online Course [Dataset]. https://ckan.americaview.org/dataset/open-source-giscience-online-course
Explore at:
Dataset updated
Nov 2, 2021
Dataset provided by
CKANhttps://ckan.org/
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
In this course, you will explore a variety of open-source technologies for working with geosptial data, performing spatial analysis, and undertaking general data science. The first component of the class focuses on the use of QGIS and associated technologies (GDAL, PROJ, GRASS, SAGA, and Orfeo Toolbox). The second component of the class introduces Python and associated open-source libraries and modules (NumPy, Pandas, Matplotlib, Seaborn, GeoPandas, Rasterio, WhiteboxTools, and Scikit-Learn) used by geospatial scientists and data scientists. We also provide an introduction to Structured Query Language (SQL) for performing table and spatial queries. This course is designed for individuals that have a background in GIS, such as working in the ArcGIS environment, but no prior experience using open-source software and/or coding. You will be asked to work through a series of lecture modules and videos broken into several topic areas, as outlined below. Fourteen assignments and the required data have been provided as hands-on opportunites to work with data and the discussed technologies and methods. If you have any questions or suggestions, feel free to contact us. We hope to continue to update and improve this course. This course was produced by West Virginia View (http://www.wvview.org/) with support from AmericaView (https://americaview.org/). This material is based upon work supported by the U.S. Geological Survey under Grant/Cooperative Agreement No. G18AP00077. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the opinions or policies of the U.S. Geological Survey. Mention of trade names or commercial products does not constitute their endorsement by the U.S. Geological Survey. After completing this course you will be able to: apply QGIS to visualize, query, and analyze vector and raster spatial data. use available resources to further expand your knowledge of open-source technologies. describe and use a variety of open data formats. code in Python at an intermediate-level. read, summarize, visualize, and analyze data using open Python libraries. create spatial predictive models using Python and associated libraries. use SQL to perform table and spatial queries at an intermediate-level.
d
Canvas Network Courses, Activities, and Users (4/2014 - 9/2015) Restricted...
search.dataone.org
dataverse.harvard.edu
Updated Nov 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Canvas Network (2023). Canvas Network Courses, Activities, and Users (4/2014 - 9/2015) Restricted Dataset [Dataset]. http://doi.org/10.7910/DVN/GVLFXO
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/GVLFXO
Dataset updated
Nov 8, 2023
Dataset provided by
Harvard Dataverse
Authors
Canvas Network
Description
This dataset release is comprised of de-identified data from March 2014 - September 2015 of Canvas Network open courses, along with related documentation. In balancing data utility with thorough de-identification, this dataset favors utility; therefore, access and usage of this dataset is restricted as described in the Canvas Network Data Usage Agreement. These data use a star schema to organize various course, activity, and person records using dimensions and facts. The structure of this dataset is based on the Canvas Data star schema as described in https://portal.inshosteddata.com/docs. The first release of this dataset is the Canvas Network Courses, Activities, and Users (4/2014 - 9/2015) Dataset, version 1.0, created on March 3, 2016. The data set is split into multiple files for convenience: CNCAU_1403-1509_R_v1_03-03-2016.tgz contains the facts and dimensions representing the breadth of the dataset CNCAU_1403-1509_R_v1_03-03-2016_requests-01.gz - ...08.gz contain user page view requests The resulting files are plain text, with tab-separated values.
OULAD-Dataset
kaggle.com
Updated Mar 2, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Vibhor Jain (2019). OULAD-Dataset [Dataset]. https://www.kaggle.com/datasets/vjcalling/ouladdata/versions/1
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 2, 2019
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Vibhor Jain
Description
Context

MOOC dataset to study behavior of students for online courses.

Content

It contains data about courses, students and their interactions with Virtual Learning Environment (VLE) for seven selected courses (called modules). Presentations of courses start in February and October - they are marked by “B” and “J” respectively. The dataset consists of tables connected using unique identifiers. All tables are stored in the csv format.

Acknowledgements

Kuzilek J., Hlosta M., Zdrahal Z. Open University Learning Analytics dataset Sci. Data 4:170171 doi: 10.1038/sdata.2017.171 (2017).
h
online-courses-usage-and-history-dataset
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mitul Das, online-courses-usage-and-history-dataset [Dataset]. https://huggingface.co/datasets/Mitul1999/online-courses-usage-and-history-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Authors
Mitul Das
Description
Online Courses Dataset

This repository provides a comprehensive dataset of online courses, including details about course categories, duration, platforms, enrollment numbers, completion rates, and ratings. The dataset can be used for trend analysis, platform comparisons, and market insights.

Key Features

Course Categories: Analyze trends across AI, Business, Data Science, Design, Finance, and more. Enrollment Metrics: Understand popularity with student enrollment… See the full description on the dataset page: https://huggingface.co/datasets/Mitul1999/online-courses-usage-and-history-dataset.
f
Moodle Course Logs of a Brazilian Higher Education Institution
figshare.com
zip
Updated Nov 13, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bernardo Pereira Nunes (2018). Moodle Course Logs of a Brazilian Higher Education Institution [Dataset]. http://doi.org/10.6084/m9.figshare.7335860.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.7335860.v1
Dataset updated
Nov 13, 2018
Dataset provided by
figshare
Authors
Bernardo Pereira Nunes
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains the records of anonymised user interactions in seven online courses at a Higher Education institution in Brazil. For each course, the dataset covers a period spanning from 2017.1 to 2018.1 equivalent to three Brazilian academic periods. All online courses used the Moodle learning platform.The dataset covers the following courses:F - An introductory course in Philosophy - mandatory for all studentsC - An introductory course in Religion - mandatory for all studentsS - An introductory course in Political Theory - mandatory for students of the School of Humanities and Social SciencesM1 - Differential and Difference Equations course - mandatory for students of the School of Engineering and Exact SciencesM2 - Single Variable Calculus course - mandatory for students of the School of Engineering and Exact SciencesE9 - An introductory course in the Design of Control Systems - mandatory for students of the School of Industrial EngineeringE0 - Foundations of Engineering course - mandatory for all students of the School of EngineeringThe data is compressed in .zip format and can be uncompressed by standard compression utilities. Each course has three separate files grouped by user interactions from different academic periods. For example, the records for the course 'F' are split into F1, F2 and F3. F1 covers the records of the first academic period whereas F2 and F3 contain the records for the second and third academic periods respectively. Note that each instance of a course is independent and that the same student (identified by the same id) may only occur in the same course but in different academic periods iff s/he failed and opted to retake that course in one of the following courses covered by the data available here. The student id is preserved among the courses and academic periods.A description of the log fields contained in this dataset can be found at: https://docs.moodle.org/dev/Event_2#Information_contained_in_events
Udemy Dataset
brightdata.com
.json, .csv, .xlsx
Updated May 7, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bright Data (2024). Udemy Dataset [Dataset]. https://brightdata.com/products/datasets/udemy
Explore at:
.json, .csv, .xlsxAvailable download formats
Dataset updated
May 7, 2024
Dataset authored and provided by
Bright Datahttps://brightdata.com/
License
https://brightdata.com/licensehttps://brightdata.com/license
Area covered
Worldwide
Description
We'll tailor a Udemy dataset to meet your unique needs, encompassing course titles, user engagement metrics, completion rates, demographic data of learners, enrollment numbers, review scores, and other pertinent metrics.

Leverage our Udemy datasets for diverse applications to bolster strategic planning and market analysis. Scrutinizing these datasets enables organizations to grasp learner preferences and online education trends, facilitating nuanced educational program development and learning initiatives. Customize your access to the entire dataset or specific subsets as per your business requisites.

Popular use cases involve optimizing educational content based on engagement insights, enhancing learning strategies through targeted learner segmentation, and identifying and forecasting trends to stay ahead in the online education landscape.
Geospatial Deep Learning Seminar Online Course
ckan.americaview.org
Updated Nov 2, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ckan.americaview.org (2021). Geospatial Deep Learning Seminar Online Course [Dataset]. https://ckan.americaview.org/dataset/geospatial-deep-learning-seminar-online-course
Explore at:
Dataset updated
Nov 2, 2021
Dataset provided by
CKANhttps://ckan.org/
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This seminar is an applied study of deep learning methods for extracting information from geospatial data, such as aerial imagery, multispectral imagery, digital terrain data, and other digital cartographic representations. We first provide an introduction and conceptualization of artificial neural networks (ANNs). Next, we explore appropriate loss and assessment metrics for different use cases followed by the tensor data model, which is central to applying deep learning methods. Convolutional neural networks (CNNs) are then conceptualized with scene classification use cases. Lastly, we explore semantic segmentation, object detection, and instance segmentation. The primary focus of this course is semantic segmenation for pixel-level classification. The associated GitHub repo provides a series of applied examples. We hope to continue to add examples as methods and technologies further develop. These examples make use of a vareity of datasets (e.g., SAT-6, topoDL, Inria, LandCover.ai, vfillDL, and wvlcDL). Please see the repo for links to the data and associated papers. All examples have associated videos that walk through the process, which are also linked to the repo. A variety of deep learning architectures are explored including UNet, UNet++, DeepLabv3+, and Mask R-CNN. Currenlty, two examples use ArcGIS Pro and require no coding. The remaining five examples require coding and make use of PyTorch, Python, and R within the RStudio IDE. It is assumed that you have prior knowledge of coding in the Python and R enviroinments. If you do not have experience coding, please take a look at our Open-Source GIScience and Open-Source Spatial Analytics (R) courses, which explore coding in Python and R, respectively. After completing this seminar you will be able to: explain how ANNs work including weights, bias, activation, and optimization. describe and explain different loss and assessment metrics and determine appropriate use cases. use the tensor data model to represent data as input for deep learning. explain how CNNs work including convolutional operations/layers, kernel size, stride, padding, max pooling, activation, and batch normalization. use PyTorch, Python, and R to prepare data, produce and assess scene classification models, and infer to new data. explain common semantic segmentation architectures and how these methods allow for pixel-level classification and how they are different from traditional CNNs. use PyTorch, Python, and R (or ArcGIS Pro) to prepare data, produce and assess semantic segmentation models, and infer to new data.
f
Data_Sheet_1_Learners’ satisfaction of courses on Coursera as a massive open...
frontiersin.figshare.com
pdf
Updated May 31, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Long Quoc Nguyen (2023). Data_Sheet_1_Learners’ satisfaction of courses on Coursera as a massive open online course platform: A case study.pdf [Dataset]. http://doi.org/10.3389/feduc.2022.1086170.s001
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.3389/feduc.2022.1086170.s001
Dataset updated
May 31, 2023
Dataset provided by
Frontiers
Authors
Long Quoc Nguyen
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Online education has become more prevalent in the 21st century, especially after the COVID-19 pandemic. One of the major trends is the learning via Massive Open Online Courses (MOOCs), which is increasingly present at many universities around the world these days. In these courses, learners interact with the pre-designed materials and study everything mostly by themselves. Therefore, gaining insights into their satisfaction of such courses is vitally important to improve their learning experiences and performances. However, previous studies primarily focused on factors that affected learners’ satisfaction, not on how and what the satisfaction was. Moreover, past research mainly employed the narrative reviews posted on MOOC platforms; very few utilized survey and interview data obtained directly from MOOC users. The present study aims to fill in such gaps by employing a mixed-methods approach including a survey design and semi-structured interviews with the participation of 120 students, who were taking academic writing courses on Coursera (one of the world-leading MOOC platforms), at a private university in Vietnam. Results from both quantitative and qualitative data showed that the overall satisfaction of courses on Coursera was relatively low. Furthermore, most learners were not satisfied with their learning experience on the platform, primarily due to inappropriate assessment, lack of support, and interaction with teachers as well as improper plagiarism check. In addition, there were moderate correlations between students’ satisfaction and their perceived usefulness of Coursera courses. Pedagogically, teachers’ feedback and grading, faster support from course designers as well as easier-to-use plagiarism checking tools are needed to secure learners’ satisfaction of MOOCs.
Udemy Courses
kaggle.com
Updated Nov 21, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hossain (2022). Udemy Courses [Dataset]. https://www.kaggle.com/datasets/hossaingh/udemy-courses
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Nov 21, 2022
Dataset provided by
Kaggle
Authors
Hossain
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
This dataset contains detailed information on all available Udemy courses on Oct 10, 2022. This data was provided in the "Course_info.csv" file. Also, over 9 million comments were collected and provided in the "Comments.csv" file. The information of over 209k courses was collected by web scraping the Udemy website. Udemy holds 209,734 courses and 73,514 instructors teaching courses in 79 languages in 13 different categories.

The related notebook was uploaded here. If you are interested in analytical data about online learning platforms, I recommend reading the below article to find attractive insight. https://lnkd.in/gjCBhP_P
USAID University Online Course Catalog
s.cnmilf.com
res1catalogd-o-tdatad-o-tgov.vcapture.xyz
+2more
Updated Jun 25, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.usaid.gov (2024). USAID University Online Course Catalog [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/usaid-university-online-course-catalog
Explore at:
Dataset updated
Jun 25, 2024
Dataset provided by
United States Agency for International Developmenthttp://usaid.gov/
Description
Learning Management System online courses for USAID staff to access.
G
Online Learning Course Enrolment Totals by Course
open.canada.ca
data.ontario.ca
html, txt, xlsx
Updated Aug 6, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Government of Ontario (2025). Online Learning Course Enrolment Totals by Course [Dataset]. https://open.canada.ca/data/en/dataset/04084397-b8a3-4f42-af04-f062a62b0d6c
Explore at:
html, xlsx, txtAvailable download formats
Dataset updated
Aug 6, 2025
Dataset provided by
Government of Ontario
License
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Time period covered
Sep 1, 2014 - Aug 31, 2023
Description
Online learning (e-learning) course enrolment totals by course and year for public and Catholic schools. School boards report this data using the Ontario School Information System (OnSIS). Includes: * course code * course name * online learning course enrolment totals by year Enrolment totals include withdrawn or dropped courses. A student enrolled in more than one course is counted for each course. Data excludes private schools and Education and Community Partnership Program (ECPP) facilities. Not all courses offered by school boards are available to students via online learning. Cells are suppressed in categories with less than 10 students. Enrolment totals are rounded to the nearest five. Final as of October 4, 2024
t
Engaging with Massive Online Courses - Dataset - LDM
service.tib.eu
Updated Dec 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Engaging with Massive Online Courses - Dataset - LDM [Dataset]. https://service.tib.eu/ldmservice/dataset/engaging-with-massive-online-courses
Explore at:
Dataset updated
Dec 16, 2024
Description
The dataset is a collection of student activity traces from six Stanford University courses offered on Coursera.
Golf Courses
hub.arcgis.com
data.seattle.gov
+3more
Updated Oct 2, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
City of Seattle ArcGIS Online (2023). Golf Courses [Dataset]. https://hub.arcgis.com/maps/SeattleCityGIS::golf-courses/explore
Explore at:
Dataset updated
Oct 2, 2023
Dataset provided by
https://arcgis.com/
Authors
City of Seattle ArcGIS Online
Area covered

Description
Seattle Parks and Recreation Golf Course locations. SPR Golf Courses are managed by contractors.Refresh Cycle: WeeklyFeature Class: DPR.GolfCourse
Dataset: An Open Combinatorial Diffraction Dataset Including Consensus Human...
catalog.data.gov
data.nist.gov
+2more
Updated Jul 29, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Institute of Standards and Technology (2022). Dataset: An Open Combinatorial Diffraction Dataset Including Consensus Human and Machine Learning Labels with Quantified Uncertainty for Training New Machine Learning Models [Dataset]. https://catalog.data.gov/dataset/dataset-an-open-combinatorial-diffraction-dataset-including-consensus-human-and-machine-le-0de06
Explore at:
Dataset updated
Jul 29, 2022
Dataset provided by
National Institute of Standards and Technologyhttp://www.nist.gov/
Description
The open dataset, software, and other files accompanying the manuscript "An Open Combinatorial Diffraction Dataset Including Consensus Human and Machine Learning Labels with Quantified Uncertainty for Training New Machine Learning Models," submitted for publication to Integrated Materials and Manufacturing Innovations.Machine learning and autonomy are increasingly prevalent in materials science, but existing models are often trained or tuned using idealized data as absolute ground truths. In actual materials science, "ground truth" is often a matter of interpretation and is more readily determined by consensus. Here we present the data, software, and other files for a study using as-obtained diffraction data as a test case for evaluating the performance of machine learning models in the presence of differing expert opinions. We demonstrate that experts with similar backgrounds can disagree greatly even for something as intuitive as using diffraction to identify the start and end of a phase transformation. We then use a logarithmic likelihood method to evaluate the performance of machine learning models in relation to the consensus expert labels and their variance. We further illustrate this method's efficacy in ranking a number of state-of-the-art phase mapping algorithms. We propose a materials data challenge centered around the problem of evaluating models based on consensus with uncertainty. The data, labels, and code used in this study are all available online at data.gov, and the interested reader is encouraged to replicate and improve the existing models or to propose alternative methods for evaluating algorithmic performance.
H
HarvardX Person-Course Academic Year 2013 De-Identified dataset, version 3.0...
dataverse.harvard.edu
search.dataone.org
Updated Dec 18, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
HarvardX (2019). HarvardX Person-Course Academic Year 2013 De-Identified dataset, version 3.0 [Dataset]. http://doi.org/10.7910/DVN/26147
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/26147
Dataset updated
Dec 18, 2019
Dataset provided by
Harvard Dataverse
Authors
HarvardX
License
https://dataverse.harvard.edu/api/datasets/:persistentId/versions/11.2/customlicense?persistentId=doi:10.7910/DVN/26147https://dataverse.harvard.edu/api/datasets/:persistentId/versions/11.2/customlicense?persistentId=doi:10.7910/DVN/26147
Description
This release is comprised of de-identified data from the first year (Academic Year 2013: Fall 2012, Spring 2013, and Summer 2013) of HarvardX courses on the edX platform along with related documentation. These data are aggregate records, and each record represents one individual's activity in one edX course. For more information about the existing analyses of these data and the first year of HarvardX courses, please see the HarvardX and MITx working paper "HarvardX and MITx: The first year of open online courses" by Andrew Ho, Justin Reich, Sergiy Nesterko, Daniel Seaton, Tommy Mullaney, Jim Waldo, and Isaac Chuang (http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2381263). The first release of this dataset is the HarvardX Person-Course Academic Year 2013 De-Identified dataset, version 3.0, created on November 12, 2019. File name: HXPC13_DI_v3_11-13-2019.csv The md5sum for this release (HXPC13_DI_v3_11-13-2019.csv) is: 53419b486c3b19c14d2f06612980f630
H
CAMEO Dataset: Detection and Prevention of "Multiple Account" Cheating in...
dataverse.harvard.edu
Updated Jun 21, 2015
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Curtis Northcutt; Andrew Ho; Isaac Chuang (2015). CAMEO Dataset: Detection and Prevention of "Multiple Account" Cheating in Massively Open Online Courses [Dataset]. http://doi.org/10.7910/DVN/3UKVOR
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/3UKVOR
Dataset updated
Jun 21, 2015
Dataset provided by
Harvard Dataverse
Authors
Curtis Northcutt; Andrew Ho; Isaac Chuang
License
https://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.7910/DVN/3UKVORhttps://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.7910/DVN/3UKVOR
Description
[NOTE: Data are currently only accessible to qualified reviewers. For reviewers, detailed dataset descriptions are provided as text files associated with each dataset.] This dataset includes statistics about student actions in MITx and HarvardX courses, used in an analysis of Copying Answers using Multiple Existences Online (CAMEO) behavior. The data are partially anonymized, but insufficiently so for open release.
Student Performance & Learning Style
kaggle.com
Updated Feb 12, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Adil Shamim (2025). Student Performance & Learning Style [Dataset]. https://www.kaggle.com/datasets/adilshamim8/student-performance-and-learning-style/data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 12, 2025
Dataset provided by
Kaggle
Authors
Adil Shamim
Description
You should not take this dataset seriously, as it is a synthetic representation based on true trends in education and career outcomes.

About the Dataset

This dataset provides insights into how different study habits, learning styles, and external factors influence student performance. It includes 10,000 records, covering details about students' study hours, online learning participation, exam scores, and other factors impacting academic success.

Dataset Features

Student_ID – Unique identifier for each student

Age – Student's age (18-30 years)

Gender – Male, Female, or Other

Study_Hours_per_Week – Hours spent studying per week (5-50 hours)

Preferred_Learning_Style – Visual, Auditory, Reading/Writing, Kinesthetic

Online_Courses_Completed – Number of online courses completed (0-20)

Participation_in_Discussions – Whether the student actively participates in discussions (Yes/No)

Assignment_Completion_Rate (%) – Percentage of assignments completed (50%-100%)

Exam_Score (%) – Student’s final exam score (40%-100%)

Attendance_Rate (%) – Percentage of classes attended (50%-100%)

Use_of_Educational_Tech – Whether the student uses educational technology (Yes/No)

Self_Reported_Stress_Level – Student’s stress level (Low, Medium, High)

Time_Spent_on_Social_Media (hours/week) – Weekly hours spent on social media (0-30 hours)

Sleep_Hours_per_Night – Average sleep duration (4-10 hours)

Final_Grade – Assigned grade based on exam score (A, B, C, D, F)

Use Cases

Predicting Student Performance – Analyze how different factors influence exam scores.

Educational Insights – Understand the impact of study habits, learning styles, and external activities.

Machine Learning Applications – Train predictive models for student success.
S
A dataset of for cross-course learning path planning with 7 types of learner...
scidb.cn
Updated May 14, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yong-Wei Zhang (2024). A dataset of for cross-course learning path planning with 7 types of learner and 7 types of course materials [Dataset]. http://doi.org/10.57760/sciencedb.18420
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.57760/sciencedb.18420
Dataset updated
May 14, 2024
Dataset provided by
Science Data Bank
Authors
Yong-Wei Zhang
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset accompanies the research paper titled "Enhancing Personalized Learning in Online Education through Integrated Cross-Course Learning Path Planning." The dataset consists of MATLAB data files (.mat format).The dataset includes data on seven types of learner attributes, named from LearnerA.mat to LearnerG.mat. Each learner dataset contains two variables: L and LP. L is a 10x16 matrix that stores learner attributes, where each row represents a learner. The first column indicates the learner's ability level, the second column indicates the expected learning time, columns 3 to 6 represent normalized learning styles, and columns 7 to 16 represent learning objectives. LP is a structure that stores statistical information about this matrix.The dataset also includes data on seven types of learning resource attributes, named DatasetA.mat, DatasetB.mat, DatasetC.mat, DatasetAB.mat, DatasetAC.mat, DatasetBC.mat, and DatasetABC.mat. Each resource dataset contains two variables: M and MP. M is a matrix that stores the attributes of learning materials, where each row represents a material. The first column indicates the material's difficulty level, the second column represents the learning time required for the material, columns 3 to 6 describe the type of material, columns 7 to 16 cover the knowledge points addressed by the material, and columns 17 to 26 list the prerequisite knowledge points required for the material. MP is a structure that stores statistical information about this matrix.The dataset encompasses results from learning path planning involving seven types of learners across seven datasets, totaling 49 datasets, named in the format PathCost4_LSHADE_cnEpSin_D_X_L_Y.mat. Here, X represents the type of learning resource dataset (A, B, C, AB, AC, BC, ABC) and Y represents the type of learner (A to G). Each data file contains three variables: Gbest, Gtime, and S. Gbest is a 30x10 matrix, where each column stores the best cost function obtained from 30 runs of path planning for a learner on the corresponding dataset. Gtime is a 30x10 matrix, where each column stores the time spent on each run for a learner on the corresponding dataset. S is a 30x10 cell array storing the status information from each run.Finally, the dataset includes a compilation of the best cost functions for all runs for all learners across all learning material datasets, named learnerBest.mat. The file contains a variable, learnerBest, which is a 7x7x10x30 four-dimensional array. The first dimension represents the type of learner, the second dimension represents the type of learning material, the third dimension represents the learner index, and the fourth dimension represents the run index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Kellogg, Shaun; Edelmann, Achim (2023). Massively Open Online Course for Educators (MOOC-Ed) network dataset [Dataset]. http://doi.org/10.7910/DVN/ZZH3UB

Massively Open Online Course for Educators (MOOC-Ed) network dataset

Explore at:

34 scholarly articles cite this dataset (View in Google Scholar)

Unique identifier

https://doi.org/10.7910/DVN/ZZH3UB

Dataset updated

Nov 21, 2023

Dataset provided by

Harvard Dataverse

Authors

Kellogg, Shaun; Edelmann, Achim

Description

The dataset provides detailed information on the communications taking place between learners in two offerings of the Massively Open Online Course for Educators (MOOC-Eds) titled The Digital Learning Transition in K-12 Schools. The courses were offered to educators from the USA and abroad during the spring and fall of 2013. Though based on the same course, minor controlled variations were made to both MOOCs in terms of the course length, discussion prompts, and group size. The primary use of this dataset is to enable social network analyses (SNAs) of these communications. In particular, it allows modeling network mechanisms to better understand factors that facilitate or impede the exchange of information among educators, and includes relevant characteristics of the participants, such as their professional roles and their experience in education.

Clear search

Close search

Google apps

Main menu

Massively Open Online Course for Educators (MOOC-Ed) network dataset

A Dataset on Online Learning-based Web Behavior from Different Countries...

Open-Source GIScience Online Course

Canvas Network Courses, Activities, and Users (4/2014 - 9/2015) Restricted...

OULAD-Dataset

Context

Content

Acknowledgements

online-courses-usage-and-history-dataset

Moodle Course Logs of a Brazilian Higher Education Institution

Udemy Dataset

Geospatial Deep Learning Seminar Online Course

Data_Sheet_1_Learners’ satisfaction of courses on Coursera as a massive open...

Udemy Courses

USAID University Online Course Catalog

Online Learning Course Enrolment Totals by Course

Engaging with Massive Online Courses - Dataset - LDM

Golf Courses

Dataset: An Open Combinatorial Diffraction Dataset Including Consensus Human...

HarvardX Person-Course Academic Year 2013 De-Identified dataset, version 3.0...

CAMEO Dataset: Detection and Prevention of "Multiple Account" Cheating in...

Student Performance & Learning Style

About the Dataset

Dataset Features

Use Cases

A dataset of for cross-course learning path planning with 7 types of learner...

Massively Open Online Course for Educators (MOOC-Ed) network dataset