The dataset provides detailed information on the communications taking place between learners in two offerings of the Massively Open Online Course for Educators (MOOC-Eds) titled The Digital Learning Transition in K-12 Schools. The courses were offered to educators from the USA and abroad during the spring and fall of 2013. Though based on the same course, minor controlled variations were made to both MOOCs in terms of the course length, discussion prompts, and group size. The primary use of this dataset is to enable social network analyses (SNAs) of these communications. In particular, it allows modeling network mechanisms to better understand factors that facilitate or impede the exchange of information among educators, and includes relevant characteristics of the participants, such as their professional roles and their experience in education.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
2022
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In this course, you will explore a variety of open-source technologies for working with geosptial data, performing spatial analysis, and undertaking general data science. The first component of the class focuses on the use of QGIS and associated technologies (GDAL, PROJ, GRASS, SAGA, and Orfeo Toolbox). The second component of the class introduces Python and associated open-source libraries and modules (NumPy, Pandas, Matplotlib, Seaborn, GeoPandas, Rasterio, WhiteboxTools, and Scikit-Learn) used by geospatial scientists and data scientists. We also provide an introduction to Structured Query Language (SQL) for performing table and spatial queries. This course is designed for individuals that have a background in GIS, such as working in the ArcGIS environment, but no prior experience using open-source software and/or coding. You will be asked to work through a series of lecture modules and videos broken into several topic areas, as outlined below. Fourteen assignments and the required data have been provided as hands-on opportunites to work with data and the discussed technologies and methods. If you have any questions or suggestions, feel free to contact us. We hope to continue to update and improve this course. This course was produced by West Virginia View (http://www.wvview.org/) with support from AmericaView (https://americaview.org/). This material is based upon work supported by the U.S. Geological Survey under Grant/Cooperative Agreement No. G18AP00077. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the opinions or policies of the U.S. Geological Survey. Mention of trade names or commercial products does not constitute their endorsement by the U.S. Geological Survey. After completing this course you will be able to: apply QGIS to visualize, query, and analyze vector and raster spatial data. use available resources to further expand your knowledge of open-source technologies. describe and use a variety of open data formats. code in Python at an intermediate-level. read, summarize, visualize, and analyze data using open Python libraries. create spatial predictive models using Python and associated libraries. use SQL to perform table and spatial queries at an intermediate-level.
This dataset release is comprised of de-identified data from March 2014 - September 2015 of Canvas Network open courses, along with related documentation. In balancing data utility with thorough de-identification, this dataset favors utility; therefore, access and usage of this dataset is restricted as described in the Canvas Network Data Usage Agreement. These data use a star schema to organize various course, activity, and person records using dimensions and facts. The structure of this dataset is based on the Canvas Data star schema as described in https://portal.inshosteddata.com/docs. The first release of this dataset is the Canvas Network Courses, Activities, and Users (4/2014 - 9/2015) Dataset, version 1.0, created on March 3, 2016. The data set is split into multiple files for convenience: CNCAU_1403-1509_R_v1_03-03-2016.tgz contains the facts and dimensions representing the breadth of the dataset CNCAU_1403-1509_R_v1_03-03-2016_requests-01.gz - ...08.gz contain user page view requests The resulting files are plain text, with tab-separated values.
MOOC dataset to study behavior of students for online courses.
It contains data about courses, students and their interactions with Virtual Learning Environment (VLE) for seven selected courses (called modules). Presentations of courses start in February and October - they are marked by “B” and “J” respectively. The dataset consists of tables connected using unique identifiers. All tables are stored in the csv format.
Kuzilek J., Hlosta M., Zdrahal Z. Open University Learning Analytics dataset Sci. Data 4:170171 doi: 10.1038/sdata.2017.171 (2017).
Online Courses Dataset
This repository provides a comprehensive dataset of online courses, including details about course categories, duration, platforms, enrollment numbers, completion rates, and ratings. The dataset can be used for trend analysis, platform comparisons, and market insights.
Key Features
Course Categories: Analyze trends across AI, Business, Data Science, Design, Finance, and more. Enrollment Metrics: Understand popularity with student enrollment… See the full description on the dataset page: https://huggingface.co/datasets/Mitul1999/online-courses-usage-and-history-dataset.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains the records of anonymised user interactions in seven online courses at a Higher Education institution in Brazil. For each course, the dataset covers a period spanning from 2017.1 to 2018.1 equivalent to three Brazilian academic periods. All online courses used the Moodle learning platform.The dataset covers the following courses:F - An introductory course in Philosophy - mandatory for all studentsC - An introductory course in Religion - mandatory for all studentsS - An introductory course in Political Theory - mandatory for students of the School of Humanities and Social SciencesM1 - Differential and Difference Equations course - mandatory for students of the School of Engineering and Exact SciencesM2 - Single Variable Calculus course - mandatory for students of the School of Engineering and Exact SciencesE9 - An introductory course in the Design of Control Systems - mandatory for students of the School of Industrial EngineeringE0 - Foundations of Engineering course - mandatory for all students of the School of EngineeringThe data is compressed in .zip format and can be uncompressed by standard compression utilities. Each course has three separate files grouped by user interactions from different academic periods. For example, the records for the course 'F' are split into F1, F2 and F3. F1 covers the records of the first academic period whereas F2 and F3 contain the records for the second and third academic periods respectively. Note that each instance of a course is independent and that the same student (identified by the same id) may only occur in the same course but in different academic periods iff s/he failed and opted to retake that course in one of the following courses covered by the data available here. The student id is preserved among the courses and academic periods.A description of the log fields contained in this dataset can be found at: https://docs.moodle.org/dev/Event_2#Information_contained_in_events
https://brightdata.com/licensehttps://brightdata.com/license
We'll tailor a Udemy dataset to meet your unique needs, encompassing course titles, user engagement metrics, completion rates, demographic data of learners, enrollment numbers, review scores, and other pertinent metrics.
Leverage our Udemy datasets for diverse applications to bolster strategic planning and market analysis. Scrutinizing these datasets enables organizations to grasp learner preferences and online education trends, facilitating nuanced educational program development and learning initiatives. Customize your access to the entire dataset or specific subsets as per your business requisites.
Popular use cases involve optimizing educational content based on engagement insights, enhancing learning strategies through targeted learner segmentation, and identifying and forecasting trends to stay ahead in the online education landscape.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This seminar is an applied study of deep learning methods for extracting information from geospatial data, such as aerial imagery, multispectral imagery, digital terrain data, and other digital cartographic representations. We first provide an introduction and conceptualization of artificial neural networks (ANNs). Next, we explore appropriate loss and assessment metrics for different use cases followed by the tensor data model, which is central to applying deep learning methods. Convolutional neural networks (CNNs) are then conceptualized with scene classification use cases. Lastly, we explore semantic segmentation, object detection, and instance segmentation. The primary focus of this course is semantic segmenation for pixel-level classification. The associated GitHub repo provides a series of applied examples. We hope to continue to add examples as methods and technologies further develop. These examples make use of a vareity of datasets (e.g., SAT-6, topoDL, Inria, LandCover.ai, vfillDL, and wvlcDL). Please see the repo for links to the data and associated papers. All examples have associated videos that walk through the process, which are also linked to the repo. A variety of deep learning architectures are explored including UNet, UNet++, DeepLabv3+, and Mask R-CNN. Currenlty, two examples use ArcGIS Pro and require no coding. The remaining five examples require coding and make use of PyTorch, Python, and R within the RStudio IDE. It is assumed that you have prior knowledge of coding in the Python and R enviroinments. If you do not have experience coding, please take a look at our Open-Source GIScience and Open-Source Spatial Analytics (R) courses, which explore coding in Python and R, respectively. After completing this seminar you will be able to: explain how ANNs work including weights, bias, activation, and optimization. describe and explain different loss and assessment metrics and determine appropriate use cases. use the tensor data model to represent data as input for deep learning. explain how CNNs work including convolutional operations/layers, kernel size, stride, padding, max pooling, activation, and batch normalization. use PyTorch, Python, and R to prepare data, produce and assess scene classification models, and infer to new data. explain common semantic segmentation architectures and how these methods allow for pixel-level classification and how they are different from traditional CNNs. use PyTorch, Python, and R (or ArcGIS Pro) to prepare data, produce and assess semantic segmentation models, and infer to new data.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Online education has become more prevalent in the 21st century, especially after the COVID-19 pandemic. One of the major trends is the learning via Massive Open Online Courses (MOOCs), which is increasingly present at many universities around the world these days. In these courses, learners interact with the pre-designed materials and study everything mostly by themselves. Therefore, gaining insights into their satisfaction of such courses is vitally important to improve their learning experiences and performances. However, previous studies primarily focused on factors that affected learners’ satisfaction, not on how and what the satisfaction was. Moreover, past research mainly employed the narrative reviews posted on MOOC platforms; very few utilized survey and interview data obtained directly from MOOC users. The present study aims to fill in such gaps by employing a mixed-methods approach including a survey design and semi-structured interviews with the participation of 120 students, who were taking academic writing courses on Coursera (one of the world-leading MOOC platforms), at a private university in Vietnam. Results from both quantitative and qualitative data showed that the overall satisfaction of courses on Coursera was relatively low. Furthermore, most learners were not satisfied with their learning experience on the platform, primarily due to inappropriate assessment, lack of support, and interaction with teachers as well as improper plagiarism check. In addition, there were moderate correlations between students’ satisfaction and their perceived usefulness of Coursera courses. Pedagogically, teachers’ feedback and grading, faster support from course designers as well as easier-to-use plagiarism checking tools are needed to secure learners’ satisfaction of MOOCs.
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
This dataset contains detailed information on all available Udemy courses on Oct 10, 2022. This data was provided in the "Course_info.csv" file. Also, over 9 million comments were collected and provided in the "Comments.csv" file. The information of over 209k courses was collected by web scraping the Udemy website. Udemy holds 209,734 courses and 73,514 instructors teaching courses in 79 languages in 13 different categories.
The related notebook was uploaded here. If you are interested in analytical data about online learning platforms, I recommend reading the below article to find attractive insight. https://lnkd.in/gjCBhP_P
Learning Management System online courses for USAID staff to access.
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Online learning (e-learning) course enrolment totals by course and year for public and Catholic schools. School boards report this data using the Ontario School Information System (OnSIS). Includes: * course code * course name * online learning course enrolment totals by year Enrolment totals include withdrawn or dropped courses. A student enrolled in more than one course is counted for each course. Data excludes private schools and Education and Community Partnership Program (ECPP) facilities. Not all courses offered by school boards are available to students via online learning. Cells are suppressed in categories with less than 10 students. Enrolment totals are rounded to the nearest five. Final as of October 4, 2024
The dataset is a collection of student activity traces from six Stanford University courses offered on Coursera.
Seattle Parks and Recreation Golf Course locations. SPR Golf Courses are managed by contractors.Refresh Cycle: WeeklyFeature Class: DPR.GolfCourse
The open dataset, software, and other files accompanying the manuscript "An Open Combinatorial Diffraction Dataset Including Consensus Human and Machine Learning Labels with Quantified Uncertainty for Training New Machine Learning Models," submitted for publication to Integrated Materials and Manufacturing Innovations.Machine learning and autonomy are increasingly prevalent in materials science, but existing models are often trained or tuned using idealized data as absolute ground truths. In actual materials science, "ground truth" is often a matter of interpretation and is more readily determined by consensus. Here we present the data, software, and other files for a study using as-obtained diffraction data as a test case for evaluating the performance of machine learning models in the presence of differing expert opinions. We demonstrate that experts with similar backgrounds can disagree greatly even for something as intuitive as using diffraction to identify the start and end of a phase transformation. We then use a logarithmic likelihood method to evaluate the performance of machine learning models in relation to the consensus expert labels and their variance. We further illustrate this method's efficacy in ranking a number of state-of-the-art phase mapping algorithms. We propose a materials data challenge centered around the problem of evaluating models based on consensus with uncertainty. The data, labels, and code used in this study are all available online at data.gov, and the interested reader is encouraged to replicate and improve the existing models or to propose alternative methods for evaluating algorithmic performance.
https://dataverse.harvard.edu/api/datasets/:persistentId/versions/11.2/customlicense?persistentId=doi:10.7910/DVN/26147https://dataverse.harvard.edu/api/datasets/:persistentId/versions/11.2/customlicense?persistentId=doi:10.7910/DVN/26147
This release is comprised of de-identified data from the first year (Academic Year 2013: Fall 2012, Spring 2013, and Summer 2013) of HarvardX courses on the edX platform along with related documentation. These data are aggregate records, and each record represents one individual's activity in one edX course. For more information about the existing analyses of these data and the first year of HarvardX courses, please see the HarvardX and MITx working paper "HarvardX and MITx: The first year of open online courses" by Andrew Ho, Justin Reich, Sergiy Nesterko, Daniel Seaton, Tommy Mullaney, Jim Waldo, and Isaac Chuang (http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2381263). The first release of this dataset is the HarvardX Person-Course Academic Year 2013 De-Identified dataset, version 3.0, created on November 12, 2019. File name: HXPC13_DI_v3_11-13-2019.csv The md5sum for this release (HXPC13_DI_v3_11-13-2019.csv) is: 53419b486c3b19c14d2f06612980f630
https://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.7910/DVN/3UKVORhttps://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.7910/DVN/3UKVOR
[NOTE: Data are currently only accessible to qualified reviewers. For reviewers, detailed dataset descriptions are provided as text files associated with each dataset.] This dataset includes statistics about student actions in MITx and HarvardX courses, used in an analysis of Copying Answers using Multiple Existences Online (CAMEO) behavior. The data are partially anonymized, but insufficiently so for open release.
You should not take this dataset seriously, as it is a synthetic representation based on true trends in education and career outcomes.
This dataset provides insights into how different study habits, learning styles, and external factors influence student performance. It includes 10,000 records, covering details about students' study hours, online learning participation, exam scores, and other factors impacting academic success.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset accompanies the research paper titled "Enhancing Personalized Learning in Online Education through Integrated Cross-Course Learning Path Planning." The dataset consists of MATLAB data files (.mat format).The dataset includes data on seven types of learner attributes, named from LearnerA.mat to LearnerG.mat. Each learner dataset contains two variables: L and LP. L is a 10x16 matrix that stores learner attributes, where each row represents a learner. The first column indicates the learner's ability level, the second column indicates the expected learning time, columns 3 to 6 represent normalized learning styles, and columns 7 to 16 represent learning objectives. LP is a structure that stores statistical information about this matrix.The dataset also includes data on seven types of learning resource attributes, named DatasetA.mat, DatasetB.mat, DatasetC.mat, DatasetAB.mat, DatasetAC.mat, DatasetBC.mat, and DatasetABC.mat. Each resource dataset contains two variables: M and MP. M is a matrix that stores the attributes of learning materials, where each row represents a material. The first column indicates the material's difficulty level, the second column represents the learning time required for the material, columns 3 to 6 describe the type of material, columns 7 to 16 cover the knowledge points addressed by the material, and columns 17 to 26 list the prerequisite knowledge points required for the material. MP is a structure that stores statistical information about this matrix.The dataset encompasses results from learning path planning involving seven types of learners across seven datasets, totaling 49 datasets, named in the format PathCost4_LSHADE_cnEpSin_D_X_L_Y.mat. Here, X represents the type of learning resource dataset (A, B, C, AB, AC, BC, ABC) and Y represents the type of learner (A to G). Each data file contains three variables: Gbest, Gtime, and S. Gbest is a 30x10 matrix, where each column stores the best cost function obtained from 30 runs of path planning for a learner on the corresponding dataset. Gtime is a 30x10 matrix, where each column stores the time spent on each run for a learner on the corresponding dataset. S is a 30x10 cell array storing the status information from each run.Finally, the dataset includes a compilation of the best cost functions for all runs for all learners across all learning material datasets, named learnerBest.mat. The file contains a variable, learnerBest, which is a 7x7x10x30 four-dimensional array. The first dimension represents the type of learner, the second dimension represents the type of learning material, the third dimension represents the learner index, and the fourth dimension represents the run index.
The dataset provides detailed information on the communications taking place between learners in two offerings of the Massively Open Online Course for Educators (MOOC-Eds) titled The Digital Learning Transition in K-12 Schools. The courses were offered to educators from the USA and abroad during the spring and fall of 2013. Though based on the same course, minor controlled variations were made to both MOOCs in terms of the course length, discussion prompts, and group size. The primary use of this dataset is to enable social network analyses (SNAs) of these communications. In particular, it allows modeling network mechanisms to better understand factors that facilitate or impede the exchange of information among educators, and includes relevant characteristics of the participants, such as their professional roles and their experience in education.