Facebook
TwitterThe Backbencher Dataset is a unique and fun dataset designed to analyze and explore student behaviors and trends in classrooms. This dataset focuses on the attendance patterns, assignment completion rates, and other factors that influence a student’s academic performance, with a quirky twist: it includes a column identifying whether the student wears glasses!
This dataset is ideal for Machine Learning practitioners and Data Science enthusiasts who want to work on real-world datasets with an engaging context. It can be used for various ML problems such as:
Predictive Analytics: Predicting student performance based on attendance and assignments. **Clustering Analysis: **Grouping students based on shared characteristics. Classification Tasks: Classifying students as "active" or "inactive" based on participation metrics. Key Features: USN: Unique Student Number for identification. Name: Student names (for reference). Attendance (%): Percentage of classes attended. Assignments Completed: Number of assignments completed. Exam Scores: Performance in exams. Participation in Activities: Measures involvement in extracurricular activities. Glasses (Yes/No): Whether the student wears glasses (interesting feature for pattern recognition). Use Cases: Educational data analysis and predictive modeling. Creating engaging ML projects for students and beginners. Developing dashboards for visualizing student performance trends.
Facebook
Twitterhttp://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
This dataset compiles the top 2500 datasets from Kaggle, encompassing a diverse range of topics and contributors. It provides insights into dataset creation, usability, popularity, and more, offering valuable information for researchers, analysts, and data enthusiasts.
Research Analysis: Researchers can utilize this dataset to analyze trends in dataset creation, popularity, and usability scores across various categories.
Contributor Insights: Kaggle contributors can explore the dataset to gain insights into factors influencing the success and engagement of their datasets, aiding in optimizing future submissions.
Machine Learning Training: Data scientists and machine learning enthusiasts can use this dataset to train models for predicting dataset popularity or usability based on features such as creator, category, and file types.
Market Analysis: Analysts can leverage the dataset to conduct market analysis, identifying emerging trends and popular topics within the data science community on Kaggle.
Educational Purposes: Educators and students can use this dataset to teach and learn about data analysis, visualization, and interpretation within the context of real-world datasets and community-driven platforms like Kaggle.
Column Definitions:
Dataset Name: Name of the dataset. Created By: Creator(s) of the dataset. Last Updated in number of days: Time elapsed since last update. Usability Score: Score indicating the ease of use. Number of File: Quantity of files included. Type of file: Format of files (e.g., CSV, JSON). Size: Size of the dataset. Total Votes: Number of votes received. Category: Categorization of the dataset's subject matter.
Facebook
TwitterAttribution-ShareAlike 2.0 (CC BY-SA 2.0)https://creativecommons.org/licenses/by-sa/2.0/
License information was derived automatically
Subject: EducationSpecific: Online Learning and FunType: Questionnaire survey data (csv / excel)Date: February - March 2020Content: Students' views about online learning and fun Data Source: Project OLAFValue: These data provide students' beliefs about how learning occurs and correlations with fun. Participants were 206 students from the OU
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is about books. It has 1 row and is filtered where the book is Paper scissors glue : 45 fun and creative papercraft projects for kids. It features 7 columns including author, publication date, language, and book publisher.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is about book subjects. It has 4 rows and is filtered where the books is Summer reading program fun : 10 thrilling, inspiring, wacky board games for kids. It features 10 columns including number of authors, number of books, earliest publication date, and latest publication date.
Facebook
TwitterYou have some experience with ANN, but you’re new to computer vision. This is the perfect introduction to working with bigger images than MNIST and also working with raw images.
Your goal is to correctly identify the type of CSGO gun from a dataset of about 980 labeled images. We encourage you to experiment with different algorithms to learn first-hand what works well and how techniques compare.
This is a Dataset created by Yash khatri by extracting images from CSGO third party software.
My interest in Deep learning and CSGO.
Facebook
Twitterhttps://cdla.io/sharing-1-0/https://cdla.io/sharing-1-0/
The dataset includes: Roll Number: Represent the roll number of the student.
Gender: Useful for analyzing performance differences between male and female students.
Race/Ethnicity: Allows analysis of academic performance trends across different racial or ethnic groups.
Parental Level of Education: Indicates the educational background of the student's family.
Lunch: Shows whether students receive a free or reduced lunch, which is often a socioeconomic indicator.
Test Preparation Course: This tells whether students completed a test prep course, which could impact their performance.
Math Score: Provides a measure of each student’s performance in math, used to calculate averages or trends across various demographics. Science Score: Evaluates students' Science knowledge, which can be analyzed to assess overall scentific knowledge of the student.
Reading Score: Measures performance in reading, allowing for insights into literacy and comprehension levels among students.
Writing Score: Evaluates students' writing skills, which can be analyzed to assess overall literacy and expression.
Total Score: Shows the total number achieved by the student out of 400.
Grade: Gade achieved by the student. "A" grade if Total marks >= 320, "B" grade if Total marks >= 250, "C" grade if Total marks >= 200, "D" grade if Total marks >= 150 and Fail if <150.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset provides a collection of 160 instances belonging to two classes (pass' = 136 andfail' = 24). The data is an anonymised, statistically sound and reliable representation of the original data collected from students studying computer science modules at a UK University. Each instance is made up of 19 features plus the class label. Eight of the features represent students' online behaviour including bio information retrieved from Virtual Learning Environment. Eleven of the features represent students' neighbourhood influence retrieved from Office for Students database. The data has been compiled and made available in de-facto/de-jure standard open formats (CSV and JSON).
This data was collected and used in a research study undertaken by academics and researchers at Computer Science Department, Edge Hill University, United Kingdom. To encourage reproducibility of the experiments and results reported, the data is provided in the exact training-validation-testing splits used in the experiments.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description. This project contains the dataset relative to the Galatanet survey, conducted in 2009 and 2010 at the Galatasaray University in Istanbul (Turkey). The goal of this survey was to retrieve information regarding the social relationships between students, their feeling regarding the university in general, and their purchase behavior. The survey was conducted during two phases: the first one in 2009 and the second in 2010.
The dataset includes two kinds of data. First, the answers to most of the questions are contained in a large table, available under both CSV and MS Excel formats. An description file allows understanding the meaning of each field appearing in the table. Note the
survey form is also contained in the archive, for reference (it is in French and Turkish only, though). Second, the social network of students is available under both Pajek and Graphml formats. Having both individual (nodal attributes) and relational (links) information in the same dataset is, to our knowledge, rare and difficult to find in public sources, and this makes (to our opinion) this dataset interesting and valuable.
All data are completely anonymous: students' names have been replaced by random numbers. Note that the survey is not exactly the same between the two phases: some small adjustments were applied thanks to the feedback from the first phase (but the datasets have been normalized since then). Also, the electronic form was very much improved for the second phase, which explains why the answers are much more complete than in the first phase.
The data were used in our following publications:
Citation. If you use this data, please cite article [1] above:
@InProceedings{Labatut2010, author = {Labatut, Vincent and Balasque, Jean-Michel}, title = {Business-oriented Analysis of a Social Network of University Students}, booktitle = {International Conference on Advances in Social Networks Analysis and Mining}, year = {2010}, pages = {25-32}, address = {Odense, DK}, publisher = {IEEE Publishing}, doi = {10.1109/ASONAM.2010.15},}
Contact. 2009-2010 by Jean-Michel Balasque (jmbalasque@gsu.edu.tr) & Vincent Labatut (vlabatut@gsu.edu.tr)
License. This dataset is open data: you can redistribute it and/or use it under the terms of the Creative Commons Zero license (see `license.txt`).
Facebook
TwitterUpdated 30 January 2023
There has been some confusion around licensing for this data set. Dr. Carla Patalano and Dr. Rich Huebner are the original authors of this dataset.
We provide a license to anyone who wishes to use this dataset for learning or teaching. For the purposes of sharing, please follow this license:
CC-BY-NC-ND This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
https://rpubs.com/rhuebner/hrd_cb_v14
PLEASE NOTE -- I recently updated the codebook - please use the above link. A few minor discrepancies were identified between the codebook and the dataset. Please feel free to contact me through LinkedIn (www.linkedin.com/in/RichHuebner) to report discrepancies and make requests.
HR data can be hard to come by, and HR professionals generally lag behind with respect to analytics and data visualization competency. Thus, Dr. Carla Patalano and I set out to create our own HR-related dataset, which is used in one of our graduate MSHRM courses called HR Metrics and Analytics, at New England College of Business. We created this data set ourselves. We use the data set to teach HR students how to use and analyze the data in Tableau Desktop - a data visualization tool that's easy to learn.
This version provides a variety of features that are useful for both data visualization AND creating machine learning / predictive analytics models. We are working on expanding the data set even further by generating even more records and a few additional features. We will be keeping this as one file/one data set for now. There is a possibility of creating a second file perhaps down the road where you can join the files together to practice SQL/joins, etc.
Note that this dataset isn't perfect. By design, there are some issues that are present. It is primarily designed as a teaching data set - to teach human resources professionals how to work with data and analytics.
We have reduced the complexity of the dataset down to a single data file (v14). The CSV revolves around a fictitious company and the core data set contains names, DOBs, age, gender, marital status, date of hire, reasons for termination, department, whether they are active or terminated, position title, pay rate, manager name, and performance score.
Recent additions to the data include: - Absences - Most Recent Performance Review Date - Employee Engagement Score
Dr. Carla Patalano provided the baseline idea for creating this synthetic data set, which has been used now by over 200 Human Resource Management students at the college. Students in the course learn data visualization techniques with Tableau Desktop and use this data set to complete a series of assignments.
We've included some open-ended questions that you can explore and try to address through creating Tableau visualizations, or R or Python analyses. Good luck and enjoy the learning!
There are so many other interesting questions that could be addressed through this interesting data set. Dr. Patalano and I look forward to seeing what we can come up with.
If you have any questions or comments about the dataset, please do not hesitate to reach out to me on LinkedIn: http://www.linkedin.com/in/RichHuebner
You can also reach me via email at: Richard.Huebner@go.cambridgecollege.edu
Facebook
Twitter5 Fun and Educational Video Games for Kids
It's difficult to strike a balance between screen time for children. Parents are concerned about gaming, but they are hoping it will be instructive. Fortunately, several games combine enjoyment and learning, allowing children to learn while having fun. These games go beyond amusement, involving puzzles, math, critical thinking, and creativity. Some even increase motor abilities, such as mouse sensitivity, without the kids' knowledge. The correct… See the full description on the dataset page: https://huggingface.co/datasets/mouse-sensitivity/video-games-for-kids.
Facebook
Twitterhttp://www.gnu.org/licenses/fdl-1.3.htmlhttp://www.gnu.org/licenses/fdl-1.3.html
No descriptions are provided as this is EDA. Your own understanding of the data is important. Try to Have Fun with the data.
Facebook
TwitterA series of fun and engaging activities have been created to get the public outside to learn about MPAs while also enjoying them in person.
Facebook
Twitterhttps://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F9401378%2F464a9e94db6f323ba93e6feeebb0e2eb%2FGirls%20and%20boys%20competing%20in%20esports%20for%20the%20cup%20using%20PC.jpg?generation=1677750759160190&alt=media" alt="">
This dataset was created as a result of students who used a gamified learning application with interactive flashcards and badges to engage them in learning about statistics and multidimensional statistical analysis.
The dataset includes data on various variables, including the practice exam grades before using the educational platform, the final exam grades after using the platform, whether or not the student is a user of the platform, the average grade for each of the six quizzes, and the number of times each quiz was submitted.
❗️This dataset is a modified version of the original data colected. Its analysis is available in the DOES GAMIFICATION LEAD TO BETTER RESULTS IN EDUCATION? paper.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains information on Brisbane City Council's GOLD (Growing Older, Living Dangerously) n kids events. These events usually take place during Queensland school holidays. Outside of these times, the feed may appear empty. The dataset includes locations, dates and times..
Brisbane City Council's events data containing dates, costs, booking requirements, venue and location for GOLD n' kids events in Brisbane.
The dataset was created using data from an external service called Trumba. The data is a transformed extract created using the Trumba Calendar API XML feed, that is limited to the next 1,000 events. The transformed extract is converted to a CSV file and uploaded into this dataset daily.
To access and view the data using the Source API (Trumba), use the information below and your preferred link in the Data and Resources section. The Source API is available for this dataset in:
Trumba Calendar - API - XML feed is limited to the next 1,000 events
Trumba Calendar - API - RSS feed is limited to the next 1,000 events
Trumba Calendar - API - CSV feed is limited to the next 2,000 events
Trumba Calendar - API - JSON feed is limited to the next 2,000 events.
The Data and resources section of this dataset contains further information for this dataset.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
On Wednesday, September 17, my research team and I observed a line-up of students waiting for free hot dogs to the West of Stedman Lecture Halls. Their gender was observed by eye and noted, as well as whether they had a two strap backpack or not. Any other type of bag was grouped together with students not carrying a bag. This was continued for a total of 25 students.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The aim of this paper is to investigate the re-use of research data deposited in digital data archive in the social sciences. The study examines the quantity, type, and purpose of data downloads by analyzing enriched user log data collected from Swiss data archive. The findings show that quantitative datasets are downloaded increasingly from the digital archive and that downloads focus heavily on a small share of the datasets. The most frequently downloaded datasets are survey datasets collected by research organizations offering possibilities for longitudinal studies. Users typically download only one dataset, but a group of heavy downloaders form a remarkable share of all downloads. The main user group downloading data from the archive are students who use the data in their studies. Furthermore, datasets downloaded for research purposes often, but not always, serve to be used in scholarly publications. Enriched log data from data archives offer an interesting macro level perspective on the use and users of the services and help understanding the increasing role of repositories in the social sciences. The study provides insights into the potential of collecting and using log data for studying and evaluating data archive use.
Facebook
Twitterhttps://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de441791https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de441791
Abstract (en): This instructional package includes a student manual containing six exercises, an instructor's guide, and four subsets of data required for use in conjunction with the manual's exercises. The package's major purpose is to enable students to examine certain substantive questions about electoral behavior through analysis of political survey data. The manual avoids instruction in methodology, per se, hence the student is taken no further than the analysis of straightforward variables in percentagized tables with and without controls, and is introduced to epsilon, the percentage difference measures based on 2 X 2 tables, but offered no elaborate discussion of measures of association. The six structured exercises introduce the basic analytic techniques necessary for coping with survey data in the expectation that the students will then move on to their own topics. The datasets were designed to be both substantively and analytically interesting, as students are forced continually to make choices and judgments about which variables to use and how to combine code categories. Beyond this, the exercises serve a more complex purpose: to help the student gain a better understanding of the existing scholarly literature by going through steps similar to those of the original analysts. In some instances, the students can readily appreciate how close their work is to the analysis in assigned reading. The instructor's guide has two purposes: first, to help instructors use the student manual effectively, and second, to suggest various ways to depart from the six exercises and to develop essentially new manuals. The subsets (Parts 1-4) contain data from every presidential election survey that was conducted by the Survey Research Center (SRC) and Center for Political Studies (CPS) (at the University of Michigan) from 1952 to 1980. Part 4 contains an extensive set of variables drawn exclusively from the CPS's AMERICAN NATIONAL ELECTION STUDY, 1980 (ICPSR 7763). This is the only deck needed to complete the exercises in Exercises l-5. Part 1 includes small sets of comparable variables from each SRC/CPS presidential election study from 1952-1972. The variables in these decks were selected with the intention of providing students with a range of interesting possibilities for original research topics for term papers. Part 2 includes variables and respondents from panel surveys contained in AMERICAN NATIONAL ELECTION SERIES: 1972, 1974, 1976 (ICPSR 7607). This dataset may be used with Exercise 6. Supplementing the panel file is the data in Part 3, based on the cross-section survey, AMERICAN NATIONAL ELECTION STUDY, 1976 (7381). It repeats the variables from the 1976 component of the panel, with a much larger N. The AMERICAN NATIONAL ELECTION STUDY, 1976 (7381) may be used independently, as with the AMERICAN NATIONAL ELECTION STUDY, 1980 (ICPSR 7763), or it may be used in exercises comparing cross-section with panel data. Data used for the exercises were made available by ICPSR. The major analyses of these data have appeared in two publications: (1) University of Michigan. Survey Research Center. THE AMERICAN VOTER. New York, NY: Wiley, 1960, and (2) Campbell, Angus, Philip Converse, Warren Miller, and Donald Stokes. ELECTIONS AND THE POLITICAL ORDER. New York, NY: Wiley, 1966. 2006-01-12 All files were removed from dataset 6 and flagged as study-level files, so that they will accompany all downloads.2006-01-12 All files were removed from dataset 5 and flagged as study-level files, so that they will accompany all downloads. The codebooks, Student Manual for All Parts and the Guide to Instruction for All Parts, are provided by ICPSR as a Portable Document Format (PDF) file. The PDF file format was developed by Adobe Systems Incorporated and can be accessed using PDF reader software, such as the Adobe Acrobat Reader. Information on how to obtain a copy of the Acrobat Reader is provided on the ICPSR Web site.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The aim of this paper is to investigate the re-use of research data deposited in digital data archive in the social sciences. The study examines the quantity, type, and purpose of data downloads by analyzing enriched user log data collected from Swiss data archive. The findings show that quantitative datasets are downloaded increasingly from the digital archive and that downloads focus heavily on a small share of the datasets. The most frequently downloaded datasets are survey datasets collected by research organizations offering possibilities for longitudinal studies. Users typically download only one dataset, but a group of heavy downloaders form a remarkable share of all downloads. The main user group downloading data from the archive are students who use the data in their studies. Furthermore, datasets downloaded for research purposes often, but not always, serve to be used in scholarly publications. Enriched log data from data archives offer an interesting macro level perspective on the use and users of the services and help understanding the increasing role of repositories in the social sciences. The study provides insights into the potential of collecting and using log data for studying and evaluating data archive use.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Student Learning Methods: A Survey - Dataset Description The Student Learning Methods: A Survey dataset comprises responses from 100 university students, with 10 participants surveyed from each of 10 different universities. This dataset explores students' preferences and evaluations of various learning methods based on effectiveness and engagement.
Key Features of the Dataset: Survey Scope:
Responses collected from 100 students.
Participants evenly distributed across 10 different universities (10 students per university).
Learning Methods Evaluated:
The dataset includes ratings for various learning techniques, such as:
Lectures – Traditional classroom-based teaching.
Case Studies – Analyzing real-world scenarios to understand concepts.
Group Projects – Collaborative assignments involving multiple students.
Experiments – Hands-on practical work in labs or controlled settings.
Online Tutorials – Digital or video-based instructional materials.
Evaluation Criteria:
Each learning method is rated on a numerical scale based on:
Effectiveness – How well students believe the method helps in learning.
Engagement – How interesting or interactive the method is perceived to be.
Secondary Evaluations:
The dataset includes repeated columns for learning methods, potentially representing:
Post-survey reflections where students reassessed their initial responses.
Comparative evaluations of different methods after exposure to multiple approaches.
Overall Effectiveness and Engagement Scores:
Each student provides aggregate scores summarizing how useful and engaging they found different learning methods overall.
Potential Use Cases:
Educational Research – Understanding which teaching techniques are most effective across universities.
Curriculum Development – Helping educators refine teaching strategies.
Student-Centric Learning Models – Identifying preferred methods to enhance student engagement.
Comparative Analysis – Examining how student preferences vary across universities. Survey Scope: Responses collected from 100 students.
Participants evenly distributed across 10 different universities (10 students per university).
The surveyed universities include:
Delhi University (DU) – A large central university in Delhi.
Jawaharlal Nehru University (JNU) – A well-known research-focused university in Delhi.
Banaras Hindu University (BHU) – A prestigious university in Varanasi, Uttar Pradesh.
Aligarh Muslim University (AMU) – A renowned university in Aligarh, Uttar Pradesh.
Chandigarh University – A fast-growing private university in Punjab.
Kurukshetra University – A public university in Haryana.
Himachal Pradesh University (HPU) – A state university in Shimla, Himachal Pradesh.
Guru Gobind Singh Indraprastha University (GGSIPU) – A Delhi-based state university.
Dr. B. R. Ambedkar University, Agra – A public university in Uttar Pradesh.
Uttarakhand Technical University (UTU) – A state technical university in Uttarakhand.
This dataset offers valuable insights into student learning preferences, enabling researchers and educators to tailor teaching methods for maximum impact.
Recently Updated Version
Facebook
TwitterThe Backbencher Dataset is a unique and fun dataset designed to analyze and explore student behaviors and trends in classrooms. This dataset focuses on the attendance patterns, assignment completion rates, and other factors that influence a student’s academic performance, with a quirky twist: it includes a column identifying whether the student wears glasses!
This dataset is ideal for Machine Learning practitioners and Data Science enthusiasts who want to work on real-world datasets with an engaging context. It can be used for various ML problems such as:
Predictive Analytics: Predicting student performance based on attendance and assignments. **Clustering Analysis: **Grouping students based on shared characteristics. Classification Tasks: Classifying students as "active" or "inactive" based on participation metrics. Key Features: USN: Unique Student Number for identification. Name: Student names (for reference). Attendance (%): Percentage of classes attended. Assignments Completed: Number of assignments completed. Exam Scores: Performance in exams. Participation in Activities: Measures involvement in extracurricular activities. Glasses (Yes/No): Whether the student wears glasses (interesting feature for pattern recognition). Use Cases: Educational data analysis and predictive modeling. Creating engaging ML projects for students and beginners. Developing dashboards for visualizing student performance trends.