Data DescriptionThe DIPSER dataset is designed to assess student attention and emotion in in-person classroom settings, consisting of RGB camera data, smartwatch sensor data, and labeled attention and emotion metrics. It includes multiple camera angles per student to capture posture and facial expressions, complemented by smartwatch data for inertial and biometric metrics. Attention and emotion labels are derived from self-reports and expert evaluations. The dataset includes diverse demographic groups, with data collected in real-world classroom environments, facilitating the training of machine learning models for predicting attention and correlating it with emotional states.Data Collection and Generation ProceduresThe dataset was collected in a natural classroom environment at the University of Alicante, Spain. The recording setup consisted of six general cameras positioned to capture the overall classroom context and individual cameras placed at each student’s desk. Additionally, smartwatches were used to collect biometric data, such as heart rate, accelerometer, and gyroscope readings.Experimental SessionsNine distinct educational activities were designed to ensure a comprehensive range of engagement scenarios:News Reading – Students read projected or device-displayed news.Brainstorming Session – Idea generation for problem-solving.Lecture – Passive listening to an instructor-led session.Information Organization – Synthesizing information from different sources.Lecture Test – Assessment of lecture content via mobile devices.Individual Presentations – Students present their projects.Knowledge Test – Conducted using Kahoot.Robotics Experimentation – Hands-on session with robotics.MTINY Activity Design – Development of educational activities with computational thinking.Technical SpecificationsRGB Cameras: Individual cameras recorded at 640×480 pixels, while context cameras captured at 1280×720 pixels.Frame Rate: 9-10 FPS depending on the setup.Smartwatch Sensors: Collected heart rate, accelerometer, gyroscope, rotation vector, and light sensor data at a frequency of 1–100 Hz.Data Organization and FormatsThe dataset follows a structured directory format:/groupX/experimentY/subjectZ.zip Each subject-specific folder contains:images/ (individual facial images)watch_sensors/ (sensor readings in JSON format)labels/ (engagement & emotion annotations)metadata/ (subject demographics & session details)Annotations and LabelingEach data entry includes engagement levels (1-5) and emotional states (9 categories) based on both self-reported labels and evaluations by four independent experts. A custom annotation tool was developed to ensure consistency across evaluations.Missing Data and Data QualitySynchronization: A centralized server ensured time alignment across devices. Brightness changes were used to verify synchronization.Completeness: No major missing data, except for occasional random frame drops due to embedded device performance.Data Consistency: Uniform collection methodology across sessions, ensuring high reliability.Data Processing MethodsTo enhance usability, the dataset includes preprocessed bounding boxes for face, body, and hands, along with gaze estimation and head pose annotations. These were generated using YOLO, MediaPipe, and DeepFace.File Formats and AccessibilityImages: Stored in standard JPEG format.Sensor Data: Provided as structured JSON files.Labels: Available as CSV files with timestamps.The dataset is publicly available under the CC-BY license and can be accessed along with the necessary processing scripts via the DIPSER GitHub repository.Potential Errors and LimitationsDue to camera angles, some student movements may be out of frame in collaborative sessions.Lighting conditions vary slightly across experiments.Sensor latency variations are minimal but exist due to embedded device constraints.CitationIf you find this project helpful for your research, please cite our work using the following bibtex entry:@misc{marquezcarpintero2025dipserdatasetinpersonstudent1, title={DIPSER: A Dataset for In-Person Student1 Engagement Recognition in the Wild}, author={Luis Marquez-Carpintero and Sergio Suescun-Ferrandiz and Carolina Lorenzo Álvarez and Jorge Fernandez-Herrero and Diego Viejo and Rosabel Roig-Vila and Miguel Cazorla}, year={2025}, eprint={2502.20209}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2502.20209}, } Usage and ReproducibilityResearchers can utilize standard tools like OpenCV, TensorFlow, and PyTorch for analysis. The dataset supports research in machine learning, affective computing, and education analytics, offering a unique resource for engagement and attention studies in real-world classroom environments.
https://www.etalab.gouv.fr/licence-ouverte-open-licencehttps://www.etalab.gouv.fr/licence-ouverte-open-licence
Le label « Euroscol » vise à reconnaître la mobilisation des écoles et des établissements scolaires publics ou privés sous contrat s'inscrivant dans une dynamique européenne, par le portage et la participation à des projets et par la construction de parcours européens dans la perspective de la création d'un Espace européen de l'éducation. Plus d'informations sur https://eduscol.education.fr/1098/euroscol-le-label-des-ecoles-et-des-etablissements-scolaires
Unified school districts provide education to children of all school ages in their service areas. In general, where there is a unified school district, no elementary or secondary school district exists, and where there is an elementary school district the secondary school district may or may not exis
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Between 1947 and 1956 a study lead by Mick Olsen resulted in 6502 school and 587 gummy sharks being tagged in south-east Australia. Most of the school shark were tagged in inshore bays and estuaries, notably Port Phillip Bay, Port Sorell, Georges Bay and Pittwater. Most of the gummy shark were tagged in inshore areas around Flinders Island and the north coast of Tasmania. A total of 594 school shark and 60 gummy shark were recaptured. This data set includes field sheets and the tags returned to CSIRO.
https://www.etalab.gouv.fr/licence-ouverte-open-licencehttps://www.etalab.gouv.fr/licence-ouverte-open-licence
Le label "Génération 2024" pour les écoles et établissements scolaires vise à développer les passerelles entre le monde scolaire et le mouvement sportif pour encourager la pratique physique et sportive des jeunes. Ce jeu de données comporte les établissements dont le label a été accordé au plus tard le 28 mars 2024. Plus d'informations sur https://eduscol.education.fr/cid131907/le-label-generation-2024.html
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
As education increasingly relies on data-driven methodologies, accurately predicting student performance is essential for implementing timely and effective interventions. The California Student Performance Dataset offers a distinctive basis for analyzing complex elements that affect educational results, such as student demographics, academic behaviours, and emotional health. This study presents the GNN-Transformer-InceptionNet (GNN-TINet) model to overcome the constraints of prior models that fail to effectively capture intricate interactions in multi-label contexts, where students may display numerous performance categories concurrently. The GNN-TINet utilizes InceptionNet, transformer architectures, and graph neural networks (GNN) to improve precision in multi-label student performance forecasting. Advanced preprocessing approaches, such as Contextual Frequency Encoding (CFI) and Contextual Adaptive Imputation (CAI), were used on a dataset of 97,000 occurrences. The model achieved exceptional outcomes, exceeding current standards with a Predictive Consistency Score (PCS) of 0.92 and an accuracy of 98.5%. Exploratory data analysis revealed significant relationships between GPA, homework completion, and parental involvement, emphasizing the complex nature of academic achievement. The results illustrate the GNN-TINet’s potential to identify at-risk pupils, providing a robust resource for educators and policymakers to improve learning outcomes. This study enhances educational data mining by enabling focused interventions that promote educational equality, tackling significant challenges in the domain.
The study was designed to determine whether a city-mandated policy requiring calorie labeling at fast food restaurants was associated with consumer awareness of labels, calories purchased, and number of fast food restaurant visits. Point-of-purchase receipts, in-person interviews, and telephone surveys via random-digit dialing were collected as a part of this study on calorie labeling in fast food restaurants. Data was collected in Philadelphia before and after calorie labeling was implemented and in Baltimore, where calorie labeling was not implemented. Baseline collected took place in December 2009 in both Baltimore and Philadelphia. Data was collected after calorie labeling took effect in Philadelphia in February 2010. Further follow-up data collection occurred in June 2010.
Researchers collected data on whether or not consumers reported seeing calorie labeling in the restaurant, whether they bought fewer or more calories as a result of the labeling, and how frequently they went to fast food restaurants. They also collected data on consumer age, gender, race, education, income, and BMI category. A total of 2,083 usable observations across both cities and data collection periods are included in the dataset.
The elementary school districts provide education to the lower grade/age levels.
The secondary school districts provide education to the upper grade/age levels.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
List of files
Ontology
For more details, please refer to http://w3id.org/ckgg
46 school shark were tagged with archival tags during 1997-98, in South Australian and Tasmanian waters. 19 tags were recovered. The tags yielded 15.3 years of data on light level, depth and temperature collected at 4 minute intervals. The basic release-recapture data has been entered into the CSIRO pelagic tag data base but not the actual electronic data. The electronic data for the Lotek tags is in a different format to that of the Wildlife Computer tags, and may require dedicated geolocation software to process. Wildlife Computers provides geolocation software for their tags free of charge. While longitudinal movements have been described, there was no analysis of corresponding latitudes, as light-based latitude estimation was unreliable. There is scope for additional research into latitudinal movements based on the depth data. The depth pattern shown by the sharks can be used to examine if the fish was close to the bottom, and combined with a longitude estimate for a particular day, latitude can be estimated as across much of southern Australia where depth increases with latitude. However, there is a software development challenge associated with this, as there may be more than one depth fit for a particular longitude, especially towards eastern Australia. In this eastern region the restricted depth of Bass Strait can provide additional information on the latitude, as fish data at >86m indicates that it was too deep for Bass Strait. An additional factor that was not examined was the temperature data from the tags. In pelagic species surface water temperature is used to estimate latitude and at times school shark do come close to the surface. Some of the tags were set up to record internal as well as external temperatures but this data was not examined. There have been 2 recaptures of Wildlife Computers tags since West & Stevens (1996) published the results. There have also been two Lotek tags returned since this publication but the data for these tags was corrupted.
https://www.etalab.gouv.fr/licence-ouverte-open-licencehttps://www.etalab.gouv.fr/licence-ouverte-open-licence
Lancé en 2021 par le Haut Conseil de l’éducation artistique et culturelle (HCEAC), le label 100% EAC reconnaît l’engagement d’un territoire en faveur de la généralisation de l’éducation artistique et culturelle (EAC). Décerné pour une durée de 5 ans renouvelables, ce label valorise les collectivités et les intercommunalités qui proposent une éducation artistique et culturelle à l’ensemble des jeunes de leur territoire, de la petite enfance à l’âge adulte. Accompagné d’outils méthodologiques permettant d’élaborer un état des lieux et une stratégie, il aide à renforcer la cohérence de l‘action, fédérer les acteurs, mobiliser d’autres partenaires, pérenniser et développer les dispositifs. Les ministres de la culture et de l’éducation nationale, qui co-président le HCEAC, ont confié aux préfets et aux recteurs l’attribution de ce label, après avis des services déconcentrés des deux ministères. Dès la première session en 2022, 79 territoires, répartis dans toutes les régions, ont été labellisés 100% EAC ; 78 (dont deux d’outre-mer) l’ont été en 2023. A l'issue des deux premiers appels à candidature, 157 territoires sont labellisés 100% EAC. NB : - La carte indique les surfaces des départements, ceux-ci peuvent inclure une ou plusieurs collectivités labellisées, dans ce cas les flèches permettent de toutes les visualiser . - Les données concernant les partenariats et les dispositifs ne sont valables que l’année de labellisation, puisqu’ils peuvent évoluer dans le temps .** Pour en savoir plus : https://www.culture.gouv.fr/catalogue-des-demarches-et-subventions/appels-a-projets-candidatures/Label-100-EAC
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Overview: This is a lab-based dataset with videos recording volunteers (medical students) washing their hands as part of a hand-washing monitoring and feedback experiment. The dataset is collected in the Medical Education Technology Center (METC) of Riga Stradins University, Riga, Latvia. In total, 72 participants took part in the experiments, each washing their hands three times, in a randomized order, going through three different hand-washing feedback approaches (user interfaces of a mobile app). The data was annotated in real time by a human operator, in order to give the experiment participants real-time feedback on their performance. There are 212 hand washing episodes in total, each of which is annotated by a single person. The annotations classify the washing movements according to the World Health Organization's (WHO) guidelines by marking each frame in each video with a certain movement code.
This dataset is part on three dataset series all following the same format:
Note #1: we recommend that when using this dataset for machine learning, allowances are made for the reaction speed of the human operator labeling the data. For example, the annotations can be expected to be incorrect a short while after the person in the video switches their washing movements.
Application: The intention of this dataset is to serve as a basis for training machine learning classifiers for automated hand washing movement recognition and quality control.
Statistics:
Movement codes (in JSON files):
Note #2: The original dataset of JPG images is available upon request. There are 13 annotation classes in the original dataset: for each of the six washing movements defined by the WHO, "correct" and "incorrect" execution is market with two different labels. In this published dataset, all incorrect executions are marked with code 0, as "other" washing movement.
Acknowledgments: The dataset collection was funded by the Latvian Council of Science project: "Automated hand washing quality control and quality evaluation system with real-time feedback", No: lzp - Nr. 2020/2-0309.
References: For more detailed information, see this article, describing a similar dataset collected in a different project:
M. Lulla, A. Rutkovskis, A. Slavinska, A. Vilde, A. Gromova, M. Ivanovs, A. Skadins, R. Kadikis, A. Elsts. Hand-Washing Video Dataset Annotated According to the World Health Organization’s Hand-Washing Guidelines. Data. 2021; 6(4):38. https://doi.org/10.3390/data6040038
Contact information: atis.elsts@edi.lv
Between 1947 and 1956 a study lead by Mick Olsen resulted in 6502 school and 587 gummy sharks being tagged in south-east Australia. Most of the school shark were tagged in inshore bays and estuaries, notably Port Phillip Bay, Port Sorell, Georges Bay and Pittwater. Most of the gummy shark were tagged in inshore areas around Flinders Island and the north coast of Tasmania. A total of 594 school shark and 60 gummy shark were recaptured. This data set includes field sheets and the tags returned to CSIRO. These records are cataloged in the TRIM Records database, as follows: AB2008/1038: CMAR - School and Gummy Shark Tagging by CSIRO in Southern Australia 1947-1956 - Mick Olsen and Grant West - MarLIN record 8218 This Archive Box number incorporates 2 containers: "C2008/6921-01: CMAR - School and Gummy Shark Tagging by CSIRO in Southern Australia 1947-1956 - Mick Olsen and Grant West - MarLIN record 8218 - Part 1 - Tag Data Field Sheets" [associated files lodged within as separate objects]; and "C2008/6921-02: CMAR - School and Gummy Shark Tagging by CSIRO in Southern Australia 1947-1956 - Mick Olsen and Grant West - MarLIN record 8218 - Part 2 - Tags and Olsen Card Index [in metal filing cabinet]"
The pupil to teacher ratio data includes figures for both elementary and high schools in Champaign County. This indicator includes the following school districts: Champaign Community Unit School District #4, Fisher Community Unit School District #1, Gifford Community Consolidated Grade School District #188, Ludlow Community Consolidated School District #142, Mahomet-Seymour Community Unit School District #3, Rantoul City School District #137, Rantoul Township High School District #193, St. Joseph Community Consolidated School District #169, St. Joseph-Ogden Community High School District #305, Tolono Community Unit School District #7, and Urbana School District #116. How many pupils per teacher there are in a district can reflect a number of other conditions. We included this indicator to provide some information on classroom size and instruction.
The pupil to teacher ratio shifts slightly from year to year in most districts, but the changes are often relatively small. Most districts’ ratios hover between 15:1 and 25:1 for most or all of the measured time period, with a few districts consistently below 15:1. The average ratio for all Champaign County schools was 16:1 every year from 2008 through 2020, reaching a new low of 15:1 for three of the four years between 2021 and 2024. There is no county-wide unifying trend.
This data, along with a variety of other school district data, is available on the Illinois Report Card, an Illinois State Board of Education and Northern Illinois University website.
Sources: Illinois Report Card. (2023-2024). Champaign CUSD 4. Illinois State Board of Education. (Accessed 6 December 2024). Illinois Report Card. (2023-2024). Fisher CUSD 1. Illinois State Board of Education. (Accessed 6 December 2024). Illinois Report Card. (2023-2024). Gifford CCSD 188. Illinois State Board of Education. (Accessed 6 December 2024). Illinois Report Card. (2023-2024). Ludlow CCSD 142. Illinois State Board of Education. (Accessed 6 December 2024). Illinois Report Card. (2023-2024). Mahomet-Seymour CUSD 3. Illinois State Board of Education. (Accessed 6 December 2024). Illinois Report Card. (2023-2024). Prairieview-Ogden CCSD 197. Illinois State Board of Education. (Accessed 6 December 2024). Illinois Report Card. (2023-2024). Rantoul City SD 137. Illinois State Board of Education. (Accessed 6 December 2024). Illinois Report Card. (2023-2024). Rantoul Township HSD 193. Illinois State Board of Education. (Accessed 6 December 2024). Illinois Report Card. (2023-2024). St. Joseph CCSD 169. Illinois State Board of Education. (Accessed 6 December 2024). Illinois Report Card. (2023-2024). St. Joseph Ogden CHSD 305. Illinois State Board of Education. (Accessed 6 December 2024). Illinois Report Card. (2023-2024). Thomasboro CCSD 130. Illinois State Board of Education. (Accessed 6 December 2024). Illinois Report Card. (2023-2024). Tolono CUSD 7. Illinois State Board of Education. (Accessed 6 December 2024). Illinois Report Card. (2023-2024). Urbana SD 116. Illinois State Board of Education. (Accessed 6 December 2024).
Ce jeu de données fournit la liste des écoles de l'académie de Montpellier détentrices d’un label numérique. Le label numérique école est depuis 2018 une démarche pour structurer le dialogue entre les écoles de l’académie de Montpellier et les collectivités.
Le label numérique école est attribué aux écoles élémentaires et primaires pour 2 années scolaires, il permet :
d'attester d’usages du numérique et de niveaux d’équipement permettant ces usages ;de structurer et conforter le dialogue entre l’école et la commune ;de partager des référentiels communs pour le développement des usages pédagogiques pertinents.
Il se décline sur 3 niveaux et permet un positionnement de l’école sur 5 critères :
Identification de référents au niveau de l’école et de la commune pour le numériqueNiveaux d’usages de l’ENT académique 1er degré ENT-écolePrésence de projets numériques dans les classesAdhésion de l’école à l’accompagnement proposé par la circonscriptionÉquipements mis en place par la collectivité
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This paper aims to provide a first systematic research overview of student learning outcomes in programs teaching school subjects through languages other than English (LOTE) which are not the mother tongue of the students, according to school- or researcher-administered assessments and stakeholder perspectives, following the PRISMA statement. For brevity, we shall refer to these types of programs as CLIL in LOTE, though we have also included programs which use other labels, such as bilingual education or immersion, due to their similarities with those labelled “content and language integrated learning” (CLIL). The selected studies, published between November 1994 and December 2023, were identified through the search of SCOPUS and EBSCO. In determining which studies to include in the review, we employed the following selection criteria: (1) articles focusing on children and youth (ages 5–17 years), (2) articles focusing on CLIL programs in LOTE, (3) articles focusing on student achievement, (4) articles focusing on studies that have collected primary data, and (5) studies that used school-/researcher-administered assessments (objective) or self/hetero-reported measures (subjective). The present paper will present the data following narrative synthesis procedures, describing outcomes and identifying patterns across the reviewed studies. Theoretically, the study contributes to establishing more general theories about the specific role of CLIL in LOTE in students’ learning. Empirically, the study outlines pathways for future research on CLIL in LOTE. In practice, the results are discussed in light of implications for implementing effective educational practices while teaching through a CLIL approach in LOTE.
Ce jeu de données fournit la liste des collèges de la région Occitanie ayant obtenu la labellisation numérique depuis 2017.De 2017 à 2023, seuls les collèges de l'académie de Montpellier étaient concernés. En 2024, 4 départements de l'académie de Toulouse intègrent le dispositif de labellisation numérique : Ariège, Haute-Garonne, Gers et Hautes-Pyrénées.Les collèges labellisés sont classés selon 3 niveaux qui jaugent leur implication et qui sont aussi la base de sélection pour le déploiement de moyens différenciés.
La commission d'attribution du label numérique se prononce
à partir de la notation d’un ensemble de critères regroupés sous
4 domaines :
- Pilotage numérique dans l'établissement
- Infrastructure et équipement numériques de
l’établissement
- Services et usages pédagogiques numériques de
l’établissement
- Accompagnement et formation des équipes pédagogiques au
numérique
Les départements et les autorités académiques s’engagent
auprès des lauréats à apporter des moyens supplémentaires pour
faciliter l’utilisation du numérique à des fins pédagogiques.
Les engagements des Départements vis-à-vis des lauréats :
- Le développement des infrastructures = réseau, wifi,
serveurs,
- Le renforcement de la dotation en matériels : vidéo
projecteurs, postes fixes, mobiliers particuliers…,
- Le soutien aux usages de l’ENT,
- Le soutien aux projets numériques : actions éducatives et
projets spécifiques.
Les engagements des académies vis-à-vis des lauréats :
- La formation des personnels,
- La mise à disposition de ressources pédagogiques
adaptées,
- L’accompagnement des usages par les corps d’inspection,
- Le soutien à l’innovation pédagogique.
Licence Ouverte / Open Licence 2.0https://www.etalab.gouv.fr/wp-content/uploads/2018/11/open-licence.pdf
License information was derived automatically
Ce jeu de données comprend la liste des centres "Label qualité FLE". Résultat d’une démarche d’assurance qualité engagée par le ministère de l’Enseignement supérieur et de la Recherche, le ministère de la Culture et le ministère de l'Europe et des Affaires étrangères, le label Qualité FLE vise à identifier, reconnaître et promouvoir les centres de français langue étrangère (FLE) dont l'offre linguistique et les services présentent des garanties de qualité.
Pour en savoir plus sur le label : https://www.qualitefle.fr/une-garantie-officielle
Pour consulter la carte des centres : https://www.qualitefle.fr/carte-des-centres-labellises
Ce jeu de données fournit la liste des lycées de la région académique Occitanie ayant obtenu la labellisation numérique depuis 2017, excepté l'année 2020. Les lycées labellisés sont classés selon 3 niveaux qui jaugent leur implication et qui sont aussi la base de sélection pour le déploiement de moyens différenciés.La commission d'attribution du label numérique se prononce à partir de la notation d’un ensemble de critères regroupés sous 4 domaines :- Pilotage numérique dans l'établissement- Infrastructure et équipement numériques de l’établissement- Services et usages pédagogiques numériques de l’établissement- Accompagnement et formation des équipes pédagogiques au numériqueLes établissements lauréats s’engagent à utiliser les outils numériques dans desprojets spécifiques.La Région et les autorités académiques s’engagent auprès des lauréats à apporter desmoyens supplémentaires pour faciliter l’utilisation du numérique à des fins pédagogiques.Les engagements de la Région vis-à-vis des lauréats :- La dotation gratuite d’ordinateurs portables à l’ensemble des élèves de seconde,- Le développement des infrastructures = réseau, wifi, serveurs,- Le renforcement de la dotation en matériels : vidéo projecteurs, postes fixes, mobiliers particuliers…,- Le soutien aux usages de l’ENT,- Le soutien aux projets numériques : actions éducatives et projets spécifiques.Les engagements des académies vis-à-vis des lauréats :- La formation des personnels,- La mise à disposition de ressources pédagogiques adaptées,- L’accompagnement des usages par les corps d’inspection,- Le soutien à l’innovation pédagogique.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Data DescriptionThe DIPSER dataset is designed to assess student attention and emotion in in-person classroom settings, consisting of RGB camera data, smartwatch sensor data, and labeled attention and emotion metrics. It includes multiple camera angles per student to capture posture and facial expressions, complemented by smartwatch data for inertial and biometric metrics. Attention and emotion labels are derived from self-reports and expert evaluations. The dataset includes diverse demographic groups, with data collected in real-world classroom environments, facilitating the training of machine learning models for predicting attention and correlating it with emotional states.Data Collection and Generation ProceduresThe dataset was collected in a natural classroom environment at the University of Alicante, Spain. The recording setup consisted of six general cameras positioned to capture the overall classroom context and individual cameras placed at each student’s desk. Additionally, smartwatches were used to collect biometric data, such as heart rate, accelerometer, and gyroscope readings.Experimental SessionsNine distinct educational activities were designed to ensure a comprehensive range of engagement scenarios:News Reading – Students read projected or device-displayed news.Brainstorming Session – Idea generation for problem-solving.Lecture – Passive listening to an instructor-led session.Information Organization – Synthesizing information from different sources.Lecture Test – Assessment of lecture content via mobile devices.Individual Presentations – Students present their projects.Knowledge Test – Conducted using Kahoot.Robotics Experimentation – Hands-on session with robotics.MTINY Activity Design – Development of educational activities with computational thinking.Technical SpecificationsRGB Cameras: Individual cameras recorded at 640×480 pixels, while context cameras captured at 1280×720 pixels.Frame Rate: 9-10 FPS depending on the setup.Smartwatch Sensors: Collected heart rate, accelerometer, gyroscope, rotation vector, and light sensor data at a frequency of 1–100 Hz.Data Organization and FormatsThe dataset follows a structured directory format:/groupX/experimentY/subjectZ.zip Each subject-specific folder contains:images/ (individual facial images)watch_sensors/ (sensor readings in JSON format)labels/ (engagement & emotion annotations)metadata/ (subject demographics & session details)Annotations and LabelingEach data entry includes engagement levels (1-5) and emotional states (9 categories) based on both self-reported labels and evaluations by four independent experts. A custom annotation tool was developed to ensure consistency across evaluations.Missing Data and Data QualitySynchronization: A centralized server ensured time alignment across devices. Brightness changes were used to verify synchronization.Completeness: No major missing data, except for occasional random frame drops due to embedded device performance.Data Consistency: Uniform collection methodology across sessions, ensuring high reliability.Data Processing MethodsTo enhance usability, the dataset includes preprocessed bounding boxes for face, body, and hands, along with gaze estimation and head pose annotations. These were generated using YOLO, MediaPipe, and DeepFace.File Formats and AccessibilityImages: Stored in standard JPEG format.Sensor Data: Provided as structured JSON files.Labels: Available as CSV files with timestamps.The dataset is publicly available under the CC-BY license and can be accessed along with the necessary processing scripts via the DIPSER GitHub repository.Potential Errors and LimitationsDue to camera angles, some student movements may be out of frame in collaborative sessions.Lighting conditions vary slightly across experiments.Sensor latency variations are minimal but exist due to embedded device constraints.CitationIf you find this project helpful for your research, please cite our work using the following bibtex entry:@misc{marquezcarpintero2025dipserdatasetinpersonstudent1, title={DIPSER: A Dataset for In-Person Student1 Engagement Recognition in the Wild}, author={Luis Marquez-Carpintero and Sergio Suescun-Ferrandiz and Carolina Lorenzo Álvarez and Jorge Fernandez-Herrero and Diego Viejo and Rosabel Roig-Vila and Miguel Cazorla}, year={2025}, eprint={2502.20209}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2502.20209}, } Usage and ReproducibilityResearchers can utilize standard tools like OpenCV, TensorFlow, and PyTorch for analysis. The dataset supports research in machine learning, affective computing, and education analytics, offering a unique resource for engagement and attention studies in real-world classroom environments.