22 datasets found
  1. u

    Data from: DIPSER: A Dataset for In-Person Student Engagement Recognition in...

    • observatorio-cientifico.ua.es
    • scidb.cn
    Updated 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Márquez-Carpintero, Luis; Suescun-Ferrandiz, Sergio; Álvarez, Carolina Lorenzo; Fernandez-Herrero, Jorge; Viejo, Diego; Rosabel Roig-Vila; Cazorla, Miguel; Márquez-Carpintero, Luis; Suescun-Ferrandiz, Sergio; Álvarez, Carolina Lorenzo; Fernandez-Herrero, Jorge; Viejo, Diego; Rosabel Roig-Vila; Cazorla, Miguel (2025). DIPSER: A Dataset for In-Person Student Engagement Recognition in the Wild [Dataset]. https://observatorio-cientifico.ua.es/documentos/67321d21aea56d4af0484172
    Explore at:
    Dataset updated
    2025
    Authors
    Márquez-Carpintero, Luis; Suescun-Ferrandiz, Sergio; Álvarez, Carolina Lorenzo; Fernandez-Herrero, Jorge; Viejo, Diego; Rosabel Roig-Vila; Cazorla, Miguel; Márquez-Carpintero, Luis; Suescun-Ferrandiz, Sergio; Álvarez, Carolina Lorenzo; Fernandez-Herrero, Jorge; Viejo, Diego; Rosabel Roig-Vila; Cazorla, Miguel
    Description

    Data DescriptionThe DIPSER dataset is designed to assess student attention and emotion in in-person classroom settings, consisting of RGB camera data, smartwatch sensor data, and labeled attention and emotion metrics. It includes multiple camera angles per student to capture posture and facial expressions, complemented by smartwatch data for inertial and biometric metrics. Attention and emotion labels are derived from self-reports and expert evaluations. The dataset includes diverse demographic groups, with data collected in real-world classroom environments, facilitating the training of machine learning models for predicting attention and correlating it with emotional states.Data Collection and Generation ProceduresThe dataset was collected in a natural classroom environment at the University of Alicante, Spain. The recording setup consisted of six general cameras positioned to capture the overall classroom context and individual cameras placed at each student’s desk. Additionally, smartwatches were used to collect biometric data, such as heart rate, accelerometer, and gyroscope readings.Experimental SessionsNine distinct educational activities were designed to ensure a comprehensive range of engagement scenarios:News Reading – Students read projected or device-displayed news.Brainstorming Session – Idea generation for problem-solving.Lecture – Passive listening to an instructor-led session.Information Organization – Synthesizing information from different sources.Lecture Test – Assessment of lecture content via mobile devices.Individual Presentations – Students present their projects.Knowledge Test – Conducted using Kahoot.Robotics Experimentation – Hands-on session with robotics.MTINY Activity Design – Development of educational activities with computational thinking.Technical SpecificationsRGB Cameras: Individual cameras recorded at 640×480 pixels, while context cameras captured at 1280×720 pixels.Frame Rate: 9-10 FPS depending on the setup.Smartwatch Sensors: Collected heart rate, accelerometer, gyroscope, rotation vector, and light sensor data at a frequency of 1–100 Hz.Data Organization and FormatsThe dataset follows a structured directory format:/groupX/experimentY/subjectZ.zip Each subject-specific folder contains:images/ (individual facial images)watch_sensors/ (sensor readings in JSON format)labels/ (engagement & emotion annotations)metadata/ (subject demographics & session details)Annotations and LabelingEach data entry includes engagement levels (1-5) and emotional states (9 categories) based on both self-reported labels and evaluations by four independent experts. A custom annotation tool was developed to ensure consistency across evaluations.Missing Data and Data QualitySynchronization: A centralized server ensured time alignment across devices. Brightness changes were used to verify synchronization.Completeness: No major missing data, except for occasional random frame drops due to embedded device performance.Data Consistency: Uniform collection methodology across sessions, ensuring high reliability.Data Processing MethodsTo enhance usability, the dataset includes preprocessed bounding boxes for face, body, and hands, along with gaze estimation and head pose annotations. These were generated using YOLO, MediaPipe, and DeepFace.File Formats and AccessibilityImages: Stored in standard JPEG format.Sensor Data: Provided as structured JSON files.Labels: Available as CSV files with timestamps.The dataset is publicly available under the CC-BY license and can be accessed along with the necessary processing scripts via the DIPSER GitHub repository.Potential Errors and LimitationsDue to camera angles, some student movements may be out of frame in collaborative sessions.Lighting conditions vary slightly across experiments.Sensor latency variations are minimal but exist due to embedded device constraints.CitationIf you find this project helpful for your research, please cite our work using the following bibtex entry:@misc{marquezcarpintero2025dipserdatasetinpersonstudent1, title={DIPSER: A Dataset for In-Person Student1 Engagement Recognition in the Wild}, author={Luis Marquez-Carpintero and Sergio Suescun-Ferrandiz and Carolina Lorenzo Álvarez and Jorge Fernandez-Herrero and Diego Viejo and Rosabel Roig-Vila and Miguel Cazorla}, year={2025}, eprint={2502.20209}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2502.20209}, } Usage and ReproducibilityResearchers can utilize standard tools like OpenCV, TensorFlow, and PyTorch for analysis. The dataset supports research in machine learning, affective computing, and education analytics, offering a unique resource for engagement and attention studies in real-world classroom environments.

  2. S

    CADDI: An in-Class Activity Detection Dataset using IMU data from low-cost...

    • scidb.cn
    • observatorio-cientifico.ua.es
    Updated May 28, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Luis Marquez-Carpintero; Sergio Suescun-Ferrandiz; Monica Pina-Navarro; Francisco Gomez-Donoso; Miguel Cazorla (2024). CADDI: An in-Class Activity Detection Dataset using IMU data from low-cost sensors [Dataset]. http://doi.org/10.57760/sciencedb.08377
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 28, 2024
    Dataset provided by
    Science Data Bank
    Authors
    Luis Marquez-Carpintero; Sergio Suescun-Ferrandiz; Monica Pina-Navarro; Francisco Gomez-Donoso; Miguel Cazorla
    Description

    Data DescriptionThe CADDI dataset is designed to support research in in-class activity recognition using IMU data from low-cost sensors. It provides multimodal data capturing 19 different activities performed by 12 participants in a classroom environment, utilizing both IMU sensors from a Samsung Galaxy Watch 5 and synchronized stereo camera images. This dataset enables the development and validation of activity recognition models using sensor fusion techniques.Data Generation ProceduresThe data collection process involved recording both continuous and instantaneous activities that typically occur in a classroom setting. The activities were captured using a custom setup, which included:A Samsung Galaxy Watch 5 to collect accelerometer, gyroscope, and rotation vector data at 100Hz.A ZED stereo camera capturing 1080p images at 25-30 fps.A synchronized computer acting as a data hub, receiving IMU data and storing images in real-time.A D-Link DSR-1000AC router for wireless communication between the smartwatch and the computer.Participants were instructed to arrange their workspace as they would in a real classroom, including a laptop, notebook, pens, and a backpack. Data collection was performed under realistic conditions, ensuring that activities were captured naturally.Temporal and Spatial ScopeThe dataset contains a total of 472.03 minutes of recorded data.The IMU sensors operate at 100Hz, while the stereo camera captures images at 25-30Hz.Data was collected from 12 participants, each performing all 19 activities multiple times.The geographical scope of data collection was Alicante, Spain, under controlled indoor conditions.Dataset ComponentsThe dataset is organized into JSON and PNG files, structured hierarchically:IMU Data: Stored in JSON files, containing:Samsung Linear Acceleration Sensor (X, Y, Z values, 100Hz)LSM6DSO Gyroscope (X, Y, Z values, 100Hz)Samsung Rotation Vector (X, Y, Z, W quaternion values, 100Hz)Samsung HR Sensor (heart rate, 1Hz)OPT3007 Light Sensor (ambient light levels, 5Hz)Stereo Camera Images: High-resolution 1920×1080 PNG files from left and right cameras.Synchronization: Each IMU data record and image is timestamped for precise alignment.Data StructureThe dataset is divided into continuous and instantaneous activities:Continuous Activities (e.g., typing, writing, drawing) were recorded for 210 seconds, with the central 200 seconds retained.Instantaneous Activities (e.g., raising a hand, drinking) were repeated 20 times per participant, with data captured only during execution.The dataset is structured as:/continuous/subject_id/activity_name/ /camera_a/ → Left camera images /camera_b/ → Right camera images /sensors/ → JSON files with IMU data

    /instantaneous/subject_id/activity_name/repetition_id/ /camera_a/ /camera_b/ /sensors/ Data Quality & Missing DataThe smartwatch buffers 100 readings per second before sending them, ensuring minimal data loss.Synchronization latency between the smartwatch and the computer is negligible.Not all IMU samples have corresponding images due to different recording rates.Outliers and anomalies were handled by discarding incomplete sequences at the start and end of continuous activities.Error Ranges & LimitationsSensor data may contain noise due to minor hand movements.The heart rate sensor operates at 1Hz, limiting its temporal resolution.Camera exposure settings were automatically adjusted, which may introduce slight variations in lighting.File Formats & Software CompatibilityIMU data is stored in JSON format, readable with Python’s json library.Images are in PNG format, compatible with all standard image processing tools.Recommended libraries for data analysis:Python: numpy, pandas, scikit-learn, tensorflow, pytorchVisualization: matplotlib, seabornDeep Learning: Keras, PyTorchPotential ApplicationsDevelopment of activity recognition models in educational settings.Study of student engagement based on movement patterns.Investigation of sensor fusion techniques combining visual and IMU data.This dataset represents a unique contribution to activity recognition research, providing rich multimodal data for developing robust models in real-world educational environments.CitationIf you find this project helpful for your research, please cite our work using the following bibtex entry:@misc{marquezcarpintero2025caddiinclassactivitydetection, title={CADDI: An in-Class Activity Detection Dataset using IMU data from low-cost sensors}, author={Luis Marquez-Carpintero and Sergio Suescun-Ferrandiz and Monica Pina-Navarro and Miguel Cazorla and Francisco Gomez-Donoso}, year={2025}, eprint={2503.02853}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2503.02853}, }

  3. q

    Data from: An integrated approach for scheduling health care activities in a...

    • researchdatafinder.qut.edu.au
    • researchdata.edu.au
    Updated Jul 17, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dr Robert Burdett (2017). An integrated approach for scheduling health care activities in a hospital [Dataset]. https://researchdatafinder.qut.edu.au/individual/n7946
    Explore at:
    Dataset updated
    Jul 17, 2017
    Dataset provided by
    Queensland University of Technology (QUT)
    Authors
    Dr Robert Burdett
    Description

    To effectively utilise hospital beds, operating rooms (OR) and other treatment spaces, it is necessary to precisely plan patient admissions and treatments in advance. As patient treatment and recovery times are unequal and uncertain, this is not easy. In response a sophisticated flexible job-shop scheduling (FJSS) model is introduced, whereby patients, beds, hospital wards and health care activities are respectively treated as jobs, single machines, parallel machines and operations. Our approach is novel because an entire hospital is describable and schedulable in one integrated approach. The scheduling model can be used to recompute timings after deviations, delays, postponements and cancellations. It also includes advanced conditions such as activity and machine setup times, transfer times between activities, blocking limitations and no wait conditions, timing and occupancy restrictions, buffering for robustness, fixed activities and sequences, release times and strict deadlines. To solve the FJSS problem, constructive algorithms and hybrid meta-heuristics have been developed. Our numerical testing shows that the proposed solution techniques are capable of solving problems of real world size. This outcome further highlights the value of the scheduling model and its potential for integration into actual hospital information systems.

  4. Guidelines to Minimize Impacts of Data Gathering Activities on Pinyon Jays

    • pinyon-jay-community-science-gbbo.hub.arcgis.com
    Updated Feb 19, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Great Basin Bird Observatory (2022). Guidelines to Minimize Impacts of Data Gathering Activities on Pinyon Jays [Dataset]. https://pinyon-jay-community-science-gbbo.hub.arcgis.com/documents/0b1ee2a5e6c34e50afd0548010a6f2fe
    Explore at:
    Dataset updated
    Feb 19, 2022
    Dataset authored and provided by
    Great Basin Bird Observatory
    Description

    This document from the Pinyon Jay Working Group presents recommendations and guidelines to prevent researchers or surveyors from inadvertently disturbing or negatively affecting Pinyon Jays during their work.

  5. g

    National Household Education Survey, 2001 - Version 1

    • search.gesis.org
    Updated Feb 26, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    United States Department of Education. National Center for Education Statistics (2021). National Household Education Survey, 2001 - Version 1 [Dataset]. http://doi.org/10.3886/ICPSR03198.v1
    Explore at:
    Dataset updated
    Feb 26, 2021
    Dataset provided by
    GESIS search
    ICPSR - Interuniversity Consortium for Political and Social Research
    Authors
    United States Department of Education. National Center for Education Statistics
    License

    https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de436161https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de436161

    Description

    Abstract (en): The National Household Education Survey (NHES) reports on the condition of education in the United States by collecting data at the household level rather than using a traditional, school-based data collection system. The surveys attempt to address many current issues in education, such as preprimary education, school safety and discipline, adult education, and activities related to citizenship. This survey included three topical survey components. The Early Childhood Program Participation (ECPP) Survey (Part 1) gathered information on the nonparental care arrangements and educational programs of preschool children, such as care by relatives, care by persons to whom they were not related, and participation in day care centers and preschool programs including Head Start. The Before- and After-School Programs and Activities (ASPA) Survey (Part 2) addressed relative and nonrelative care for school-age children during the out-of-school hours, including home schooling as well as participation in before- and/or after-school programs, activities, and self-care. The Adult Education and Lifelong Learning (AELL) Survey (Part 3) collected data such as type of program, employer support, and credential sought for participation in the following types of adult educational activities: English as a second language, adult basic education, credential programs, apprenticeships, work-related courses, and personal interest courses. Some information on work-related informal learning activities was gathered as well. ICPSR data undergo a confidentiality review and are altered when necessary to limit the risk of disclosure. ICPSR also routinely creates ready-to-go data files along with setups in the major statistical software formats as well as standard codebooks to accompany the data. In addition to these procedures, ICPSR performed the following processing steps for this data collection: Checked for undocumented or out-of-range codes.. National sample of household members in the United States. National sample of households. 2006-01-18 File UG3198.ALL.PDF was removed from any previous datasets and flagged as a study-level file, so that it will accompany all downloads.2006-01-18 File QU3198.ALL.PDF was removed from any previous datasets and flagged as a study-level file, so that it will accompany all downloads. The codebooks, user guide, and data collection instrument are provided by the ICPSR as Portable Document Format (PDF) files. The PDF file format was developed by Adobe Systems Incorporated and can be accessed using PDF reader software, such as the Adobe Acrobat Reader. Information on how to obtain a copy of the Acrobat Reader is provided on the ICPSR Web site.

  6. User data collection in select mobile iOS apps for kids worldwide 2021, by...

    • statista.com
    Updated Jul 7, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    User data collection in select mobile iOS apps for kids worldwide 2021, by type [Dataset]. https://www.statista.com/statistics/1302472/data-points-collected-kids-apps-ios-by-type/
    Explore at:
    Dataset updated
    Jul 7, 2022
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Mar 2021
    Area covered
    Worldwide
    Description

    As of March 2021, YouTube Kids and Facebook Messenger Kids were the mobile apps for children found to collect the largest amount of data from global iOS users. The apps collected a total of 15 data points from each of the examined data types,. Language learning app Lingokids and educational app ABCmouse followed with 10 data points. The type of data that the examined children's apps collected mostoften were contact information and diagnostics.

    Children mobile privacy From online education to gaming and social media, children and young users are increasingly active in online environments via mobile devices. In 2021, playing online games and watching YouTube videos figured among the most popular mobile activities for kids worldwide, while less than five in 10 reported using their phones to complete assignments for school. As vulnerable users, children are entitled to institutional protection and lower interference from tech companies. However, mobile apps designed for children still collect data from their young users. As of the beginning of 2022, money management and gaming apps were the app categories found to track the largest number of data segments from children, with 10.1 and 9.3 data points tracked, respectively.

    Child proof social media? While the impact of social media on younger users’ development is yet to be fully understood, parents and educators were quick to realize that social media expands the range of dangers children can encounter while being online. In 2021, children in the United States and in the United Kingdom spent an average of 98 minutes per day on TikTok, as well as 83 minutes daily on Snapchat. In the U.S., both Snapchat and TikTok agreed to respect the age limit restrictions set by the Children's Online Privacy Protection Act (COPPA), and while Snapchat discontinued its children-specific Snapkidz app in 2016, TikTok relies on its TikTok Younger Users platform for users younger than 13. Despite the majority of social media services requiring users to be at least 13 years old, a survey conducted in 2021 in the United Kingdom has found that 60 percent of all surveyed kids aged between eight and 11 had their own social media profile.

  7. Volunteer Activities Survey 2018 - South Africa

    • microdata.worldbank.org
    • catalog.ihsn.org
    Updated Nov 1, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statistics South Africa (2021). Volunteer Activities Survey 2018 - South Africa [Dataset]. https://microdata.worldbank.org/index.php/catalog/4186
    Explore at:
    Dataset updated
    Nov 1, 2021
    Dataset authored and provided by
    Statistics South Africahttp://www.statssa.gov.za/
    Time period covered
    2018
    Area covered
    South Africa
    Description

    Abstract

    The Volunteer Activities Survey (VAS) is a household-based survey conducted by Statistics South Africa (Stats SA). The VAS collects information on the volunteer activities of individuals aged 15 years and older in South Africa. The respondents were selected from households who took part in the second quarter Quarterly Labour Force Survey (QLFS). Volunteer activities covers unpaid non-compulsory work; that is, the time individuals give without pay to activities performed either through an organization or directly for others outside their own household.

    Data on volunteering provides important information on skills application, social network development, social capital and quality of life outcomes. The main aim of the survey is to provide information on the scale of volunteer work and bring into view the sizeable part of the labour force that is invisible in existing labour statistics. The objectives of the VAS are:

    • To collect reliable data about people who are involved in volunteer activities. • To identify organization-based and direct volunteering. • To give a profile of those engaged in volunteer activities. • To estimate the economic value of volunteer work.

    Geographic coverage

    National coverage

    Analysis unit

    Households and individuals

    Universe

    The target population of the survey consists of individuals aged 15 years and older who live in South Africa and who are members of households living in dwellings that have been selected to take part in the second quarter Quarterly Labour Force Survey (QLFS).

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    The Quarterly Labour Force Survey (QLFS) sample frame was used for data collection in the VAS. The sample for the QLFS is based on a stratified two-stage design with probability proportional to size (PPS) sampling of primary sampling units (PSUs) in the first stage, and sampling of dwelling units (DUs) with systematic sampling in the second stage. The frame was developed as a general-purpose household survey frame that can be used by all other household surveys irrespective of the sample size requirement of the survey. The sample is based on information collected by Statistics SA during the 2001 Population Census and is designed to be representative at the provincial level and within provinces at the metro/non-metro level. Within the metros, the sample is further distributed by geography type. The four geography types are: urban formal, urban informal, farms and tribal land.

    Mode of data collection

    Face-to-face [f2f]

    Research instrument

    The 2018 VAS questionnaire consists of the following sections: - Particulars of the dwelling - Households at selected dwelling unit - Response details - Main activities

  8. Survey of Activities of Young People 1999 - South Africa

    • datafirst.uct.ac.za
    Updated Jul 12, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statistics South Africa (2020). Survey of Activities of Young People 1999 - South Africa [Dataset]. https://www.datafirst.uct.ac.za/dataportal/index.php/catalog/313
    Explore at:
    Dataset updated
    Jul 12, 2020
    Dataset authored and provided by
    Statistics South Africahttp://www.statssa.gov.za/
    Time period covered
    1999
    Description

    Abstract

    The Survey of Activities of Young People was conducted by Statistics South Africa and commissioned by the Department of Labour, primarily to gather information necessary for formulating an effective programme of action to address the issue of harmful work done by children in South Africa. Technical assistance for the survey was provided by the International Labour Organisation (ILO) and a consultant appointed by the Department of Labour. Stats SA also worked with an advisory committee, consisting of representatives from national government departments most directly concerned with child labour (the Departments of Labour,Welfare,Education and Health), non-governmental organisations, and the United Nations Children's Fund (Unicef).

    Geographic coverage

    The survey has national coverage

    Analysis unit

    Households and individuals

    Universe

    The sampled population was household members in South Africa. The survey excluded all people in prison, patients in hospitals, people residing in boarding houses and hotels, and boarding schools. Any single person households were screened out in all areas before the sample was drawn. Families living in hostels were treated as households.

    Kind of data

    Sample survey data

    Sampling procedure

    The sample frame was based on the 1996 Population Census Enumerator Areas (EA) and the number of households counted in 1996 Population Census. The sampled population excluded all prisoners in prison, patients in hospitals, people residing in boarding houses and hotels (whether temporary or semi-permanent), and boarding schools. Any single person households were screened out in all areas before the sample was drawn. Families living in hostels were treated as households. Coverage rules for the survey were that all children of usual residents were to be included even if they were not present. This means that most boarding school pupils were included in their parents’ household. The 16 EA types from the 1996 Population Census were condensed into four area types. The four area types were Formal Urban, Informal Urban, Tribal, and Commercial Farms. A decision was made to drop the Institution type EAs.

    The EAs were stratified by province, and within a province by the four area types defined above. The sample size (6110 households) was disproportionately allocated to strata by using the square root method. Within the strata the EAs were ordered by magisterial district and the EA-types included in the area type (implicit stratification). PSUs consisted of ONE or more EAs of size 100 households to ensure sufficient numbers for screening. Statistics SA was advised by child labour experts that there was a likelihood of high rates of child labour in the Urban Informal and Rural Farm areas. The sample allocation to Rural Commercial Farms was therefore increased to a minimum of 20 PSUs.

    Mode of data collection

    Face-to-face [f2f]

    Research instrument

    The Phase one questionnaire covered the following topics: Living conditions of the household, including the type of dwelling, fuels used for cooking, lighting and heating,water source for domestic use, land ownership,tenure and cultivation; demographic information on members of the household, both adults and children. Questions covered the age, gender and population group of each household member, their marital status, their relationships to each other, and their levels of education; migration details; household income; school attendance of children aged 5 -17 years; information on economic and non-economic activities of children aged 5-17 years in the 12 months prior to the survey

    Phase two questionnaire The second phase questionnaire was administered to the sampled sub-set of households in which at least one child was involved in some form of work in the year prior to the interview. It covered activities of children in much more detail than in phase one, and the work situation of related adults in the household. Both adults and children were asked to respond.

    The data files contain data from sections of the questionnaires as follows:

    PERSON: Data from Section 1, 2 and 3 of the questionnaire HHOLD : Data from Section 4 ADULT : Data from Section 5 YOUNGP: Data from Section 6, 7, 8 and 9

  9. Data usage in consumer products and retail industry 2020

    • statista.com
    Updated Dec 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2023). Data usage in consumer products and retail industry 2020 [Dataset]. https://www.statista.com/statistics/1262066/data-usage-in-consumer-products-and-retail-industry/
    Explore at:
    Dataset updated
    Dec 13, 2023
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Aug 2020
    Area covered
    Worldwide
    Description

    A global survey from Capgemini showed that retail companies were lagging behind consumer products enterprises in the use of data. The gap was significant in the automation of processes and in data collecting: only 34 percent of retailers automated data collection, against 45 percent of consumer goods companies. However, one in four organizations in both categories reported to have implemented practices involving data engineering, machine learning, and DevOps.

  10. I

    Ireland No of Enterprises: Industry: WS: Waste Collection, Treatment and...

    • ceicdata.com
    Updated Dec 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com (2024). Ireland No of Enterprises: Industry: WS: Waste Collection, Treatment and Disposal Activities, Materials Recovery [Dataset]. https://www.ceicdata.com/en/ireland/number-of-enterprises/no-of-enterprises-industry-ws-waste-collection-treatment-and-disposal-activities-materials-recovery
    Explore at:
    Dataset updated
    Dec 15, 2024
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Dec 1, 2008 - Dec 1, 2016
    Area covered
    Ireland, Ireland
    Variables measured
    Enterprises Statistics
    Description

    Ireland Number of Enterprises: Industry: WS: Waste Collection, Treatment and Disposal Activities, Materials Recovery data was reported at 578.000 Unit in 2016. This records a decrease from the previous number of 579.000 Unit for 2014. Ireland Number of Enterprises: Industry: WS: Waste Collection, Treatment and Disposal Activities, Materials Recovery data is updated yearly, averaging 582.000 Unit from Dec 2008 (Median) to 2016, with 8 observations. The data reached an all-time high of 611.000 Unit in 2012 and a record low of 544.000 Unit in 2008. Ireland Number of Enterprises: Industry: WS: Waste Collection, Treatment and Disposal Activities, Materials Recovery data remains active status in CEIC and is reported by Central Statistics Office of Ireland. The data is categorized under Global Database’s Ireland – Table IE.O005: Number of Enterprises.

  11. 2019 Farm to School Census v2

    • agdatacommons.nal.usda.gov
    xlsx
    Updated Jan 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    USDA Food and Nutrition Service, Office of Policy Support (2025). 2019 Farm to School Census v2 [Dataset]. http://doi.org/10.15482/USDA.ADC/1523106
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jan 22, 2025
    Dataset provided by
    United States Department of Agriculturehttp://usda.gov/
    Authors
    USDA Food and Nutrition Service, Office of Policy Support
    License

    U.S. Government Workshttps://www.usa.gov/government-works
    License information was derived automatically

    Description

    Note: This version supersedes version 1: https://doi.org/10.15482/USDA.ADC/1522654. In Fall of 2019 the USDA Food and Nutrition Service (FNS) conducted the third Farm to School Census. The 2019 Census was sent via email to 18,832 school food authorities (SFAs) including all public, private, and charter SFAs, as well as residential care institutions, participating in the National School Lunch Program. The questionnaire collected data on local food purchasing, edible school gardens, other farm to school activities and policies, and evidence of economic and nutritional impacts of participating in farm to school activities. A total of 12,634 SFAs completed usable responses to the 2019 Census. Version 2 adds the weight variable, “nrweight”, which is the Non-response weight. Processing methods and equipment used The 2019 Census was administered solely via the web. The study team cleaned the raw data to ensure the data were as correct, complete, and consistent as possible. This process involved examining the data for logical errors, contacting SFAs and consulting official records to update some implausible values, and setting the remaining implausible values to missing. The study team linked the 2019 Census data to information from the National Center of Education Statistics (NCES) Common Core of Data (CCD). Records from the CCD were used to construct a measure of urbanicity, which classifies the area in which schools are located. Study date(s) and duration Data collection occurred from September 9 to December 31, 2019. Questions asked about activities prior to, during and after SY 2018-19. The 2019 Census asked SFAs whether they currently participated in, had ever participated in or planned to participate in any of 30 farm to school activities. An SFA that participated in any of the defined activities in the 2018-19 school year received further questions. Study spatial scale (size of replicates and spatial scale of study area) Respondents to the survey included SFAs from all 50 States as well as American Samoa, Guam, the Northern Mariana Islands, Puerto Rico, the U.S. Virgin Islands, and Washington, DC. Level of true replication Unknown Sampling precision (within-replicate sampling or pseudoreplication) No sampling was involved in the collection of this data. Level of subsampling (number and repeat or within-replicate sampling) No sampling was involved in the collection of this data. Study design (before–after, control–impacts, time series, before–after-control–impacts) None – Non-experimental Description of any data manipulation, modeling, or statistical analysis undertaken Each entry in the dataset contains SFA-level responses to the Census questionnaire for SFAs that responded. This file includes information from only SFAs that clicked “Submit” on the questionnaire. (The dataset used to create the 2019 Farm to School Census Report includes additional SFAs that answered enough questions for their response to be considered usable.) In addition, the file contains constructed variables used for analytic purposes. The file does not include weights created to produce national estimates for the 2019 Farm to School Census Report. The dataset identified SFAs, but to protect individual privacy the file does not include any information for the individual who completed the questionnaire. Description of any gaps in the data or other limiting factors See the full 2019 Farm to School Census Report [https://www.fns.usda.gov/cfs/farm-school-census-and-comprehensive-review] for a detailed explanation of the study’s limitations. Outcome measurement methods and equipment used None Resources in this dataset:Resource Title: 2019 Farm to School Codebook with Weights. File Name: Codebook_Update_02SEP21.xlsxResource Description: 2019 Farm to School Codebook with WeightsResource Title: 2019 Farm to School Data with Weights CSV. File Name: census2019_public_use_with_weight.csvResource Description: 2019 Farm to School Data with Weights CSVResource Title: 2019 Farm to School Data with Weights SAS R Stata and SPSS Datasets. File Name: Farm_to_School_Data_AgDataCommons_SAS_SPSS_R_STATA_with_weight.zipResource Description: 2019 Farm to School Data with Weights SAS R Stata and SPSS Datasets

  12. d

    Exposure Activities Number of Activities

    • catalog.data.gov
    • opendata.dc.gov
    • +1more
    Updated Feb 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    D.C. Office of the Chief Technology Officer (2025). Exposure Activities Number of Activities [Dataset]. https://catalog.data.gov/dataset/exposure-activities-number-of-activities
    Explore at:
    Dataset updated
    Feb 4, 2025
    Dataset provided by
    D.C. Office of the Chief Technology Officer
    Description

    The “Number of Activities” chart indicates the number of activity types reported by interviewed positive cases during their exposure period. Activity types were considered moderate to high exposure activity types, and included personal care, dining out, social-, work-, travel-, faith-, gym/fitness-, and sports-related activities. Data is reported on a weekly basis from Friday to the following Thursday.Note: Data subject to change on a daily basis. Data are restricted to positive cases with a completed contact tracing interview. Possible exposure data are collected during the contact tracing interview as self-reported activities occurring within the 2-week period before the date of symptom onset for symptomatic individuals or the date of test sample collection for asymptomatic individuals. Data collection methods were altered starting the week of Dec 11 for gym/fitness and sports, so should not be compared to previous values.* High to Moderate Exposure Activity Types are not exhaustive and include travel, personal care, faith events, work, dining out, social events, gym/fitness, and sports.Data is updated on a weekly basis.

  13. Multiview Extended Video with Activities (MEVA)

    • registry.opendata.aws
    Updated Sep 19, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kitware (2019). Multiview Extended Video with Activities (MEVA) [Dataset]. https://registry.opendata.aws/mevadata/
    Explore at:
    Dataset updated
    Sep 19, 2019
    Dataset provided by
    Kitwarehttps://www.kitware.com/
    Description

    The Multiview Extended Video with Activities (MEVA) dataset consists video data of human activity, both scripted and unscripted, collected with roughly 100 actors over several weeks. The data was collected with 29 cameras with overlapping and non-overlapping fields of view. The current release consists of about 328 hours (516GB, 4259 clips) of video data, as well as 4.6 hours (26GB) of UAV data. Other data includes GPS tracks of actors, camera models, and a site map. We have also released annotations for roughly 184 hours of data. Further updates are planned.

  14. c

    Renaissance Data Collection Hub Results, 2004-2005

    • datacatalogue.cessda.eu
    • beta.ukdataservice.ac.uk
    Updated Nov 28, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Museums (2024). Renaissance Data Collection Hub Results, 2004-2005 [Dataset]. http://doi.org/10.5255/UKDA-SN-6816-1
    Explore at:
    Dataset updated
    Nov 28, 2024
    Dataset provided by
    Libraries and Archives Council
    Authors
    Museums
    Time period covered
    May 1, 2004 - Apr 1, 2005
    Area covered
    England
    Variables measured
    Individuals, Institutions/organisations, National
    Measurement technique
    Count visitors
    Description

    Abstract copyright UK Data Service and data collection copyright owner.

    Renaissance was the Museums, Libraries and Archives Council's (MLA) programme to transform England's regional museums. The programme has received over £300 million since 2002 which has been allocated across nine regional museum hubs. Regional museum hubs are a cluster of four-five museums which receive government investment in order to develop as centres of excellence and as leaders of their regional museum communities.

    MLA has been gathering data from the nine regional museum hubs from 2002-2003 to 2007-2008. The Renaissance Data Collection is a quarterly return of data from each site participating in the Renaissance in the Regions Programme. The data returns contain information on numbers of: visits; priority group visits; child visits; website visits; school visits; Higher Education visits; adult and child on-site participation; and outreach activity. The data returns support Programme management and monitoring and forms the basis of the Renaissance Museums Performance Indicator statistical series.

    From the 30th June 2011, the regional Renaissance hub structure ceases to exist. 2011-12 is a transitional year for Renaissance, in which £37.6 million of grant funding, previously known as museum hub funding, has been made available instead directly to 45 museum services.

    Further information about Renaissance can be found on the MLA's Renaissance Data Collection web page.

    Main Topics:
    Data were recorded by hubs and submitted to the MLA on a quarterly basis, following a financial year cycle e.g. Q1 (April to June).

    Data were submitted in an Excel workbook (Data Collection Template) that consists of six worksheets, covering six different areas of museums activity:
    • Template 1: number of self-directed visits by children and young people in formal education
    • Template 2: number of facilitated visits by children and young people in formal education
    • Template 3: number of instances of children, young people and adults participating in museums’ outreach activities
    • Template 4: number of instances of teachers in contact with museums
    • Template 5: number of instances of children, young people and adults participating in organised activities at museums
    • Template 6: visits, child visits, web visits and loan venues
    Within each template there are a number of different measures, with hubs reporting on a total of 67 performance indicators. A breakdown of each template measure and accompanying guidelines can be found in the Audience Data Collection Manual 2008.
    Data were not collected for templates 2 and parts of templates 3 and 5, in 2004-2005. In addition, some template sections were not introduced until January 2005. See individual data files for details.

  15. D

    Webinar: Indicators on firm level innovation activities from web scraped...

    • dataverse.nl
    mp4, pptx
    Updated Dec 14, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sajad Ashouri; Arho Suominen; Arho Suominen; Ad Notten; Ad Notten; Sajad Ashouri (2021). Webinar: Indicators on firm level innovation activities from web scraped data [Dataset]. http://doi.org/10.34894/OJMSMZ
    Explore at:
    pptx(2691228), mp4(231744179)Available download formats
    Dataset updated
    Dec 14, 2021
    Dataset provided by
    DataverseNL
    Authors
    Sajad Ashouri; Arho Suominen; Arho Suominen; Ad Notten; Ad Notten; Sajad Ashouri
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This webinar on "Indicators on firm level innovation activities from web scraped data" is a companion to the SSRN paper, and visualizes and explains the data collection and manipulation process, as well as the link with the FOS codes etc. The PPT file is also included for reference.

  16. f

    Estimates.

    • plos.figshare.com
    xls
    Updated Jan 31, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abebaw Andargie; Dawit Amogne; Ebabu Tefera (2025). Estimates. [Dataset]. http://doi.org/10.1371/journal.pone.0317518.t004
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jan 31, 2025
    Dataset provided by
    PLOS ONE
    Authors
    Abebaw Andargie; Dawit Amogne; Ebabu Tefera
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Connecting language classrooms with 21st-century skills could be the potential framework for enhancing EFL learners’ performance in writing classes. However, investigating whether project-based learning, as a new field within ELT with unique pedagogical affordances, can enhance learners’ writing skills still needs to be improved in the literature. Accordingly, this study aimed to investigate the impact of project-based learning on EFL learners’ writing performance. It sought to determine whether and to what extent project-based learning could enhance writing skills in an EFL context. The study employed a quasi-experimental design with an interrupted pre-test-post-test time series design with single group participants. Twenty-three third-year EFL undergraduate students enrolled in the Advanced Writing Skills I course were selected using a comprehensive sampling method. An essay writing test and interview were used to gather data. The participants of the study were given a series of three problem-solving essay writing tests before and after the intervention, which employed project-based essay writing instruction. In addition, to discover their attitudes toward the impacts of project-based learning and its applications on the ground, three randomly selected students were interviewed at the end of the intervention. The data collected through the tests were analyzed through a one-way repeated measure ANOVA; narration was also used to analyze the qualitative data gathered through interviews. Accordingly, the quantitative data suggested that project-based learning significantly enhances EFL learners’ writing performance. Moreover, interview data showed that students felt optimistic about the impact of project-based learning on their writing performance, idea generation, and cooperation among themselves. Therefore, project-based learning is suggested as another method in ELT writing classes because it enhances learners’ writing via idea generation, data collection, organization, cooperation, and general communication skills. As students work on worthwhile projects, its emphasis on real-world applicability and realistic activities can help them become better writers. Hence, teachers can reinforce the relationship between form and purpose by incorporating a variety of genres and collaborative writing to reflect real-world or professional situations.

  17. r

    Activities and Preferences of Visitors (Tourists) to the GBRWHA - 2012 to...

    • researchdata.edu.au
    Updated 2014
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stoeckl, Natalie, Prof.; Sakata, Hana, Ms; Farr, Marina, Dr; Esparon, Michelle, Dr; Larson, Silva, Dr (2014). Activities and Preferences of Visitors (Tourists) to the GBRWHA - 2012 to 2013 survey period (NERP TE 10.2, JCU and The Cairns Institute) [Dataset]. https://researchdata.edu.au/activities-preferences-visitors-cairns-institute/675289
    Explore at:
    Dataset updated
    2014
    Dataset provided by
    eAtlas
    Authors
    Stoeckl, Natalie, Prof.; Sakata, Hana, Ms; Farr, Marina, Dr; Esparon, Michelle, Dr; Larson, Silva, Dr
    License

    Attribution 2.5 (CC BY 2.5)https://creativecommons.org/licenses/by/2.5/
    License information was derived automatically

    Time period covered
    Jun 1, 2012 - Jun 30, 2013
    Area covered
    Description

    This dataset represents the aggregate of face to face surveys of 2743 visitors to the Great Barrier Reef World Heritage Area (GBRWHA) conducted in quarterly periods from June 2012 to June 2013. This survey was to explore how tourists feel towards and perceive Great Barrier Reef World Heritage Area, as well as their willingness to pay to protect the reef and their satisfaction with current and future developments in and around the GBRWHA. Due to privacy constrains this dataset does not correspond to the raw survey results, but instead is an aggregate of views of tourists from different origins. The format of the data is an Excel spreadsheet.

    The core segments of the tourist questionnaire included questions about: * The socio-demographic background of respondents PLUS background about travel party and origin * How often visitors had been to the GBRWHA in the past and what they did (or planned to do) while on this particular trip * Questions about the importance of various ‘goods and services’ to their overall decision to come to the region (in contrast to the resident survey which asked about importance to overall quality of life). * Their satisfaction with the trip overall (in contrast to the resident survey which asked about satisfaction with life overall) * The way in which their decision to come to the region would have been affected by changes in various environmental and market factors (in contrast to the resident survey which asked about the way these things would affect overall quality of life). * Expenditure while in the area * Willingness to pay for improvements in various environmental attributes This tourist survey was developed in combination with a matching resident questionnaire.

    Data was collected from visitors to the GBRWHA region in 59 location along the coast from airports, ferry/boat operators, caravan-park owners and beach / lagoon areas in Cairns, Port Douglas, Townsville, Bowen, Airlie Beach, Rockhampton and Yeppoon. Enumerators collected data using a questionnaire specifically developed for the purpose of the project. Questionnaires were also available in Japanese and Chinese, with the presence of Japanese and Chinese speaking enumerators. Data collection occurred in 4 time periods over the June 2012 to June 2013 year, to account for seasonality of the tourist visitations.

    In addition to this data was collected from a stratified random selection of tourism operators between Cooktown and Gladstone. From an initial list of 673 tourism operators they were divided into the accomodation sector, tour operators and tourism 'attractions'. In total 36 operators agreed to participate and resulted in 203 completed surveys.

    In line with the national Ethical Conduct in Human Research guidelines and JCU Ethics Committee Approval for this research, raw data collected from individuals cannot be made publically available. Rather, data is collated into meaningful units and presented as such.

    This dataset is accompanied by the following factsheets in pdf format: 1. Activities and Preferences of Visitors to the Great Barrier Reef World Heritage Area 2. Activities and Preferences of Visitors to Townsville and Airlie Beach 3. Activities and Preferences of Visitors to Gladstone to Mackay region 4. Activities and Preferences of Visitors to Tropical NQ 5. Activities and Preferences of Chinese Visitors to Tropical NQ 6. Activities and Preferences of Japanese Visitors to Tropical NQ 7. Activities and Preferences of Queensland Visitors to Tropical NQ 8. Activities and Preferences of Domestic (non-QLD) Visitors to Tropical NQ.

    The data associated with this metadata record corresponds to an Excel data sheet containing the aggregate data used in the accompanying summary PDF fact sheets.

    Further details of the project, including data collection and analysis methods, can be found in: Stoeckl, N., Farr, M. and Sakata H. (2013) What do residents and tourists ‘value’ most in the GBRWHA? Project 10.2 interim report on residential and tourist data collection activities including descriptive data summaries. Report to the National Environmental research program. Reef and Rainforest Research Centre Limited, Cairn (pp112) Available online from: http://www.nerptropical.edu.au/research

  18. Leading countries by number of data centers 2024

    • statista.com
    Updated Mar 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Petroc Taylor (2024). Leading countries by number of data centers 2024 [Dataset]. https://www.statista.com/topics/1464/big-data/
    Explore at:
    Dataset updated
    Mar 19, 2024
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Petroc Taylor
    Description

    As of March 2024, there were a reported 5,381 data centers in the United States, the most of any country worldwide. A further 521 were located in Germany, while 514 were located in the United Kingdom. What is a data center? A data center is a network of computing and storage resources that enables the delivery of shared software applications and data. These centers can house large amounts of critical and important data, and therefore are vital to the daily functions of companies and consumers alike. As a result, whether it is a cloud, colocation, or managed service, data center real estate will have increasing importance worldwide. Hyperscale data centers In the past, data centers were highly controlled physical infrastructures, but the cloud has since changed that model. A cloud data service is a remote version of a data center – located somewhere away from a company's physical premises. Cloud IT infrastructure spending has grown and is forecast to rise further in the coming years. The evolution of technology, along with the rapid growth in demand for data across the globe, is largely driven by the leading hyperscale data center providers.

  19. r

    2014 DSITIA Open Data Competition example ideas

    • researchdata.edu.au
    • data.wu.ac.at
    Updated Feb 26, 2014
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.qld.gov.au (2014). 2014 DSITIA Open Data Competition example ideas [Dataset]. https://researchdata.edu.au/2014-dsitia-open-example-ideas/657978
    Explore at:
    Dataset updated
    Feb 26, 2014
    Dataset provided by
    data.qld.gov.au
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In support of the 2014 DSITIA Open Data competition, this dataset was developed by departmental workshop participants, and others, to provide ideas for potential uses of science data.

  20. Enterprise Survey 2012 - Cambodia

    • microdata.worldbank.org
    Updated Feb 11, 2015
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    World Bank (2015). Enterprise Survey 2012 - Cambodia [Dataset]. https://microdata.worldbank.org/index.php/catalog/2223
    Explore at:
    Dataset updated
    Feb 11, 2015
    Dataset provided by
    World Bankhttp://worldbank.org/
    Asian Development Bankhttp://www.adb.org/
    Time period covered
    2012 - 2013
    Area covered
    Cambodia
    Description

    Abstract

    Cambodia Enterprise Survey 2012 (also known as Investment Climate Survey 2012) was conducted by the World Bank Cambodia country office and Asian Development Bank between February 2012 and February 2013. The survey formed analytical background for the Investment Climate Assessment (ICA) prepared by the World Bank in partnership with the government of Cambodia. The assessment was completed in August 2014.

    The objectives of the 2014 Cambodia ICA are to provide up-to-date and fact-based analysis of the business environment for development partners, policymakers in the government, private sector, civil society, and outline priorities for improving business environment and suggest possible policy options for achieving them.

    Cambodia Enterprise Survey 2012 was not conducted under the supervision of World Bank's Enterprise Analysis Unit, as other Enterprise Surveys, and therefore small variations in methodology are present.

    Data from 472 registered establishments was analyzed. Stratified random sampling was used to select the surveyed businesses. Data was collected using face-to-face interviews.

    The topics covered include firm characteristics, access to finance, sales, costs of inputs/labor, workforce composition, bribery, licensing, infrastructure, trade, crime, competition, capacity utilization, land and permits, taxation, informality, business-government relations, innovation and technology, and performance measures.

    Geographic coverage

    National

    Analysis unit

    The primary sampling unit of the study is an establishment. An establishment is a physical location where business is carried out and where industrial operations take place or services are provided. A firm may be composed of one or more establishments. For example, a brewery may have several bottling plants and several establishments for distribution. For the purposes of this survey an establishment must make its own financial decisions and have its own financial statements separate from those of the firm. An establishment must also have its own management and control over its payroll.

    Universe

    The universe of the study, is manufacturing, trade, tourism, and selected services. In terms of the International Standard Industrial Classification (Rev. 4) the following groups are included: manufacturing (group C), construction (group F), wholesale and retail trade (group G), transportation and storage (group H), accommodation and food services activities (group I), travel agency, tour operator, reservation service and related activity (79) and computer programming, consultancy and related activities (62). Note that this definition excludes agriculture (group A), mining and quarrying (group B), energy and water supply (groups D and E), and all other services (groups J to U) except for IT (62) and travel agency, tour operator, reservation service and related activity (79) which were included in the population under study.

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    Four levels of stratification were used in this country: sector, establishment size, location and formal status.

    Sector stratification was designed in the following way: the universe was stratified into 5 sectors: (1) agroprocessing consisting of manufacture of food, beverages and tobacco, manufacture of wood and wood products and manufacture of rubber products (ISIC Rev. 4 codes 10-12 and 16), (2) manufacturing except agroprocessing (ISIC Rev. 4 group C except 10-12 and 16), (3) trade (ISIC Rev. 4 group G), (4) tourism (ISIC Rev. 4 group I and 79), and (5) other (ISIC Rev. 4 groups F and H and 62).

    Size stratification was defined following the standardized definition for the rollout: small (5 to 19 employees), medium (20 to 99 employees), and large (more than 99 employees). For stratification purposes, the number of employees was defined on the basis of reported number of persons engaged daily in the last week as this was the only information available in the sampling frame.

    Location stratification was defined in the five major urban economic centers: Phnom Penh, Siem Reap, Kampong Cham, Sihanouk Ville, and Battambang.

    Stratification by formal status is done by distinguishing between firms that have the required registration with the Ministry of Commerce (formal firms) and those that lack the registration (informal firms).

    The Establishment Listing 2009 (EL 2009), which was conducted during February-March 2009 by the National Institute of Statistics and the Ministry of Planning of Cambodia, was used as the sampling frame. The EL 2009 aimed at compiling basic statistics on establishments and constructing a comprehensive list of establishments. The establishment list was later used as a frame for the 2011 Economic Census.

    The sample of firms that were interviewed for Cambodia Enterprise Survey 2007 was also used in the survey.

    In order to have a sufficient number of firms outside Phnom Penh in the sample, firms in Battambang, Siem Reap, Kampong Cham, Sihanouk Ville were oversampled proportionally in each stratum defined by sector, size, and formality, such that the total number of sampled firms from Battambang, Siem Reap, Kampong Cham, Sihanouk Ville was approximately 50% in each of the strata (less if not enough firms outside Phnom Penh are available).

    Mode of data collection

    Face-to-face [f2f]

    Research instrument

    The questionnaire included most questions from the traditional Enterprise Survey Core Module. But there were some differences.

    First, the survey collected more detailed information on some elements of the investment climate, such as firm registration (question 113), interest in the stock market (questions 102-106), and assessment of different investment locations (questions 107-108).

    Second, detailed questions on revenues from supplying products/services and trade and the costs of inputs were asked (questions 132-135). It was found that some firms had difficulty providing this information for the whole year, but they were able to provide this information for subperiods. Also given poor bookkeeping in a lot of Cambodia businesses, firms were asked for the revenues and raw material costs for their main three products and other (remaining) products rather than for the total revenues and raw materials directly.

    Third, detailed questions were asked on investment in and replacement values of machinery and equipment (questions 138 and 140). Firms were asked to provide information on components rather than total values, as firms had otherwise even more difficulty answering this question.

    Response rate

    Survey non-response was addressed by maximizing efforts to contact establishments that were initially selected for interview. Attempts were made to contact the establishment for interview at different times/days of the week before a replacement establishment (within the same stratum) was selected for interview. Survey non-response did occur but substitutions were made in order to potentially achieve strata-specific quota.

    The number of contacted establishments per realized interview was 2.56. This number is the result of two factors: explicit refusals to participate in the survey, as reflected by the rate of rejection (which includes rejections of the screener and the main survey) as well as difficulties to locate firms and changes in sector activity. The number of refusals per contact actually made was 0.32.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Márquez-Carpintero, Luis; Suescun-Ferrandiz, Sergio; Álvarez, Carolina Lorenzo; Fernandez-Herrero, Jorge; Viejo, Diego; Rosabel Roig-Vila; Cazorla, Miguel; Márquez-Carpintero, Luis; Suescun-Ferrandiz, Sergio; Álvarez, Carolina Lorenzo; Fernandez-Herrero, Jorge; Viejo, Diego; Rosabel Roig-Vila; Cazorla, Miguel (2025). DIPSER: A Dataset for In-Person Student Engagement Recognition in the Wild [Dataset]. https://observatorio-cientifico.ua.es/documentos/67321d21aea56d4af0484172

Data from: DIPSER: A Dataset for In-Person Student Engagement Recognition in the Wild

Related Article
Explore at:
Dataset updated
2025
Authors
Márquez-Carpintero, Luis; Suescun-Ferrandiz, Sergio; Álvarez, Carolina Lorenzo; Fernandez-Herrero, Jorge; Viejo, Diego; Rosabel Roig-Vila; Cazorla, Miguel; Márquez-Carpintero, Luis; Suescun-Ferrandiz, Sergio; Álvarez, Carolina Lorenzo; Fernandez-Herrero, Jorge; Viejo, Diego; Rosabel Roig-Vila; Cazorla, Miguel
Description

Data DescriptionThe DIPSER dataset is designed to assess student attention and emotion in in-person classroom settings, consisting of RGB camera data, smartwatch sensor data, and labeled attention and emotion metrics. It includes multiple camera angles per student to capture posture and facial expressions, complemented by smartwatch data for inertial and biometric metrics. Attention and emotion labels are derived from self-reports and expert evaluations. The dataset includes diverse demographic groups, with data collected in real-world classroom environments, facilitating the training of machine learning models for predicting attention and correlating it with emotional states.Data Collection and Generation ProceduresThe dataset was collected in a natural classroom environment at the University of Alicante, Spain. The recording setup consisted of six general cameras positioned to capture the overall classroom context and individual cameras placed at each student’s desk. Additionally, smartwatches were used to collect biometric data, such as heart rate, accelerometer, and gyroscope readings.Experimental SessionsNine distinct educational activities were designed to ensure a comprehensive range of engagement scenarios:News Reading – Students read projected or device-displayed news.Brainstorming Session – Idea generation for problem-solving.Lecture – Passive listening to an instructor-led session.Information Organization – Synthesizing information from different sources.Lecture Test – Assessment of lecture content via mobile devices.Individual Presentations – Students present their projects.Knowledge Test – Conducted using Kahoot.Robotics Experimentation – Hands-on session with robotics.MTINY Activity Design – Development of educational activities with computational thinking.Technical SpecificationsRGB Cameras: Individual cameras recorded at 640×480 pixels, while context cameras captured at 1280×720 pixels.Frame Rate: 9-10 FPS depending on the setup.Smartwatch Sensors: Collected heart rate, accelerometer, gyroscope, rotation vector, and light sensor data at a frequency of 1–100 Hz.Data Organization and FormatsThe dataset follows a structured directory format:/groupX/experimentY/subjectZ.zip Each subject-specific folder contains:images/ (individual facial images)watch_sensors/ (sensor readings in JSON format)labels/ (engagement & emotion annotations)metadata/ (subject demographics & session details)Annotations and LabelingEach data entry includes engagement levels (1-5) and emotional states (9 categories) based on both self-reported labels and evaluations by four independent experts. A custom annotation tool was developed to ensure consistency across evaluations.Missing Data and Data QualitySynchronization: A centralized server ensured time alignment across devices. Brightness changes were used to verify synchronization.Completeness: No major missing data, except for occasional random frame drops due to embedded device performance.Data Consistency: Uniform collection methodology across sessions, ensuring high reliability.Data Processing MethodsTo enhance usability, the dataset includes preprocessed bounding boxes for face, body, and hands, along with gaze estimation and head pose annotations. These were generated using YOLO, MediaPipe, and DeepFace.File Formats and AccessibilityImages: Stored in standard JPEG format.Sensor Data: Provided as structured JSON files.Labels: Available as CSV files with timestamps.The dataset is publicly available under the CC-BY license and can be accessed along with the necessary processing scripts via the DIPSER GitHub repository.Potential Errors and LimitationsDue to camera angles, some student movements may be out of frame in collaborative sessions.Lighting conditions vary slightly across experiments.Sensor latency variations are minimal but exist due to embedded device constraints.CitationIf you find this project helpful for your research, please cite our work using the following bibtex entry:@misc{marquezcarpintero2025dipserdatasetinpersonstudent1, title={DIPSER: A Dataset for In-Person Student1 Engagement Recognition in the Wild}, author={Luis Marquez-Carpintero and Sergio Suescun-Ferrandiz and Carolina Lorenzo Álvarez and Jorge Fernandez-Herrero and Diego Viejo and Rosabel Roig-Vila and Miguel Cazorla}, year={2025}, eprint={2502.20209}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2502.20209}, } Usage and ReproducibilityResearchers can utilize standard tools like OpenCV, TensorFlow, and PyTorch for analysis. The dataset supports research in machine learning, affective computing, and education analytics, offering a unique resource for engagement and attention studies in real-world classroom environments.

Search
Clear search
Close search
Google apps
Main menu