Do you know how much time you spend on an app? Do you know the total use time of a day or average use time of an app?
This data set consists of - how many times a person unlocks his phone. - how much time he spends on every app on every day. - how much time he spends on his phone.
It lists the usage time of apps for each day.
Use the test data to find the Total Minutes that we can use the given app in a day. we can get a clear stats of apps usage. This data set will show you about the persons sleeping behavior as well as what app he spends most of his time. with this we can improve the productivity of the person.
The dataset was collected from the app usage app.
The average time spent daily on a phone, not counting talking on the phone, has increased in recent years, reaching a total of * hours and ** minutes as of April 2022. This figure was expected to reach around * hours and ** minutes by 2024.
How much time do people spend on social media? As of 2025, the average daily social media usage of internet users worldwide amounted to 141 minutes per day, down from 143 minutes in the previous year. Currently, the country with the most time spent on social media per day is Brazil, with online users spending an average of 3 hours and 49 minutes on social media each day. In comparison, the daily time spent with social media in the U.S. was just 2 hours and 16 minutes. Global social media usageCurrently, the global social network penetration rate is 62.3 percent. Northern Europe had an 81.7 percent social media penetration rate, topping the ranking of global social media usage by region. Eastern and Middle Africa closed the ranking with 10.1 and 9.6 percent usage reach, respectively. People access social media for a variety of reasons. Users like to find funny or entertaining content and enjoy sharing photos and videos with friends, but mainly use social media to stay in touch with current events friends. Global impact of social mediaSocial media has a wide-reaching and significant impact on not only online activities but also offline behavior and life in general. During a global online user survey in February 2019, a significant share of respondents stated that social media had increased their access to information, ease of communication, and freedom of expression. On the flip side, respondents also felt that social media had worsened their personal privacy, increased a polarization in politics and heightened everyday distractions.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Related article: Bergroth, C., Järv, O., Tenkanen, H., Manninen, M., Toivonen, T., 2022. A 24-hour population distribution dataset based on mobile phone data from Helsinki Metropolitan Area, Finland. Scientific Data 9, 39.
In this dataset:
We present temporally dynamic population distribution data from the Helsinki Metropolitan Area, Finland, at the level of 250 m by 250 m statistical grid cells. Three hourly population distribution datasets are provided for regular workdays (Mon – Thu), Saturdays and Sundays. The data are based on aggregated mobile phone data collected by the biggest mobile network operator in Finland. Mobile phone data are assigned to statistical grid cells using an advanced dasymetric interpolation method based on ancillary data about land cover, buildings and a time use survey. The data were validated by comparing population register data from Statistics Finland for night-time hours and a daytime workplace registry. The resulting 24-hour population data can be used to reveal the temporal dynamics of the city and examine population variations relevant to for instance spatial accessibility analyses, crisis management and planning.
Please cite this dataset as:
Bergroth, C., Järv, O., Tenkanen, H., Manninen, M., Toivonen, T., 2022. A 24-hour population distribution dataset based on mobile phone data from Helsinki Metropolitan Area, Finland. Scientific Data 9, 39. https://doi.org/10.1038/s41597-021-01113-4
Organization of data
The dataset is packaged into a single Zipfile Helsinki_dynpop_matrix.zip which contains following files:
HMA_Dynamic_population_24H_workdays.csv represents the dynamic population for average workday in the study area.
HMA_Dynamic_population_24H_sat.csv represents the dynamic population for average saturday in the study area.
HMA_Dynamic_population_24H_sun.csv represents the dynamic population for average sunday in the study area.
target_zones_grid250m_EPSG3067.geojson represents the statistical grid in ETRS89/ETRS-TM35FIN projection that can be used to visualize the data on a map using e.g. QGIS.
Column names
YKR_ID : a unique identifier for each statistical grid cell (n=13,231). The identifier is compatible with the statistical YKR grid cell data by Statistics Finland and Finnish Environment Institute.
H0, H1 ... H23 : Each field represents the proportional distribution of the total population in the study area between grid cells during a one-hour period. In total, 24 fields are formatted as “Hx”, where x stands for the hour of the day (values ranging from 0-23). For example, H0 stands for the first hour of the day: 00:00 - 00:59. The sum of all cell values for each field equals to 100 (i.e. 100% of total population for each one-hour period)
In order to visualize the data on a map, the result tables can be joined with the target_zones_grid250m_EPSG3067.geojson data. The data can be joined by using the field YKR_ID as a common key between the datasets.
License Creative Commons Attribution 4.0 International.
Related datasets
Järv, Olle; Tenkanen, Henrikki & Toivonen, Tuuli. (2017). Multi-temporal function-based dasymetric interpolation tool for mobile phone data. Zenodo. https://doi.org/10.5281/zenodo.252612
Tenkanen, Henrikki, & Toivonen, Tuuli. (2019). Helsinki Region Travel Time Matrix [Data set]. Zenodo. http://doi.org/10.5281/zenodo.3247564
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset explores the relationship between digital behavior and mental well-being among 100,000 individuals. It records how much time people spend on screens, use of social media (including TikTok), and how these habits may influence their sleep, stress, and mood levels.
It includes six numerical features, all clean and ready for analysis, making it ideal for machine learning tasks like regression or classification. The data enables researchers and analysts to investigate how modern digital lifestyles may impact mental health indicators in measurable ways.
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
This table contains 2376 series, with data for years 2015 - 2015 (not all combinations necessarily have data for all years). This table contains data described by the following dimensions (Not all combinations are available): Geography (11 items: Canada; Newfoundland and Labrador; Prince Edward Island; Nova Scotia; ...); Age group (3 items: Total, 6 to 17 years; 6 to 11 years; 12 to 17 years); Sex (3 items: Both sexes; Males; Females); Children's screen time (3 items: Total population for the variable children's screen time; 2 hours or less of screen time per day; More than 2 hours of screen time per day); Characteristics (8 items: Number of persons; Low 95% confidence interval, number of persons; High 95% confidence interval, number of persons; Coefficient of variation for number of persons; ...).
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This datas real-world trends in children's screen time usage. It includes data on educational, recreational, and total screen time for children aged 5 to 15 years, with breakdowns by gender (Male, Female, Other/Prefer not to say) and day type (Weekday, Weekend). The dataset follows expected behavioral patterns:
Screen time increases with age (~1.5 hours/day at age 5 to 6+ hours/day at age 15).
Recreational screen time dominates, making up 65–80% of total screen time.
Weekend screen time is 20–30% higher than weekdays, with a larger increase for teenagers.
Slight gender-based variations in recreational screen time.
The dataset contains natural variability, ensuring realism, and the sample size decreases slightly with age (e.g., 500 respondents at age 5, 300 at age 15).
This dataset is ideal for data analysis, visualization, and machine learning experiments related to children's digital habits. 🚀
Percentage of smartphone users by selected smartphone use habits in a typical day.
The population share with mobile internet access in North America was forecast to increase between 2024 and 2029 by in total 2.9 percentage points. This overall increase does not happen continuously, notably not in 2028 and 2029. The mobile internet penetration is estimated to amount to 84.21 percent in 2029. Notably, the population share with mobile internet access of was continuously increasing over the past years.The penetration rate refers to the share of the total population having access to the internet via a mobile broadband connection.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the population share with mobile internet access in countries like Caribbean and Europe.
In 2023, Android users in Singapore spent an average of **** hours per day using their mobile devices. This represents an increase from the **** hours that users in the country spent on their devices in 2020.
The S3 dataset contains the behavior (sensors, statistics of applications, and voice) of 21 volunteers interacting with their smartphones for more than 60 days. The type of users is diverse, males and females in the age range from 18 until 70 have been considered in the dataset generation. The wide range of age is a key aspect, due to the impact of age in terms of smartphone usage. To generate the dataset the volunteers installed a prototype of the smartphone application in on their Android mobile phones.
All attributes of the different kinds of data are writed in a vector. The dataset contains the fellow vectors:
Sensors:
This type of vector contains data belonging to smartphone sensors (accelerometer and gyroscope) that has been acquired in a given windows of time. Each vector is obtained every 20 seconds, and the monitored features are:- Average of accelerometer and gyroscope values.- Maximum and minimum of accelerometer and gyroscope values.- Variance of accelerometer and gyroscope values.- Peak-to-peak (max-min) of X, Y, Z coordinates.- Magnitude for gyroscope and accelerometer.
Statistics:
These vectors contain data about the different applications used by the user recently. Each vector of statistics is calculated every 60 seconds and contains : - Foreground application counters (number of different and total apps) for the last minute and the last day.- Most common app ID and the number of usages in the last minute and the last day. - ID of the currently active app. - ID of the last active app prior to the current one.- ID of the application most frequently utilized prior to the current application. - Bytes transmitted and received through the network interfaces.
Voice:
This kind of vector is generated when the microphone is active in a call o voice note. The speaker vector is an embedding, extracted from the audio, and it contains information about the user's identity. This vector, is usually named "x-vector" in the Speaker Recognition field, and it is calculated following the steps detailed in "egs/sitw/v2" for the Kaldi library, with the models available for the extraction of the embedding.
A summary of the details of the collected database.
- Users: 21 - Sensors vectors: 417.128 - Statistics app's usage vectors: 151.034 - Speaker vectors: 2.720 - Call recordings: 629 - Voice messages: 2.091
The data sets contains the major results of the article “Improving information extraction from model data using sensitivity-weighted performance criteria“ written by Guse et al. (2020). In this article, it is analysed how a sensitivity-weighted performance criterion improves parameter identifiability and model performance. More details are given the in article. The files of this dataset are described as follows. Parameter sampling: FAST parameter sampling.xlsx: To estimate the sensitivity, the Fourier Amplitude Sensitivity Test (FAST) was used (R-routine FAST, Reusser, 2013). Each column shows the values of the model parameter of the SWAT model (Arnold et al., 1998). All parameters are explained in detail in Neitsch et al. (2011). The FAST parameter sampling defines the number of model runs. For twelve model parameters as in this case, 579 model runs are required. The same parameter sets were used for all catchments. Daily sensitivity time series: Sensitivity_2000_2005.xlsx: Daily time series of parameter sensitivity for the period 2000-2005 for three catchments in Germany (Treene, Saale, Kinzig). Each column shows the sensitivity of one parameter of the SWAT model. The methodological approach of the temporal dynamics of parameter sensitivity (TEDPAS) was developed by Reusser et al. (2011) and firstly applied to the SWAT model in Guse et al. (2014). As sensitivity index, the first-order partial variance is used that is the ratio of the partial variance of one parameter divided by the total variance. The sensitivity is thus always between 0 and 1. The sum in one row, i.e. the sensitivity of all model parameters on one day, could not be higher than 1. Parameter sampling: LH parameter sampling.xlsx: To calculate parameter identifiability, Latin Hypercube sampling was used to generate 2000 parameter sets (R-package FME, Soetaert and Petzoldt, 2010). Each column shows the values of the model parameter of the SWAT model (Arnold et al., 1998). All parameters are explained in detail in Neitsch et al. (2011). The same parameter sets were used for all catchments. Performance criteria with and without sensitivity weights: RSR_RSRw_cal.xlsx: • Calculation of the RSR once and RSRw separately for each model parameter. • RSR: Typical RSR (RMSE divided by standard deviation) • RSR_w: RSR with weights according to daily sensitivity time series. The calculation was carried out in all three catchments. • The column RSR shows the results of the RSR (RMSE divided by standard deviation) for the different model runs. • The column RSR[_parameter name] shows the calculation of the RSR_w for the specific model parameter. • RSR_w give weights on each day based on the daily parameter sensitivity (as shown in sensitivity_2000_2005.xlsx). This means that days with a higher parameter sensitivity are higher weighted. In the methodological approach the best 25% of the model runs were calculated (best 500 model runs) and the model parameters were constrained to the most appropriate parameter values (see methodological description in the article). Performance criteria for the three catchments: GOFrun_[catchment name]_RSR.xlsx: These three tables are organised identical and are available for the three catchments in Germany (Treene, Saale, Kinzig). In using the different parameter ranges for the catchments as defined in the previous steps, 2000 model simulation were carried out. Therefore, a Latin-Hypercube sampling was used (R-package FME, Soetaert and Petzoldt, 2010). The three tables show the results of 2000 model simulations for ten different performance criteria for the two different methodological approaches (RSR and swRSR) and two periods (calibration: 2000-2005 and validation: 2006-2010). Performance criteria for the three catchments: GOFrun_[catchment name]_MAE.xlsx: The three tables show the results of 2000 model simulations for ten different performance criteria for the two different methodological approaches (MAE and swMAE) and two periods (calibration: 2000-2005 and validation: 2006-2010).
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
This table contains 1044 series, with data for years 1990 - 1998 (not all combinations necessarily have data for all years), and was last released on 2007-01-29. This table contains data described by the following dimensions (Not all combinations are available): Geography (29 items: Austria; Belgium (Flemish speaking); Belgium; Belgium (French speaking) ...), Sex (2 items: Males; Females ...), Age group (3 items: 11 years;15 years;13 years ...), Time spent (6 items: Not at all; Less than 1/2 hour;2 to 3 hours;1/2 hour to 1 hour ...).
https://cdla.io/sharing-1-0/https://cdla.io/sharing-1-0/
Context: This dataset offers insights into the usage patterns of social media apps for 1,000 users across seven popular platforms: Facebook, Instagram, Twitter, Snapchat, TikTok, LinkedIn, and Pinterest. It tracks various metrics such as daily time spent on the app, number of posts made, likes received, and new followers gained.
Dataset Features:
User_ID: Unique identifier for each user. App: The social media platform being used. Daily_Minutes_Spent: Total time a user spends on the app each day, ranging from 5 to 500 minutes. Posts_Per_Day: Number of posts a user creates per day, ranging from 0 to 20. Likes_Per_Day: Total number of likes a user receives on their posts each day, ranging from 0 to 200. Follows_Per_Day: The number of new followers a user gains daily, ranging from 0 to 50. Context & Use Cases: This dataset could be particularly useful for social media analysts, digital marketers, or researchers interested in understanding user engagement trends across different platforms. It provides insights into how much time users spend, how actively they post, and the level of engagement they receive (in terms of likes and followers).
Conclusion & Outcome: Analyzing this dataset could yield several outcomes:
Engagement Patterns: Identifying which platforms have higher engagement in terms of time spent or likes received. Active Users: Determining which users are the most active across various platforms based on the number of posts and followers gained. User Retention: Studying the correlation between time spent and follower growth, providing insight into user retention strategies for different platforms. Overall, the dataset allows for exploration of social media usage trends and helps drive decision-making for marketing strategies, content creation, and platform engagement.
Abstract: Building health management is an important part in running an efficient and cost-effective building. Many problems in a building’s system can go undetected for long periods of time, leading to expensive repairs or wasted resources. This project aims to help detect and diagnose the building‘s health with data driven methods throughout the day. Orca and IMS are two state of the art algorithms that observe an array of building health sensors and provide feedback on the overall system’s health as well as localize the problem to one, or possibly two, components. With this level of feedback the hope is to quickly identify problems and provide appropriate maintenance while reducing the number of complaints and service calls. Introduction: To prepare these technologies for the new installation, the proposed methods are being tested on a current system that behaves similarly to the future green building. Building 241 was determined to best resemble the proposed building 232 and therefore was chosen for this study. Building 241 is currently outfitted with 34 sensors that monitor the heating & cooling temperatures for the air and water systems as well as other various subsystem states. The daily sensor recordings were logged and sent to the IDU group for analysis. The period of analysis was focused from July 1st through August 10th 2009. Methodology: The two algorithms used for analysis were Orca and IMS. Both methods look for anomalies using a distanced based scoring approach. Orca has the ability to use a single data set and find outliers within that data set. This tactic was applied to each day. After scoring each time sample throughout a given day the Orca score profiles were compared by computing the correlation against all other days. Days with high overall correlations were considered normal however days with lower overall correlations were more anomalous. IMS, on the other hand, needs a normal set of data to build a model, which can be applied to a set of test data to asses how anomaly the particular data set is. The typical days identified by Orca were used as the reference/training set for IMS, while all the other days were passed through IMS resulting in an anomaly score profile for each day. The mean of the IMS score profile was then calculated for each day to produce a summary IMS score. These summary scores were ranked and the top outliers were identified (see Figure 1). Once the anomalies were identified the contributing parameters were then ranked by the algorithm. Analysis: The contributing parameters identified by IMS were localized to the return air temperature duct system. -7/03/09 (Figure 2 & 3) AHU-1 Return Air Temperature (RAT) Calculated Average Return Air Temperature -7/19/09 (Figure 3 & 4) AHU-2 Return Air Temperature (RAT) Calculated Average Return Air Temperature IMS identified significantly higher temperatures compared to other days during the month of July and August. Conclusion: The proposed algorithms Orca and IMS have shown that they were able to pick up significant anomalies in the building system as well as diagnose the anomaly by identifying the sensor values that were anomalous. In the future these methods can be used on live streaming data and produce a real time anomaly score to help building maintenance with detection and diagnosis of problems.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
AbstractThis dataset was collected from first-generation immigrants between 2022 and 2023. Over a 28-day period, 39 participants aged 18 to 65, fluent in English and experiencing loneliness (UCLA Loneliness Scale score ≥ 28) contributed to the study. Data collection utilized Samsung Watch Active 2, Oura Ring, AWARE, and Centralive smartphone application. This dataset contains raw data from photoplethysmogram (PPG), inertial measurement unit (IMU) readings, air pressure, and processed data on heart rate, heart rate variability, sleep metrics (bedtime, stages, quality), physical activity (steps, active calories, activity types), and smartphone usage patterns (screen time, notifications, call and message logs). Participants also completed ecological momentary assessments (EMA) and weekly surveys, including instruments like the Beck Depression Inventory (BDI), Patient Health Questionnaire-9 (PHQ-9), Perceived Stress Scale, Sense of Coherence Scale, Social Connectedness Scale, Twente Engagement with E-Health Technologies questionnaire, and the UCLA Loneliness Scale. This dataset can be used to study the interplay between loneliness, mental well-being, and daily behaviors of immigrants in a real-world context. MethodsDesign and set up This study was designed to create a longitudinal dataset capturing physiological, behavioral, and psychological data from first-generation immigrants living in Finland. The dataset aims to support research on the relationship between mental health and daily lifestyle factors, providing a foundation for further detection algorithm development. To achieve this, the study collected multimodal data over a 28-day period from every participant. Objective data were gathered from wearable devices, which recorded sleep patterns, physical activity, and cardiovascular health metrics and raw PPG signals. Passive smartphone data, such as screen usage, notifications, calls, and messages, were also collected to capture digital behavior patterns. Subjective data were collected through EMAs delivered via push notifications and weekly self-report surveys. These assessments measured daily emotional states—loneliness, stress, depression, and social connectedness. By integrating multiple data sources, this dataset allows researchers to explore the complex interactions between mental health and lifestyle behaviors under free-living conditions. Data collection To facilitate continuous data collection and remote monitoring, the Centralive was used. Centralive is a digital health platform designed for continuous data collection, data storage, real-time monitoring, and remote management of participant engagement throughout the study. Data was collected using different applications, and wearable devices all centralized to the Centralive system. Then the collected data was transferred and stored in the Centralive’s cloud server. The Centralive’s dashboard was used to monitor the collected data to monitor participant’s engagement during data collection. To collect the subjective daily EMAs and weekly surveys, the Centralive prompted the daily EMAs at 8 a.m., 2 p.m., 5 p.m., 8 p.m., and 10 p.m. to every participant. The daily EMA contains questions focusing on their current emotions including feelings of loneliness, social connectedness, and affect. The weekly EMA was open from 12 a.m. to 11:59 p.m. and prompted participants every Sunday. Samsung watch active 2, equipped with Tizen open-source Operating System (TizenOS) was used to collect objective physiological signals. The device recorded photoplethysmography (PPG), accelerometer, and gyroscope data at a sampling rate of 20 Hz, while air pressure measurements were captured at 10 Hz. Data collection was scheduled at two-hour intervals, with each recording session lasting 12 minutes. The Oura Ring was used to track participants' sleep and activity patterns throughout the study. Data collected by the Oura Ring, including sleep, activity metrics, and cardiac metrics including heart rate and heart rate variability sensed during sleep. Centralive utilized Open Authentication to securely access and retrieve these data, making them available to researchers on a daily basis for further analysis. The AWARE framework was used to collect passive phone activity data. The AWARE app ran in the background on participants’ smartphones, continuously logging data without requiring active user input. The collected data included battery usage patterns, recording charging events and power consumption to monitor device usage trends. Call logs were also recorded, tracking incoming and outgoing calls with metadata such as timestamps and call duration, but without capturing conversation content. Similarly, message logs documented sent and received text messages, preserving metadata while ensuring privacy. Notifications data provided insights into participants’ digital engagement by logging received notifications, including app source and timestamps. Screen usage patterns were...
This case surveillance public use dataset has 19 elements for all COVID-19 cases shared with CDC and includes demographics, geography (county and state of residence), any exposure history, disease severity indicators and outcomes, and presence of any underlying medical conditions and risk behaviors. Currently, CDC provides the public with three versions of COVID-19 case surveillance line-listed data: this 19 data element dataset with geography, a 12 data element public use dataset, and a 32 data element restricted access dataset. The following apply to the public use datasets and the restricted access dataset: - Data elements can be found on the COVID-19 case report form located at www.cdc.gov/coronavirus/2019-ncov/downloads/pui-form.pdf. - Data are considered provisional by CDC and are subject to change until the data are reconciled and verified with the state and territorial data providers. - Some data are suppressed to protect individual privacy. - Datasets will include all cases with the earliest date available in each record (date received by CDC or date related to illness/specimen collection) at least 14 days prior to the creation of the previously updated datasets. This 14-day lag allows case reporting to be stabilized and ensure that time-dependent outcome data are accurately captured. - Datasets are updated monthly. - Datasets are created using CDC’s Policy on Public Health Research and Nonresearch Data Management and Access and include protections designed to protect individual privacy. - For more information about data collection and reporting, please see wwwn.cdc.gov/nndss/data-collection.html. - For more information about the COVID-19 case surveillance data, please see www.cdc.gov/coronavirus/2019-ncov/covid-data/faq-surveillance.html. Overview The COVID-19 case surveillance database includes patient-level data reported by U.S. states and autonomous reporting entities, including New York City and the District of Columbia (D.C.), as well as U.S. territories and affiliates. On April 5, 2020, COVID-19 was added to the Nationally Notifiable Condition List and classified as "immediately notifiable, urgent (within 24 hours)" by a Council of State and Territorial Epidemiologists (CSTE) Interim Position Statement (Interim-20-ID-01). CSTE updated the position statement on August 5, 2020 to clarify the interpretation of antigen detection tests and serologic test results within the case classification (Interim-20-ID-02). The statement also recommended that all states and territories enact laws to make COVID-19 reportable in their jurisdiction, and that jurisdictions conducting surveillance should submit case notifications to CDC. COVID-19 case surveillance data collected by jurisdictions are shared voluntarily with CDC. For more information, visit: wwwn.cdc.gov/nndss/conditions/coronavirus-disease-2019-covid-19/case-definition/2020/08/05/. COVID-19 Case Reports COVID-19 case reports are routinely submitted to CDC by pu
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset is provided in the form of an excel files with 5 tabs. The first three excel tabs constitute demonstration data on the set up of consumer wearable device for exposure and health monitoring in population studies while the two last excel tabs include the full dataset with actual data collected using the consumer wearable devices in Cyprus and Greece respectively during the Spring of 2020. The data from the last two tabs were used to assess the compliance of asthmatic schoolchildren (n=108) from both countries to public health intervention levels in response to COVID-19 pandemic (lockdown and social distancing measures), using wearable sensors to continuously track personal location and physical activity. Asthmatic children were recruited from primary schools in Cyprus and Greece (Heraklion district, Crete) and were enrolled in the LIFE-MEDEA public health intervention project (Clinical.Trials.gov Identifier: NCT03503812). The LIFE-MEDEA project aimed to evaluate the efficacy of behavioral recommendations to reduce exposure to particulate matter during desert dust storm (DDS) events and thus mitigate disease-specific adverse health effects in vulnerable groups of patients. However, during the COVID-19 pandemic, the collected data were analysed using a mixed effect model adjusted for confounders to estimate the changes in 'fraction time spent at home' and 'total steps/day' during the enforcement of gradually more stringent lockdown measures. Results of this analysis were first presented in the manuscript titled “Use of wearable sensors to assess compliance of asthmatic children in response to lockdown measures for the COVID-19 epidemic” published by Scientific Reports (https://doi.org/10.1038/s41598-021-85358-4). The dataset from LIFE-MEDEA participants (asthmatic children) from Cyprus and Greece, include variables: Study ID, gender, age, study year, ambient temperature, ambient humidity, recording day, percentage of time staying at home, steps per day, callendar day, calendar week, date, lockdown status (phase 1, 2, or 3) due to COVID-19 pandemic, and if the date was during the weekend (binary variable). All data were collected following approvals from relevant authorities at both Cyprus and Greece, according to national legislation. In Cyprus, approvals have been obtained from the Cyprus National Bioethics Committee (EEBK EΠ 2017.01.141), by the Data Protection Commissioner (No. 3.28.223) and Ministry of Education (No 7.15.01.23.5). In Greece, approvals have been obtained from the Scientific Committee (25/04/2018, No: 1748) and the Governing Board of the University General Hospital of Heraklion (25/22/08/2018).
Overall, wearable sensors, often embedded in commercial smartwatches, allow for continuous and non-invasive health measurements and exposure assessment in clinical studies. Nevertheless, the real-life application of these technologies in studies involving many participants for a significant observation period may be hindered by several practical challenges. Using a small subset of the LIFE-MEDEA dataset, in the first excel tab of dataset, we provide demonstration data from a small subset of asthmatic children (n=17) that participated in the LIFE MEDEA study that were equipped with a smartwatch for the assessment of physical activity (heart rate, pedometer, accelerometer) and location (exposure to indoor or outdoor microenvironment using GPS signal). Participants were required to wear the smartwatch, equipped with a data collection application, daily, and data were transmitted via a wireless network to a centrally administered data collection platform. The main technical challenges identified ranged from restricting access to standard smartwatch features such as gaming, internet browser, camera, and audio recording applications, to technical challenges such as loss of GPS signal, especially in indoor environments, and internal smartwatch settings interfering with the data collection application. The dataset includes information on the percentage of time with collected data before and after the implementation of a protocol that relied on setting up the smartwatch device using publicly available Application Lockers and Device Automation applications to address most of these challenges. In addition, the dataset includes example single-day observations that demonstrate how the inclusion of a Wi-Fi received signal strength indicator, significantly improved indoor localization and largely minimised GPS signal misclassification (excel tab 2). Finally excel tab 3, shows the tasks Overall, the implementation of these protocols during the roll-out of the LIFE MEDEA study in the spring of 2020 led to significantly improved results in terms of data completeness and data quality. The protocol and the representative results have been submitted for publication to the Journal of Visualised experiments (submission: JoVE63275). The Variables included in the first three excel tabs were the following: Participant ID (Unique serial number for patient participating in the study), % Time Before (Percentage of time with data before protocol implementation), % Time After (Percentage of time with data after protocol implementation), Timestamp (Date and time of event occurrence), Indoor/Outdoor (Categorical- Classification of GPS signals to Indoor and Outdoor and null(missing value) based on distance from participant home), Filling algorithm (Imputation algorithm), SSID (Wireless network name connected to the smartwatch), Wi-Fi Signal Strength (Connection strength via Wi-Fi between smartwatch and home’s wireless network. (0 maximum strength), IMEI (International mobile equipment identity. Device serial number), GPS_LAT (Latitude), GPS_LONG (Longitude), Accuracy of GPS coordinates (Accuracy in meters of GPS coordinates), Timestamp of GPS coordinates (Obtained GPS coordinates Date and time), Battery Percentage (Battery life), Charger (Connected to the charger status).
Important notes on data collection methodology: Global positioning system (GPS) and physical activity data were recorded using LEMFO-LM25 smartwatch device which was equipped with the embrace™ data collection application. The smartwatch worked as a stand-alone device that was able to transmit data across 5-minute intervals to a cloud-based database via Wi-Fi data transfer. The software was able to synchronize the data collected from the different sensors, so the data are transferred to the cloud with the same timestamp. Data synchronization with the cloud-based database is performed automatically when the smartwatch contacts the Wi-Fi network inside the participants’ homes. According to the study aims, GPS coordinates were used to estimate the fraction of time spent in or out of the participants' residences. The time spent outside was defined as the duration of time with a GPS signal outside a 100-meter radius around the participant’s residence, to account for the signal accuracy in commercially available GPS receivers. Additionally, to address the limitation that signal accuracy in urban and especially indoor environments is diminished, 5-minute intervals with missing GPS signals were classified as either “indoor classification” or “outdoor classification” based on the most recent available GPS recording. The implementation of this GPS data filling algorithm allowed replacing the missing 5-minute intervals with estimated values. Via the described protocol, and through the use of a Device Automation application, information on WiFi connectivity, WiFi signal strength, battery capacity, and whether the device was charging or not was also made available. Data on these additional variables were not automatically synchronised with the cloud-based database but had to be manually downloaded from each smartwatch via Bluetooth after the end of the study period.
Abstract: Building health management is an important part in running an efficient and cost-effective building. Many problems in a building’s system can go undetected for long periods of time, leading to expensive repairs or wasted resources. This project aims to help detect and diagnose the building‘s health with data driven methods throughout the day. Orca and IMS are two state of the art algorithms that observe an array of building health sensors and provide feedback on the overall system’s health as well as localize the problem to one, or possibly two, components. With this level of feedback the hope is to quickly identify problems and provide appropriate maintenance while reducing the number of complaints and service calls. Introduction: To prepare these technologies for the new installation, the proposed methods are being tested on a current system that behaves similarly to the future green building. Building 241 was determined to best resemble the proposed building 232 and therefore was chosen for this study. Building 241 is currently outfitted with 34 sensors that monitor the heating & cooling temperatures for the air and water systems as well as other various subsystem states. The daily sensor recordings were logged and sent to the IDU group for analysis. The period of analysis was focused from July 1st through August 10th 2009. Methodology: The two algorithms used for analysis were Orca and IMS. Both methods look for anomalies using a distanced based scoring approach. Orca has the ability to use a single data set and find outliers within that data set. This tactic was applied to each day. After scoring each time sample throughout a given day the Orca score profiles were compared by computing the correlation against all other days. Days with high overall correlations were considered normal however days with lower overall correlations were more anomalous. IMS, on the other hand, needs a normal set of data to build a model, which can be applied to a set of test data to asses how anomaly the particular data set is. The typical days identified by Orca were used as the reference/training set for IMS, while all the other days were passed through IMS resulting in an anomaly score profile for each day. The mean of the IMS score profile was then calculated for each day to produce a summary IMS score. These summary scores were ranked and the top outliers were identified (see Figure 1). Once the anomalies were identified the contributing parameters were then ranked by the algorithm. Analysis: The contributing parameters identified by IMS were localized to the return air temperature duct system. -7/03/09 (Figure 2 & 3) AHU-1 Return Air Temperature (RAT) Calculated Average Return Air Temperature -7/19/09 (Figure 3 & 4) AHU-2 Return Air Temperature (RAT) Calculated Average Return Air Temperature IMS identified significantly higher temperatures compared to other days during the month of July and August. Conclusion: The proposed algorithms Orca and IMS have shown that they were able to pick up significant anomalies in the building system as well as diagnose the anomaly by identifying the sensor values that were anomalous. In the future these methods can be used on live streaming data and produce a real time anomaly score to help building maintenance with detection and diagnosis of problems.
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Average daily time spent by adults on activities including paid work, unpaid household work, unpaid care, travel and entertainment. These are official statistics in development.
Do you know how much time you spend on an app? Do you know the total use time of a day or average use time of an app?
This data set consists of - how many times a person unlocks his phone. - how much time he spends on every app on every day. - how much time he spends on his phone.
It lists the usage time of apps for each day.
Use the test data to find the Total Minutes that we can use the given app in a day. we can get a clear stats of apps usage. This data set will show you about the persons sleeping behavior as well as what app he spends most of his time. with this we can improve the productivity of the person.
The dataset was collected from the app usage app.