The average time spent daily on a phone, not counting talking on the phone, has increased in recent years, reaching a total of * hours and ** minutes as of April 2022. This figure was expected to reach around * hours and ** minutes by 2024.
How much time do people spend on social media? As of 2025, the average daily social media usage of internet users worldwide amounted to 141 minutes per day, down from 143 minutes in the previous year. Currently, the country with the most time spent on social media per day is Brazil, with online users spending an average of 3 hours and 49 minutes on social media each day. In comparison, the daily time spent with social media in the U.S. was just 2 hours and 16 minutes. Global social media usageCurrently, the global social network penetration rate is 62.3 percent. Northern Europe had an 81.7 percent social media penetration rate, topping the ranking of global social media usage by region. Eastern and Middle Africa closed the ranking with 10.1 and 9.6 percent usage reach, respectively. People access social media for a variety of reasons. Users like to find funny or entertaining content and enjoy sharing photos and videos with friends, but mainly use social media to stay in touch with current events friends. Global impact of social mediaSocial media has a wide-reaching and significant impact on not only online activities but also offline behavior and life in general. During a global online user survey in February 2019, a significant share of respondents stated that social media had increased their access to information, ease of communication, and freedom of expression. On the flip side, respondents also felt that social media had worsened their personal privacy, increased a polarization in politics and heightened everyday distractions.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset explores the relationship between digital behavior and mental well-being among 100,000 individuals. It records how much time people spend on screens, use of social media (including TikTok), and how these habits may influence their sleep, stress, and mood levels.
It includes six numerical features, all clean and ready for analysis, making it ideal for machine learning tasks like regression or classification. The data enables researchers and analysts to investigate how modern digital lifestyles may impact mental health indicators in measurable ways.
This Mobility & Foot traffic dataset includes enriched mobility data and visitation at POIs to answer questions such as:
-How often do people visit a location? (daily, monthly, absolute, and averages).
-What type of places do they visit? (parks, schools, hospitals, etc)
-Which social characteristics do people have in a certain POI? - Breakdown by type: residents, workers, visitors.
-What's their mobility like during night hours & day hours?
-What's the frequency of the visits by day of the week and hour of the day?
Extra insights
-Visitors´ relative Income Level.
-Visitors´ preferences as derived from their visits to shopping, parks, sports facilities, and churches, among others.
- Footfall measurement in all types of establishments (shopping malls, stand-alone stores, etc).
-Visitors´ preferences as derived from their visits to shopping, parks, sports facilities, and churches, among others.
- Origin/Destiny matrix.
- Vehicular traffic, measurement of speed, types of vehicles, among other insights.
Overview & Key Concepts
Each record corresponds to a ping from a mobile device, at a particular moment in time, and at a particular lat and long. We procure this data from reliable technology partners, which obtain it through partnerships with location-aware apps. All the process is compliant with applicable privacy laws.
We clean, process and enrich these massive datasets with a number of complex, computer-intensive calculations to make them easier to use in different tailor-made solutions for companies and also data science and machine learning applications, especially those related to understanding customer behavior.
Featured attributes of the data
Device speed: based on the distance between each observation and the previous one, we estimate the speed at which the device is moving. This is particularly useful to differentiate between vehicles, pedestrians, and stationery observations.
Night base of the device: we calculate the approximate location of where the device spends the night, which is usually its home neighborhood.
Day base of the device: we calculate the most common daylight location during weekdays, which is usually their work location.
Income level: we use the night neighborhood of the device, and intersect it with available socioeconomic data, to infer the device’s income level. Depending on the country, and the availability of good census data, this figure ranges from a relative wealth index to a currency-calculated income.
POI visited: we intersect each observation with a number of POI databases, to estimate check-ins to different locations. POI databases can vary significantly, in scope and depth, between countries.
Category of visited POI: for each observation that can be attributable to a POI, we also include a standardized location category (park, hospital, among others).
Delivery schemas
We can deliver the data in three different formats:
Full dataset: one record per mobile ping. These datasets are very large, and should only be consumed by experienced teams with large computing budgets.
Visitation stream: one record per attributable visit. This dataset is considerably smaller than the full one but retains most of the more valuable elements in the dataset. This helps understand who visited a specific POI, and characterize and understand the consumer's behavior.
Audience profiles: one record per mobile device in a given period of time (usually monthly). All the visitation stream is aggregated by category. This is the most condensed version of the dataset and is very useful to quickly understand the types of consumers in a particular area and to create cohorts of users.
This Location Data & Foot traffic dataset available for all countries include enriched raw mobility data and visitation at POIs to answer questions such as:
-How often do people visit a location? (daily, monthly, absolute, and averages).
-What type of places do they visit ? (parks, schools, hospitals, etc)
-Which social characteristics do people have in a certain POI? - Breakdown by type: residents, workers, visitors.
-What's their mobility like enduring night hours & day hours?
-What's the frequency of the visits partition by day of the week and hour of the day?
Extra insights -Visitors´ relative income Level. -Visitors´ preferences as derived by their visits to shopping, parks, sports facilities, churches, among others.
Overview & Key Concepts Each record corresponds to a ping from a mobile device, at a particular moment in time and at a particular latitude and longitude. We procure this data from reliable technology partners, which obtain it through partnerships with location-aware apps. All the process is compliant with applicable privacy laws.
We clean and process these massive datasets with a number of complex, computer-intensive calculations to make them easier to use in different data science and machine learning applications, especially those related to understanding customer behavior.
Featured attributes of the data Device speed: based on the distance between each observation and the previous one, we estimate the speed at which the device is moving. This is particularly useful to differentiate between vehicles, pedestrians, and stationery observations.
Night base of the device: we calculate the approximated location of where the device spends the night, which is usually their home neighborhood.
Day base of the device: we calculate the most common daylight location during weekdays, which is usually their work location.
Income level: we use the night neighborhood of the device, and intersect it with available socioeconomic data, to infer the device’s income level. Depending on the country, and the availability of good census data, this figure ranges from a relative wealth index to a currency-calculated income.
POI visited: we intersect each observation with a number of POI databases, to estimate check-ins to different locations. POI databases can vary significantly, in scope and depth, between countries.
Category of visited POI: for each observation that can be attributable to a POI, we also include a standardized location category (park, hospital, among others). Coverage: Worldwide.
Delivery schemas We can deliver the data in three different formats:
Full dataset: one record per mobile ping. These datasets are very large, and should only be consumed by experienced teams with large computing budgets.
Visitation stream: one record per attributable visit. This dataset is considerably smaller than the full one but retains most of the more valuable elements in the dataset. This helps understand who visited a specific POI, characterize and understand the consumer's behavior.
Audience profiles: one record per mobile device in a given period of time (usually monthly). All the visitation stream is aggregated by category. This is the most condensed version of the dataset and is very useful to quickly understand the types of consumers in a particular area and to create cohorts of users.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The Sleep Time , Predictive Behavioural Modelling Dataset (Synthetic) is designed for educational and research purposes to analyze the relationship between various lifestyle factors and sleep time. The dataset includes anonymized, synthetic data on factors such as physical activity, reading, work hours, caffeine intake, and relaxation time to explore their impact on the duration of sleep.
https://storage.googleapis.com/opendatabay_public/08399d5b-2e93-4ff9-a697-bd4a0d2c27fc/15a8a7ca85c7_image.png" alt="image.png">
This dataset can be used for the following applications:
This synthetic dataset is fully anonymized and complies with data privacy standards. It covers a broad spectrum of factors that contribute to sleep time, allowing for comprehensive research and analysis in the health and wellness domain.
CC0 (Public Domain)
Data-driven models help mobile app designers understand best practices and trends, and can be used to make predictions about design performance and support the creation of adaptive UIs. This paper presents Rico, the largest repository of mobile app designs to date, created to support five classes of data-driven applications: design search, UI layout generation, UI code generation, user interaction modeling, and user perception prediction. To create Rico, we built a system that combines crowdsourcing and automation to scalably mine design and interaction data from Android apps at runtime. The Rico dataset contains design data from more than 9.3k Android apps spanning 27 categories. It exposes visual, textual, structural, and interactive design properties of more than 66k unique UI screens. To demonstrate the kinds of applications that Rico enables, we present results from training an autoencoder for UI layout similarity, which supports query-by-example search over UIs.
Rico was built by mining Android apps at runtime via human-powered and programmatic exploration. Like its predecessor ERICA, Rico’s app mining infrastructure requires no access to — or modification of — an app’s source code. Apps are downloaded from the Google Play Store and served to crowd workers through a web interface. When crowd workers use an app, the system records a user interaction trace that captures the UIs visited and the interactions performed on them. Then, an automated agent replays the trace to warm up a new copy of the app and continues the exploration programmatically, leveraging a content-agnostic similarity heuristic to efficiently discover new UI states. By combining crowdsourcing and automation, Rico can achieve higher coverage over an app’s UI states than either crawling strategy alone. In total, 13 workers recruited on UpWork spent 2,450 hours using apps on the platform over five months, producing 10,811 user interaction traces. After collecting a user trace for an app, we ran the automated crawler on the app for one hour.
UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN https://interactionmining.org/rico
The Rico dataset is large enough to support deep learning applications. We trained an autoencoder to learn an embedding for UI layouts, and used it to annotate each UI with a 64-dimensional vector representation encoding visual layout. This vector representation can be used to compute structurally — and often semantically — similar UIs, supporting example-based search over the dataset. To create training inputs for the autoencoder that embed layout information, we constructed a new image for each UI capturing the bounding box regions of all leaf elements in its view hierarchy, differentiating between text and non-text elements. Rico’s view hierarchies obviate the need for noisy image processing or OCR techniques to create these inputs.
In 2023, Android users in Singapore spent an average of 4.51 hours per day using their mobile devices. This represents an increase from the 4.17 hours that users in the country spent on their devices in 2020.
This Mobility & Foot traffic dataset includes enriched mobility data and visitation at POIs to answer questions such as:
-How often do people visit a location? (daily, monthly, absolute, and averages).
-What type of places do they visit? (parks, schools, hospitals, etc).
-Which social characteristics do people have in a certain POI? - Breakdown by type: residents, workers, visitors.
-What's their mobility like during night hours & day hours?
-What's the frequency of the visits by day of the week and hour of the day?
Extra insights
-Visitors´ relative Income Level.
- Footfall measurement in all types of establishments (shopping malls, stand-alone stores, etc).
-Visitors´ preferences as derived from their visits to shopping, parks, sports facilities, and churches, among others.
- Origin/Destiny matrix.
- Vehicular traffic, measurement of speed, types of vehicles, among other insights.
Overview & Key Concepts
Each record corresponds to a ping from a mobile device, at a particular moment in time, and at a particular lat and long. We procure this data from reliable technology partners, which obtain it through partnerships with location-aware apps. All the process is compliant with GDPR and all applicable privacy laws.
We clean, process, and enrich these massive datasets with a number of complex, computer-intensive calculations to make them easier to use in different tailor-made solutions for companies and also data science and machine learning applications, especially those related to understanding customer behavior.
Featured attributes of the data
Device speed: based on the distance between each observation and the previous one, we estimate the speed at which the device is moving. This is particularly useful to differentiate between vehicles, pedestrians, and stationery observations.
Night base of the device: we calculate the approximate location of where the device spends the night, which is usually its home neighborhood.
Day base of the device: we calculate the most common daylight location during weekdays, which is usually their work location.
Income level: we use the night neighborhood of the device, and intersect it with available socioeconomic data, to infer the device’s income level. Depending on the country, and the availability of good census data, this figure ranges from a relative wealth index to a currency-calculated income.
POI visited: we intersect each observation with a number of POI databases, to estimate check-ins to different locations. POI databases can vary significantly, in scope and depth, between countries.
Category of visited POI: for each observation that can be attributable to a POI, we also include a standardized location category (park, hospital, among others).
Delivery schemas
We can deliver the data in three different formats:
Full dataset: one record per mobile ping. These datasets are very large, and should only be consumed by experienced teams with large computing budgets.
Visitation stream: one record per attributable visit. This dataset is considerably smaller than the full one but retains most of the more valuable elements in the dataset. This helps understand who visited a specific POI, and characterize and understand the consumer's behavior.
Audience profiles: one record per mobile device in a given period of time (usually monthly). All the visitation stream is aggregated by category. This is the most condensed version of the dataset and is very useful to quickly understand the types of consumers in a particular area and to create cohorts of users.
The Dataset is fully dedicated for the developers who want to build & train the model on employee login,logout and total time spent at work. This dataset provides data from 1st January 2020 to 26th June 2021, The 4 parameters here are Date,In,Out and Total_Hours.
This a genuine data of an employee whose login and logout was managed in an external application. Employee is supposed to login and logout everyday in the external application(due to mandatory work from home during this pandemic) else Data will be captured through flap barrier and stored in the external system when employee goes to the office automatically.
File contains Data of one employee with his login and logout time along with total time spent(logout hours - login hours) captured over a period of 1.5 years(2020-2021).
To this employee Saturday & Sunday is off, Indian Public holidays were applicable. Saturday,Sunday,Public holiday and when employee is on leave(sick/privilege) you see In,Out column having value "Public holiday/weekend" and total hours is 0 for that day.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Average Weekly Hours in the United States decreased to 34.20 Hours in June from 34.30 Hours in May of 2025. This dataset provides - United States Average Weekly Hours - actual values, historical data, forecast, chart, statistics, economic calendar and news.
This dataset was created from the online retail dataset found here https://www.kaggle.com/roshansharma/online-retail. This has had some processing for customer segmentation so it can be used for nice visualisation of the data.
The following variables are used: | Variable | Description | | --- | --- | |**CustomerID**| This is the same CustomerID field as in the online retail dataset found in the link above and can be linked to this dataset.| |**Frequency**|This is how many times a customer purchased.| |**Recency**|This is how many days ago a customer made a purchase. This is adjusted to reference a point in time.| |**Monetary** |This is how much a customer spent in total. Their total Lifetime monetary value.| |**rankF**|This is the Frequency value divided into different ranges from 1 to 5 using the cut function in R. (5 = lots of visits, 1 = very low visits)| |**rankR**|This is the Recency value divided into different ranges from 1 to 5 using the cut function in R and then flipped. (5 = very Recent, 1 = ages ago) | |**rankM**|This is the Monetary value divided into different ranges from 1 to 5 using the cut function in R. (5 = High spender, 1 = low spender) | |**groupRFM**| The group RFM is a value combining the rankR, rankF and rankM. This uses 1 digit per rank (ie 1 rankR, 2 rankF, 5 rankM would be 125 Group)| |**Country**|This is the customer delivery country from the original online retail dataset.| |**Customer_Segment**| A customer segment is added to give a more human description of the customer and therefore can be treated differently. These segments are listed below.|
The customer segments below detail the description of the customers from their details processed in the RFM analysis. | Customer Segment | Segment Description | | --- | --- | |**Champions** | Bought recently buy often and spend the most | |**Loyal Customers**|Spend good money Responsive to promotions| |**Potential Loyalist**|Recent customers spent good amount, bought more than once| |**Recent High Spender**|Recent customers not frequent but spend some| |**New Customers**|Bought more recently but not often| |**Promising**|Recent shoppers but haven’t spent much| |**Need Attention**|Above average recency frequency & monetary values| |**About To Sleep**|Below average recency frequency & monetary values| |**At Risk**|Spent big money purchased often but long time ago| |**Can’t Lose Them**|Made big purchases and often but long time ago| |**Hibernating**|Low spenders low frequency purchased long time ago| |**Lost**|Lowestrecency frequency & monetary scores|
Thank you to the owners of the online retail dataset. https://www.kaggle.com/roshansharma
The online retail dataset is a great set for finding anomalies and doing some interesting reports, however RFM analysis allows you to treat clusters of data in the same way which is suitable for marketing teams etc.
RFM analysis is a straight forward analytical process that can be achieved by clustering but a more manual process is good as you can adjust these figures to get more even groups. I will post my R code for this and link shortly.| | | | | --- | --- | | | | | | | --- | --- | | | |
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
NBA anba WNBA dataset is a large-scale play-by-play and shot-detail dataset covering both NBA and WNBA games, collected from multiple public sources (e.g., official league APIs and stats sites). It provides every in-game event—from period starts, jump balls, fouls, turnovers, rebounds, and field-goal attempts through free throws—along with detailed shot metadata (shot location, distance, result, assisting player, etc.).
Also you can download dataset from github or GoogleDrive
Tutorials
I will be grateful for ratings and stars on github, but the best gratitude is use of dataset for your projects.
Useful links:
I made this dataset because I want to simplify and speed up work with play-by-play data so that researchers spend their time studying data, not collecting it. Due to the limits on requests on the NBA and WNBA website, and also because you can get play-by-play of only one game per request, collecting this data is a very long process.
Using this dataset, you can reduce the time to get information about one season from a few hours to a couple of seconds and spend more time analyzing data or building models.
I also added play-by-play information from other sources: pbpstats.com, data.nba.com, cdnnba.com. This data will enrich information about the progress of each game and hopefully add opportunities to do interesting things.
If you have any questions or suggestions about the dataset, you can write to me in a convenient channel for you:
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is a public release of Beiwe-generated data. The Beiwe Research Platform collects high-density data from a variety of smartphone sensors such as GPS, WiFi, Bluetooth, gyroscope, and accelerometer in addition to metadata from active surveys. A description of passive and active data streams, and a documentation concerning the use of Beiwe can be found here. This data was collected from an internal test study and is made available solely for educational purposes. It contains no identifying information; subject locations are de-identified using the noise GPS feature of Beiwe.
As part of the internal test study, data from 6 participants were collected from the start of March 21, 2022 to the end of March 28, 2022. The local time zone of this study is Eastern Standard Time. Each participant was notified to complete a survey at 9am EST on Monday, Thursday, and Saturday of the study week. An additional survey was administered on Tuesday at 5:15pm EST. For each survey, subjects were asked to respond to the prompt "How much time (in hours) do you think you spent at home?".
This dataset is from the 2013 California Dietary Practices Survey of Adults. This survey has been discontinued. Adults were asked a series of eight questions about their physical activity practices in the last month. These questions were borrowed from the Behavior Risk Factor Surveillance System. Data displayed in this table represent California adults who met the aerobic recommendation for physical activity, as defined by the 2008 U.S. Department of Health and Human Services Physical Activity Guidelines for Americans and Objectives 2.1 and 2.2 of Healthy People 2020. The California Dietary Practices Surveys (CDPS) (now discontinued) was the most extensive dietary and physical activity assessment of adults 18 years and older in the state of California. CDPS was designed in 1989 and was administered biennially in odd years up through 2013. The CDPS was designed to monitor dietary trends, especially fruit and vegetable consumption, among California adults for evaluating their progress toward meeting the 2010 Dietary Guidelines for Americans and the Healthy People 2020 Objectives. For the data in this table, adults were asked a series of eight questions about their physical activity practices in the last month. Questions included: 1) During the past month, other than your regular job, did you participate in any physical activities or exercise such as running, calisthenics, golf, gardening or walking for exercise? 2) What type of physical activity or exercise did you spend the most time doing during the past month? 3) How many times per week or per month did you take part n this activity during the past month? 4) And when you took part in this activity, for how many minutes or hours did you usually keep at it? 5) During the past month, how many times per week or per month did you do physical activities or exercises to strengthen your muscles? Questions 2, 3, and 4 were repeated to collect a second activity. Data were collected using a list of participating CalFresh households and random digit dial, approximately 1,400-1,500 adults (ages 18 and over) were interviewed via phone survey between the months of June and October. Demographic data included gender, age, ethnicity, education level, income, physical activity level, overweight status, and food stamp eligibility status. Data were oversampled for low-income adults to provide greater sensitivity for analyzing trends among our target population.
At a time when OECD and partner countries are trying to figure out how to reduce burgeoning debt and make the most of shrinking public budgets, spending on education is an obvious target for scrutiny. Education officials, teachers, policy makers, parents and students struggle to determine the merits of shorter or longer school days or school years, how much time should be allotted to various subjects, and the usefulness of after-school lessons and independent study. This report focuses on how students use learning time, both in and out of school. What are the ideal conditions to ensure that students use their learning time efficiently? What can schools do to maximise the learning that occurs during the limited amount of time students spend in class? In what kinds of lessons does learning time reap the most benefits? And how can this be determined? The report draws on data from the 2006 cycle of the Programme of International Student Assessment (PISA) to describe differences across and within countries in how much time students spend studying different subjects, how much time they spend in different types of learning activities, how they allocate their learning time and how they perform academically.
Daily average time in hours and proportion of day spent on various activities by age group and sex, 15 years and over, Canada and provinces.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Attrition analysis: Identify factors correlated with attrition like department, role, salary, etc. Segment high-risk employees. Predict future attrition.
Performance management: Analyze the relationship between metrics like ratings, and salary increments. recommend performance improvement programs.
Workforce planning: Forecast staffing needs based on historical hiring/turnover trends. Determine optimal recruitment strategies.
Compensation analysis: Benchmark salaries vs performance, and experience. Identify pay inequities. Inform compensation policies.
Diversity monitoring: Assess diversity metrics like gender ratio over roles, and departments. Identify underrepresented groups.
Succession planning: Identify high-potential candidates and critical roles. Predict internal promotions/replacements in advance.
Given its longitudinal employee data and multiple variables, this dataset provides rich opportunities for exploration, predictive modeling, and actionable insights. With a large sample size, it can uncover subtle patterns. Cleaning, joining with other contextual data sources can yield even deeper insights. This makes it a valuable starting point for many organizational studies and evidence-based decision-making.
.............................................................................................................................................................................................................................................
This dataset contains information about different attributes of employees from a company. It includes 1000 employee records and 12 feature columns.
satisfaction_level: Employee satisfaction score (1-5 scale) last_evaluation: Score on last evaluation (1-5 scale) number_project: Number of projects employee worked on average_monthly_hours: Average hours worked in a month time_spend_company: Number of years spent with the company work_accident: If an employee had a workplace accident (yes/no) left: If an employee has left the company (yes/no) promotion_last_5years: Number of promotions in last 5 years Department: Department of the employee Salary: Annual salary of employee satisfaction_level: Employee satisfaction level (1-5 scale) last_evaluation: Score on last evaluation (1-5 scale)
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
# Title
Interview-Based Stress Assessment Dataset
# Overview
The dataset includes stress evaluations (6 grades) assessed by interviews of 50 Japanese workers (49 completed follow-up), as well as self-reported stress and attribute information and personality information measured at the pre and one-month follow-up.
# Data Source
Interviews were conducted between December 2022 and January 2023. The average follow-up period was 34.2 days.
The main variables were interview-based stress evaluation, with self-reported stress (stress load, mental symptoms and physical symptoms from the Brief Job Stress Questionnaire), well-being (life satisfaction and happiness), and burnout were measured pre and 1 month later. Interview-based stress evaluations were conducted by two occupational health professionals in addition to an evaluation by the interviewer, a psychologist.
# Data Description
## main variables are total (time 1 self-reported stress), burnout, wellbeing, meanStressEv (mean overall stress ratings of interviewer and two evaluators), T2_loadAll, T2_mental, T2_physical, T2_burnout, and T2_wellbeing
no: Record number or identifier.
age: Age of the individual in years.
gender: Gender of the individual. Possible values include 'male', 'female', etc.
height_cm: Height of the individual in centimeters.
weight_kg: Weight of the individual in kilograms.
BMI: Body Mass Index, calculated based on height and weight.
drinking_freq: Frequency of alcohol consumption. Example values might be 'daily', 'weekly', 'monthly', etc.
smoking_habits: Smoking habits of the individual. Possible values include 'smoker', 'non-smoker', etc.
money_spending_hobby: Attitude towards spending money on hobbies. Describes how much an individual spends on their hobbies.
employment_status: Current employment status. Possible values include 'employed', 'unemployed', 'self-employed', etc.
full_time: employment_status
part_time: employment_status
discretionary: employment_status
side_job: This variable likely indicates whether the individual has a side job in addition to their primary employment. The values could be binary (yes/no) or provide more detail about the nature of the side job.
work_type: This variable probably categorizes the type of work the individual is engaged in. It could include categories such as 'full-time', 'part-time', 'contract', 'freelance', etc.
fixedHours: This variable might indicate whether the individual's work schedule has fixed hours. It could be a binary variable (yes/no) indicating the presence or absence of a fixed work schedule.
rotationalShifts: This variable likely denotes whether the individual works in rotational shifts. It could be a binary (yes/no) variable or provide details on the shift rotation pattern.
flexibleShifts: This variable possibly reflects if the individual has flexible shift options in their work. This could involve varying start and end times or the ability to switch shifts.
flexTime: This variable might indicate the presence of 'flextime' in the individual's work arrangement, allowing them to choose their working hours within certain limits.
adjustableWorkHours: This variable probably denotes whether the individual has the ability to adjust their work hours, suggesting a degree of flexibility in their work schedule.
discretionaryWork: This variable could indicate whether the individual's work involves a degree of discretion or autonomy in decision-making or task execution.
nightShift: This variable likely indicates if the individual works night shifts. It could be a simple binary (yes/no) or provide details about the frequency or regularity of night shifts.
remote_work_freq: This variable probably measures the frequency of remote work. It could include categories like 'never', 'sometimes', 'often', or 'always'.
primary_job_industry: This variable likely categorizes the industry sector of the individual's primary job. It could include sectors like 'technology', 'healthcare', 'education', 'finance', etc.
ind: industry
ind.manu–ind.gove: binary coding of industry
primary_job_role: This variable likely represents the specific role or position held by the individual in their primary job. It could include titles like 'manager', 'engineer', 'teacher', etc.
job: job
job.admi–job.carClPa: binary coding of job
job_duration_years: This variable probably indicates the duration of the individual's current job in years. It typically measures the length of time an individual has been in their current job role.
years: Without additional context, this variable could represent various time-related aspects, such as years of experience in a particular field, age in years, or years in a specific role. It generally signifies a duration or period in years.
months: Similar to 'years', this variable could refer to a duration in months. It might represent age in months (for younger individuals), months of experience, or months spent in a current role or activity.
job_duration_months: This variable is likely to indicate the total duration of the individual's current job in months. It's a more precise measure compared to 'job_duration_years', especially for shorter employment periods.
working_days_per_week: This variable probably denotes the number of days the individual works in a typical week. It helps to understand the work pattern, whether it's a standard five-day workweek or otherwise.
work_hours_per_day: This variable likely measures the average number of hours the individual works each day. It can be used to assess work-life balance and overall workload.
job_workload: This variable might represent the overall workload associated with the individual's job. This could be subjective (based on the individual's perception) or objective (based on quantifiable measures like hours worked or tasks completed).
job_qualitative_load: This variable likely assesses the qualitative aspects of the job's workload, such as the level of mental or emotional stress, complexity of tasks, or level of responsibility.
job_control: This variable probably measures the degree of control or autonomy the individual has in their job. It could assess how much freedom they have in making decisions, planning their work, or the flexibility in how they perform their duties.
hirou_1–hirou_7: Working Conditions of Fatigue Accumulation Checklist
hirou_kinmu: Sum of Working Conditions of Fatigue Accumulation Checklist
WH_1–WH_2: Items related to workaholic
workaholic: Sum of items related to workaholic
WE_1–WE_3: Items related to work engagement
engagement: Sum of items related to work engagement
relationship_stress: This variable likely measures stress stemming from personal relationships, possibly including family, romantic partners, or friends.
future_uncertainty_stress: This variable probably captures stress related to uncertainties about the future, such as career prospects, financial stability, or life goals.
discrimination_stress: This variable indicates stress experienced due to discrimination, possibly based on factors like race, gender, age, or other personal characteristics.
financial_stress: This variable measures stress related to financial matters, such as income, expenses, debt, or overall financial security.
health_stress: This variable likely assesses stress concerning personal health or the health of loved ones.
commuting_stress: This variable measures stress associated with daily commuting, such as traffic, travel time, or transportation issues.
irregular_lifestyle: This variable probably indicates the presence of an irregular lifestyle, potentially including erratic sleep patterns, eating habits, or work schedules.
living_env_stress: This variable likely measures stress related to the living environment, which could include housing conditions, neighborhood safety, or noise levels.
unrewarded_efforts: This variable probably assesses feelings of stress or dissatisfaction due to efforts that are perceived as unrewarded or unacknowledged.
other_stressors: This variable might capture additional stress factors not covered by other specific variables.
coping: This variable likely assesses the individual's coping mechanisms or strategies in response to stress.
support: This variable measures the level of support the individual perceives or receives, possibly from friends, family, or professional services.
weekday_bedtime: This variable likely indicates the typical bedtime of the individual on weekdays.
weekday_wakeup: This variable represents the typical time the individual wakes up on weekdays.
holiday_bedtime: This variable indicates the typical bedtime of the individual on holidays or non-workdays.
holiday_wakeup: This variable measures the typical wake-up time of the individual on holidays or non-workdays.
avg_sleep_duration: This variable likely represents the average duration of sleep the individual gets, possibly averaged over a certain period.
weekday_bedtime_posix: This variable might represent the weekday bedtime in POSIX time format.
weekday_wakeup_posix: Similar to bedtime, this represents the weekday wakeup time in POSIX time format.
holiday_bedtime_posix: This variable likely indicates the holiday bedtime in POSIX time format.
holiday_wakeup_posix: This represents the holiday wakeup time in POSIX time format.
weekday_bedtime_posix_hms: This variable could be the weekday bedtime in POSIX time format, specifically in hours, minutes, and seconds.
weekday_wakeup_posix_hms: This variable might represent the weekday wakeup time in POSIX time format in hours, minutes, and seconds.
holiday_bedtime_posix_hms: The holiday bedtime in POSIX time format, detailed to hours, minutes, and seconds.
holiday_wakeup_posix_hms: The holiday wakeup time in POSIX time format, in hours, minutes, and
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Person-to-Object Contact Dataset’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/teruakihayashi/person-to-object-contact-dataset on 28 January 2022.
--- Dataset description provided by original source is as follows ---
This dataset has been designed and obtained for discussing control measures during the COVID-19 pandemic. In this study, 1,260 people living in Tokyo and Kanagawa prefectures in Japan participated in the survey. This survey was used to collect participants’ behaviors and the objects that they touched on the days that they went out at 15 types of locations and vehicles.
This dataset is expected to improve our understanding of actual human behavior and contact with objects that could. Although it is impossible to disinfect all objects and spaces, this dataset is expected to contribute to the prioritization of disinfection during periods of widespread infection.
The participants living in Tokyo and Kanagawa prefectures in Japan were asked to respond, in detail, to a survey regarding the locations they stayed at for an extended period between December 3 (Thursday) and December 7 (Monday), 2020, and all the items that they touched during this time. Using the locations where clusters of infections were found during April 2020, 12 locations were selected (e.g., medical facilities, including hospitals; restaurants; stores whose main objective was to sell alcohol, such as bars; companies, including the participants’ own companies and the offices of others; and sports facilities such as gyms) and investigated. Similarly, three means of transport, namely trains, buses, and taxis, were selected as spaces where people often crowd together.
The main survey was conducted with 1,536 subjects during December 3–8. Data from 1,260 subjects who gave valid responses were used for the dataset. To ensure that the respondents could respond while their memories were still fresh, the survey was distributed to each subject on the day of their corresponding behavior. Participants were asked to respond about the locations where they spent most of their time during the corresponding period. They were also asked to detail all the objects they touched (excluding personal objects) during this time. The objects in this study were evaluated using a free-writing description. Typographical errors and differences in expressions were frequently observed (e.g., water closet, toilet, and bathroom). A categorization rule was thus developed to better ascertain the actual status of locations and object contact. The participants’ expressions were modified through visual inspection.
This survey was conducted after appropriate review by the Ethics Committee of the Graduate School of Engineering, University of Tokyo (examination number: 20-61, approval number: KE20-72).
Teruaki Hayashi, Daisuke Hase, Hikaru Suenaga, Yukio Ohsawa, "The Actual Conditions of Person-to-Object Contact and a Proposal for Prevention Measures During the COVID-19 Pandemic," medRxiv, 2021. DOI: https://doi.org/10.1101/2021.04.11.21255290
This research project was supported by the “Startup Research Program for Post-Corona Society” of the Academic Strategy Office, School of Engineering, the University of Tokyo, and the “COVID-19 AI and Simulation Project” of the Office for Novel Coronavirus Disease Control, Cabinet Secretariat, Government of Japan. The authors would like to thank PLUG-Inc. for survey design and implementation.
--- Original source retains full ownership of the source dataset ---
The average time spent daily on a phone, not counting talking on the phone, has increased in recent years, reaching a total of * hours and ** minutes as of April 2022. This figure was expected to reach around * hours and ** minutes by 2024.