Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
This dataset explores how daily digital habits β including social media usage, screen time, and notification exposure β relate to individual productivity, stress, and well-being.
The dataset contains 30,000 real-world-style records simulating behavioral patterns of people with various jobs, social habits, and lifestyle choices. The goal is to understand how different digital behaviors correlate with perceived and actual productivity.
β
Designed for real-world ML workflows
Includes missing values, noise, and outliers β ideal for practicing data cleaning and preprocessing.
π High correlation between target features
The perceived_productivity_score
and actual_productivity_score
are strongly correlated, making this dataset suitable for experiments in feature selection and multicollinearity.
π οΈ Feature Engineering playground
Use this dataset to practice feature scaling, encoding, binning, interaction terms, and more.
π§ͺ Perfect for EDA, regression & classification
You can model productivity, stress, or satisfaction based on behavior patterns and digital exposure.
Column Name | Description |
---|---|
age | Age of the individual (18β65 years) |
gender | Gender identity: Male, Female, or Other |
job_type | Employment sector or status (IT, Education, Student, etc.) |
daily_social_media_time | Average daily time spent on social media (hours) |
social_platform_preference | Most-used social platform (Instagram, TikTok, Telegram, etc.) |
number_of_notifications | Number of mobile/social notifications per day |
work_hours_per_day | Average hours worked each day |
perceived_productivity_score | Self-rated productivity score (scale: 0β10) |
actual_productivity_score | Simulated ground-truth productivity score (scale: 0β10) |
stress_level | Current stress level (scale: 1β10) |
sleep_hours | Average hours of sleep per night |
screen_time_before_sleep | Time spent on screens before sleeping (hours) |
breaks_during_work | Number of breaks taken during work hours |
uses_focus_apps | Whether the user uses digital focus apps (True/False) |
has_digital_wellbeing_enabled | Whether Digital Wellbeing is activated (True/False) |
coffee_consumption_per_day | Number of coffee cups consumed per day |
days_feeling_burnout_per_month | Number of burnout days reported per month |
weekly_offline_hours | Total hours spent offline each week (excluding sleep) |
job_satisfaction_score | Satisfaction with job/life responsibilities (scale: 0β10) |
π Sample notebook coming soon with data cleaning, visualization, and productivity prediction!
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
This data set contains combined on-court performance data for NBA players in the 2016-2017 season, alongside salary, Twitter engagement, and Wikipedia traffic data.
Further information can be found in a series of articles for IBM Developerworks: "Explore valuation and attendance using data science and machine learning" and "Exploring the individual NBA players".
A talk about this dataset has slides from March, 2018, Strata:
Further reading on this dataset is in the book Pragmatic AI, in Chapter 6 or full book, Pragmatic AI: An introduction to Cloud-based Machine Learning and watch lesson 9 in Essential Machine Learning and AI with Python and Jupyter Notebook
You can watch a breakdown of using cluster analysis on the Pragmatic AI YouTube channel
Learn to deploy a Kaggle project into a production Machine Learning sklearn + flask + container by reading Python for Devops: Learn Ruthlessly Effective Automation, Chapter 14: MLOps and Machine learning engineering
Use social media to predict a winning season with this notebook: https://github.com/noahgift/core-stats-datascience/blob/master/Lesson2_7_Trends_Supervized_Learning.ipynb
Learn to use the cloud for data analysis.
Data sources include ESPN, Basketball-Reference, Twitter, Five-ThirtyEight, and Wikipedia. The source code for this dataset (in Python and R) can be found on GitHub. Links to more writing can be found at noahgift.com.
Back in 2016 I was working on an app to pass the time of unemployment so I decided to make a mobile app. This isn't much just for me to pass the time. You can find the app here though, I use it personally especially to check some interesting upcoming schedule
The data consists of schedules from 2016 to 2018 (june). I'm still exploring data science though but I plan to make a simple analysis out of this scrapped data.
I might be posting more data after some time to get some data like most frequent artists, one time deals, or maybe social media opinions as well probably some customers might have already tweeted about the event.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
This dataset explores how daily digital habits β including social media usage, screen time, and notification exposure β relate to individual productivity, stress, and well-being.
The dataset contains 30,000 real-world-style records simulating behavioral patterns of people with various jobs, social habits, and lifestyle choices. The goal is to understand how different digital behaviors correlate with perceived and actual productivity.
β
Designed for real-world ML workflows
Includes missing values, noise, and outliers β ideal for practicing data cleaning and preprocessing.
π High correlation between target features
The perceived_productivity_score
and actual_productivity_score
are strongly correlated, making this dataset suitable for experiments in feature selection and multicollinearity.
π οΈ Feature Engineering playground
Use this dataset to practice feature scaling, encoding, binning, interaction terms, and more.
π§ͺ Perfect for EDA, regression & classification
You can model productivity, stress, or satisfaction based on behavior patterns and digital exposure.
Column Name | Description |
---|---|
age | Age of the individual (18β65 years) |
gender | Gender identity: Male, Female, or Other |
job_type | Employment sector or status (IT, Education, Student, etc.) |
daily_social_media_time | Average daily time spent on social media (hours) |
social_platform_preference | Most-used social platform (Instagram, TikTok, Telegram, etc.) |
number_of_notifications | Number of mobile/social notifications per day |
work_hours_per_day | Average hours worked each day |
perceived_productivity_score | Self-rated productivity score (scale: 0β10) |
actual_productivity_score | Simulated ground-truth productivity score (scale: 0β10) |
stress_level | Current stress level (scale: 1β10) |
sleep_hours | Average hours of sleep per night |
screen_time_before_sleep | Time spent on screens before sleeping (hours) |
breaks_during_work | Number of breaks taken during work hours |
uses_focus_apps | Whether the user uses digital focus apps (True/False) |
has_digital_wellbeing_enabled | Whether Digital Wellbeing is activated (True/False) |
coffee_consumption_per_day | Number of coffee cups consumed per day |
days_feeling_burnout_per_month | Number of burnout days reported per month |
weekly_offline_hours | Total hours spent offline each week (excluding sleep) |
job_satisfaction_score | Satisfaction with job/life responsibilities (scale: 0β10) |
π Sample notebook coming soon with data cleaning, visualization, and productivity prediction!