3 datasets found

Social Media vs Productivity

kaggle.com

Updated May 15, 2025

Facebook

Twitter

Click to copy link

Link copied

Cite

Mahdi Mashayekhi (2025). Social Media vs Productivity [Dataset]. https://www.kaggle.com/datasets/mahdimashayekhi/social-media-vs-productivity/code

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

May 15, 2025

Dataset provided by

Kaggle

Authors

Mahdi Mashayekhi

License

Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically

Description

📊 Social Media vs Productivity — Realistic Behavioral Dataset (30,000 Users)

This dataset explores how daily digital habits — including social media usage, screen time, and notification exposure — relate to individual productivity, stress, and well-being.

🔍 What’s Inside?

The dataset contains 30,000 real-world-style records simulating behavioral patterns of people with various jobs, social habits, and lifestyle choices. The goal is to understand how different digital behaviors correlate with perceived and actual productivity.

🧠 Why This Dataset is Valuable

✅ Designed for real-world ML workflows
Includes missing values, noise, and outliers — ideal for practicing data cleaning and preprocessing.
🔗 High correlation between target features
The perceived_productivity_score and actual_productivity_score are strongly correlated, making this dataset suitable for experiments in feature selection and multicollinearity.
🛠️ Feature Engineering playground
Use this dataset to practice feature scaling, encoding, binning, interaction terms, and more.
🧪 Perfect for EDA, regression & classification
You can model productivity, stress, or satisfaction based on behavior patterns and digital exposure.

🧾 Columns & Feature Info

Column Name	Description
`age`	Age of the individual (18–65 years)
`gender`	Gender identity: Male, Female, or Other
`job_type`	Employment sector or status (IT, Education, Student, etc.)
`daily_social_media_time`	Average daily time spent on social media (hours)
`social_platform_preference`	Most-used social platform (Instagram, TikTok, Telegram, etc.)
`number_of_notifications`	Number of mobile/social notifications per day
`work_hours_per_day`	Average hours worked each day
`perceived_productivity_score`	Self-rated productivity score (scale: 0–10)
`actual_productivity_score`	Simulated ground-truth productivity score (scale: 0–10)
`stress_level`	Current stress level (scale: 1–10)
`sleep_hours`	Average hours of sleep per night
`screen_time_before_sleep`	Time spent on screens before sleeping (hours)
`breaks_during_work`	Number of breaks taken during work hours
`uses_focus_apps`	Whether the user uses digital focus apps (True/False)
`has_digital_wellbeing_enabled`	Whether Digital Wellbeing is activated (True/False)
`coffee_consumption_per_day`	Number of coffee cups consumed per day
`days_feeling_burnout_per_month`	Number of burnout days reported per month
`weekly_offline_hours`	Total hours spent offline each week (excluding sleep)
`job_satisfaction_score`	Satisfaction with job/life responsibilities (scale: 0–10)

📌 Notes

Contains NaN values in critical columns (productivity, sleep, stress) for data imputation tasks
Includes outliers in media usage, coffee intake, and notification count
Target columns are strongly correlated for multicollinearity testing
Multi-purpose: regression, classification, clustering, visualization

💡 Use Cases

Exploratory Data Analysis (EDA)
Feature engineering pipelines
Machine learning model benchmarking
Statistical hypothesis testing
Burnout and mental health prediction projects

📥 Bonus

👉 Sample notebook coming soon with data cleaning, visualization, and productivity prediction!

Social Power NBA
kaggle.com
Updated Aug 1, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Noah Gift (2017). Social Power NBA [Dataset]. https://www.kaggle.com/datasets/noahgift/social-power-nba/suggestions
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 1, 2017
Dataset provided by
Kaggle
Authors
Noah Gift
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
Context

This data set contains combined on-court performance data for NBA players in the 2016-2017 season, alongside salary, Twitter engagement, and Wikipedia traffic data.

Further information can be found in a series of articles for IBM Developerworks: "Explore valuation and attendance using data science and machine learning" and "Exploring the individual NBA players".

A talk about this dataset has slides from March, 2018, Strata:

https://www.slideshare.net/noahgift/social-power-andinfluenceinthenba-89807740?qid=3f9f835a-f3d7-4174-8a8c-c97f9c82e614&v=&b=&from_search=1

Further reading on this dataset is in the book Pragmatic AI, in Chapter 6 or full book, Pragmatic AI: An introduction to Cloud-based Machine Learning and watch lesson 9 in Essential Machine Learning and AI with Python and Jupyter Notebook

Followup Items

You can watch a breakdown of using cluster analysis on the Pragmatic AI YouTube channel

Learn to deploy a Kaggle project into a production Machine Learning sklearn + flask + container by reading Python for Devops: Learn Ruthlessly Effective Automation, Chapter 14: MLOps and Machine learning engineering

Use social media to predict a winning season with this notebook: https://github.com/noahgift/core-stats-datascience/blob/master/Lesson2_7_Trends_Supervized_Learning.ipynb

Learn to use the cloud for data analysis.

Acknowledgement

Data sources include ESPN, Basketball-Reference, Twitter, Five-ThirtyEight, and Wikipedia. The source code for this dataset (in Python and R) can be found on GitHub. Links to more writing can be found at noahgift.com.

Inspiration

Do NBA fans know more about who the best players are, or do owners?

What is the true worth of the social media presence of athletes in the NBA?
19 East Gig Schedules
kaggle.com
Updated Jun 21, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jereme (2018). 19 East Gig Schedules [Dataset]. https://www.kaggle.com/jeremejazz/19-east-gig-schedules-2016-2018/metadata
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 21, 2018
Dataset provided by
Kaggle
Authors
Jereme
Description
Context

Back in 2016 I was working on an app to pass the time of unemployment so I decided to make a mobile app. This isn't much just for me to pass the time. You can find the app here though, I use it personally especially to check some interesting upcoming schedule

Content

The data consists of schedules from 2016 to 2018 (june). I'm still exploring data science though but I plan to make a simple analysis out of this scrapped data.

I might be posting more data after some time to get some data like most frequent artists, one time deals, or maybe social media opinions as well probably some customers might have already tweeted about the event.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Mahdi Mashayekhi (2025). Social Media vs Productivity [Dataset]. https://www.kaggle.com/datasets/mahdimashayekhi/social-media-vs-productivity/code

Social Media vs Productivity

Impact of Social Media Usage on Individual Productivity

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

May 15, 2025

Dataset provided by

Kaggle

Authors

Mahdi Mashayekhi

License

Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically

Description

📊 Social Media vs Productivity — Realistic Behavioral Dataset (30,000 Users)

This dataset explores how daily digital habits — including social media usage, screen time, and notification exposure — relate to individual productivity, stress, and well-being.

🔍 What’s Inside?

🧠 Why This Dataset is Valuable

✅ Designed for real-world ML workflows
Includes missing values, noise, and outliers — ideal for practicing data cleaning and preprocessing.
🔗 High correlation between target features
The perceived_productivity_score and actual_productivity_score are strongly correlated, making this dataset suitable for experiments in feature selection and multicollinearity.
🛠️ Feature Engineering playground
Use this dataset to practice feature scaling, encoding, binning, interaction terms, and more.
🧪 Perfect for EDA, regression & classification
You can model productivity, stress, or satisfaction based on behavior patterns and digital exposure.

🧾 Columns & Feature Info

Column Name	Description
`age`	Age of the individual (18–65 years)
`gender`	Gender identity: Male, Female, or Other
`job_type`	Employment sector or status (IT, Education, Student, etc.)
`daily_social_media_time`	Average daily time spent on social media (hours)
`social_platform_preference`	Most-used social platform (Instagram, TikTok, Telegram, etc.)
`number_of_notifications`	Number of mobile/social notifications per day
`work_hours_per_day`	Average hours worked each day
`perceived_productivity_score`	Self-rated productivity score (scale: 0–10)
`actual_productivity_score`	Simulated ground-truth productivity score (scale: 0–10)
`stress_level`	Current stress level (scale: 1–10)
`sleep_hours`	Average hours of sleep per night
`screen_time_before_sleep`	Time spent on screens before sleeping (hours)
`breaks_during_work`	Number of breaks taken during work hours
`uses_focus_apps`	Whether the user uses digital focus apps (True/False)
`has_digital_wellbeing_enabled`	Whether Digital Wellbeing is activated (True/False)
`coffee_consumption_per_day`	Number of coffee cups consumed per day
`days_feeling_burnout_per_month`	Number of burnout days reported per month
`weekly_offline_hours`	Total hours spent offline each week (excluding sleep)
`job_satisfaction_score`	Satisfaction with job/life responsibilities (scale: 0–10)

📌 Notes

Contains NaN values in critical columns (productivity, sleep, stress) for data imputation tasks
Includes outliers in media usage, coffee intake, and notification count
Target columns are strongly correlated for multicollinearity testing
Multi-purpose: regression, classification, clustering, visualization

💡 Use Cases

Exploratory Data Analysis (EDA)
Feature engineering pipelines
Machine learning model benchmarking
Statistical hypothesis testing
Burnout and mental health prediction projects

📥 Bonus

👉 Sample notebook coming soon with data cleaning, visualization, and productivity prediction!

Clear search

Close search

Google apps

Main menu

Social Media vs Productivity

📊 Social Media vs Productivity — Realistic Behavioral Dataset (30,000 Users)

🔍 What’s Inside?

🧠 Why This Dataset is Valuable

🧾 Columns & Feature Info

📌 Notes

💡 Use Cases

📥 Bonus

Social Power NBA

Context

Followup Items

Acknowledgement

Inspiration

19 East Gig Schedules

Context

Content

Social Media vs Productivity

Impact of Social Media Usage on Individual Productivity

📊 Social Media vs Productivity — Realistic Behavioral Dataset (30,000 Users)

🔍 What’s Inside?

🧠 Why This Dataset is Valuable

🧾 Columns & Feature Info

📌 Notes

💡 Use Cases

📥 Bonus