50 datasets found
  1. Instagram: distribution of global audiences 2025, by age group

    • statista.com
    • ai-chatbox.pro
    Updated Jun 2, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Instagram: distribution of global audiences 2025, by age group [Dataset]. https://www.statista.com/statistics/325587/instagram-global-age-group/
    Explore at:
    Dataset updated
    Jun 2, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Apr 2025
    Area covered
    Worldwide
    Description

    As of April 2025, almost 32 percent of global Instagram audiences were aged between 25 and 34 years, and 29.5 percent of users were aged between 25 and 34 years. Overall, 16.3 percent of users belonged to the 35 to 44 year age group. Instagram users With roughly one billion monthly active users, Instagram belongs to the most popular social networks worldwide. The social photo sharing app is especially popular in India and in the United States, which have respectively 413.85 million and 171.7 million Instagram users each. Instagram features One of the most popular features of Instagram is Stories. Users can post photos and videos to their Stories stream and the content is live for others to view for 24 hours before it disappears. In January 2019, the company reported that there were 500 million daily active Instagram Stories users. Instagram Stories directly competes with Snapchat, another photo sharing app that initially became famous due to it’s “vanishing photos” feature. As of the first quarter of 2025, Snapchat had 460 million daily active users.

  2. i

    Data from: Five Years of COVID-19 Discourse on Instagram: A Labeled...

    • ieee-dataport.org
    Updated Jan 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nirmalya Thakur (2025). Five Years of COVID-19 Discourse on Instagram: A Labeled Instagram Dataset of Over Half a Million Posts for Multilingual Sentiment Analysis [Dataset]. https://ieee-dataport.org/documents/five-years-covid-19-discourse-instagram-labeled-instagram-dataset-over-half-million-posts
    Explore at:
    Dataset updated
    Jan 22, 2025
    Authors
    Nirmalya Thakur
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    To download this dataset without purchasing an IEEE Dataport subscription

  3. Instagram accounts with the most followers worldwide 2024

    • statista.com
    • es.statista.com
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stacy Jo Dixon, Instagram accounts with the most followers worldwide 2024 [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Stacy Jo Dixon
    Description

    Cristiano Ronaldo has one of the most popular Instagram accounts as of April 2024.

                  The Portuguese footballer is the most-followed person on the photo sharing app platform with 628 million followers. Instagram's own account was ranked first with roughly 672 million followers.
    
                  How popular is Instagram?
    
                  Instagram is a photo-sharing social networking service that enables users to take pictures and edit them with filters. The platform allows users to post and share their images online and directly with their friends and followers on the social network. The cross-platform app reached one billion monthly active users in mid-2018. In 2020, there were over 114 million Instagram users in the United States and experts project this figure to surpass 127 million users in 2023.
    
                  Who uses Instagram?
    
                  Instagram audiences are predominantly young – recent data states that almost 60 percent of U.S. Instagram users are aged 34 years or younger. Fall 2020 data reveals that Instagram is also one of the most popular social media for teens and one of the social networks with the biggest reach among teens in the United States.
    
                  Celebrity influencers on Instagram
                  Many celebrities and athletes are brand spokespeople and generate additional income with social media advertising and sponsored content. Unsurprisingly, Ronaldo ranked first again, as the average media value of one of his Instagram posts was 985,441 U.S. dollars.
    
  4. Instagram Dataset

    • brightdata.com
    .json, .csv, .xlsx
    Updated Apr 26, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bright Data (2022). Instagram Dataset [Dataset]. https://brightdata.com/products/datasets/instagram
    Explore at:
    .json, .csv, .xlsxAvailable download formats
    Dataset updated
    Apr 26, 2022
    Dataset authored and provided by
    Bright Datahttps://brightdata.com/
    License

    https://brightdata.com/licensehttps://brightdata.com/license

    Area covered
    Worldwide
    Description

    Access detailed insights with our Instagram datasets, featuring follower counts, verified status, account types, and engagement scores. Explore post information including URLs, descriptions, hashtags, comments, likes, media, posting dates, locations, and reel URLs. Perfect for understanding user engagement and content trends to drive informed decisions and optimize your social media strategies. Over 750M records available Price starts at $250/100K records Data formats are available in JSON, NDJSON, CSV, XLSX and Parquet. 100% ethical and compliant data collection Included datapoints:

    Account Fbid Id Followers Posts Count Is Business Account Is Professional Account Is Verified Avg Engagement External Url Biography Business Category Name Category Name Post Hashtags Following Posts Profile Image Link Profile URL Profile Name Highlights Count Highlights Full Name Is Private Bio Hashtags URL Is Joined Recently And much more

  5. Data from: Five Years of COVID-19 Discourse on Instagram: A Labeled...

    • zenodo.org
    • data.niaid.nih.gov
    bin
    Updated Oct 21, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nirmalya Thakur, Ph.D.; Nirmalya Thakur, Ph.D. (2024). Five Years of COVID-19 Discourse on Instagram: A Labeled Instagram Dataset of Over Half a Million Posts for Multilingual Sentiment Analysis [Dataset]. http://doi.org/10.5281/zenodo.13896353
    Explore at:
    binAvailable download formats
    Dataset updated
    Oct 21, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Nirmalya Thakur, Ph.D.; Nirmalya Thakur, Ph.D.
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Oct 6, 2024
    Description

    Please cite the following paper when using this dataset:

    N. Thakur, “Five Years of COVID-19 Discourse on Instagram: A Labeled Instagram Dataset of Over Half a Million Posts for Multilingual Sentiment Analysis”, Proceedings of the 7th International Conference on Machine Learning and Natural Language Processing (MLNLP 2024), Chengdu, China, October 18-20, 2024 (Paper accepted for publication, Preprint available at: https://arxiv.org/abs/2410.03293)

    Abstract

    The outbreak of COVID-19 served as a catalyst for content creation and dissemination on social media platforms, as such platforms serve as virtual communities where people can connect and communicate with one another seamlessly. While there have been several works related to the mining and analysis of COVID-19-related posts on social media platforms such as Twitter (or X), YouTube, Facebook, and TikTok, there is still limited research that focuses on the public discourse on Instagram in this context. Furthermore, the prior works in this field have only focused on the development and analysis of datasets of Instagram posts published during the first few months of the outbreak. The work presented in this paper aims to address this research gap and presents a novel multilingual dataset of 500,153 Instagram posts about COVID-19 published between January 2020 and September 2024. This dataset contains Instagram posts in 161 different languages. After the development of this dataset, multilingual sentiment analysis was performed using VADER and twitter-xlm-roberta-base-sentiment. This process involved classifying each post as positive, negative, or neutral. The results of sentiment analysis are presented as a separate attribute in this dataset.

    For each of these posts, the Post ID, Post Description, Date of publication, language code, full version of the language, and sentiment label are presented as separate attributes in the dataset.

    The Instagram posts in this dataset are present in 161 different languages out of which the top 10 languages in terms of frequency are English (343041 posts), Spanish (30220 posts), Hindi (15832 posts), Portuguese (15779 posts), Indonesian (11491 posts), Tamil (9592 posts), Arabic (9416 posts), German (7822 posts), Italian (5162 posts), Turkish (4632 posts)

    There are 535,021 distinct hashtags in this dataset with the top 10 hashtags in terms of frequency being #covid19 (169865 posts), #covid (132485 posts), #coronavirus (117518 posts), #covid_19 (104069 posts), #covidtesting (95095 posts), #coronavirusupdates (75439 posts), #corona (39416 posts), #healthcare (38975 posts), #staysafe (36740 posts), #coronavirusoutbreak (34567 posts)

    The following is a description of the attributes present in this dataset

    • Post ID: Unique ID of each Instagram post
    • Post Description: Complete description of each post in the language in which it was originally published
    • Date: Date of publication in MM/DD/YYYY format
    • Language code: Language code (for example: “en”) that represents the language of the post as detected using the Google Translate API
    • Full Language: Full form of the language (for example: “English”) that represents the language of the post as detected using the Google Translate API
    • Sentiment: Results of sentiment analysis (using the preprocessed version of each post) where each post was classified as positive, negative, or neutral

    Open Research Questions

    This dataset is expected to be helpful for the investigation of the following research questions and even beyond:

    1. How does sentiment toward COVID-19 vary across different languages?
    2. How has public sentiment toward COVID-19 evolved from 2020 to the present?
    3. How do cultural differences affect social media discourse about COVID-19 across various languages?
    4. How has COVID-19 impacted mental health, as reflected in social media posts across different languages?
    5. How effective were public health campaigns in shifting public sentiment in different languages?
    6. What patterns of vaccine hesitancy or support are present in different languages?
    7. How did geopolitical events influence public sentiment about COVID-19 in multilingual social media discourse?
    8. What role does social media discourse play in shaping public behavior toward COVID-19 in different linguistic communities?
    9. How does the sentiment of minority or underrepresented languages compare to that of major world languages regarding COVID-19?
    10. What insights can be gained by comparing the sentiment of COVID-19 posts in widely spoken languages (e.g., English, Spanish) to those in less common languages?

    All the Instagram posts that were collected during this data mining process to develop this dataset were publicly available on Instagram and did not require a user to log in to Instagram to view the same (at the time of writing this paper).

  6. Instagram: distribution of global audiences 2024, by age and gender

    • statista.com
    • es.statista.com
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stacy Jo Dixon, Instagram: distribution of global audiences 2024, by age and gender [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Stacy Jo Dixon
    Description

    As of April 2024, around 16.5 percent of global active Instagram users were men between the ages of 18 and 24 years. More than half of the global Instagram population worldwide was aged 34 years or younger.

                  Teens and social media
    
                  As one of the biggest social networks worldwide, Instagram is especially popular with teenagers. As of fall 2020, the photo-sharing app ranked third in terms of preferred social network among teenagers in the United States, second to Snapchat and TikTok. Instagram was one of the most influential advertising channels among female Gen Z users when making purchasing decisions. Teens report feeling more confident, popular, and better about themselves when using social media, and less lonely, depressed and anxious.
                  Social media can have negative effects on teens, which is also much more pronounced on those with low emotional well-being. It was found that 35 percent of teenagers with low social-emotional well-being reported to have experienced cyber bullying when using social media, while in comparison only five percent of teenagers with high social-emotional well-being stated the same. As such, social media can have a big impact on already fragile states of mind.
    
  7. Instagram: distribution of global audiences 2024, by gender

    • statista.com
    • es.statista.com
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stacy Jo Dixon, Instagram: distribution of global audiences 2024, by gender [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Stacy Jo Dixon
    Description

    As of January 2024, Instagram was slightly more popular with men than women, with men accounting for 50.6 percent of the platform’s global users. Additionally, the social media app was most popular amongst younger audiences, with almost 32 percent of users aged between 18 and 24 years.

                  Instagram’s Global Audience
    
                  As of January 2024, Instagram was the fourth most popular social media platform globally, reaching two billion monthly active users (MAU). This number is projected to keep growing with no signs of slowing down, which is not a surprise as the global online social penetration rate across all regions is constantly increasing.
                  As of January 2024, the country with the largest Instagram audience was India with 362.9 million users, followed by the United States with 169.7 million users.
    
                  Who is winning over the generations?
    
                  Even though Instagram’s audience is almost twice the size of TikTok’s on a global scale, TikTok has shown itself to be a fierce competitor, particularly amongst younger audiences. TikTok was the most downloaded mobile app globally in 2022, generating 672 million downloads. As of 2022, Generation Z in the United States spent more time on TikTok than on Instagram monthly.
    
  8. Instagram: number of global users 2020-2025

    • statista.com
    Updated May 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2024). Instagram: number of global users 2020-2025 [Dataset]. https://www.statista.com/statistics/183585/instagram-number-of-global-users/
    Explore at:
    Dataset updated
    May 22, 2024
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Worldwide
    Description

    In 2021, there were 1.21 billion monthly active users of Meta's Instagram, making up over 28 percent of the world's internet users. By 2025, it has been forecast that there will be 1.44 billion monthly active users of the social media platform, which would account for 31.2 percent of global internet users.

    How popular is Instagram?

    Instagram, as of January 2022, was the fourth most popular social media platform in the world in terms of user numbers. YouTube and WhatsApp ranked in second and third place, respectively, whilst Facebook remained the most popular, with almost three billion monthly active users worldwide.

    India had the largest number of Instagram users as of January 2022, with a total of over 230 million users in the country. The second-largest Instagram audience could be found in the United States, with almost 160 million people subscribing to the photo and video sharing app.

    Gen Z and Instagram

    As of September 2021, Gen Z users in the United States spent an average of five hours per week on Instagram. Although Instagram ranked third in terms of hours per week spent on the platform, Gen Z users spent considerably more time on TikTok, amounting to a weekly average of over 10 hours being spent on the mobile-first video app.

    Most followed accounts on Instagram

    As of May 2022, Instagram’s own account had 504.37 million followers. In terms of celebrities, Portuguese footballer Cristiano Ronaldo (@chistiano) had over 440.41 million followers on the social network. Moreover, the average media value of an Instagram post by Ronaldo was over 985,000 U.S. dollars.

    The most liked post on Instagram as of May 2022 was Photo of an Egg, which was posted in 2019 by the account @world_record_egg. Photo of an Egg has not only exceeded 55 million likes on the platform, but it also has nearly 3.5 million comments, and the account itself has over 4.5 million Instagram followers. After mysterious posts published by the account, World Record Egg revealed itself as part of a mental health campaign aimed at the difficulties and demands of using social media.

  9. A

    ‘Instagram fake spammer genuine accounts’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Jan 28, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘Instagram fake spammer genuine accounts’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-instagram-fake-spammer-genuine-accounts-2889/c3dd896e/?iid=015-810&v=presentation
    Explore at:
    Dataset updated
    Jan 28, 2022
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Instagram fake spammer genuine accounts’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/free4ever1/instagram-fake-spammer-genuine-accounts on 28 January 2022.

    --- Dataset description provided by original source is as follows ---

    Context

    [comment]: <> (There's a story behind every dataset and here's your opportunity to share yours.) Fakes and spammers are a major problem on all social media platforms, including Instagram. This is the subject of my final-year project in which I set out to find ways of detecting them using machine learning. In this dataset fake and spammer are interchangeable terms.

    Content

    [comment]: <> (What's inside is more than just rows and columns. Make it easy for others to get started by describing how you acquired the data and what time period it represents, too.) I have personally identified the spammer/fake accounts included in this dataset after carefully examining each instance and as such the dataset has high level of accuracy though there might be a couple of misidentified accounts in the spammers list as well. The dataset has been collected using a crawler from 15-19, March 2019.

    [comment]: <> (### Acknowledgements)

    [comment]: <> (We wouldn't be here without the help of others. If you owe any attributions or thanks, include them here along with any citations of past research.)

    Inspiration

    [comment]: <> (Your data will be in front of the world's largest data science community. What questions do you want to see answered?) This dataset could be further improved in quantity and quality measures, but how much accuracy can it achieve? Possible ways of using the models to tackle the problem?

    --- Original source retains full ownership of the source dataset ---

  10. f

    Data from: Mpox Narrative on Instagram: A Labeled Multilingual Dataset of...

    • figshare.com
    xlsx
    Updated Oct 12, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nirmalya Thakur (2024). Mpox Narrative on Instagram: A Labeled Multilingual Dataset of Instagram Posts on Mpox for Sentiment, Hate Speech, and Anxiety Analysis [Dataset]. http://doi.org/10.6084/m9.figshare.27072247.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Oct 12, 2024
    Dataset provided by
    figshare
    Authors
    Nirmalya Thakur
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Please cite this paper when using this dataset: N. Thakur, “Mpox narrative on Instagram: A labeled multilingual dataset of Instagram posts on mpox for sentiment, hate speech, and anxiety analysis,” arXiv [cs.LG], 2024, URL: https://arxiv.org/abs/2409.05292Abstract: The world is currently experiencing an outbreak of mpox, which has been declared a Public Health Emergency of International Concern by WHO. During recent virus outbreaks, social media platforms have played a crucial role in keeping the global population informed and updated regarding various aspects of the outbreaks. As a result, in the last few years, researchers from different disciplines have focused on the development of social media datasets focusing on different virus outbreaks. No prior work in this field has focused on the development of a dataset of Instagram posts about the mpox outbreak. The work presented in this paper (stated above) aims to address this research gap. It presents this multilingual dataset of 60,127 Instagram posts about mpox, published between July 23, 2022, and September 5, 2024. This dataset contains Instagram posts about mpox in 52 languages.For each of these posts, the Post ID, Post Description, Date of publication, language, and translated version of the post (translation to English was performed using the Google Translate API) are presented as separate attributes in the dataset. After developing this dataset, sentiment analysis, hate speech detection, and anxiety or stress detection were also performed. This process included classifying each post intoone of the fine-grain sentiment classes, i.e., fear, surprise, joy, sadness, anger, disgust, or neutralhate or not hateanxiety/stress detected or no anxiety/stress detected.These results are presented as separate attributes in the dataset for the training and testing of machine learning algorithms for sentiment, hate speech, and anxiety or stress detection, as well as for other applications.The 52 distinct languages in which Instagram posts are present in the dataset are English, Portuguese, Indonesian, Spanish, Korean, French, Hindi, Finnish, Turkish, Italian, German, Tamil, Urdu, Thai, Arabic, Persian, Tagalog, Dutch, Catalan, Bengali, Marathi, Malayalam, Swahili, Afrikaans, Panjabi, Gujarati, Somali, Lithuanian, Norwegian, Estonian, Swedish, Telugu, Russian, Danish, Slovak, Japanese, Kannada, Polish, Vietnamese, Hebrew, Romanian, Nepali, Czech, Modern Greek, Albanian, Croatian, Slovenian, Bulgarian, Ukrainian, Welsh, Hungarian, and Latvian.The following is a description of the attributes present in this dataset:Post ID: Unique ID of each Instagram postPost Description: Complete description of each post in the language in which it was originally publishedDate: Date of publication in MM/DD/YYYY formatLanguage: Language of the post as detected using the Google Translate APITranslated Post Description: Translated version of the post description. All posts which were not in English were translated into English using the Google Translate API. No language translation was performed for English posts.Sentiment: Results of sentiment analysis (using the preprocessed version of the translated Post Description) where each post was classified into one of the sentiment classes: fear, surprise, joy, sadness, anger, disgust, and neutralHate: Results of hate speech detection (using the preprocessed version of the translated Post Description) where each post was classified as hate or not hateAnxiety or Stress: Results of anxiety or stress detection (using the preprocessed version of the translated Post Description) where each post was classified as stress/anxiety detected or no stress/anxiety detected.All the Instagram posts that were collected during this data mining process to develop this dataset were publicly available on Instagram and did not require a user to log in to Instagram to view the same (at the time of writing this paper).

  11. f

    U.S. Army vs. British Army Instagram Engagement Metrics Dataset

    • figshare.com
    xlsx
    Updated Jun 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abby Stover (2024). U.S. Army vs. British Army Instagram Engagement Metrics Dataset [Dataset]. http://doi.org/10.6084/m9.figshare.26060866.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jun 18, 2024
    Dataset provided by
    figshare
    Authors
    Abby Stover
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    United States, United Kingdom
    Description

    This dataset investigates the Instagram engagement metrics (likes and comments) of the U.S. and British Armies to understand their strengths and weaknesses in their marketing. For the quantitative data collection, a random number generator was used to compile a 20% data sample (73 posts) from a total of 365 posts from each account. For instance, a number 1 in the random generator corresponded to the most recent post from the start date of data collection (May 23rd, 2024). By picking from 365 posts, the data collection was meant to represent roughly a year of Instagram content, assuming their Instagram accounts posted every day. This method ensured an unbiased representation of which content was included in the 20% data sample.However, the U.S Army posted almost once a day while the British Army posted only a few days a week. In the end, data was collected across 365 U.S. Army posts from May 23rd, 2024, to October 28th, 2023. For the British Army’s Instagram, the data collection span from May 23rd, 2024, to November 25th, 2021. By engaging with recent posts, the purpose was to understand how effectively these Armies responded to their recruitment crisis (which started in 2022).For the data collection, variables for each post included the following:Date of postNumber of likesPercentage of likes by follower populationNumber of commentsPercentage of comments by follower populationTo understand which Instagram posts were successful, the content with the highest number of likes and comments were defined as the most engaged. But, to accurately compare the British Army’s Instagram engagement to the U.S., the number of likes/comments was divided by the number of their followers. As of May 23, 2024, the U.S. Army had 2.9 million followers on Instagram whereas the British Army had 594,000 followers. While social media users outside of the Armies’ followers engaged with the posts, these ratios provided a basis to fairly compare their engagement metrics.

  12. Social Media vs Productivity

    • kaggle.com
    Updated May 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mahdi Mashayekhi (2025). Social Media vs Productivity [Dataset]. https://www.kaggle.com/datasets/mahdimashayekhi/social-media-vs-productivity/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 15, 2025
    Dataset provided by
    Kaggle
    Authors
    Mahdi Mashayekhi
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    📊 Social Media vs Productivity — Realistic Behavioral Dataset (30,000 Users)

    This dataset explores how daily digital habits — including social media usage, screen time, and notification exposure — relate to individual productivity, stress, and well-being.

    🔍 What’s Inside?

    The dataset contains 30,000 real-world-style records simulating behavioral patterns of people with various jobs, social habits, and lifestyle choices. The goal is to understand how different digital behaviors correlate with perceived and actual productivity.

    🧠 Why This Dataset is Valuable

    • Designed for real-world ML workflows
      Includes missing values, noise, and outliers — ideal for practicing data cleaning and preprocessing.

    • 🔗 High correlation between target features
      The perceived_productivity_score and actual_productivity_score are strongly correlated, making this dataset suitable for experiments in feature selection and multicollinearity.

    • 🛠️ Feature Engineering playground
      Use this dataset to practice feature scaling, encoding, binning, interaction terms, and more.

    • 🧪 Perfect for EDA, regression & classification
      You can model productivity, stress, or satisfaction based on behavior patterns and digital exposure.

    🧾 Columns & Feature Info

    Column NameDescription
    ageAge of the individual (18–65 years)
    genderGender identity: Male, Female, or Other
    job_typeEmployment sector or status (IT, Education, Student, etc.)
    daily_social_media_timeAverage daily time spent on social media (hours)
    social_platform_preferenceMost-used social platform (Instagram, TikTok, Telegram, etc.)
    number_of_notificationsNumber of mobile/social notifications per day
    work_hours_per_dayAverage hours worked each day
    perceived_productivity_scoreSelf-rated productivity score (scale: 0–10)
    actual_productivity_scoreSimulated ground-truth productivity score (scale: 0–10)
    stress_levelCurrent stress level (scale: 1–10)
    sleep_hoursAverage hours of sleep per night
    screen_time_before_sleepTime spent on screens before sleeping (hours)
    breaks_during_workNumber of breaks taken during work hours
    uses_focus_appsWhether the user uses digital focus apps (True/False)
    has_digital_wellbeing_enabledWhether Digital Wellbeing is activated (True/False)
    coffee_consumption_per_dayNumber of coffee cups consumed per day
    days_feeling_burnout_per_monthNumber of burnout days reported per month
    weekly_offline_hoursTotal hours spent offline each week (excluding sleep)
    job_satisfaction_scoreSatisfaction with job/life responsibilities (scale: 0–10)

    📌 Notes

    • Contains NaN values in critical columns (productivity, sleep, stress) for data imputation tasks
    • Includes outliers in media usage, coffee intake, and notification count
    • Target columns are strongly correlated for multicollinearity testing
    • Multi-purpose: regression, classification, clustering, visualization

    💡 Use Cases

    • Exploratory Data Analysis (EDA)
    • Feature engineering pipelines
    • Machine learning model benchmarking
    • Statistical hypothesis testing
    • Burnout and mental health prediction projects

    📥 Bonus

    👉 Sample notebook coming soon with data cleaning, visualization, and productivity prediction!

  13. Instagram: most used hashtags 2024

    • statista.com
    • es.statista.com
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista Research Department, Instagram: most used hashtags 2024 [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Statista Research Department
    Description

    As of January 2024, #love was the most used hashtag on Instagram, being included in over two billion posts on the social media platform. #Instagood and #instagram were used over one billion times as of early 2024.

  14. d

    Proactive Action of Social Media Companies: Year- and Month-wise Number of...

    • dataful.in
    Updated Jul 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataful (Factly) (2025). Proactive Action of Social Media Companies: Year- and Month-wise Number of Accounts Blocked, Contents Removed and other Actions Taken by SSMIs [Dataset]. https://dataful.in/datasets/18653
    Explore at:
    xlsx, csv, application/x-parquetAvailable download formats
    Dataset updated
    Jul 23, 2025
    Dataset authored and provided by
    Dataful (Factly)
    License

    https://dataful.in/terms-and-conditionshttps://dataful.in/terms-and-conditions

    Area covered
    India
    Variables measured
    Social Media Intermediaries Ban actions
    Description

    High Frequency Indicator: The dataset contains year and month-wise data from the year 2021 to till date on the different types of actions taken by by significant social media intermediaries (SSMIs) such as Twitter, Koo, Facebook, Instagram, Sharechat, Google and WhatsApp. The data compiled is based on the monthly transparency reports published by SSMIs in accordance with Rule 4(1)(d) of the Information Technology (Intermediary Guidelines and Digital Media Ethics Code) Rules, 2021 (IT Rules, 2021)

    The different types of action taken include Content actioning (removal), Account Removal Actions as a result of automated detection, Reporting spam, Blocking, Proactive ban, etc. on the different reasons such as harrassment, Child Endangerment with Nudity, Sexual Exploitation and Physical Abuse, Dangerous Organizations and Individuals with Organized Hate and Terrorism and its Propaganda, Hate Speech, Drugs and Firearms, etc.

    Notes:

    1. Twitter: “Proactive Monitoring” refers to content proactively identified by employing internal proprietary tools and industry hash sharing initiatives.
      1. Google: For data related to automated detection processes, Google includes data where the sender or creator of the content is located in India. In order to attribute a location to an individual sender or creator, Google use data signals such as location of account creation, IP address at the time of video upload and user phone number, as available.
      2. Meta: Proactive Rates- This metric shows the percentage of all content or accounts acted on that Meta found and flagged before users reported them to Meta. These metrics are the best estimates of content Meta act on and of proactive rates based on the creator of the content and predicted country locations for those users.
    2. ShareChat: The accounts are proactively banned on the basis of Copyright violations, Sexually explicit, UGC- violation of community standards, Chatrooms and Comments.
  15. o

    DeepCube: Post-processing dataset of social media data

    • explore.openaire.eu
    Updated Mar 14, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexandros Mokas; Eleni Kamateri; Ioannis Tsampoulatidis (2023). DeepCube: Post-processing dataset of social media data [Dataset]. http://doi.org/10.5281/zenodo.7736979
    Explore at:
    Dataset updated
    Mar 14, 2023
    Authors
    Alexandros Mokas; Eleni Kamateri; Ioannis Tsampoulatidis
    Description

    This dataset contains the post-processing of the social media data collected for two different use cases during the first two years of the Deepcube project. More specifically, it contains two sub-datasets, including: The UC2 dataset containing the post-processing of the Twitter data collected for the DeepCube use case (UC2) dealing with the climate induced migration in Africa. This dataset contains in total 5,695,253 social media posts collected from the Twitter platform, based on the initial version of search criteria relevant to UC2 - defined by Universitat De Valencia, focused on the regions of Ethiopia and Somalia and started from 26 June, 2021 till March, 2023. The UC5 dataset containing the post-processing of the Twitter and Instagram data collected for the DeepCube use case (UC5) related to the sustainable and environmentally-friendly tourism. This dataset contains in total 58,143 social media posts collected from the Twitter and Instagram platform (12,881 collected from Twitter and 45,262 collected from Instagram), based on the initial version of search criteria relevant to UC5- defined by MURMURATION SAS, focused on the regions of Brasil and started from 26 June, 2021 till March, 2023. Additionally, an anottated dataset was created by Twitter historical data for UC2 the year 2010-20220. The UC2 historical anottated dataset containg the post-processing of the Twitter data collected for the DeepCube use case (UC2) dealing with the climate induced migration in Africa. This dataset contains in total 1721 annotated (412 relevant and 1309 irrelevant) by social media posts collected from the Twitter platform , focused on the region Somalia. INFALIA, being a spin-off of the CERTH institute (link) and a partner of a research EU project, releases this dataset containing an unlimited number of Tweet IDs for the sole purpose of enabling the validation of the research conducted within the DeepCube. Moreover, Twitter Content provided to in this dataset to third parties remains subject to the Twitter Policy, and those third parties must agree to the Twitter Terms of Service, Privacy Policy, Developer Agreement, and Developer Policy (link - https://developer.twitter.com/en/developer-terms) before receiving this download. testtesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttesttest

  16. Instagram users in the United Kingdom 2019-2028

    • statista.com
    Updated Nov 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista Research Department (2024). Instagram users in the United Kingdom 2019-2028 [Dataset]. https://www.statista.com/topics/3236/social-media-usage-in-the-uk/
    Explore at:
    Dataset updated
    Nov 22, 2024
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Statista Research Department
    Area covered
    United Kingdom
    Description

    The number of Instagram users in the United Kingdom was forecast to continuously increase between 2024 and 2028 by in total 2.1 million users (+7.02 percent). After the ninth consecutive increasing year, the Instagram user base is estimated to reach 32 million users and therefore a new peak in 2028. Notably, the number of Instagram users of was continuously increasing over the past years.User figures, shown here with regards to the platform instagram, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period and count multiple accounts by persons only once.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).

  17. i

    A 40-year High Arctic climatological dataset of the Polish Polar Station...

    • dataportal.igf.edu.pl
    Updated Jul 29, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2021). A 40-year High Arctic climatological dataset of the Polish Polar Station Hornsund (SW Spitsbergen, Svalbard) - Dataset - IG PAS Data Portal [Dataset]. https://dataportal.igf.edu.pl/dataset/https-doi-pangaea-de-10-1594-pangaea-909042
    Explore at:
    Dataset updated
    Jul 29, 2021
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Area covered
    Arctic, Spitsbergen, Svalbard
    Description

    A consistent long-term (1979-2018) dataset from the Arctic meteorological site the Polish Polar Station Hornsund (77°00'N 15°33'E), located in the SW part of Spitsbergen. The Station is managed by the Institute of Geophysics Polish Academy of Sciences. The data series includes daily, monthly and annual air temperature (TA), PDD, NDD, the sum of precipitation (Precip), air humidity (RH), atmospheric pressure (PA), wind speed (WS) and direction (WD), sunshine duration (SD), cloudiness, and visibility (VV). This rich dataset, now available online, is a valuable source for documenting the state of the climate in SW Spitsbergen that represents the Atlantic sector of the Arctic. Nowhere on the planet is climate warming faster than here. With the positive trend of mean annual temperature +1.14°C/decade, the climate in Hornsund is warming five times faster than the global average.

  18. i

    Universidad glacier LANDSAT 8 analysis results - Dataset - IG PAS Data...

    • dataportal.igf.edu.pl
    Updated Mar 20, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Universidad glacier LANDSAT 8 analysis results - Dataset - IG PAS Data Portal [Dataset]. https://dataportal.igf.edu.pl/dataset/universidad-landsat
    Explore at:
    Dataset updated
    Mar 20, 2024
    Description

    The files in this dataset are LANDSAT 8 images and maps of parameters derived from these images. Four sets of maps are included, each covering time period between 2013 and 2022: - LANDSAT 8 multispectral image sourced from the Google Earth Engine API - Albedo map computed from this image with a formula of () - GRAI map computed from this image with a formula included in the project's final publication (Podgorski et al 2023) - Debris abundance map derived from the GRAI map The maps show the state of the glacier's surface at the end of the ablation season in each year (end of March/beginning of April).

  19. s

    Social Media Worldwide Usage Statistics

    • searchlogistics.com
    Updated Apr 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Social Media Worldwide Usage Statistics [Dataset]. https://www.searchlogistics.com/learn/statistics/social-media-addiction-statistics/
    Explore at:
    Dataset updated
    Apr 1, 2025
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    56.8% of the world’s total population is active on social media.

  20. Instagram: countries with the highest audience reach 2024

    • statista.com
    • es.statista.com
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stacy Jo Dixon, Instagram: countries with the highest audience reach 2024 [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Stacy Jo Dixon
    Description

    As of April 2024, Bahrain was the country with the highest Instagram audience reach with 95.6 percent. Kazakhstan also had a high Instagram audience penetration rate, with 90.8 percent of the population using the social network. In the United Arab Emirates, Turkey, and Brunei, the photo-sharing platform was used by more than 85 percent of each country's population.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Statista (2025). Instagram: distribution of global audiences 2025, by age group [Dataset]. https://www.statista.com/statistics/325587/instagram-global-age-group/
Organization logo

Instagram: distribution of global audiences 2025, by age group

Explore at:
495 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Jun 2, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Apr 2025
Area covered
Worldwide
Description

As of April 2025, almost 32 percent of global Instagram audiences were aged between 25 and 34 years, and 29.5 percent of users were aged between 25 and 34 years. Overall, 16.3 percent of users belonged to the 35 to 44 year age group. Instagram users With roughly one billion monthly active users, Instagram belongs to the most popular social networks worldwide. The social photo sharing app is especially popular in India and in the United States, which have respectively 413.85 million and 171.7 million Instagram users each. Instagram features One of the most popular features of Instagram is Stories. Users can post photos and videos to their Stories stream and the content is live for others to view for 24 hours before it disappears. In January 2019, the company reported that there were 500 million daily active Instagram Stories users. Instagram Stories directly competes with Snapchat, another photo sharing app that initially became famous due to it’s “vanishing photos” feature. As of the first quarter of 2025, Snapchat had 460 million daily active users.

Search
Clear search
Close search
Google apps
Main menu