100+ datasets found
  1. Average daily time spent on social media worldwide 2012-2025

    • statista.com
    Updated Jun 19, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Average daily time spent on social media worldwide 2012-2025 [Dataset]. https://www.statista.com/statistics/433871/daily-social-media-usage-worldwide/
    Explore at:
    Dataset updated
    Jun 19, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Worldwide
    Description

    How much time do people spend on social media? As of 2025, the average daily social media usage of internet users worldwide amounted to 141 minutes per day, down from 143 minutes in the previous year. Currently, the country with the most time spent on social media per day is Brazil, with online users spending an average of 3 hours and 49 minutes on social media each day. In comparison, the daily time spent with social media in the U.S. was just 2 hours and 16 minutes. Global social media usageCurrently, the global social network penetration rate is 62.3 percent. Northern Europe had an 81.7 percent social media penetration rate, topping the ranking of global social media usage by region. Eastern and Middle Africa closed the ranking with 10.1 and 9.6 percent usage reach, respectively. People access social media for a variety of reasons. Users like to find funny or entertaining content and enjoy sharing photos and videos with friends, but mainly use social media to stay in touch with current events friends. Global impact of social mediaSocial media has a wide-reaching and significant impact on not only online activities but also offline behavior and life in general. During a global online user survey in February 2019, a significant share of respondents stated that social media had increased their access to information, ease of communication, and freedom of expression. On the flip side, respondents also felt that social media had worsened their personal privacy, increased a polarization in politics and heightened everyday distractions.

  2. S

    Social media profile growth, engagement rate, and reach

    • data.sugarlandtx.gov
    xlsx
    Updated Jan 3, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Communications and Community Engagement (2024). Social media profile growth, engagement rate, and reach [Dataset]. https://data.sugarlandtx.gov/dataset/social-media-profile-growth-engagement-rate-and-reach
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jan 3, 2024
    Dataset authored and provided by
    Communications and Community Engagement
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Profile growth - the growth on our social platforms to see where and when we're gaining followers. Engagement rate - a ratio of how many people interacted with ours posts based on when users are usually online. Reach - the number of feeds our posts appeared in (doesn't mean people interacted with the post).

  3. Instagram accounts with the most followers worldwide 2024

    • statista.com
    • es.statista.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stacy Jo Dixon, Instagram accounts with the most followers worldwide 2024 [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Stacy Jo Dixon
    Description

    Cristiano Ronaldo has one of the most popular Instagram accounts as of April 2024.

                  The Portuguese footballer is the most-followed person on the photo sharing app platform with 628 million followers. Instagram's own account was ranked first with roughly 672 million followers.
    
                  How popular is Instagram?
    
                  Instagram is a photo-sharing social networking service that enables users to take pictures and edit them with filters. The platform allows users to post and share their images online and directly with their friends and followers on the social network. The cross-platform app reached one billion monthly active users in mid-2018. In 2020, there were over 114 million Instagram users in the United States and experts project this figure to surpass 127 million users in 2023.
    
                  Who uses Instagram?
    
                  Instagram audiences are predominantly young – recent data states that almost 60 percent of U.S. Instagram users are aged 34 years or younger. Fall 2020 data reveals that Instagram is also one of the most popular social media for teens and one of the social networks with the biggest reach among teens in the United States.
    
                  Celebrity influencers on Instagram
                  Many celebrities and athletes are brand spokespeople and generate additional income with social media advertising and sponsored content. Unsurprisingly, Ronaldo ranked first again, as the average media value of one of his Instagram posts was 985,441 U.S. dollars.
    
  4. Social Media Datasets

    • brightdata.com
    .json, .csv, .xlsx
    Updated Sep 7, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bright Data (2022). Social Media Datasets [Dataset]. https://brightdata.com/products/datasets/social-media
    Explore at:
    .json, .csv, .xlsxAvailable download formats
    Dataset updated
    Sep 7, 2022
    Dataset authored and provided by
    Bright Datahttps://brightdata.com/
    License

    https://brightdata.com/licensehttps://brightdata.com/license

    Area covered
    Worldwide
    Description

    Gain valuable insights with our comprehensive Social Media Dataset, designed to help businesses, marketers, and analysts track trends, monitor engagement, and optimize strategies. This dataset provides structured and reliable social media data from multiple platforms.

    Dataset Features

    User Profiles: Access public social media profiles, including usernames, bios, follower counts, engagement metrics, and more. Ideal for audience analysis, influencer marketing, and competitive research. Posts & Content: Extract posts, captions, hashtags, media (images/videos), timestamps, and engagement metrics such as likes, shares, and comments. Useful for trend analysis, sentiment tracking, and content strategy optimization. Comments & Interactions: Analyze user interactions, including replies, mentions, and discussions. This data helps brands understand audience sentiment and engagement patterns. Hashtag & Trend Tracking: Monitor trending hashtags, topics, and viral content across platforms to stay ahead of industry trends and consumer interests.

    Customizable Subsets for Specific Needs Our Social Media Dataset is fully customizable, allowing you to filter data based on platform, region, keywords, engagement levels, or specific user profiles. Whether you need a broad dataset for market research or a focused subset for brand monitoring, we tailor the dataset to your needs.

    Popular Use Cases

    Brand Monitoring & Reputation Management: Track brand mentions, customer feedback, and sentiment analysis to manage online reputation effectively. Influencer Marketing & Audience Analysis: Identify key influencers, analyze engagement metrics, and optimize influencer partnerships. Competitive Intelligence: Monitor competitor activity, content performance, and audience engagement to refine marketing strategies. Market Research & Consumer Insights: Analyze social media trends, customer preferences, and emerging topics to inform business decisions. AI & Predictive Analytics: Leverage structured social media data for AI-driven trend forecasting, sentiment analysis, and automated content recommendations.

    Whether you're tracking brand sentiment, analyzing audience engagement, or monitoring industry trends, our Social Media Dataset provides the structured data you need. Get started today and customize your dataset to fit your business objectives.

  5. s

    Data from: Facebook Users

    • searchlogistics.com
    Updated Apr 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Facebook Users [Dataset]. https://www.searchlogistics.com/learn/statistics/social-media-user-statistics/
    Explore at:
    Dataset updated
    Apr 1, 2025
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Facebook is fast approaching 3 billion monthly active users. That’s about 36% of the world’s entire population that log in and use Facebook at least once a month.

  6. Social Media vs Productivity

    • kaggle.com
    Updated May 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mahdi Mashayekhi (2025). Social Media vs Productivity [Dataset]. https://www.kaggle.com/datasets/mahdimashayekhi/social-media-vs-productivity/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 15, 2025
    Dataset provided by
    Kaggle
    Authors
    Mahdi Mashayekhi
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    šŸ“Š Social Media vs Productivity — Realistic Behavioral Dataset (30,000 Users)

    This dataset explores how daily digital habits — including social media usage, screen time, and notification exposure — relate to individual productivity, stress, and well-being.

    šŸ” What’s Inside?

    The dataset contains 30,000 real-world-style records simulating behavioral patterns of people with various jobs, social habits, and lifestyle choices. The goal is to understand how different digital behaviors correlate with perceived and actual productivity.

    🧠 Why This Dataset is Valuable

    • āœ… Designed for real-world ML workflows
      Includes missing values, noise, and outliers — ideal for practicing data cleaning and preprocessing.

    • šŸ”— High correlation between target features
      The perceived_productivity_score and actual_productivity_score are strongly correlated, making this dataset suitable for experiments in feature selection and multicollinearity.

    • šŸ› ļø Feature Engineering playground
      Use this dataset to practice feature scaling, encoding, binning, interaction terms, and more.

    • 🧪 Perfect for EDA, regression & classification
      You can model productivity, stress, or satisfaction based on behavior patterns and digital exposure.

    🧾 Columns & Feature Info

    Column NameDescription
    ageAge of the individual (18–65 years)
    genderGender identity: Male, Female, or Other
    job_typeEmployment sector or status (IT, Education, Student, etc.)
    daily_social_media_timeAverage daily time spent on social media (hours)
    social_platform_preferenceMost-used social platform (Instagram, TikTok, Telegram, etc.)
    number_of_notificationsNumber of mobile/social notifications per day
    work_hours_per_dayAverage hours worked each day
    perceived_productivity_scoreSelf-rated productivity score (scale: 0–10)
    actual_productivity_scoreSimulated ground-truth productivity score (scale: 0–10)
    stress_levelCurrent stress level (scale: 1–10)
    sleep_hoursAverage hours of sleep per night
    screen_time_before_sleepTime spent on screens before sleeping (hours)
    breaks_during_workNumber of breaks taken during work hours
    uses_focus_appsWhether the user uses digital focus apps (True/False)
    has_digital_wellbeing_enabledWhether Digital Wellbeing is activated (True/False)
    coffee_consumption_per_dayNumber of coffee cups consumed per day
    days_feeling_burnout_per_monthNumber of burnout days reported per month
    weekly_offline_hoursTotal hours spent offline each week (excluding sleep)
    job_satisfaction_scoreSatisfaction with job/life responsibilities (scale: 0–10)

    šŸ“Œ Notes

    • Contains NaN values in critical columns (productivity, sleep, stress) for data imputation tasks
    • Includes outliers in media usage, coffee intake, and notification count
    • Target columns are strongly correlated for multicollinearity testing
    • Multi-purpose: regression, classification, clustering, visualization

    šŸ’” Use Cases

    • Exploratory Data Analysis (EDA)
    • Feature engineering pipelines
    • Machine learning model benchmarking
    • Statistical hypothesis testing
    • Burnout and mental health prediction projects

    šŸ“„ Bonus

    šŸ‘‰ Sample notebook coming soon with data cleaning, visualization, and productivity prediction!

  7. Instagram: most used hashtags 2024

    • statista.com
    • es.statista.com
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista Research Department, Instagram: most used hashtags 2024 [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Statista Research Department
    Description

    As of January 2024, #love was the most used hashtag on Instagram, being included in over two billion posts on the social media platform. #Instagood and #instagram were used over one billion times as of early 2024.

  8. News Popularity in Multiple Social Media Platforms

    • kaggle.com
    zip
    Updated Oct 28, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nikhil John (2020). News Popularity in Multiple Social Media Platforms [Dataset]. https://www.kaggle.com/nikhiljohnk/news-popularity-in-multiple-social-media-platforms
    Explore at:
    zip(10881978 bytes)Available download formats
    Dataset updated
    Oct 28, 2020
    Authors
    Nikhil John
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    Social Media has been taking up everything on the Internet. People getting the latest news, useful resources, life partner and what not. In a world where Social media plays a big role in giving news, we must also know that news which affects our sentiments are going to get spread like a wildfire. Based on the Headline and the title, and according to the date given and the Social media platforms, you have to predict how it has affected the human sentiment scores. You have to predict the column ā€œSentimentTitleā€ and ā€œSentimentHeadlineā€.

    Content

    This is a subset of the dataset of the same name available in the UCI Machine Learning Repository The collected data relates to a period of 8 months, between November 2015 and July 2016, accounting for about 100,000 news items on four different topics: economy, microsoft, obama and palestine.

    Dataset Information

    The attributes for each of the dataset are : - IDLink (numeric): Unique identifier of news items - Title (string): Title of the news item according to the official media sources - Headline (string): Headline of the news item according to the official media sources - Source (string): Original news outlet that published the news item - Topic (string): Query topic used to obtain the items in the official media sources - Publish-Date (timestamp): Date and time of the news items' publication - Facebook (numeric): Final value of the news items' popularity according to the social media source Facebook - Google-Plus (numeric): Final value of the news items' popularity according to the social media source Google+ - LinkedIn (numeric): Final value of the news items' popularity according to the social media source LinkedIn - SentimentTitle: Sentiment score of the title, Higher the score, better is the impact or +ve sentiment and vice-versa. (Target Variable 1) - SentimentHeadline: Sentiment score of the text in the news items' headline. Higher the score, better is the impact or +ve sentiment. (Target Variable 2)

  9. s

    Social Media Usage By Country

    • searchlogistics.com
    Updated Apr 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Social Media Usage By Country [Dataset]. https://www.searchlogistics.com/learn/statistics/social-media-addiction-statistics/
    Explore at:
    Dataset updated
    Apr 1, 2025
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The results might surprise you when looking at internet users that are active on social media in each country.

  10. Impact of Digital Habits on Mental Health

    • kaggle.com
    Updated Jun 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shahzad Aslam (2025). Impact of Digital Habits on Mental Health [Dataset]. https://www.kaggle.com/datasets/zeesolver/mental-health
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 14, 2025
    Dataset provided by
    Kaggle
    Authors
    Shahzad Aslam
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    This dataset explores the relationship between digital behavior and mental well-being among 100,000 individuals. It records how much time people spend on screens, use of social media (including TikTok), and how these habits may influence their sleep, stress, and mood levels.

    It includes six numerical features, all clean and ready for analysis, making it ideal for machine learning tasks like regression or classification. The data enables researchers and analysts to investigate how modern digital lifestyles may impact mental health indicators in measurable ways.

    Dataset Applications

    • Quantify how screen‑time, TikTok use, or multi‑platform engagement statistically relate to stress, sleep loss, and mood.
    • Train regression or classification models that forecast stress level or mood score from real‑time digital‑usage metrics.
    • Feed user‑specific data into recommender systems that suggest screen‑time caps or bedtime routines to improve mental health.
    • Provide evidence for guidelines on youth screen‑time limits and platform moderation based on observed stress‑sleep trade‑offs.
    • Serve as a teaching dataset for EDA, feature engineering, and model evaluation in data‑science or psychology curricula.
    • Evaluate app interventions (e.g., screen‑time nudges) by comparing predicted versus actual post‑intervention stress or mood shifts.
    • Cluster individuals into digital‑behavior personas (e.g., ā€œheavy late‑night scrollersā€) to tailor mental‑health resources.
    • Generate synthetic time‑series scenarios (what‑if reductions in TikTok hours) to estimate downstream impacts on sleep and stress.
    • Use engineered features (ratio of TikTok hours to total screen‑time, etc.) in broader wellbeing models that include diet or exercise data.
    • Assess whether mental‑health prediction models remain accurate and unbiased across different screen‑time or platform‑use segments. # Column Descriptions
    • screen_time_hours – Daily total screen usage in hours across all devices.
    • social_media_platforms_used – Number of different social media platforms used per day.
    • hours_on_TikTok – Time spent on TikTok daily, in hours.
    • sleep_hours – Average number of sleep hours per night.
    • stress_level – Stress intensity reported on a scale from 1 (low) to 10 (high).
    • mood_score – Self-rated mood on a scale from 2 (poor) to 10 (excell # Inspiration This dataset was inspired by growing concerns about how screen time and social media affect mental health. It enables analysis of the links between digital habits, stress, sleep, and mood—encouraging data-driven solutions for healthier online behavior and emotional well-being. # Ethically Mined Data: This dataset has been ethically mined and synthetically generated without collecting any personally identifiable information. All values are artificial but statistically realistic, allowing safe use in academic, research, and public health projects while fully respecting user privacy and data ethics.
  11. MultiSocial

    • zenodo.org
    Updated May 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dominik Macko; Dominik Macko; Jakub Kopal; Robert Moro; Robert Moro; Ivan Srba; Ivan Srba; Jakub Kopal (2025). MultiSocial [Dataset]. http://doi.org/10.5281/zenodo.13846152
    Explore at:
    Dataset updated
    May 21, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Dominik Macko; Dominik Macko; Jakub Kopal; Robert Moro; Robert Moro; Ivan Srba; Ivan Srba; Jakub Kopal
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    MultiSocial is a dataset (described in a paper) for multilingual (22 languages) machine-generated text detection benchmark in social-media domain (5 platforms). It contains 472,097 texts, of which about 58k are human-written and approximately the same amount is generated by each of 7 multilingual large language models by using 3 iterations of paraphrasing. The dataset has been anonymized to minimize amount of sensitive data by hiding email addresses, usernames, and phone numbers.

    If you use this dataset in any publication, project, tool or in any other form, please, cite the a paper.

    Disclaimer

    Due to data source (described below), the dataset may contain harmful, disinformation, or offensive content. Based on a multilingual toxicity detector, about 8% of the text samples are probably toxic (from 5% in WhatsApp to 10% in Twitter). Although we have used data sources of older date (lower probability to include machine-generated texts), the labeling (of human-written text) might not be 100% accurate. The anonymization procedure might not successfully hiden all the sensitive/personal content; thus, use the data cautiously (if feeling affected by such content, report the found issues in this regard to dpo[at]kinit.sk). The intended use if for non-commercial research purpose only.

    Data Source

    The human-written part consists of a pseudo-randomly selected subset of social media posts from 6 publicly available datasets:

    1. Telegram data originated in Pushshift Telegram, containing 317M messages (Baumgartner et al., 2020). It contains messages from 27k+ channels. The collection started with a set of right-wing extremist and cryptocurrency channels (about 300 in total) and was expanded based on occurrence of forwarded messages from other channels. In the end, it thus contains a wide variety of topics and societal movements reflecting the data collection time.

    2. Twitter data originated in CLEF2022-CheckThat! Task 1, containing 34k tweets on COVID-19 and politics (Nakov et al., 2022, combined with Sentiment140, containing 1.6M tweets on various topics (Go et al., 2009).

    3. Gab data originated in the dataset containing 22M posts from Gab social network. The authors of the dataset (Zannettou et al., 2018) found out that ā€œGab is predominantly used for the dissemination and discussion of news and world events, and that it attracts alt-right users, conspiracy theorists, and other trolls.ā€ They also found out that hate speech is much more prevalent there compared to Twitter, but lower than 4chan's Politically Incorrect board.

    4. Discord data originated in Discord-Data, containing 51M messages. This is a long-context, anonymized, clean, multi-turn and single-turn conversational dataset based on Discord data scraped from a large variety of servers, big and small. According to the dataset authors, it contains around 0.1% of potentially toxic comments (based on the applied heuristic/classifier).

    5. WhatsApp data originated in whatsapp-public-groups, containing 300k messages (Garimella & Tyson, 2018). The public dataset contains the anonymised data, collected for around 5 months from around 178 groups. Original messages were made available to us on request to dataset authors for research purposes.

    From these datasets, we have pseudo-randomly sampled up to 1300 texts (up to 300 for test split and the remaining up to 1000 for train split if available) for each of the selected 22 languages (using a combination of automated approaches to detect the language) and platform. This process resulted in 61,592 human-written texts, which were further filtered out based on occurrence of some characters or their length, resulting in about 58k human-written texts.

    The machine-generated part contains texts generated by 7 LLMs (Aya-101, Gemini-1.0-pro, GPT-3.5-Turbo-0125, Mistral-7B-Instruct-v0.2, opt-iml-max-30b, v5-Eagle-7B-HF, vicuna-13b). All these models were self-hosted except for GPT and Gemini, where we used the publicly available APIs. We generated the texts using 3 paraphrases of the original human-written data and then preprocessed the generated texts (filtered out cases when the generation obviously failed).

    The dataset has the following fields:

    • 'text' - a text sample,

    • 'label' - 0 for human-written text, 1 for machine-generated text,

    • 'multi_label' - a string representing a large language model that generated the text or the string "human" representing a human-written text,

    • 'split' - a string identifying train or test split of the dataset for the purpose of training and evaluation respectively,

    • 'language' - the ISO 639-1 language code identifying the detected language of the given text,

    • 'length' - word count of the given text,

    • 'source' - a string identifying the source dataset / platform of the given text,

    • 'potential_noise' - 0 for text without identified noise, 1 for text with potential noise.

    ToDo Statistics (under construction)

  12. Tweets and User Engagement

    • kaggle.com
    Updated Dec 6, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). Tweets and User Engagement [Dataset]. https://www.kaggle.com/datasets/thedevastator/tweets-and-user-engagement
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 6, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    The Devastator
    Description

    Tweets and User Engagement

    Twitter Data: Tweet Characteristics and Engagement Metrics

    By Krystal Jensen [source]

    About this dataset

    The dataset Twitter Data: Tweets and User Interactions provides comprehensive information about tweets and user interactions on the popular social media platform Twitter. The dataset includes various attributes that shed light on the characteristics and engagement metrics of tweets, allowing for in-depth analysis of user behavior and content performance.

    One of the key variables in this dataset is the Klout score, which represents the influence and reputation of the Twitter users who posted the tweets. This numeric metric helps assess the impact a user has on their audience and provides insights into their social media presence.

    Another essential attribute is the text content of each tweet. By examining this textual data, analysts can uncover valuable information about trending topics, opinions, sentiments, conversations, or news shared by users. It serves as a primary source for understanding what people share publicly on Twitter.

    The dataset Twitter+data+in+sheets.csv serves as a reliable resource for conducting research or performing analytics that require detailed information about Twitter activity. It covers aspects such as tweet characteristics (including length and language), engagement metrics (such as retweets and favorites), sentiment analysis (revealing positive or negative emotions expressed), as well as individual user details.

    By utilizing this extensive dataset, researchers can gain valuable insights into patterns of online communication within Twitter's vast network. They can identify influential individuals with high Klout scores who have substantial reach among their followers or communities. Additionally, they can analyze various aspects related to tweet content such as sentiment analysis to understand public opinion trends or measure engagement levels through counts like retweets and favorites.

    Overall, this dataset serves as an invaluable resource for anyone interested in comprehensively analyzing tweets' characteristics, exploring how users interact with them across different dimensions like popularity or sentiment analysis groups—or examining correlations between Klout scores with other factors influencing engagement levels like time posted

    How to use the dataset

    Welcome to the Twitter Data: Tweets and User Interactions dataset! This dataset provides valuable insights into tweet characteristics and user engagement on Twitter. Here is a useful guide on how to make the most out of this dataset:

    • Understanding the Columns: There are two main columns in this dataset:

      • Klout Score (Numeric): The Klout score indicates the influence of the user who posted the tweet. A higher Klout score suggests greater influence and reach.
      • Text Content of Tweet (Text): This column contains the actual text content of each tweet.
    • Analyzing Tweet Characteristics: The text content column will help you understand various aspects of tweets, such as language, sentiment, trending topics, or specific keywords used by users. You can perform text analysis techniques like word frequency analysis or sentiment analysis to gain insights into tweet characteristics.

    • Examining User Engagement: The Klout score provides a measure of user influence on Twitter. By analyzing this column, you can identify highly influential users who generate higher engagement rates with their tweets. You can further explore interactions (likes, retweets, replies) between these influential users and other Twitter users mentioned in their tweets.

    • Identifying Trends and Patterns: With this dataset's rich information about tweet content and user engagement, you can identify popular trends or patterns among highly engaged tweets or influential users over different time periods.

    Remember that dates are not included in this guide since they were not provided in the original request for creating it.

    Please note that it is essential to responsibly use this data for any analysis or research purposes while adhering to ethical considerations related to privacy rights and data usage policies set by both Kaggle platform rules as well as any relevant privacy regulations.

    Best regards, [Your Name]

    Research Ideas

    • Analyzing the relationship between Klout score and the content of tweets: This dataset can be used to investigate whether there is a correlation between a user's Klout score (a measure of their social media influence) and the characteristics of their tweets. By examining factors such as tweet length, sentiment, and engagement metrics, researchers can gain...
  13. Instagram: most popular posts as of 2024

    • statista.com
    • es.statista.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stacy Jo Dixon, Instagram: most popular posts as of 2024 [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Stacy Jo Dixon
    Description

    Instagram’s most popular post

                  As of April 2024, the most popular post on Instagram was Lionel Messi and his teammates after winning the 2022 FIFA World Cup with Argentina, posted by the account @leomessi. Messi's post, which racked up over 61 million likes within a day, knocked off the reigning post, which was 'Photo of an Egg'. Originally posted in January 2021, 'Photo of an Egg' surpassed the world’s most popular Instagram post at that time, which was a photo by Kylie Jenner’s daughter totaling 18 million likes.
                  After several cryptic posts published by the account, World Record Egg revealed itself to be a part of a mental health campaign aimed at the pressures of social media use.
    
                  Instagram’s most popular accounts
    
                  As of April 2024, the official Instagram account @instagram had the most followers of any account on the platform, with 672 million followers. Portuguese footballer Cristiano Ronaldo (@cristiano) was the most followed individual with 628 million followers, while Selena Gomez (@selenagomez) was the most followed woman on the platform with 429 million. Additionally, Inter Miami CF striker Lionel Messi (@leomessi) had a total of 502 million. Celebrities such as The Rock, Kylie Jenner, and Ariana Grande all had over 380 million followers each.
    
                  Instagram influencers
    
                  In the United States, the leading content category of Instagram influencers was lifestyle, with 15.25 percent of influencers creating lifestyle content in 2021. Music ranked in second place with 10.96 percent, followed by family with 8.24 percent. Having a large audience can be very lucrative: Instagram influencers in the United States, Canada and the United Kingdom with over 90,000 followers made around 1,221 US dollars per post.
    
                  Instagram around the globe
    
                  Instagram’s worldwide popularity continues to grow, and India is the leading country in terms of number of users, with over 362.9 million users as of January 2024. The United States had 169.65 million Instagram users and Brazil had 134.6 million users. The social media platform was also very popular in Indonesia and Turkey, with 100.9 and 57.1, respectively. As of January 2024, Instagram was the fourth most popular social network in the world, behind Facebook, YouTube and WhatsApp.
    
  14. Social Media Platforms in the UK - Market Research Report (2015-2030)

    • ibisworld.com
    Updated Aug 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    IBISWorld (2024). Social Media Platforms in the UK - Market Research Report (2015-2030) [Dataset]. https://www.ibisworld.com/united-kingdom/market-research-reports/social-media-platforms-industry/
    Explore at:
    Dataset updated
    Aug 25, 2024
    Dataset authored and provided by
    IBISWorld
    License

    https://www.ibisworld.com/about/termsofuse/https://www.ibisworld.com/about/termsofuse/

    Time period covered
    2014 - 2029
    Area covered
    United Kingdom
    Description

    Social media platforms are integral to people's lives, offering ways to communicate, create and view content and share information. According to Ofcom, approximately 89% of UK internet users in 2023 used social media apps or sites. Teenagers and young adults are the biggest users, although there is rapid uptake among older age groups. Advertising is the primary revenue source for social media platforms, although subscription-based services are gaining momentum as platforms seek to diversify their incomes. TikTok is the success story of the last few years, becoming the most downloaded app between 2020 and 2022, according to Apptopia. The short-form video platform reported that it averaged revenue growth of over 450% between 2019 and 2022. After Musk's takeover, X, formerly known as Twitter, adjusted its content moderation and allowed previously banned accounts to return. As a result, over 600 advertisers have pulled their ads from the site because of fears their brand may be associated with malcontent. In response to falling ad revenue, X has introduced a subscription-based service which enables users to verify themselves and boosts the number of people who view their tweets. Meta-owned Facebook and Instagram have responded by introducing a similar service. Revenue is expected to grow by 14.3% in 2024-25, constrained by a slowdown in user growth for most major social media platforms. Over the five years through 2024-25, revenue is forecast to expand at a compound annual rate of 32.8% to reach Ā£9.8 billion. Looking forward, regulations relating to how data is collected, stored, and shared will force advertisers and platforms to rethink how they can target their desired demographics. The rising prominence of AI will require the introduction of adequate regulations. The Online Safety Bill sets out new guidelines for social media platforms to abide by, with hefty fines in store for those who do not. Operating costs will swell as platforms look to meet consumers’ expectations, weighing on profit. Over the five years through 2029-30, social media platforms' revenue is projected to climb at an estimated 9.4% to reach Ā£15.4 billion.

  15. Facebook Metrics Dataset

    • kaggle.com
    Updated Aug 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dileep Naidu (2024). Facebook Metrics Dataset [Dataset]. https://www.kaggle.com/datasets/dileeppatchaone/facebook-metrics-dataset-of-cosmetic-brand
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 9, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Dileep Naidu
    Description

    This dataset contains information about posts made on Famous Cosmetic Brand's Facebook page from 1st of January to 31th of December of 2014. Each row represents a single post and includes the following attributes:

    1. Page total likes: The total number of likes for the page at the time of the post. Example: 139441
    2. Type: The type of post (e.g., photo, status, video, link). Example: Photo
    3. Category: A categorical variable representing the content category of the post (the specific meaning of the categories is not defined in the provided data). Example: 2
    4. Post Month: The month the post was published (likely represented numerically, e.g., 12 for December). Example: 12
    5. Post Weekday: The day of the week the post was published (likely represented numerically, e.g., 1 for Monday). Example: 4
    6. Post Hour: The hour of the day the post was published (likely in 24-hour format). Example: 3
    7. Paid: A binary variable indicating whether the post was a paid advertisement (1 for yes, 0 for no). Example: 0
    8. Lifetime Post Total Reach: The total number of unique people who saw the post during its lifetime. Example: 2752
    9. Lifetime Post Total Impressions: The total number of times the post was displayed, regardless of whether it was clicked or seen. Example: 5091
    10. Lifetime Engaged Users: The number of unique people who engaged with the post (e.g., liked, commented, shared, clicked). Example: 178
    11. Lifetime Post Consumers: The number of unique people who clicked anywhere in the post. Example: 109
    12. Lifetime Post Consumptions: The total number of clicks anywhere in the post. Example: 159
    13. Lifetime Post Impressions by people who have liked your Page: The number of times the post was shown to people who liked the page. Example: 3078
    14. Lifetime Post reach by people who like your Page: The number of people who like the page that saw the post. Example: 1640
    15. Lifetime People who have liked your Page and engaged with your post: The number of people who liked the page that engaged with the post. Example: 119
    16. comment: Number of comments on the post. Example: 4
    17. like: Number of likes on the post. Example: 79
    18. share: Number of shares of the post. Example: 17
    19. Total Interactions: Total number of interactions with the post (likely sum of comments, likes, and shares). Example: 100

    Citation: (Moro et al., 2016) S. Moro, P. Rita and B. Vala. Predicting social media performance metrics and evaluation of the impact on brand building: A data mining approach. Journal of Business Research, Elsevier, In press. Available at: http://dx.doi.org/10.1016/j.jbusres.2016.02.010

  16. o

    Social Media Dataset of Covid-aware Publics

    • ordo.open.ac.uk
    csv
    Updated Sep 30, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Christian Nold (2024). Social Media Dataset of Covid-aware Publics [Dataset]. http://doi.org/10.21954/ou.rd.27044467.v1
    Explore at:
    csvAvailable download formats
    Dataset updated
    Sep 30, 2024
    Dataset provided by
    The Open University
    Authors
    Christian Nold
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is a dataset of tweets by and about COVID-aware publics from the 'X' (Twitter) social media platform collected by the author. The dataset consists of 344 textual tweets regarding COVID-related material practices gathered during the research period Jan 2023 - Sep 2024, yet the dataset also includes tweets created before this date.The textual data has been rewritten to fully anonymise the people who made the tweets, and identifiable contexts have been removed. In addition, all date/time metadata and hashtags, as well as any attached images, have been removed. Square brackets have been used for editorial edits to obfuscate entities or add context to tweets. The dataset consists of a structured comma-separated text file that can be read in any spreadsheet software to maximise accessibility.The research dataset was created with Open university HREC approval: HREC/4557/Nold

  17. m

    Data from: A Dataset on 'Social media and India’s Foreign Policy: The Case...

    • data.mendeley.com
    Updated Dec 19, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mukund Narvenkar (2024). A Dataset on 'Social media and India’s Foreign Policy: The Case Study of ā€˜X’ Diplomacy during the Covid-19 Pandemic' [Dataset]. http://doi.org/10.17632/xfr9y9ggkm.3
    Explore at:
    Dataset updated
    Dec 19, 2024
    Authors
    Mukund Narvenkar
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    India
    Description

    Social media platforms have become integral tools in the conduct of foreign policy for many nations, including India. This dataset serves as a resource for analyzing ā€˜Social Media and India’s Foreign Policy: The Case Study of ā€˜X’ Diplomacy during the Covid-19 Pandemic.’ The data were collected through a web-based questionnaire distributed primarily to people aged 18 – 61 and above in India. A total of 171 valid data were collected from 17 states offering extensive geographic coverage and stored in Mendeley. The 15 contributor states are Goa, Maharashtra, Tamil Nadu, Gujarat, Delhi, Assam, Haryana, Jammu and Kashmir, Karnataka, Kerala, Punjab, Rajasthan, Tripura, Uttar Pradesh and West Bengal. It encompasses diverse question formats, including single-choice, multiple-choice, quizzes, and open-ended. The study underscores the opportunities and challenges of employing 'X' diplomacy in India's foreign policy. Thus, there were two hypotheses. First, India's effective use of 'X' diplomacy positively impacts public perception of India's foreign policy effectiveness. Second, India's adept use of 'X' diplomacy during the COVID-19 pandemic enhances its ability to manage and respond to the crisis effectively. This data shows public perception of the effective use of social media by the Government of India, particularly in the crisis situation. Data also highlight the significant change in India’s narrative through its ā€˜X’ diplomacy, effectively setting the narratives, public perceptions, and diplomatic strategies. This data can be fully utilized in the study of the significance of social media in India’s foreign policy, the role of social media like ā€˜X’ in the making of India’s foreign policy, how effective social media like ā€˜X’ was during the Covid-19 pandemic and how Indian government utilized social media like ā€˜X’ to delivered messages and to set the narrative in the international politics.

  18. S

    Social Media Addiction Statistics

    • searchlogistics.com
    Updated Apr 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Search Logistics (2025). Social Media Addiction Statistics [Dataset]. https://www.searchlogistics.com/learn/statistics/social-media-addiction-statistics/
    Explore at:
    Dataset updated
    Apr 1, 2025
    Dataset authored and provided by
    Search Logistics
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In this post, I'll give you all the social media addiction statistics you need to be aware of to moderate your social media use.

  19. Relational Data Engineering

    • kaggle.com
    Updated Oct 31, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Md Iqbal Hossain (2020). Relational Data Engineering [Dataset]. https://www.kaggle.com/iqbalrony/relational-data-engineering
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 31, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Md Iqbal Hossain
    Description

    Context

    This dataset contains four tables about social media users. 1. Users. csv contains data about the users of the social net. 2. Friends.csv contains data about their friendship status 3. Posts.csv contains data about the posts they made 4. Reactions.csv contains data about the reactions to posts by their friends

    Task: Combine the tables in a way that lets you answer the following questions 1. What is the most common name on the social net? How many people belong to that name? 2. List the five people with the most posts and reactions combined. 3. Create a plot of the friendship graph for all users named ā€žJean-Luc Picardā€œ (up to degree 2)

    Content

    What's inside is more than just rows and columns. Make it easy for others to get started by describing how you acquired the data and what time period it represents, too.

    Acknowledgements

    We wouldn't be here without the help of others. If you owe any attributions or thanks, include them here along with any citations of past research.

    Inspiration

    Your data will be in front of the world's largest data science community. What questions do you want to see answered?

  20. Social Buzz Popularity index

    • kaggle.com
    Updated Jun 18, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    RohanMishra75 (2023). Social Buzz Popularity index [Dataset]. https://www.kaggle.com/datasets/rohanmishra75/social-buzz-popularity-index/data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 18, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    RohanMishra75
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    Social buzz is a social media platform that help you to interact with other people. In this you can reacts you people's post with many way including liking it. In this i found out the top 5 categories that are most searched/used by people in this platform.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Statista (2025). Average daily time spent on social media worldwide 2012-2025 [Dataset]. https://www.statista.com/statistics/433871/daily-social-media-usage-worldwide/
Organization logo

Average daily time spent on social media worldwide 2012-2025

Explore at:
Dataset updated
Jun 19, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
Worldwide
Description

How much time do people spend on social media? As of 2025, the average daily social media usage of internet users worldwide amounted to 141 minutes per day, down from 143 minutes in the previous year. Currently, the country with the most time spent on social media per day is Brazil, with online users spending an average of 3 hours and 49 minutes on social media each day. In comparison, the daily time spent with social media in the U.S. was just 2 hours and 16 minutes. Global social media usageCurrently, the global social network penetration rate is 62.3 percent. Northern Europe had an 81.7 percent social media penetration rate, topping the ranking of global social media usage by region. Eastern and Middle Africa closed the ranking with 10.1 and 9.6 percent usage reach, respectively. People access social media for a variety of reasons. Users like to find funny or entertaining content and enjoy sharing photos and videos with friends, but mainly use social media to stay in touch with current events friends. Global impact of social mediaSocial media has a wide-reaching and significant impact on not only online activities but also offline behavior and life in general. During a global online user survey in February 2019, a significant share of respondents stated that social media had increased their access to information, ease of communication, and freedom of expression. On the flip side, respondents also felt that social media had worsened their personal privacy, increased a polarization in politics and heightened everyday distractions.

Search
Clear search
Close search
Google apps
Main menu