100+ datasets found
  1. Average daily time spent on social media worldwide 2012-2024

    • statista.com
    • grusthub.com
    • +4more
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stacy Jo Dixon, Average daily time spent on social media worldwide 2012-2024 [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Stacy Jo Dixon
    Description

    How much time do people spend on social media?

                  As of 2024, the average daily social media usage of internet users worldwide amounted to 143 minutes per day, down from 151 minutes in the previous year. Currently, the country with the most time spent on social media per day is Brazil, with online users spending an average of three hours and 49 minutes on social media each day. In comparison, the daily time spent with social media in
                  the U.S. was just two hours and 16 minutes. Global social media usageCurrently, the global social network penetration rate is 62.3 percent. Northern Europe had an 81.7 percent social media penetration rate, topping the ranking of global social media usage by region. Eastern and Middle Africa closed the ranking with 10.1 and 9.6 percent usage reach, respectively.
                  People access social media for a variety of reasons. Users like to find funny or entertaining content and enjoy sharing photos and videos with friends, but mainly use social media to stay in touch with current events friends. Global impact of social mediaSocial media has a wide-reaching and significant impact on not only online activities but also offline behavior and life in general.
                  During a global online user survey in February 2019, a significant share of respondents stated that social media had increased their access to information, ease of communication, and freedom of expression. On the flip side, respondents also felt that social media had worsened their personal privacy, increased a polarization in politics and heightened everyday distractions.
    
  2. Number of global social network users 2017-2028

    • statista.com
    • grusthub.com
    • +4more
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stacy Jo Dixon, Number of global social network users 2017-2028 [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Stacy Jo Dixon
    Description

    How many people use social media?

                  Social media usage is one of the most popular online activities. In 2024, over five billion people were using social media worldwide, a number projected to increase to over six billion in 2028.
    
                  Who uses social media?
                  Social networking is one of the most popular digital activities worldwide and it is no surprise that social networking penetration across all regions is constantly increasing. As of January 2023, the global social media usage rate stood at 59 percent. This figure is anticipated to grow as lesser developed digital markets catch up with other regions
                  when it comes to infrastructure development and the availability of cheap mobile devices. In fact, most of social media’s global growth is driven by the increasing usage of mobile devices. Mobile-first market Eastern Asia topped the global ranking of mobile social networking penetration, followed by established digital powerhouses such as the Americas and Northern Europe.
    
                  How much time do people spend on social media?
                  Social media is an integral part of daily internet usage. On average, internet users spend 151 minutes per day on social media and messaging apps, an increase of 40 minutes since 2015. On average, internet users in Latin America had the highest average time spent per day on social media.
    
                  What are the most popular social media platforms?
                  Market leader Facebook was the first social network to surpass one billion registered accounts and currently boasts approximately 2.9 billion monthly active users, making it the most popular social network worldwide. In June 2023, the top social media apps in the Apple App Store included mobile messaging apps WhatsApp and Telegram Messenger, as well as the ever-popular app version of Facebook.
    
  3. Social Media Engagement (2025)

    • kaggle.com
    Updated Mar 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Damla Ağaça (2025). Social Media Engagement (2025) [Dataset]. https://www.kaggle.com/datasets/dagaca/social-media-engagement-2025
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 21, 2025
    Dataset provided by
    Kaggle
    Authors
    Damla Ağaça
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Social Media Engagement (2025)

    This dataset contains 20,000 synthetic social media posts crafted to mimic realistic user activity on a fictional platform. It simulates various user demographics, post content, hashtags, topics, and detailed engagement metrics such as likes, comments, and shares.

    Overview

    Each record represents a unique social media post made by a user, enriched with features that allow for analysis of trends, behavior, and engagement. The dataset includes:

    • User-level information: age, gender, followers, verified status, etc.
    • Post-level information: topic, hashtags, media, engagement
    • Platform and device data
    • Calculated engagement rate

    Column Descriptions

    ColumnDescription
    post_idUnique identifier for each post
    user_idUnique identifier for each user
    user_nameSynthetic username
    user_genderGender of the user (Male, Female, Other)
    user_ageAge of the user (16–60)
    followers_countNumber of followers the user has
    following_countNumber of accounts the user follows
    account_creation_dateAccount registration date
    is_verifiedBoolean flag for verified users
    locationCity or region where the user is located
    topicMain topic of the post (e.g., Travel, Food, Fashion, etc.)
    post_contentActual content of the post
    content_lengthNumber of characters in the post content
    hashtagsRelevant hashtags used in the post
    has_mediaWhether the post includes image or video
    post_dateTimestamp of when the post was made
    deviceDevice used to make the post (e.g., iPhone, Android)
    languageLanguage of the post
    likesNumber of likes received
    commentsNumber of comments received
    sharesNumber of times the post was shared
    engagement_rateNormalized metric: (likes + comments + shares) / followers_count
  4. m

    Abbreviated FOMO and social media dataset

    • figshare.mq.edu.au
    • researchdata.edu.au
    txt
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Danielle Einstein; Carol Dabb; Madeleine Ferrari; Anne McMaugh; Peter McEvoy; Ron Rapee; Eyal Karin; Maree J. Abbott (2023). Abbreviated FOMO and social media dataset [Dataset]. http://doi.org/10.25949/20188298.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    Macquarie University
    Authors
    Danielle Einstein; Carol Dabb; Madeleine Ferrari; Anne McMaugh; Peter McEvoy; Ron Rapee; Eyal Karin; Maree J. Abbott
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This database is comprised of 951 participants who provided self-report data online in their school classrooms. The data was collected in 2016 and 2017. The dataset is comprised of 509 males (54%) and 442 females (46%). Their ages ranged from 12 to 16 years (M = 13.69, SD = 0.72). Seven participants did not report their age. The majority were born in Australia (N = 849, 89%). The next most common countries of birth were China (N = 24, 2.5%), the UK (N = 23, 2.4%), and the USA (N = 9, 0.9%). Data were drawn from students at five Australian independent secondary schools. The data contains item responses for the Spence Children’s Anxiety Scale (SCAS; Spence, 1998) which is comprised of 44 items. The Social media question asked about frequency of use with the question “How often do you use social media?”. The response options ranged from constantly to once a week or less. Items measuring Fear of Missing Out were included and incorporated the following five questions based on the APS Stress and Wellbeing in Australia Survey (APS, 2015). These were “When I have a good time it is important for me to share the details online; I am afraid that I will miss out on something if I don’t stay connected to my online social networks; I feel worried and uncomfortable when I can’t access my social media accounts; I find it difficult to relax or sleep after spending time on social networking sites; I feel my brain burnout with the constant connectivity of social media. Internal consistency for this measure was α = .81. Self compassion was measured using the 12-item short-form of the Self-Compassion Scale (SCS-SF; Raes et al., 2011). The data set has the option of downloading an excel file (composed of two worksheet tabs) or CSV files 1) Data and 2) Variable labels. References: Australian Psychological Society. (2015). Stress and wellbeing in Australia survey. https://www.headsup.org.au/docs/default-source/default-document-library/stress-and-wellbeing-in-australia-report.pdf?sfvrsn=7f08274d_4 Raes, F., Pommier, E., Neff, K. D., & Van Gucht, D. (2011). Construction and factorial validation of a short form of the self-compassion scale. Clinical Psychology and Psychotherapy, 18(3), 250-255. https://doi.org/10.1002/cpp.702 Spence, S. H. (1998). A measure of anxiety symptoms among children. Behaviour Research and Therapy, 36(5), 545-566. https://doi.org/10.1016/S0005-7967(98)00034-5

  5. Social media as a news outlet worldwide 2024

    • statista.com
    • grusthub.com
    • +4more
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amy Watson, Social media as a news outlet worldwide 2024 [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Amy Watson
    Description

    During a 2024 survey, 77 percent of respondents from Nigeria stated that they used social media as a source of news. In comparison, just 23 percent of Japanese respondents said the same. Large portions of social media users around the world admit that they do not trust social platforms either as media sources or as a way to get news, and yet they continue to access such networks on a daily basis.

                  Social media: trust and consumption
    
                  Despite the majority of adults surveyed in each country reporting that they used social networks to keep up to date with news and current affairs, a 2018 study showed that social media is the least trusted news source in the world. Less than 35 percent of adults in Europe considered social networks to be trustworthy in this respect, yet more than 50 percent of adults in Portugal, Poland, Romania, Hungary, Bulgaria, Slovakia and Croatia said that they got their news on social media.
    
                  What is clear is that we live in an era where social media is such an enormous part of daily life that consumers will still use it in spite of their doubts or reservations. Concerns about fake news and propaganda on social media have not stopped billions of users accessing their favorite networks on a daily basis.
                  Most Millennials in the United States use social media for news every day, and younger consumers in European countries are much more likely to use social networks for national political news than their older peers.
                  Like it or not, reading news on social is fast becoming the norm for younger generations, and this form of news consumption will likely increase further regardless of whether consumers fully trust their chosen network or not.
    
  6. u

    Social Media and Mental Health - Dataset - BSOS Data Repository

    • bsos-data.umd.edu
    Updated Jul 24, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Social Media and Mental Health - Dataset - BSOS Data Repository [Dataset]. https://bsos-data.umd.edu/dataset/social-media-and-mental-health
    Explore at:
    Dataset updated
    Jul 24, 2024
    License

    ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
    License information was derived automatically

    Description

    The dataset encompasses demographic, health, and mental health information of students from 48 different states in the USA, born between 1971 and 2003. It includes data on general health ratings, responses to the PHQ-9 depression screening tool, and the GAD-7 anxiety assessment tool. It details how often students experienced various mental health symptoms over the past two weeks, their depression severity scores, and anxiety severity scores. Also, it covers experiences of feeling overwhelmed, exhausted, and hopeless within the last 12 months, along with diagnoses of depression, therapy, and medication usage. The dataset also includes information on various medical conditions, student status (full-time or international), sex, and race.

  7. Social Media Disaster-Related Discussions

    • kaggle.com
    Updated Dec 14, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2022). Social Media Disaster-Related Discussions [Dataset]. https://www.kaggle.com/datasets/thedevastator/mining-disaster-related-insights-from-social-med
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 14, 2022
    Dataset provided by
    Kaggle
    Authors
    The Devastator
    Description

    Social Media Disaster-Related Discussions

    Detecting Relevant Content with Trusted Judgments

    By CrowdFlower [source]

    About this dataset

    Welcome to the disaster tweets dataset! This collection of tweets holds a wealth of information about global disasters and their effects on people, governments, and organizations all over the world. With over 10,000 tweets collected and carefully annotated with labels of whether they reported an actual disaster or not, this dataset provides unique insight into what these events look like in terms of social media conversations.

    This information is derived from a variety of key terms related to disaster events, such as “ablaze” and “pandemonium” which was used to gather each individual tweet for analysis. The columns for each tweet include detailed metadata about the user who posted it along with variables such as keyword relevance and location. Alongside all these attributes is the core text belonging to each individual tweet- giving you access to all sorts of stories from natural disasters, contagious disease outbreaks or conflicts between nations that can be found in one place!

    So whatever you're looking for - whether it's observations about first-hand accounts or conducting research on public sentiment during a major event - this dataset offers you an invaluable source full of timely information that could potentially save lives down the line. So take your journey through this data now and embark upon discovering what devastation looks like through social media!

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    This dataset contains tweets related to disaster events, including the keyword, location, text, tweetid and userid. It provides insights into how people interact with each other on social media during a disaster. Using this dataset you can gain valuable insight into the dynamics of online communication in disasters and provide an important point of reference for future disaster management initiatives.

    Research Ideas

    • Analyzing the effectiveness of disaster relief and humanitarian aid efforts, by mapping tweets against public data of areas affected by disasters and donations made to help those affected.
    • Developing advanced statistical models to predict the magnitude and impact of an oncoming natural disaster using keyword analysis in social media posts related to past disasters.
    • Creating text-based classifiers to accurately detect disaster-related tweets in real-time, allowing emergency services providers early warning signs before a potential event occurs

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    Unknown License - Please check the dataset description for more information.

    Columns

    File: socialmedia-disaster-tweets-DFE.csv | Column name | Description | |:-----------------------|:-----------------------------------------------------------------------------------| | _golden | A boolean value indicating whether the tweet is a golden tweet or not. (Boolean) | | _unit_state | The state of the tweet (e.g. finalized, judged, etc.). (String) | | _trusted_judgments | The number of trusted judgments for the tweet. (Integer) | | _last_judgment_at | The date and time of the last judgment for the tweet. (DateTime) | | choose_one | The label assigned to the tweet (e.g. relevant, not relevant, etc.). (String) | | choose_one_gold | The gold label assigned to the tweet (e.g. relevant, not relevant, etc.). (String) | | keyword | The keyword associated with the tweet. (String) | | location | The location associated with the tweet. (String) | | text | The text content of the tweet. (String) |

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. If you use this dataset in your research, please credit CrowdFlower.

  8. Planned changes in use of selected social media for organic marketing...

    • statista.com
    • grusthub.com
    • +4more
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Christopher Ross, Planned changes in use of selected social media for organic marketing worldwide 2024 [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Christopher Ross
    Description

    During a January 2024 global survey among marketers, nearly 60 percent reported plans to increase their organic use of YouTube for marketing purposes in the following 12 months. LinkedIn and Instagram followed, respectively mentioned by 57 and 56 percent of the respondents intending to use them more. According to the same survey, Facebook was the most important social media platform for marketers worldwide.

  9. s

    Dataset for Social Media Activity, Number of Friends, and Relationship...

    • eprints.soton.ac.uk
    Updated Jul 8, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Elder, Lindsay; Brignell, Catherine; Cooke, Tim (2022). Dataset for Social Media Activity, Number of Friends, and Relationship Quality [Dataset]. http://doi.org/10.5258/SOTON/D1955
    Explore at:
    Dataset updated
    Jul 8, 2022
    Dataset provided by
    University of Southampton
    Authors
    Elder, Lindsay; Brignell, Catherine; Cooke, Tim
    Description

    The data from my thesis. This data was collected using the Lifeguide Software and exported onto SPSS following data collection. The data was collected from young people aged 11-18 years old to explore the impact of different types of social media use.

  10. f

    Data set belonging to Beyens et al. (2020). The effect of social media on...

    • uvaauas.figshare.com
    • narcis.nl
    bin
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    I. Beyens; J.L. Pouwels; I.I. van Driel; Loes Keijsers; P.M. Valkenburg (2023). Data set belonging to Beyens et al. (2020). The effect of social media on well-being differs from adolescent to adolescent [Dataset]. http://doi.org/10.21942/uva.12497990.v2
    Explore at:
    binAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    University of Amsterdam / Amsterdam University of Applied Sciences
    Authors
    I. Beyens; J.L. Pouwels; I.I. van Driel; Loes Keijsers; P.M. Valkenburg
    License

    http://rdm.uva.nl/en/support/confidential-data.htmlhttp://rdm.uva.nl/en/support/confidential-data.html

    Description

    This data set belongs to:Beyens, I., Pouwels, J. L., van Driel, I. I., Keijsers, L., & Valkenburg, P. M. (2020). The effect of social media on well-being differs from adolescent to adolescent. Scientific Reports. doi:10.1038/s41598-020-67727-7The design, sampling and analysis plan of the study are available on the Open Science Framework (OSF) at https://osf.io/nhks2.For more information, please contact the authors at i.beyens@uva.nl or info@project-awesome.nl.

  11. Social Media Channels and Statistics at the National Archives

    • catalog.data.gov
    • data.amerigeoss.org
    • +1more
    Updated Nov 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Archives and Records Administration (2024). Social Media Channels and Statistics at the National Archives [Dataset]. https://catalog.data.gov/dataset/social-media-channels-and-statistics-at-the-national-archives
    Explore at:
    Dataset updated
    Nov 7, 2024
    Dataset provided by
    National Archives and Records Administrationhttp://www.archives.gov/
    Description

    More than 100 social media channels and statistics for the National Archives and Records Administration.

  12. B

    Undergraduate Digital Literacies and Social Media Survey

    • borealisdata.ca
    Updated May 11, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Erika Smith; Hannah Storrs (2023). Undergraduate Digital Literacies and Social Media Survey [Dataset]. http://doi.org/10.5683/SP3/YGMS7B
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 11, 2023
    Dataset provided by
    Borealis
    Authors
    Erika Smith; Hannah Storrs
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    2019
    Area covered
    Canada, Alberta
    Dataset funded by
    Social Sciences and Humanities Research Council
    Description

    This dataset is from a survey conducted in 2019 on social media and digital literacies in undergraduate learning. Data was collected using Survey Monkey and primary analysis was conducted using SPSS. The dataset accompanies our article published in the International Journal of Educational Technology in Higher Education in May 2023, available at https://doi.org/10.1186/s41239-023-00398-2.

  13. Social Media Aspects

    • kaggle.com
    Updated Apr 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Arshad Rahman Ziban (2025). Social Media Aspects [Dataset]. https://www.kaggle.com/datasets/arshadrahmanziban/social-media-aspects/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 3, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Arshad Rahman Ziban
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset contains survey responses about social media usage patterns and their perceived effects on relationships and mental health. The data was collected from individuals primarily in the 18-25 age group.

  14. f

    Dataset Political Personalism in Social Media

    • figshare.com
    pdf
    Updated Aug 27, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    shahaf zamir (2024). Dataset Political Personalism in Social Media [Dataset]. http://doi.org/10.6084/m9.figshare.14073692.v1
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Aug 27, 2024
    Dataset provided by
    figshare
    Authors
    shahaf zamir
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset covers aspects of online politics in 25 democracies: 15 relatively old established European democracies (Austria, Belgium, Denmark, Finland, France, Germany, Iceland, Ireland, Italy, Luxembourg, Netherlands, Norway, Sweden, Switzerland, United Kingdom); five non-European veteran democracies (Australia, Canada, Israel, Japan, New Zealand); two early (Portugal, Spain) and three late (Czech Republic, Hungary, Poland) third-wave (young) European democracies. The research population includes, in each country, parties that won 4% or more of the votes in two consecutive elections before April 2019 (a total of 141 parties and 145 leaders). The dataset includes external party level information such as performance in the last national elections, governmental status, party age, populism affiliation and leadership selection method. It also includes information related to the party leaders such as their term in leadership office and other formal positions. In addition it includes information about online activity mainly on the consumption (user related activities) of the parties and their leaders in Facebook and Twitter two of the most used social media platforms for political purposes.

  15. MultiSocial

    • zenodo.org
    • data.niaid.nih.gov
    Updated Aug 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dominik Macko; Dominik Macko; Jakub Kopal; Robert Moro; Robert Moro; Ivan Srba; Ivan Srba; Jakub Kopal (2025). MultiSocial [Dataset]. http://doi.org/10.5281/zenodo.13846152
    Explore at:
    Dataset updated
    Aug 20, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Dominik Macko; Dominik Macko; Jakub Kopal; Robert Moro; Robert Moro; Ivan Srba; Ivan Srba; Jakub Kopal
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    MultiSocial is a dataset (described in a paper) for multilingual (22 languages) machine-generated text detection benchmark in social-media domain (5 platforms). It contains 472,097 texts, of which about 58k are human-written and approximately the same amount is generated by each of 7 multilingual large language models by using 3 iterations of paraphrasing. The dataset has been anonymized to minimize amount of sensitive data by hiding email addresses, usernames, and phone numbers.

    If you use this dataset in any publication, project, tool or in any other form, please, cite the paper.

    Disclaimer

    Due to data source (described below), the dataset may contain harmful, disinformation, or offensive content. Based on a multilingual toxicity detector, about 8% of the text samples are probably toxic (from 5% in WhatsApp to 10% in Twitter). Although we have used data sources of older date (lower probability to include machine-generated texts), the labeling (of human-written text) might not be 100% accurate. The anonymization procedure might not successfully hiden all the sensitive/personal content; thus, use the data cautiously (if feeling affected by such content, report the found issues in this regard to dpo[at]kinit.sk). The intended use if for non-commercial research purpose only.

    Data Source

    The human-written part consists of a pseudo-randomly selected subset of social media posts from 6 publicly available datasets:

    1. Telegram data originated in Pushshift Telegram, containing 317M messages (Baumgartner et al., 2020). It contains messages from 27k+ channels. The collection started with a set of right-wing extremist and cryptocurrency channels (about 300 in total) and was expanded based on occurrence of forwarded messages from other channels. In the end, it thus contains a wide variety of topics and societal movements reflecting the data collection time.

    2. Twitter data originated in CLEF2022-CheckThat! Task 1, containing 34k tweets on COVID-19 and politics (Nakov et al., 2022, combined with Sentiment140, containing 1.6M tweets on various topics (Go et al., 2009).

    3. Gab data originated in the dataset containing 22M posts from Gab social network. The authors of the dataset (Zannettou et al., 2018) found out that “Gab is predominantly used for the dissemination and discussion of news and world events, and that it attracts alt-right users, conspiracy theorists, and other trolls.” They also found out that hate speech is much more prevalent there compared to Twitter, but lower than 4chan's Politically Incorrect board.

    4. Discord data originated in Discord-Data, containing 51M messages. This is a long-context, anonymized, clean, multi-turn and single-turn conversational dataset based on Discord data scraped from a large variety of servers, big and small. According to the dataset authors, it contains around 0.1% of potentially toxic comments (based on the applied heuristic/classifier).

    5. WhatsApp data originated in whatsapp-public-groups, containing 300k messages (Garimella & Tyson, 2018). The public dataset contains the anonymised data, collected for around 5 months from around 178 groups. Original messages were made available to us on request to dataset authors for research purposes.

    From these datasets, we have pseudo-randomly sampled up to 1300 texts (up to 300 for test split and the remaining up to 1000 for train split if available) for each of the selected 22 languages (using a combination of automated approaches to detect the language) and platform. This process resulted in 61,592 human-written texts, which were further filtered out based on occurrence of some characters or their length, resulting in about 58k human-written texts.

    The machine-generated part contains texts generated by 7 LLMs (Aya-101, Gemini-1.0-pro, GPT-3.5-Turbo-0125, Mistral-7B-Instruct-v0.2, opt-iml-max-30b, v5-Eagle-7B-HF, vicuna-13b). All these models were self-hosted except for GPT and Gemini, where we used the publicly available APIs. We generated the texts using 3 paraphrases of the original human-written data and then preprocessed the generated texts (filtered out cases when the generation obviously failed).

    The dataset has the following fields:

    • 'text' - a text sample,

    • 'label' - 0 for human-written text, 1 for machine-generated text,

    • 'multi_label' - a string representing a large language model that generated the text or the string "human" representing a human-written text,

    • 'split' - a string identifying train or test split of the dataset for the purpose of training and evaluation respectively,

    • 'language' - the ISO 639-1 language code identifying the detected language of the given text,

    • 'length' - word count of the given text,

    • 'source' - a string identifying the source dataset / platform of the given text,

    • 'potential_noise' - 0 for text without identified noise, 1 for text with potential noise.

    ToDo Statistics (under construction)

  16. Social media usage by local government - Dataset - data.gov.uk

    • ckan.publishing.service.gov.uk
    Updated Jun 8, 2010
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ckan.publishing.service.gov.uk (2010). Social media usage by local government - Dataset - data.gov.uk [Dataset]. https://ckan.publishing.service.gov.uk/dataset/social-media-usage-by-local-government
    Explore at:
    Dataset updated
    Jun 8, 2010
    Dataset provided by
    CKANhttps://ckan.org/
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    A list of UK local authorities which are using social media such as Facebook, Twitter, YouTube. Also includes those with RSS feeds, web development blogs and open data.

  17. Instagram: most used hashtags 2024

    • statista.com
    • grusthub.com
    • +4more
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista Research Department, Instagram: most used hashtags 2024 [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Statista Research Department
    Description

    As of January 2024, #love was the most used hashtag on Instagram, being included in over two billion posts on the social media platform. #Instagood and #instagram were used over one billion times as of early 2024.

  18. d

    Social Media and Online Usage to Improve the Customer Experience

    • catalog.data.gov
    • datasets.ai
    • +1more
    Updated Mar 18, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    opendata.maryland.gov (2023). Social Media and Online Usage to Improve the Customer Experience [Dataset]. https://catalog.data.gov/dataset/social-media-and-online-usage-to-improve-the-customer-experience
    Explore at:
    Dataset updated
    Mar 18, 2023
    Dataset provided by
    opendata.maryland.gov
    Description

    Social Media and Online Usage to Improve the Customer Experience (description updated 3/10/2023)

  19. f

    Data Sheet 1_A content analysis of government-issued social media posts...

    • frontiersin.figshare.com
    docx
    Updated Dec 17, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vayshali Patel; Lauren E. Grant; Hisba Shereefdeen; Melissa MacKay; Leslie Cheng; Melissa Phypers; Andrew Papadopoulos; Jennifer E. McWhirter (2024). Data Sheet 1_A content analysis of government-issued social media posts during multi-jurisdictional enteric illness outbreaks in Canada.docx [Dataset]. http://doi.org/10.3389/fcomm.2024.1512014.s001
    Explore at:
    docxAvailable download formats
    Dataset updated
    Dec 17, 2024
    Dataset provided by
    Frontiers
    Authors
    Vayshali Patel; Lauren E. Grant; Hisba Shereefdeen; Melissa MacKay; Leslie Cheng; Melissa Phypers; Andrew Papadopoulos; Jennifer E. McWhirter
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Canada
    Description

    IntroductionMost Canadians use at least one social media platform regularly, making social media a potentially effective tool for reaching broad audiences. The Public Health Agency of Canada (PHAC) uses social media as one tool for rapidly communicating with the public during multi-jurisdictional enteric illness outbreaks. However, the effectiveness of social media in enhancing public risk communication during these outbreaks remains unexplored. Addressing this gap may help optimise social media use for risk communication to inform the public and prevent additional illness. This study aims to analyse the engagement with and quality of PHAC’s social media content regarding multi-jurisdictional enteric illness outbreaks.MethodsUsing a search of PHAC’s social media platforms, 482 posts during enteric illness outbreaks (2014–2022) were identified, including 198 posts from Facebook and 284 posts from X (formerly Twitter) in English and French. A codebook was developed using engagement metrics for gauging public interest, the Centers for Disease Control and Prevention’s (CDC) Modified Clear Communication Index (CCI) to assess clarity as a proxy for comprehension, the Health Belief Model (HBM) to evaluate the potential to motivate behaviour change, and measures of consistency. Descriptive statistics were used to analyse post content.ResultsThe average engagement rates for PHAC social media accounts were 

  20. Social Media Influencers Dataset

    • figshare.com
    bin
    Updated Jun 25, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Esther Leander (2023). Social Media Influencers Dataset [Dataset]. http://doi.org/10.6084/m9.figshare.23576037.v1
    Explore at:
    binAvailable download formats
    Dataset updated
    Jun 25, 2023
    Dataset provided by
    figshare
    Authors
    Esther Leander
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This data was used in a study to determine the role of social media influencers in shaping consumer behaviour for beauty products in the US market.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Stacy Jo Dixon, Average daily time spent on social media worldwide 2012-2024 [Dataset]. https://www.statista.com/topics/1164/social-networks/
Organization logo

Average daily time spent on social media worldwide 2012-2024

Explore at:
Dataset provided by
Statistahttp://statista.com/
Authors
Stacy Jo Dixon
Description

How much time do people spend on social media?

              As of 2024, the average daily social media usage of internet users worldwide amounted to 143 minutes per day, down from 151 minutes in the previous year. Currently, the country with the most time spent on social media per day is Brazil, with online users spending an average of three hours and 49 minutes on social media each day. In comparison, the daily time spent with social media in
              the U.S. was just two hours and 16 minutes. Global social media usageCurrently, the global social network penetration rate is 62.3 percent. Northern Europe had an 81.7 percent social media penetration rate, topping the ranking of global social media usage by region. Eastern and Middle Africa closed the ranking with 10.1 and 9.6 percent usage reach, respectively.
              People access social media for a variety of reasons. Users like to find funny or entertaining content and enjoy sharing photos and videos with friends, but mainly use social media to stay in touch with current events friends. Global impact of social mediaSocial media has a wide-reaching and significant impact on not only online activities but also offline behavior and life in general.
              During a global online user survey in February 2019, a significant share of respondents stated that social media had increased their access to information, ease of communication, and freedom of expression. On the flip side, respondents also felt that social media had worsened their personal privacy, increased a polarization in politics and heightened everyday distractions.
Search
Clear search
Close search
Google apps
Main menu