93 datasets found
  1. Number of global social network users 2017-2028

    • statista.com
    • grusthub.com
    • +3more
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stacy Jo Dixon, Number of global social network users 2017-2028 [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Stacy Jo Dixon
    Description

    How many people use social media?

                  Social media usage is one of the most popular online activities. In 2024, over five billion people were using social media worldwide, a number projected to increase to over six billion in 2028.
    
                  Who uses social media?
                  Social networking is one of the most popular digital activities worldwide and it is no surprise that social networking penetration across all regions is constantly increasing. As of January 2023, the global social media usage rate stood at 59 percent. This figure is anticipated to grow as lesser developed digital markets catch up with other regions
                  when it comes to infrastructure development and the availability of cheap mobile devices. In fact, most of social media’s global growth is driven by the increasing usage of mobile devices. Mobile-first market Eastern Asia topped the global ranking of mobile social networking penetration, followed by established digital powerhouses such as the Americas and Northern Europe.
    
                  How much time do people spend on social media?
                  Social media is an integral part of daily internet usage. On average, internet users spend 151 minutes per day on social media and messaging apps, an increase of 40 minutes since 2015. On average, internet users in Latin America had the highest average time spent per day on social media.
    
                  What are the most popular social media platforms?
                  Market leader Facebook was the first social network to surpass one billion registered accounts and currently boasts approximately 2.9 billion monthly active users, making it the most popular social network worldwide. In June 2023, the top social media apps in the Apple App Store included mobile messaging apps WhatsApp and Telegram Messenger, as well as the ever-popular app version of Facebook.
    
  2. Instagram: most used hashtags 2024

    • statista.com
    • grusthub.com
    • +3more
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista Research Department, Instagram: most used hashtags 2024 [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Statista Research Department
    Description

    As of January 2024, #love was the most used hashtag on Instagram, being included in over two billion posts on the social media platform. #Instagood and #instagram were used over one billion times as of early 2024.

  3. Leading social media platforms used by marketers worldwide 2025

    • statista.com
    • abripper.com
    Updated Jul 24, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Leading social media platforms used by marketers worldwide 2025 [Dataset]. https://www.statista.com/statistics/259379/social-media-platforms-used-by-marketers-worldwide/
    Explore at:
    Dataset updated
    Jul 24, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Jan 2025
    Area covered
    Worldwide
    Description

    During a 2025 survey among marketers worldwide, around 83 percent reported using Facebook for marketing purposes. Instagram and LinkedIn followed, respectively mentioned by 78 and 69 percent of the respondents. The global social media marketing segment According to the same study, 60 percent of responding marketers intended to increase their organic use of YouTube for marketing purposes throughout that year. LinkedIn and Instagram followed with similar shares, rounding up the top three social media platforms attracting a planned growth in organic use among global marketers in 2025. Their main driver is increasing brand exposure and traffic, which led the ranking of benefits of social media marketing worldwide. Social media for B2B marketing Social media platform adoption rates among business-to-consumer (B2C) and business-to-business (B2B) marketers vary according to each subsegment's focus. While B2C professionals prioritize Facebook and Instagram, both run by Meta, Inc., due to their popularity among online audiences, B2B marketers concentrate their endeavors on Microsoft-owned LinkedIn due to its goal to connect people and companies in a corporate context.

  4. Social media as a news outlet worldwide 2024

    • statista.com
    • grusthub.com
    • +4more
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amy Watson, Social media as a news outlet worldwide 2024 [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Amy Watson
    Description

    During a 2024 survey, 77 percent of respondents from Nigeria stated that they used social media as a source of news. In comparison, just 23 percent of Japanese respondents said the same. Large portions of social media users around the world admit that they do not trust social platforms either as media sources or as a way to get news, and yet they continue to access such networks on a daily basis.

                  Social media: trust and consumption
    
                  Despite the majority of adults surveyed in each country reporting that they used social networks to keep up to date with news and current affairs, a 2018 study showed that social media is the least trusted news source in the world. Less than 35 percent of adults in Europe considered social networks to be trustworthy in this respect, yet more than 50 percent of adults in Portugal, Poland, Romania, Hungary, Bulgaria, Slovakia and Croatia said that they got their news on social media.
    
                  What is clear is that we live in an era where social media is such an enormous part of daily life that consumers will still use it in spite of their doubts or reservations. Concerns about fake news and propaganda on social media have not stopped billions of users accessing their favorite networks on a daily basis.
                  Most Millennials in the United States use social media for news every day, and younger consumers in European countries are much more likely to use social networks for national political news than their older peers.
                  Like it or not, reading news on social is fast becoming the norm for younger generations, and this form of news consumption will likely increase further regardless of whether consumers fully trust their chosen network or not.
    
  5. Average daily time spent on social media worldwide 2012-2025

    • statista.com
    Updated Jun 19, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Average daily time spent on social media worldwide 2012-2025 [Dataset]. https://www.statista.com/statistics/433871/daily-social-media-usage-worldwide/
    Explore at:
    Dataset updated
    Jun 19, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Worldwide
    Description

    How much time do people spend on social media? As of 2025, the average daily social media usage of internet users worldwide amounted to 141 minutes per day, down from 143 minutes in the previous year. Currently, the country with the most time spent on social media per day is Brazil, with online users spending an average of 3 hours and 49 minutes on social media each day. In comparison, the daily time spent with social media in the U.S. was just 2 hours and 16 minutes. Global social media usageCurrently, the global social network penetration rate is 62.3 percent. Northern Europe had an 81.7 percent social media penetration rate, topping the ranking of global social media usage by region. Eastern and Middle Africa closed the ranking with 10.1 and 9.6 percent usage reach, respectively. People access social media for a variety of reasons. Users like to find funny or entertaining content and enjoy sharing photos and videos with friends, but mainly use social media to stay in touch with current events friends. Global impact of social mediaSocial media has a wide-reaching and significant impact on not only online activities but also offline behavior and life in general. During a global online user survey in February 2019, a significant share of respondents stated that social media had increased their access to information, ease of communication, and freedom of expression. On the flip side, respondents also felt that social media had worsened their personal privacy, increased a polarization in politics and heightened everyday distractions.

  6. f

    Dataset Political Personalism in Social Media

    • figshare.com
    pdf
    Updated Aug 27, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    shahaf zamir (2024). Dataset Political Personalism in Social Media [Dataset]. http://doi.org/10.6084/m9.figshare.14073692.v1
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Aug 27, 2024
    Dataset provided by
    figshare
    Authors
    shahaf zamir
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset covers aspects of online politics in 25 democracies: 15 relatively old established European democracies (Austria, Belgium, Denmark, Finland, France, Germany, Iceland, Ireland, Italy, Luxembourg, Netherlands, Norway, Sweden, Switzerland, United Kingdom); five non-European veteran democracies (Australia, Canada, Israel, Japan, New Zealand); two early (Portugal, Spain) and three late (Czech Republic, Hungary, Poland) third-wave (young) European democracies. The research population includes, in each country, parties that won 4% or more of the votes in two consecutive elections before April 2019 (a total of 141 parties and 145 leaders). The dataset includes external party level information such as performance in the last national elections, governmental status, party age, populism affiliation and leadership selection method. It also includes information related to the party leaders such as their term in leadership office and other formal positions. In addition it includes information about online activity mainly on the consumption (user related activities) of the parties and their leaders in Facebook and Twitter two of the most used social media platforms for political purposes.

  7. m

    Abbreviated FOMO and social media dataset

    • figshare.mq.edu.au
    • researchdata.edu.au
    txt
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Danielle Einstein; Carol Dabb; Madeleine Ferrari; Anne McMaugh; Peter McEvoy; Ron Rapee; Eyal Karin; Maree J. Abbott (2023). Abbreviated FOMO and social media dataset [Dataset]. http://doi.org/10.25949/20188298.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    Macquarie University
    Authors
    Danielle Einstein; Carol Dabb; Madeleine Ferrari; Anne McMaugh; Peter McEvoy; Ron Rapee; Eyal Karin; Maree J. Abbott
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This database is comprised of 951 participants who provided self-report data online in their school classrooms. The data was collected in 2016 and 2017. The dataset is comprised of 509 males (54%) and 442 females (46%). Their ages ranged from 12 to 16 years (M = 13.69, SD = 0.72). Seven participants did not report their age. The majority were born in Australia (N = 849, 89%). The next most common countries of birth were China (N = 24, 2.5%), the UK (N = 23, 2.4%), and the USA (N = 9, 0.9%). Data were drawn from students at five Australian independent secondary schools. The data contains item responses for the Spence Children’s Anxiety Scale (SCAS; Spence, 1998) which is comprised of 44 items. The Social media question asked about frequency of use with the question “How often do you use social media?”. The response options ranged from constantly to once a week or less. Items measuring Fear of Missing Out were included and incorporated the following five questions based on the APS Stress and Wellbeing in Australia Survey (APS, 2015). These were “When I have a good time it is important for me to share the details online; I am afraid that I will miss out on something if I don’t stay connected to my online social networks; I feel worried and uncomfortable when I can’t access my social media accounts; I find it difficult to relax or sleep after spending time on social networking sites; I feel my brain burnout with the constant connectivity of social media. Internal consistency for this measure was α = .81. Self compassion was measured using the 12-item short-form of the Self-Compassion Scale (SCS-SF; Raes et al., 2011). The data set has the option of downloading an excel file (composed of two worksheet tabs) or CSV files 1) Data and 2) Variable labels. References: Australian Psychological Society. (2015). Stress and wellbeing in Australia survey. https://www.headsup.org.au/docs/default-source/default-document-library/stress-and-wellbeing-in-australia-report.pdf?sfvrsn=7f08274d_4 Raes, F., Pommier, E., Neff, K. D., & Van Gucht, D. (2011). Construction and factorial validation of a short form of the self-compassion scale. Clinical Psychology and Psychotherapy, 18(3), 250-255. https://doi.org/10.1002/cpp.702 Spence, S. H. (1998). A measure of anxiety symptoms among children. Behaviour Research and Therapy, 36(5), 545-566. https://doi.org/10.1016/S0005-7967(98)00034-5

  8. Planned changes in use of selected social media for organic marketing...

    • statista.com
    • grusthub.com
    • +3more
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Christopher Ross, Planned changes in use of selected social media for organic marketing worldwide 2024 [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Christopher Ross
    Description

    During a January 2024 global survey among marketers, nearly 60 percent reported plans to increase their organic use of YouTube for marketing purposes in the following 12 months. LinkedIn and Instagram followed, respectively mentioned by 57 and 56 percent of the respondents intending to use them more. According to the same survey, Facebook was the most important social media platform for marketers worldwide.

  9. DeepCube: Post-processing and annotated datasets of social media data

    • zenodo.org
    • data.niaid.nih.gov
    Updated Mar 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexandros Mokas; Eleni Kamateri; Giannis Tsampoulatidis; Alexandros Mokas; Eleni Kamateri; Giannis Tsampoulatidis (2024). DeepCube: Post-processing and annotated datasets of social media data [Dataset]. http://doi.org/10.5281/zenodo.10731637
    Explore at:
    Dataset updated
    Mar 15, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Alexandros Mokas; Eleni Kamateri; Giannis Tsampoulatidis; Alexandros Mokas; Eleni Kamateri; Giannis Tsampoulatidis
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Researcher(s): Alexandros Mokas, Eleni Kamateri

    Supervisor: Ioannis Tsampoulatidis

    This repository contains 3 social media datasets:

    2 Post-processing datasets: These datasets contain post-processing data extracted from the analysis of social media posts collected for two different use cases during the first two years of the Deepcube project. More specifically, these include:

    • The UC2 dataset containing the post-processing analysis of the Twitter data collected for the DeepCube use case (UC2) dealing with the climate induced migration in Africa. This dataset contains in total 5,695,253 social media posts collected from the Twitter platform, based on the initial version of search criteria relevant to UC2 defined by Universitat De Valencia, focused on the regions of Ethiopia and Somalia and started from 26 June, 2021 till March, 2023.
    • The UC5 dataset containing the post-processing analysis of the Twitter and Instagram data collected for the DeepCube use case (UC5) related to the sustainable and environmentally-friendly tourism. This dataset contains in total 58,143 social media posts collected from the Twitter and Instagram platform (12,881 collected from Twitter and 45,262 collected from Instagram), based on the initial version of search criteria relevant to UC5 defined by MURMURATION SAS, focused on the regions of Brasil and started from 26 June, 2021 till March, 2023.

    1 Annotated dataset: An additional anottated dataset was created that contains post-processing data along with annotations of Twitter posts collected for UC2 for the years 2010-2022. More specifically, it includes:

    • The UC2 dataset contain the post-processing of the Twitter data collected for the DeepCube use case (UC2) dealing with the climate induced migration in Africa. This dataset contains in total 1721 annotated (412 relevant and 1309 irrelevant) by social media posts collected from the Twitter platform, focused on the region of Somalia and started from 1 January, 2010 till 31 December, 2022.

    For every social media post retrieved from Twitter and Instagram, a preprocessing step was performed. This involved a three-step analysis of each post using the appropriate web service. First, the location of the post was automatically extracted from the text using a location extraction service. Second, the images included in the post were analyzed using a concept extraction service, which identified and provided the top ten concepts that best described the image. These concepts included items such as "person," "building," "drought," "sun," and so on. Finally, the sentiment expressed in the post's text was determined by using a sentiment analysis service. The sentiment was classified as either positive, negative, or neutral.

    After the social media posts were preprocessed, they were visualized using the Social Media Web Application. This intuitive, user-friendly online application was designed for both expert and non-expert users and offers a web-based user interface for filtering and visualizing the collected social media data. The application provides various filtering options, an interactive map, a timeline, and a collection of graphs to help users analyze the data. Moreover, this application provides users with the option to download aggregated data for specific periods by applying filters and clicking the "Download Posts" button. This feature allows users to easily extract and analyze social media data outside of the web application, providing greater flexibility and control over data analysis.

    The dataset is provided by INFALIA.

    INFALIA, being a spin-off of the CERTH institute and a partner of a research EU project, releases this dataset containing Tweets IDs and post pre-processing data for the sole purpose of enabling the validation of the research conducted within the DeepCube. Moreover, Twitter Content provided in this dataset to third parties remains subject to the Twitter Policy, and those third parties must agree to the Twitter Terms of Service, Privacy Policy, Developer Agreement, and Developer Policy (https://developer.twitter.com/en/developer-terms) before receiving this download.

  10. d

    Dataplex: Reddit Data | Global Social Media Data | 2.1M+ subreddits: trends,...

    • datarade.ai
    .json, .csv
    Updated Aug 12, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataplex (2024). Dataplex: Reddit Data | Global Social Media Data | 2.1M+ subreddits: trends, audience insights + more | Ideal for Interest-Based Segmentation [Dataset]. https://datarade.ai/data-products/dataplex-reddit-data-global-social-media-data-1-1m-mill-dataplex
    Explore at:
    .json, .csvAvailable download formats
    Dataset updated
    Aug 12, 2024
    Dataset authored and provided by
    Dataplex
    Area covered
    Chile, Martinique, Macao, Jersey, Holy See, Botswana, Mexico, Christmas Island, Côte d'Ivoire, Gambia
    Description

    The Reddit Subreddit Dataset by Dataplex offers a comprehensive and detailed view of Reddit’s vast ecosystem, now enhanced with appended AI-generated columns that provide additional insights and categorization. This dataset includes data from over 2.1 million subreddits, making it an invaluable resource for a wide range of analytical applications, from social media analysis to market research.

    Dataset Overview:

    This dataset includes detailed information on subreddit activities, user interactions, post frequency, comment data, and more. The inclusion of AI-generated columns adds an extra layer of analysis, offering sentiment analysis, topic categorization, and predictive insights that help users better understand the dynamics of each subreddit.

    2.1 Million Subreddits with Enhanced AI Insights: The dataset covers over 2.1 million subreddits and now includes AI-enhanced columns that provide: - Sentiment Analysis: AI-driven sentiment scores for posts and comments, allowing users to gauge community mood and reactions. - Topic Categorization: Automated categorization of subreddit content into relevant topics, making it easier to filter and analyze specific types of discussions. - Predictive Insights: AI models that predict trends, content virality, and user engagement, helping users anticipate future developments within subreddits.

    Sourced Directly from Reddit:

    All social media data in this dataset is sourced directly from Reddit, ensuring accuracy and authenticity. The dataset is updated regularly, reflecting the latest trends and user interactions on the platform. This ensures that users have access to the most current and relevant data for their analyses.

    Key Features:

    • Subreddit Metrics: Detailed data on subreddit activity, including the number of posts, comments, votes, and user participation.
    • User Engagement: Insights into how users interact with content, including comment threads, upvotes/downvotes, and participation rates.
    • Trending Topics: Track emerging trends and viral content across the platform, helping you stay ahead of the curve in understanding social media dynamics.
    • AI-Enhanced Analysis: Utilize AI-generated columns for sentiment analysis, topic categorization, and predictive insights, providing a deeper understanding of the data.

    Use Cases:

    • Social Media Analysis: Researchers and analysts can use this dataset to study online behavior, track the spread of information, and understand how content resonates with different audiences.
    • Market Research: Marketers can leverage the dataset to identify target audiences, understand consumer preferences, and tailor campaigns to specific communities.
    • Content Strategy: Content creators and strategists can use insights from the dataset to craft content that aligns with trending topics and user interests, maximizing engagement.
    • Academic Research: Academics can explore the dynamics of online communities, studying everything from the spread of misinformation to the formation of online subcultures.

    Data Quality and Reliability:

    The Reddit Subreddit Dataset emphasizes data quality and reliability. Each record is carefully compiled from Reddit’s vast database, ensuring that the information is both accurate and up-to-date. The AI-generated columns further enhance the dataset's value, providing automated insights that help users quickly identify key trends and sentiments.

    Integration and Usability:

    The dataset is provided in a format that is compatible with most data analysis tools and platforms, making it easy to integrate into existing workflows. Users can quickly import, analyze, and utilize the data for various applications, from market research to academic studies.

    User-Friendly Structure and Metadata:

    The data is organized for easy navigation and analysis, with metadata files included to help users identify relevant subreddits and data points. The AI-enhanced columns are clearly labeled and structured, allowing users to efficiently incorporate these insights into their analyses.

    Ideal For:

    • Data Analysts: Conduct in-depth analyses of subreddit trends, user engagement, and content virality. The dataset’s extensive coverage and AI-enhanced insights make it an invaluable tool for data-driven research.
    • Marketers: Use the dataset to better understand your target audience, tailor campaigns to specific interests, and track the effectiveness of marketing efforts across Reddit.
    • Researchers: Explore the social dynamics of online communities, analyze the spread of ideas and information, and study the impact of digital media on public discourse, all while leveraging AI-generated insights.

    This dataset is an essential resource for anyone looking to understand the intricacies of Reddit's vast ecosystem, offering the data and AI-enhanced insights needed to drive informed decisions and strategies across various fields. Whether you’re tracking emerging trends, analyzing user behavior, or conduc...

  11. f

    Data set belonging to Beyens et al. (2020). The effect of social media on...

    • uvaauas.figshare.com
    • narcis.nl
    bin
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    I. Beyens; J.L. Pouwels; I.I. van Driel; Loes Keijsers; P.M. Valkenburg (2023). Data set belonging to Beyens et al. (2020). The effect of social media on well-being differs from adolescent to adolescent [Dataset]. http://doi.org/10.21942/uva.12497990.v2
    Explore at:
    binAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    University of Amsterdam / Amsterdam University of Applied Sciences
    Authors
    I. Beyens; J.L. Pouwels; I.I. van Driel; Loes Keijsers; P.M. Valkenburg
    License

    http://rdm.uva.nl/en/support/confidential-data.htmlhttp://rdm.uva.nl/en/support/confidential-data.html

    Description

    This data set belongs to:Beyens, I., Pouwels, J. L., van Driel, I. I., Keijsers, L., & Valkenburg, P. M. (2020). The effect of social media on well-being differs from adolescent to adolescent. Scientific Reports. doi:10.1038/s41598-020-67727-7The design, sampling and analysis plan of the study are available on the Open Science Framework (OSF) at https://osf.io/nhks2.For more information, please contact the authors at i.beyens@uva.nl or info@project-awesome.nl.

  12. MultiSocial

    • zenodo.org
    • data.niaid.nih.gov
    Updated Aug 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dominik Macko; Dominik Macko; Jakub Kopal; Robert Moro; Robert Moro; Ivan Srba; Ivan Srba; Jakub Kopal (2025). MultiSocial [Dataset]. http://doi.org/10.5281/zenodo.13846152
    Explore at:
    Dataset updated
    Aug 20, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Dominik Macko; Dominik Macko; Jakub Kopal; Robert Moro; Robert Moro; Ivan Srba; Ivan Srba; Jakub Kopal
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    MultiSocial is a dataset (described in a paper) for multilingual (22 languages) machine-generated text detection benchmark in social-media domain (5 platforms). It contains 472,097 texts, of which about 58k are human-written and approximately the same amount is generated by each of 7 multilingual large language models by using 3 iterations of paraphrasing. The dataset has been anonymized to minimize amount of sensitive data by hiding email addresses, usernames, and phone numbers.

    If you use this dataset in any publication, project, tool or in any other form, please, cite the paper.

    Disclaimer

    Due to data source (described below), the dataset may contain harmful, disinformation, or offensive content. Based on a multilingual toxicity detector, about 8% of the text samples are probably toxic (from 5% in WhatsApp to 10% in Twitter). Although we have used data sources of older date (lower probability to include machine-generated texts), the labeling (of human-written text) might not be 100% accurate. The anonymization procedure might not successfully hiden all the sensitive/personal content; thus, use the data cautiously (if feeling affected by such content, report the found issues in this regard to dpo[at]kinit.sk). The intended use if for non-commercial research purpose only.

    Data Source

    The human-written part consists of a pseudo-randomly selected subset of social media posts from 6 publicly available datasets:

    1. Telegram data originated in Pushshift Telegram, containing 317M messages (Baumgartner et al., 2020). It contains messages from 27k+ channels. The collection started with a set of right-wing extremist and cryptocurrency channels (about 300 in total) and was expanded based on occurrence of forwarded messages from other channels. In the end, it thus contains a wide variety of topics and societal movements reflecting the data collection time.

    2. Twitter data originated in CLEF2022-CheckThat! Task 1, containing 34k tweets on COVID-19 and politics (Nakov et al., 2022, combined with Sentiment140, containing 1.6M tweets on various topics (Go et al., 2009).

    3. Gab data originated in the dataset containing 22M posts from Gab social network. The authors of the dataset (Zannettou et al., 2018) found out that “Gab is predominantly used for the dissemination and discussion of news and world events, and that it attracts alt-right users, conspiracy theorists, and other trolls.” They also found out that hate speech is much more prevalent there compared to Twitter, but lower than 4chan's Politically Incorrect board.

    4. Discord data originated in Discord-Data, containing 51M messages. This is a long-context, anonymized, clean, multi-turn and single-turn conversational dataset based on Discord data scraped from a large variety of servers, big and small. According to the dataset authors, it contains around 0.1% of potentially toxic comments (based on the applied heuristic/classifier).

    5. WhatsApp data originated in whatsapp-public-groups, containing 300k messages (Garimella & Tyson, 2018). The public dataset contains the anonymised data, collected for around 5 months from around 178 groups. Original messages were made available to us on request to dataset authors for research purposes.

    From these datasets, we have pseudo-randomly sampled up to 1300 texts (up to 300 for test split and the remaining up to 1000 for train split if available) for each of the selected 22 languages (using a combination of automated approaches to detect the language) and platform. This process resulted in 61,592 human-written texts, which were further filtered out based on occurrence of some characters or their length, resulting in about 58k human-written texts.

    The machine-generated part contains texts generated by 7 LLMs (Aya-101, Gemini-1.0-pro, GPT-3.5-Turbo-0125, Mistral-7B-Instruct-v0.2, opt-iml-max-30b, v5-Eagle-7B-HF, vicuna-13b). All these models were self-hosted except for GPT and Gemini, where we used the publicly available APIs. We generated the texts using 3 paraphrases of the original human-written data and then preprocessed the generated texts (filtered out cases when the generation obviously failed).

    The dataset has the following fields:

    • 'text' - a text sample,

    • 'label' - 0 for human-written text, 1 for machine-generated text,

    • 'multi_label' - a string representing a large language model that generated the text or the string "human" representing a human-written text,

    • 'split' - a string identifying train or test split of the dataset for the purpose of training and evaluation respectively,

    • 'language' - the ISO 639-1 language code identifying the detected language of the given text,

    • 'length' - word count of the given text,

    • 'source' - a string identifying the source dataset / platform of the given text,

    • 'potential_noise' - 0 for text without identified noise, 1 for text with potential noise.

    ToDo Statistics (under construction)

  13. d

    Replication data and Online Appendix for: \"Introducing the Online Political...

    • dataone.org
    • dataverse.harvard.edu
    Updated Nov 11, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Martin, Diego A; Jacob N. Shapiro; Julia G. Ilhardt (2023). Replication data and Online Appendix for: \"Introducing the Online Political Influence Efforts dataset\" Journal of Peace Research [Dataset]. http://doi.org/10.7910/DVN/8IF59Q
    Explore at:
    Dataset updated
    Nov 11, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Martin, Diego A; Jacob N. Shapiro; Julia G. Ilhardt
    Description

    This dataset covers the use of social media to influence politics by promoting propaganda, advocating controversial viewpoints, and spreading disinformation. Influence efforts are defined as: (i) coordinated campaigns by a state, or the ruling party in an autocracy, to impact one or more specific aspects of politics at home or in another state, (ii) through media channels, including social media, by (iii) producing content designed to appear indigenous to the target state. Our data draw on more than 1000 media reports and 500 research articles/reports to identify IEs, track their progress, and classify their features. The data cover 78 foreign influence efforts (FIEs) and 25 domestic influence efforts (DIEs)—in which governments targeted their own citizens—against 51 different countries from 2011 through early-2021. The Influence Effort dataset measures covert information campaigns by state actors, facilitating research on contemporary statecraft.

  14. Facebook Users by Country Data (Cleaned)

    • kaggle.com
    Updated Apr 6, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Muhammad Faisal Ali (2023). Facebook Users by Country Data (Cleaned) [Dataset]. https://www.kaggle.com/faisaljanjua0555/facebook-users-by-country-data-cleaned/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 6, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Muhammad Faisal Ali
    Description

    The Facebook Users by Country Data (Cleaned) dataset is a collection of information on Facebook users from different countries. The dataset contains five columns of data, which are named as follows:

    • Names: This column contains the names of the countries for which the data is collected.
    • Users: This column provides the number of Facebook users in millions for each respective country.
    • Facebook_Users: This column shows the percentage of the total population of each country that uses Facebook.
    • Date_of_Data: This column indicates the date on which the data was collected and compiled.
    • Population: This column represents the total population of each country.

    How you can use this Dataset?

    The Facebook Users by Country Data (Cleaned) dataset can be used in several ways. Here are some potential use cases:

    • Market Research: Marketers can use this dataset to identify markets with the highest concentration of Facebook users. This information can be used to target Facebook ads to specific regions, optimize social media campaigns, and determine which markets to expand into.

    • Business Strategy: Businesses can use this dataset to identify potential markets for their products or services. By analyzing Facebook usage rates in different countries, businesses can identify countries with high engagement rates and target those markets.

    • Social Media Analysis: Researchers can use this dataset to analyze social media behavior in different countries. By comparing Facebook usage rates across different countries, researchers can identify cultural and social differences that affect social media behavior.

  15. Instagram accounts with the most followers worldwide 2024

    • statista.com
    • de.statista.com
    • +3more
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stacy Jo Dixon, Instagram accounts with the most followers worldwide 2024 [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Stacy Jo Dixon
    Description

    Cristiano Ronaldo has one of the most popular Instagram accounts as of April 2024.

                  The Portuguese footballer is the most-followed person on the photo sharing app platform with 628 million followers. Instagram's own account was ranked first with roughly 672 million followers.
    
                  How popular is Instagram?
    
                  Instagram is a photo-sharing social networking service that enables users to take pictures and edit them with filters. The platform allows users to post and share their images online and directly with their friends and followers on the social network. The cross-platform app reached one billion monthly active users in mid-2018. In 2020, there were over 114 million Instagram users in the United States and experts project this figure to surpass 127 million users in 2023.
    
                  Who uses Instagram?
    
                  Instagram audiences are predominantly young – recent data states that almost 60 percent of U.S. Instagram users are aged 34 years or younger. Fall 2020 data reveals that Instagram is also one of the most popular social media for teens and one of the social networks with the biggest reach among teens in the United States.
    
                  Celebrity influencers on Instagram
                  Many celebrities and athletes are brand spokespeople and generate additional income with social media advertising and sponsored content. Unsurprisingly, Ronaldo ranked first again, as the average media value of one of his Instagram posts was 985,441 U.S. dollars.
    
  16. Dataset Explaining social media attention in Clinical Medicine research...

    • figshare.com
    xlsx
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pablo Dorta-gonzález (2023). Dataset Explaining social media attention in Clinical Medicine research through social mentions and bibliometric factors [Dataset]. http://doi.org/10.6084/m9.figshare.20411592.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    figshare
    Authors
    Pablo Dorta-gonzález
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The unit of study: articles and reviews in disciplinary journals in the field of Clinical Medicine. The time frame covers 2018-2020 and the country analyzed is Spain.

    The random sample: N=4895. The years 2018 and 2019 correspond to a simple random sample, while the year 2020 is a compendium of a random sample and the complete COVID-19 subgroup.

    Variables: Twitter is nowadays the most used social media platform by the general population (and by researchers in particular) to disseminate and comment on the results of scientific research. There are many and varied factors that may influence social media attention of research. These factors include the mainstream news coverage, the treated topic (COVID-19, for example), and some characteristics of the research such as its proximity to social issues (impact on public policy) and business (impact on patents), their contribution to the consolidation of knowledge (in the format of review or mention on Wikipedia), and the recommendation of experts (Faculty Opinions).

  17. Z

    #IndonesiaHumanRightsSOS Twitter Hashtag Tweets Dataset

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jul 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Azmi Nawwar (2024). #IndonesiaHumanRightsSOS Twitter Hashtag Tweets Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4362504
    Explore at:
    Dataset updated
    Jul 19, 2024
    Dataset authored and provided by
    Azmi Nawwar
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset ini merupakan hasil dari scraping pada media sosial twitter dengan menggunakan aplikasi twint yang ditujukan pada hashtag #IndonesiaHumanRightsSOS. Scraping data dilakukan untuk cuitan yang dibuat dari tanggal 18 Desember 2020 10:59 AM s/d 19 Desember 2020 23:18 PM.

    Pada dataset mengandung 106.903 Row data dengan informasi terkait: User ID, Username, Twitter Name,Tweets, dsb.

    Selain itu dilampirkan juga contoh data yang telah dianalisis berupa wordcloud,username cloud, 100 most used word & most active username.

    -

    This dataset is the result of scraping on social media twitter using the twint application aimed at the hashtag #IndonesiaHumanRightsSOS. Data scraping is done for tweets made from December 18 2020 10:59 AM to December 19 2020 23:18 PM.

    The dataset contains 106,903 rows of data with related information: User ID, Username, Twitter Name, Tweets, etc.

    Also there is an example of the data that has been analyzed in the form of wordcloud, username cloud, 100 most used words & most active username.

  18. Countries with the most Facebook users 2024

    • statista.com
    • tokrwards.com
    • +4more
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stacy Jo Dixon, Countries with the most Facebook users 2024 [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Stacy Jo Dixon
    Description

    Which county has the most Facebook users?

                  There are more than 378 million Facebook users in India alone, making it the leading country in terms of Facebook audience size. To put this into context, if India’s Facebook audience were a country then it would be ranked third in terms of largest population worldwide. Apart from India, there are several other markets with more than 100 million Facebook users each: The United States, Indonesia, and Brazil with 193.8 million, 119.05 million, and 112.55 million Facebook users respectively.
    
                  Facebook – the most used social media
    
                  Meta, the company that was previously called Facebook, owns four of the most popular social media platforms worldwide, WhatsApp, Facebook Messenger, Facebook, and Instagram. As of the third quarter of 2021, there were around 3,5 billion cumulative monthly users of the company’s products worldwide. With around 2.9 billion monthly active users, Facebook is the most popular social media worldwide. With an audience of this scale, it is no surprise that the vast majority of Facebook’s revenue is generated through advertising.
    
                  Facebook usage by device
                  As of July 2021, it was found that 98.5 percent of active users accessed their Facebook account from mobile devices. In fact, almost 81.8 percent of Facebook audiences worldwide access the platform only via mobile phone. Facebook is not only available through mobile browser as the company has published several mobile apps for users to access their products and services. As of the third quarter 2021, the four core Meta products were leading the ranking of most downloaded mobile apps worldwide, with WhatsApp amassing approximately six billion downloads.
    
  19. f

    Dataset belonging to Meier et al. (2023) Habitual Social Media and...

    • uvaauas.figshare.com
    csv
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Adrian Meier; I. Beyens; T. Siebers; J.L. Pouwels; P.M. Valkenburg (2023). Dataset belonging to Meier et al. (2023) Habitual Social Media and Smartphone Use are Linked to Task Delay for Some, but not all, Adolescents [Dataset]. http://doi.org/10.21942/uva.22153844.v1
    Explore at:
    csvAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    University of Amsterdam / Amsterdam University of Applied Sciences
    Authors
    Adrian Meier; I. Beyens; T. Siebers; J.L. Pouwels; P.M. Valkenburg
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The codebook, preregistration, and syntaxes can be found on the Open Science Framework (OSF) at https://osf.io/fzp7d/. For more information, please contact the authors via t.siebers@uva.nl or info@project-awesome.nl.

  20. B

    COVID-19 Twitter Dataset

    • borealisdata.ca
    • figshare.com
    Updated Nov 10, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anatoliy Gruzd; Philip Mai (2020). COVID-19 Twitter Dataset [Dataset]. http://doi.org/10.5683/SP2/PXF2CU
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 10, 2020
    Dataset provided by
    Borealis
    Authors
    Anatoliy Gruzd; Philip Mai
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    The current dataset contains 237M Tweet IDs for Twitter posts that mentioned "COVID" as a keyword or as part of a hashtag (e.g., COVID-19, COVID19) between March and July of 2020. Sampling Method: hourly requests sent to Twitter Search API using Social Feed Manager, an open source software that harvests social media data and related content from Twitter and other platforms. NOTE: 1) In accordance with Twitter API Terms, only Tweet IDs are provided as part of this dataset. 2) To recollect tweets based on the list of Tweet IDs contained in these datasets, you will need to use tweet 'rehydration' programs like Hydrator (https://github.com/DocNow/hydrator) or Python library Twarc (https://github.com/DocNow/twarc). 3) This dataset, like most datasets collected via the Twitter Search API, is a sample of the available tweets on this topic and is not meant to be comprehensive. Some COVID-related tweets might not be included in the dataset either because the tweets were collected using a standardized but intermittent (hourly) sampling protocol or because tweets used hashtags/keywords other than COVID (e.g., Coronavirus or #nCoV). 4) To broaden this sample, consider comparing/merging this dataset with other COVID-19 related public datasets such as: https://github.com/thepanacealab/covid19_twitter https://ieee-dataport.org/open-access/corona-virus-covid-19-tweets-dataset https://github.com/echen102/COVID-19-TweetIDs

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Stacy Jo Dixon, Number of global social network users 2017-2028 [Dataset]. https://www.statista.com/topics/1164/social-networks/
Organization logo

Number of global social network users 2017-2028

Explore at:
Dataset provided by
Statistahttp://statista.com/
Authors
Stacy Jo Dixon
Description

How many people use social media?

              Social media usage is one of the most popular online activities. In 2024, over five billion people were using social media worldwide, a number projected to increase to over six billion in 2028.

              Who uses social media?
              Social networking is one of the most popular digital activities worldwide and it is no surprise that social networking penetration across all regions is constantly increasing. As of January 2023, the global social media usage rate stood at 59 percent. This figure is anticipated to grow as lesser developed digital markets catch up with other regions
              when it comes to infrastructure development and the availability of cheap mobile devices. In fact, most of social media’s global growth is driven by the increasing usage of mobile devices. Mobile-first market Eastern Asia topped the global ranking of mobile social networking penetration, followed by established digital powerhouses such as the Americas and Northern Europe.

              How much time do people spend on social media?
              Social media is an integral part of daily internet usage. On average, internet users spend 151 minutes per day on social media and messaging apps, an increase of 40 minutes since 2015. On average, internet users in Latin America had the highest average time spent per day on social media.

              What are the most popular social media platforms?
              Market leader Facebook was the first social network to surpass one billion registered accounts and currently boasts approximately 2.9 billion monthly active users, making it the most popular social network worldwide. In June 2023, the top social media apps in the Apple App Store included mobile messaging apps WhatsApp and Telegram Messenger, as well as the ever-popular app version of Facebook.
Search
Clear search
Close search
Google apps
Main menu