80 datasets found
  1. Daily Social Media Active Users

    • kaggle.com
    zip
    Updated May 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shaik Barood Mohammed Umar Adnaan Faiz (2025). Daily Social Media Active Users [Dataset]. https://www.kaggle.com/datasets/umeradnaan/daily-social-media-active-users
    Explore at:
    zip(126814 bytes)Available download formats
    Dataset updated
    May 5, 2025
    Authors
    Shaik Barood Mohammed Umar Adnaan Faiz
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Description:

    The "Daily Social Media Active Users" dataset provides a comprehensive and dynamic look into the digital presence and activity of global users across major social media platforms. The data was generated to simulate real-world usage patterns for 13 popular platforms, including Facebook, YouTube, WhatsApp, Instagram, WeChat, TikTok, Telegram, Snapchat, X (formerly Twitter), Pinterest, Reddit, Threads, LinkedIn, and Quora. This dataset contains 10,000 rows and includes several key fields that offer insights into user demographics, engagement, and usage habits.

    Dataset Breakdown:

    • Platform: The name of the social media platform where the user activity is tracked. It includes globally recognized platforms, such as Facebook, YouTube, and TikTok, that are known for their large, active user bases.

    • Owner: The company or entity that owns and operates the platform. Examples include Meta for Facebook, Instagram, and WhatsApp, Google for YouTube, and ByteDance for TikTok.

    • Primary Usage: This category identifies the primary function of each platform. Social media platforms differ in their primary usage, whether it's for social networking, messaging, multimedia sharing, professional networking, or more.

    • Country: The geographical region where the user is located. The dataset simulates global coverage, showcasing users from diverse locations and regions. It helps in understanding how user behavior varies across different countries.

    • Daily Time Spent (min): This field tracks how much time a user spends on a given platform on a daily basis, expressed in minutes. Time spent data is critical for understanding user engagement levels and the popularity of specific platforms.

    • Verified Account: Indicates whether the user has a verified account. This feature mimics real-world patterns where verified users (often public figures, businesses, or influencers) have enhanced status on social media platforms.

    • Date Joined: The date when the user registered or started using the platform. This data simulates user account history and can provide insights into user retention trends or platform growth over time.

    Context and Use Cases:

    • This synthetic dataset is designed to offer a privacy-friendly alternative for analytics, research, and machine learning purposes. Given the complexities and privacy concerns around using real user data, especially in the context of social media, this dataset offers a clean and secure way to develop, test, and fine-tune applications, models, and algorithms without the risks of handling sensitive or personal information.

    Researchers, data scientists, and developers can use this dataset to:

    • Model User Behavior: By analyzing patterns in daily time spent, verified status, and country of origin, users can model and predict social media engagement behavior.

    • Test Analytics Tools: Social media monitoring and analytics platforms can use this dataset to simulate user activity and optimize their tools for engagement tracking, reporting, and visualization.

    • Train Machine Learning Algorithms: The dataset can be used to train models for various tasks like user segmentation, recommendation systems, or churn prediction based on engagement metrics.

    • Create Dashboards: This dataset can serve as the foundation for creating user-friendly dashboards that visualize user trends, platform comparisons, and engagement patterns across the globe.

    • Conduct Market Research: Business intelligence teams can use the data to understand how various demographics use social media, offering valuable insights into the most engaged regions, platform preferences, and usage behaviors.

    • Sources of Inspiration: This dataset is inspired by public data from industry reports, such as those from Statista, DataReportal, and other market research platforms. These sources provide insights into the global user base and usage statistics of popular social media platforms. The synthetic nature of this dataset allows for the use of realistic engagement metrics without violating any privacy concerns, making it an ideal tool for educational, analytical, and research purposes.

    The structure and design of the dataset are based on real-world usage patterns and aim to represent a variety of users from different backgrounds, countries, and activity levels. This diversity makes it an ideal candidate for testing data-driven solutions and exploring social media trends.

    Future Considerations:

    As the social media landscape continues to evolve, this dataset can be updated or extended to include new platforms, engagement metrics, or user behaviors. Future iterations may incorporate features like post frequency, follower counts, engagement rates (likes, comments, shares), or even sentiment analysis from user-generated content.

    By leveraging this dataset, analysts and data scientists can create better, more effective strategies ...

  2. Social Media Dataset

    • kaggle.com
    zip
    Updated Apr 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nixie6254 (2025). Social Media Dataset [Dataset]. https://www.kaggle.com/datasets/nixie6254/social-media-dataset
    Explore at:
    zip(28057 bytes)Available download formats
    Dataset updated
    Apr 17, 2025
    Authors
    Nixie6254
    Description

    This dataset consists of 734 entries representing social media activity and performance from a local SME (Micro, Small, and Medium Enterprise) across TikTok, Instagram, and Twitter platforms. It captures key metrics related to audience interaction and content strategy effectiveness, and is valuable for evaluating and optimizing digital marketing efforts for small businesses.

    Area : Target location or customer region where the UMKM's content is directed. Category : The business content category (e.g., product promotion, education, seasonal campaign). Day : The day of the week the content was published. Month : The month the post went live. Platform : The social media platform used by the UMKM (TikTok, Instagram, or Twitter). Post Type : The format of the content posted: image, video, carousel, or text. Timestamp : The exact date and time when the content was posted. User : The username or business account that posted the content. Week : Week number within the year for time-based analysis. Year : The year the content was posted. Comments : Total number of comments received on the post. Engagement Rate : A calculated metric showing how engaging the content is (based on likes, comments, shares vs. reach/impressions). Hour : Hour of the day the post was published. Impressions : Number of times the content appeared on users' feeds. Likes : Number of likes the post received. Reach : Number of unique users who saw the content. Shares : Number of times users shared the content.

  3. Number of global social network users 2017-2028

    • statista.com
    • de.statista.com
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stacy Jo Dixon, Number of global social network users 2017-2028 [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Stacy Jo Dixon
    Description

    How many people use social media?

                  Social media usage is one of the most popular online activities. In 2024, over five billion people were using social media worldwide, a number projected to increase to over six billion in 2028.
    
                  Who uses social media?
                  Social networking is one of the most popular digital activities worldwide and it is no surprise that social networking penetration across all regions is constantly increasing. As of January 2023, the global social media usage rate stood at 59 percent. This figure is anticipated to grow as lesser developed digital markets catch up with other regions
                  when it comes to infrastructure development and the availability of cheap mobile devices. In fact, most of social media’s global growth is driven by the increasing usage of mobile devices. Mobile-first market Eastern Asia topped the global ranking of mobile social networking penetration, followed by established digital powerhouses such as the Americas and Northern Europe.
    
                  How much time do people spend on social media?
                  Social media is an integral part of daily internet usage. On average, internet users spend 151 minutes per day on social media and messaging apps, an increase of 40 minutes since 2015. On average, internet users in Latin America had the highest average time spent per day on social media.
    
                  What are the most popular social media platforms?
                  Market leader Facebook was the first social network to surpass one billion registered accounts and currently boasts approximately 2.9 billion monthly active users, making it the most popular social network worldwide. In June 2023, the top social media apps in the Apple App Store included mobile messaging apps WhatsApp and Telegram Messenger, as well as the ever-popular app version of Facebook.
    
  4. Social Media Behavior Dataset

    • kaggle.com
    zip
    Updated Nov 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shibin Shereef (2024). Social Media Behavior Dataset [Dataset]. https://www.kaggle.com/datasets/shibinshereef1/social-media-behavior-dataset
    Explore at:
    zip(7429 bytes)Available download formats
    Dataset updated
    Nov 25, 2024
    Authors
    Shibin Shereef
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset contains 600 synthetic entries simulating social media activity across three major platforms: Twitter, Reddit, and Instagram. The data was generated to analyze trends, sentiments, and user engagement patterns based on hashtags and posts. It can be useful for researchers, data analysts, and machine learning enthusiasts interested in studying social media behavior.

    Dataset Structure The dataset includes the following columns:

    Date: The date of the post, ranging across a simulated timeline. Platform: The social media platform where the post was made (Twitter, Reddit, or Instagram). Hashtag: The main hashtag associated with the post, such as #AI, #MachineLearning, or #Python. Post Content: The text of the post, crafted to simulate common social media interactions. Sentiment: The sentiment of the post, classified as Positive, Neutral, or Negative. Likes: The number of likes the post received. Shares: The number of shares or retweets the post received. Potential Use Cases Sentiment analysis: Train machine learning models to detect sentiment in text. Hashtag popularity analysis: Determine which hashtags are most commonly used or generate the most engagement. Engagement trends: Explore correlations between post sentiment and engagement metrics (likes/shares). Platform comparison: Compare user behavior across different social media platforms. Acknowledgments This dataset is fully synthetic and was generated using Python. It does not contain any real user data and is intended for educational and research purposes.

  5. Global social network penetration 2019-2028

    • statista.com
    • de.statista.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stacy Jo Dixon, Global social network penetration 2019-2028 [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Stacy Jo Dixon
    Description

    The global social media penetration rate in was forecast to continuously increase between 2024 and 2028 by in total 11.6 (+18.19 percent). After the ninth consecutive increasing year, the penetration rate is estimated to reach 75.31 and therefore a new peak in 2028. Notably, the social media penetration rate of was continuously increasing over the past years.

  6. Average daily time spent on social media worldwide 2012-2024

    • statista.com
    • de.statista.com
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stacy Jo Dixon, Average daily time spent on social media worldwide 2012-2024 [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Stacy Jo Dixon
    Description

    How much time do people spend on social media?

                  As of 2024, the average daily social media usage of internet users worldwide amounted to 143 minutes per day, down from 151 minutes in the previous year. Currently, the country with the most time spent on social media per day is Brazil, with online users spending an average of three hours and 49 minutes on social media each day. In comparison, the daily time spent with social media in
                  the U.S. was just two hours and 16 minutes. Global social media usageCurrently, the global social network penetration rate is 62.3 percent. Northern Europe had an 81.7 percent social media penetration rate, topping the ranking of global social media usage by region. Eastern and Middle Africa closed the ranking with 10.1 and 9.6 percent usage reach, respectively.
                  People access social media for a variety of reasons. Users like to find funny or entertaining content and enjoy sharing photos and videos with friends, but mainly use social media to stay in touch with current events friends. Global impact of social mediaSocial media has a wide-reaching and significant impact on not only online activities but also offline behavior and life in general.
                  During a global online user survey in February 2019, a significant share of respondents stated that social media had increased their access to information, ease of communication, and freedom of expression. On the flip side, respondents also felt that social media had worsened their personal privacy, increased a polarization in politics and heightened everyday distractions.
    
  7. YouTube Social Network with Communities (SNAP)

    • kaggle.com
    zip
    Updated Dec 16, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Subhajit Sahu (2021). YouTube Social Network with Communities (SNAP) [Dataset]. https://www.kaggle.com/datasets/wolfram77/graphs-snap-com-youtube
    Explore at:
    zip(13777811 bytes)Available download formats
    Dataset updated
    Dec 16, 2021
    Authors
    Subhajit Sahu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    YouTube
    Description

    Youtube social network and ground-truth communities

    https://snap.stanford.edu/data/com-Youtube.html

    Dataset information

    Youtube (http://www.youtube.com/) is a video-sharing web site that includes a social network. In the Youtube social network, users form friendship each other and users can create groups which other users can join. We consider
    such user-defined groups as ground-truth communities. This data is provided by Alan Mislove et al.
    (http://socialnetworks.mpi-sws.org/data-imc2007.html)

    We regard each connected component in a group as a separate ground-truth
    community. We remove the ground-truth communities which have less than 3
    nodes. We also provide the top 5,000 communities with highest quality
    which are described in our paper (http://arxiv.org/abs/1205.6233). As for
    the network, we provide the largest connected component.

    Network statistics
    Nodes 1,134,890
    Edges 2,987,624
    Nodes in largest WCC 1134890 (1.000)
    Edges in largest WCC 2987624 (1.000)
    Nodes in largest SCC 1134890 (1.000)
    Edges in largest SCC 2987624 (1.000)
    Average clustering coefficient 0.0808
    Number of triangles 3056386
    Fraction of closed triangles 0.002081
    Diameter (longest shortest path) 20
    90-percentile effective diameter 6.5
    Community statistics
    Number of communities 8,385
    Average community size 13.50
    Average membership size 0.10

    Source (citation)
    J. Yang and J. Leskovec. Defining and Evaluating Network Communities based on Ground-truth. ICDM, 2012. http://arxiv.org/abs/1205.6233

    Files
    File Description
    com-youtube.ungraph.txt.gz Undirected Youtube network
    com-youtube.all.cmty.txt.gz Youtube communities
    com-youtube.top5000.cmty.txt.gz Youtube communities (Top 5,000)

    Notes on inclusion into the SuiteSparse Matrix Collection, July 2018:

    The graph in the SNAP data set is 1-based, with nodes numbered 1 to
    1,157,827.

    In the SuiteSparse Matrix Collection, Problem.A is the undirected Youtube
    network, a matrix of size n-by-n with n=1,134,890, which is the number of
    unique user id's appearing in any edge.

    Problem.aux.nodeid is a list of the node id's that appear in the SNAP data set. A(i,j)=1 if person nodeid(i) is friends with person nodeid(j). The
    node id's are the same as the SNAP data set (1-based).

    C = Problem.aux.Communities_all is a sparse matrix of size n by 16,386
    which represents the communities in the com-youtube.all.cmty.txt file.
    The kth line in that file defines the kth community, and is the column
    C(:,k), where C(i,k)=1 if person ...

  8. Social Media Datasets

    • brightdata.com
    .json, .csv, .xlsx
    Updated Sep 7, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bright Data (2022). Social Media Datasets [Dataset]. https://brightdata.com/products/datasets/social-media
    Explore at:
    .json, .csv, .xlsxAvailable download formats
    Dataset updated
    Sep 7, 2022
    Dataset authored and provided by
    Bright Datahttps://brightdata.com/
    License

    https://brightdata.com/licensehttps://brightdata.com/license

    Area covered
    Worldwide
    Description

    Gain valuable insights with our comprehensive Social Media Dataset, designed to help businesses, marketers, and analysts track trends, monitor engagement, and optimize strategies. This dataset provides structured and reliable social media data from multiple platforms.

    Dataset Features

    User Profiles: Access public social media profiles, including usernames, bios, follower counts, engagement metrics, and more. Ideal for audience analysis, influencer marketing, and competitive research. Posts & Content: Extract posts, captions, hashtags, media (images/videos), timestamps, and engagement metrics such as likes, shares, and comments. Useful for trend analysis, sentiment tracking, and content strategy optimization. Comments & Interactions: Analyze user interactions, including replies, mentions, and discussions. This data helps brands understand audience sentiment and engagement patterns. Hashtag & Trend Tracking: Monitor trending hashtags, topics, and viral content across platforms to stay ahead of industry trends and consumer interests.

    Customizable Subsets for Specific Needs Our Social Media Dataset is fully customizable, allowing you to filter data based on platform, region, keywords, engagement levels, or specific user profiles. Whether you need a broad dataset for market research or a focused subset for brand monitoring, we tailor the dataset to your needs.

    Popular Use Cases

    Brand Monitoring & Reputation Management: Track brand mentions, customer feedback, and sentiment analysis to manage online reputation effectively. Influencer Marketing & Audience Analysis: Identify key influencers, analyze engagement metrics, and optimize influencer partnerships. Competitive Intelligence: Monitor competitor activity, content performance, and audience engagement to refine marketing strategies. Market Research & Consumer Insights: Analyze social media trends, customer preferences, and emerging topics to inform business decisions. AI & Predictive Analytics: Leverage structured social media data for AI-driven trend forecasting, sentiment analysis, and automated content recommendations.

    Whether you're tracking brand sentiment, analyzing audience engagement, or monitoring industry trends, our Social Media Dataset provides the structured data you need. Get started today and customize your dataset to fit your business objectives.

  9. Dataset of directed signed networks from social domain

    • figshare.com
    zip
    Updated Sep 4, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Samin Aref; Ly Dinh; Rezvaneh Rezapour (2020). Dataset of directed signed networks from social domain [Dataset]. http://doi.org/10.6084/m9.figshare.12152628.v3
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 4, 2020
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Samin Aref; Ly Dinh; Rezvaneh Rezapour
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains a range of directed signed networks (signed digraphs) from social domain. The data come from 9 different sources and in total there are 29 network files. There are two temporal networks and one multilayer network in this dataset. Each network is provided in two formats: edgelist (.csv) and .gml format.This dataset is provided under a CC BY-NC-SA Creative Commons v 4.0 license (Attribution-NonCommercial-ShareAlike). This means that other individuals may remix, tweak, and build upon these data non-commercially, as long as they provide citations to this data repository (https://doi.org/10.6084/m9.figshare.12152628) and the reference article listed below (https://doi.org/10.1038/s41598-020-71838-6), and license the new creations under the identical terms.For more information about the data, one may refer to the article below:Samin Aref, Ly Dinh, Rezvaneh Rezapour, and Jana Diesner. "Multilevel Structural Evaluation of Signed Directed Social Networks based on Balance Theory" Scientific Reports (2020) https://doi.org/10.1038/s41598-020-71838-6

  10. U.S. Facebook data requests from government agencies 2013-2023

    • statista.com
    • de.statista.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stacy Jo Dixon, U.S. Facebook data requests from government agencies 2013-2023 [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Stacy Jo Dixon
    Description

    Facebook received 73,390 user data requests from federal agencies and courts in the United States during the second half of 2023. The social network produced some user data in 88.84 percent of requests from U.S. federal authorities. The United States accounts for the largest share of Facebook user data requests worldwide.

  11. H

    Replication data for: Power Positions: International Organizations, Social...

    • dataverse.harvard.edu
    Updated Nov 28, 2007
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Emilie M. Hafner-Burton; Alexander Montgomery (2007). Replication data for: Power Positions: International Organizations, Social Networks, and Conflict [Dataset]. http://doi.org/10.7910/DVN/B9YYYH
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 28, 2007
    Dataset provided by
    Harvard Dataverse
    Authors
    Emilie M. Hafner-Burton; Alexander Montgomery
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Time period covered
    1885 - 1992
    Description

    A growing number of international relations scholars argue that intergovernmental organizations (IGOs) promote peace. Existing approaches emphasize IGO membership as an important causal attribute of individual states, much like economic development and regime type. The authors draw up on social network analysis, arguing that conflicts between states are also shaped by relative positions of social power created by IGO memberships and characterized by significant disparity. Membership partitions states into structurally equivalent clusters and establishes hierarchies of prestige in the international system. These relative positions promote common beliefs and alter the distribution of social power, making certain policy strategies more practical or rational. The authors introduce new IGO relational data and explore the empirical merits of their approach during the period from 1885 to 1992. They demonstrate that conflict is increased by the presence of many other states in structurally equivalent clusters, while large prestige disparities and in-group favoritism decrease it.

  12. Social media as a news outlet worldwide 2024

    • statista.com
    • de.statista.com
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amy Watson, Social media as a news outlet worldwide 2024 [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Amy Watson
    Description

    During a 2024 survey, 77 percent of respondents from Nigeria stated that they used social media as a source of news. In comparison, just 23 percent of Japanese respondents said the same. Large portions of social media users around the world admit that they do not trust social platforms either as media sources or as a way to get news, and yet they continue to access such networks on a daily basis.

                  Social media: trust and consumption
    
                  Despite the majority of adults surveyed in each country reporting that they used social networks to keep up to date with news and current affairs, a 2018 study showed that social media is the least trusted news source in the world. Less than 35 percent of adults in Europe considered social networks to be trustworthy in this respect, yet more than 50 percent of adults in Portugal, Poland, Romania, Hungary, Bulgaria, Slovakia and Croatia said that they got their news on social media.
    
                  What is clear is that we live in an era where social media is such an enormous part of daily life that consumers will still use it in spite of their doubts or reservations. Concerns about fake news and propaganda on social media have not stopped billions of users accessing their favorite networks on a daily basis.
                  Most Millennials in the United States use social media for news every day, and younger consumers in European countries are much more likely to use social networks for national political news than their older peers.
                  Like it or not, reading news on social is fast becoming the norm for younger generations, and this form of news consumption will likely increase further regardless of whether consumers fully trust their chosen network or not.
    
  13. LiveJournal Social Network with Communities (SNAP)

    • kaggle.com
    zip
    Updated Dec 16, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Subhajit Sahu (2021). LiveJournal Social Network with Communities (SNAP) [Dataset]. https://www.kaggle.com/datasets/wolfram77/graphs-snap-com-livejournal
    Explore at:
    zip(162104147 bytes)Available download formats
    Dataset updated
    Dec 16, 2021
    Authors
    Subhajit Sahu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    LiveJournal social network and ground-truth communities

    https://snap.stanford.edu/data/com-LiveJournal.html

    Dataset information

    LiveJournal (http://www.livejournal.com/) is a free on-line blogging
    community where users declare friendship each other. LiveJournal also
    allows users form a group which other members can then join. We consider
    such user-defined groups as ground-truth communities. We provide the
    LiveJournal friendship social network and ground-truth communities.

    We regard each connected component in a group as a separate ground-truth
    community. We remove the ground-truth communities which have less than 3
    nodes. We also provide the top 5,000 communities with highest quality
    which are described in our paper (http://arxiv.org/abs/1205.6233). As for
    the network, we provide the largest connected component.

    Dataset statistics
    Nodes 3,997,962
    Edges 34,681,189
    Nodes in largest WCC 3997962 (1.000)
    Edges in largest WCC 34681189 (1.000)
    Nodes in largest SCC 3997962 (1.000)
    Edges in largest SCC 34681189 (1.000)
    Average clustering coefficient 0.2843
    Number of triangles 177820130
    Fraction of closed triangles 0.04559
    Diameter (longest shortest path) 17
    90-percentile effective diameter 6.5

    Source (citation)
    J. Yang and J. Leskovec. Defining and Evaluating Network Communities based on Ground-truth. ICDM, 2012. http://arxiv.org/abs/1205.6233

    Files
    File Description
    com-lj.ungraph.txt.gz Undirected LiveJournal network
    com-lj.all.cmty.txt.gz LiveJournal communities
    com-lj.top5000.cmty.txt.gz LiveJournal communities (Top 5,000)

    Notes on inclusion into the SuiteSparse Matrix Collection, July 2018:

    The graph in the SNAP data set is 0-based, with nodes numbering 0 to
    4,036,537.

    In the SuiteSparse Matrix Collection, Problem.A is the undirected
    LiveJournal network, a matrix of size n-by-n with n=3,997,962, which is
    the number of unique user id's appearing in any edge.

    Problem.aux.nodeid is a list of the node id's that appear in the SNAP data set. A(i,j)=1 if person nodeid(i) is friends with person nodeid(j). The
    node id's are the same as the SNAP data set (0-based).

    C = Problem.aux.Communities_all is a sparse matrix of size n by 664,414
    which represents the communities in the com-lj.all.cmty.txt file. The kth line in that file defines the kth community, and is the column C(:,k),
    where C(i,k)=1 if person nodeid(i) is in the kth community. Row C(i,:)
    and row/column i of the A matrix thus refer to the same person, nodeid(i).

    Ctop = Problem.aux.Communities_top5000 is n-by-5000, with the same
    structure as the C array above, with the content of the
    com-lj.top5000.cmty.txt file.

  14. s

    A Blue Start: A large-scale pairwise and higher-order social network dataset...

    • socialmediaarchive.org
    csv, json, pdf, zip
    Updated May 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). A Blue Start: A large-scale pairwise and higher-order social network dataset [Dataset]. https://socialmediaarchive.org/record/78
    Explore at:
    json(619104394), zip(8502960667), json(132710350), pdf(73964), csv(85077755)Available download formats
    Dataset updated
    May 9, 2025
    Description

    This dataset consists of all starter packs and all following network data available on Bluesky in January and February 2025. Starter packs can be created by any Bluesky user. They are lists of users and curated feeds with a minimum of 6 and a maximum of 150 users, curated by the starter pack creator. The creator typically names them and provides a description. Other users can use a single click to follow all users in the starter pack, or they can scroll through a specific starter pack to decide who to follow within that starter pack. In our dataset, all DIDs (persistent, unique identifiers) are anonymized with a non-reversible hash function; users in the network, as well as users who created starter packs, or appear in starter packs, are identified by their hashed DIDs. Similarly, starter packs themselves are identified by their hashed identifiers.

    First, we include the Bluesky following network as it appeared in late January/early February 2025. This shows all available directed following relationships on Bluesky. We also include a network dataset of starter packs with information on creators and starter pack members. This is intended for users who wish to undertake a computational analysis of the networks created by starter packs or starter packs’ influences on networks.

  15. s

    Dataset for Social Media Activity, Number of Friends, and Relationship...

    • eprints.soton.ac.uk
    Updated Jul 8, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Elder, Lindsay; Brignell, Catherine; Cooke, Tim (2022). Dataset for Social Media Activity, Number of Friends, and Relationship Quality [Dataset]. http://doi.org/10.5258/SOTON/D1955
    Explore at:
    Dataset updated
    Jul 8, 2022
    Dataset provided by
    University of Southampton
    Authors
    Elder, Lindsay; Brignell, Catherine; Cooke, Tim
    Description

    The data from my thesis. This data was collected using the Lifeguide Software and exported onto SPSS following data collection. The data was collected from young people aged 11-18 years old to explore the impact of different types of social media use.

  16. GitHub Social Network

    • kaggle.com
    Updated Jan 12, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gitanjali Wadhwa (2023). GitHub Social Network [Dataset]. https://www.kaggle.com/datasets/gitanjali1425/github-social-network-graph-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 12, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Gitanjali Wadhwa
    Description

    Description

    An extensive social network of GitHub developers was collected from the public API in June 2019. Nodes are developers who have starred at most minuscule 10 repositories, and edges are mutual follower relationships between them. The vertex features are extracted based on the location; repositories starred, employer and e-mail address. The task related to the graph is binary node classification - one has to predict whether the GitHub user is a web or a machine learning developer. This targeting feature was derived from the job title of each user.

    Properties

    • Directed: No.
    • Node features: Yes.
    • Edge features: No.
    • Node labels: Yes. Binary-labeled.
    • Temporal: No.
    • Nodes: 37,700
    • Edges: 289,003
    • Density: 0.001
    • Transitvity: 0.013

    Possible Tasks

    • Binary node classification
    • Link prediction
    • Community detection
    • Network visualisation
  17. S

    Social media profile growth, engagement rate, and reach

    • data.sugarlandtx.gov
    xlsx
    Updated Jan 3, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Communications and Community Engagement (2024). Social media profile growth, engagement rate, and reach [Dataset]. https://data.sugarlandtx.gov/dataset/social-media-profile-growth-engagement-rate-and-reach
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jan 3, 2024
    Dataset authored and provided by
    Communications and Community Engagement
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Profile growth - the growth on our social platforms to see where and when we're gaining followers. Engagement rate - a ratio of how many people interacted with ours posts based on when users are usually online. Reach - the number of feeds our posts appeared in (doesn't mean people interacted with the post).

  18. Planned changes in use of selected social media for organic marketing...

    • statista.com
    • de.statista.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Christopher Ross, Planned changes in use of selected social media for organic marketing worldwide 2024 [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Christopher Ross
    Description

    During a January 2024 global survey among marketers, nearly 60 percent reported plans to increase their organic use of YouTube for marketing purposes in the following 12 months. LinkedIn and Instagram followed, respectively mentioned by 57 and 56 percent of the respondents intending to use them more. According to the same survey, Facebook was the most important social media platform for marketers worldwide.

  19. d

    Removal and Enforcement Actions by Social Media Companies: Year and Month...

    • dataful.in
    Updated Nov 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataful (Factly) (2025). Removal and Enforcement Actions by Social Media Companies: Year and Month wise Number of Content Removed and Accounts Banned/Suspended by SSMIs and Violation Category [Dataset]. https://dataful.in/datasets/18652
    Explore at:
    csv, application/x-parquet, xlsxAvailable download formats
    Dataset updated
    Nov 5, 2025
    Dataset authored and provided by
    Dataful (Factly)
    License

    https://dataful.in/terms-and-conditionshttps://dataful.in/terms-and-conditions

    Area covered
    India
    Variables measured
    Social Media Intermediaries Ban actions
    Description

    This dataset presents year and month wise enforcement actions taken by Significant Social Media Intermediaries (SSMIs) from 2021 to the present, compiled from the mandatory monthly transparency reports published under Rule 4(1)(d) of the Information Technology (Intermediary Guidelines and Digital Media Ethics Code) Rules, 2021. It includes counts of content removed, accounts suspended or banned, and chatrooms, comments, edit profiles and livestreams restricted, along with the policy or violation category (e.g., child sexual exploitation, terrorism, hate speech, bullying, violence, regulated goods, misinformation, etc.).

    To enable comparability across platforms with different reporting terms, the dataset uses a standardised enforcement classification:

    1. enforcement_type:

    The type of action taken: a. Content Actioned (any enforcement such as warning, downranking, age-gating), b. Content Removed (content deleted or made inaccessible), c. Account Banned (account suspension or disabling), d. Quality Metric (AI moderation accuracy indicators reported by some platforms).

    1. proactive_flag:

    Whether the platform identified and enforced before user reports: a. Proactive = Found via automated detection or internal review systems, b. Unknown = Platform did not specify proactive vs reactive.

    Notes: 1. SSMI denotes to Significant Social Media Intermediaries, with over 50,00,000 registered users in India, which primarily or solely enables online interaction between two or more users and allows them to create, upload, share, disseminate, modify or access information using its services

    1. Facebook & Instagram (Meta) a. Content Actioned counts any enforcement, not only removals (e.g., removals, warning screens/covering, age gates, downranking). b. Proactive Rate = (items found & actioned proactively) ÷ (total content actioned).

    2. X/Twitter a. Child Sexual Exploitation and terrorism suspensions are largely proactive, flagged using proprietary tools and industry hash-sharing systems. b. Data reflects global enforcement, not only India.

    3. Google / YouTube a. Number of removal actions as a result of automated detection captures actions triggered by automated systems (ML + human-trained models).

    4. ShareChat a. Content Removed / Taken Down / UGC discard / Comments/Chatrooms deleted are standardised as Content Removed. b. Also includes rights-holder reporting workflow for copyright/IP and automated proactive monitoring for harmful content.

    5. WhatsApp a. Reports Proactively Banned Accounts, meaning accounts banned before any user reports.

    6. Koo a. Distinguishes between Content Removed, Content Actioned (flagged/downranked), and Account Banned. b. Automation Correct/Wrong reflect AI moderation accuracy, not enforcement outcomes.

  20. r

    Abbreviated FOMO and social media dataset

    • researchdata.edu.au
    • figshare.mq.edu.au
    Updated Jul 7, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ron Rapee; McEvoy, Peter; Maree J. Abbott; Madeleine Ferrari; Eyal Karin; Danielle Einstein; Carol Dabb; Anne McMaugh (2022). Abbreviated FOMO and social media dataset [Dataset]. http://doi.org/10.25949/20188298.V1
    Explore at:
    Dataset updated
    Jul 7, 2022
    Dataset provided by
    Macquarie University
    Authors
    Ron Rapee; McEvoy, Peter; Maree J. Abbott; Madeleine Ferrari; Eyal Karin; Danielle Einstein; Carol Dabb; Anne McMaugh
    Description

    This database is comprised of 951 participants who provided self-report data online in their school classrooms. The data was collected in 2016 and 2017. The dataset is comprised of 509 males (54%) and 442 females (46%). Their ages ranged from 12 to 16 years (M = 13.69, SD = 0.72). Seven participants did not report their age. The majority were born in Australia (N = 849, 89%). The next most common countries of birth were China (N = 24, 2.5%), the UK (N = 23, 2.4%), and the USA (N = 9, 0.9%). Data were drawn from students at five Australian independent secondary schools.

    The data contains item responses for the Spence Children’s Anxiety Scale (SCAS; Spence, 1998) which is comprised of 44 items. The Social media question asked about frequency of use with the question “How often do you use social media?”. The response options ranged from constantly to once a week or less. Items measuring Fear of Missing Out were included and incorporated the following five questions based on the APS Stress and Wellbeing in Australia Survey (APS, 2015). These were “When I have a good time it is important for me to share the details online; I am afraid that I will miss out on something if I don’t stay connected to my online social networks; I feel worried and uncomfortable when I can’t access my social media accounts; I find it difficult to relax or sleep after spending time on social networking sites; I feel my brain burnout with the constant connectivity of social media. Internal consistency for this measure was α = .81. Self compassion was measured using the 12-item short-form of the Self-Compassion Scale (SCS-SF; Raes et al., 2011).

    The data set has the option of downloading an excel file (composed of two worksheet tabs) or CSV files 1) Data and 2) Variable labels.

    References:

    Australian Psychological Society. (2015). Stress and wellbeing in Australia survey. https://www.headsup.org.au/docs/default-source/default-document-library/stress-and-wellbeing-in-australia-report.pdf?sfvrsn=7f08274d_4

    Raes, F., Pommier, E., Neff, K. D., & Van Gucht, D. (2011). Construction and factorial validation of a short form of the self-compassion scale. Clinical Psychology and Psychotherapy, 18(3), 250-255. https://doi.org/10.1002/cpp.702

    Spence, S. H. (1998). A measure of anxiety symptoms among children. Behaviour Research and Therapy, 36(5), 545-566. https://doi.org/10.1016/S0005-7967(98)00034-5

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Shaik Barood Mohammed Umar Adnaan Faiz (2025). Daily Social Media Active Users [Dataset]. https://www.kaggle.com/datasets/umeradnaan/daily-social-media-active-users
Organization logo

Daily Social Media Active Users

"A thorough dataset that displays user activity on major social media platforms

Explore at:
zip(126814 bytes)Available download formats
Dataset updated
May 5, 2025
Authors
Shaik Barood Mohammed Umar Adnaan Faiz
License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

Description:

The "Daily Social Media Active Users" dataset provides a comprehensive and dynamic look into the digital presence and activity of global users across major social media platforms. The data was generated to simulate real-world usage patterns for 13 popular platforms, including Facebook, YouTube, WhatsApp, Instagram, WeChat, TikTok, Telegram, Snapchat, X (formerly Twitter), Pinterest, Reddit, Threads, LinkedIn, and Quora. This dataset contains 10,000 rows and includes several key fields that offer insights into user demographics, engagement, and usage habits.

Dataset Breakdown:

  • Platform: The name of the social media platform where the user activity is tracked. It includes globally recognized platforms, such as Facebook, YouTube, and TikTok, that are known for their large, active user bases.

  • Owner: The company or entity that owns and operates the platform. Examples include Meta for Facebook, Instagram, and WhatsApp, Google for YouTube, and ByteDance for TikTok.

  • Primary Usage: This category identifies the primary function of each platform. Social media platforms differ in their primary usage, whether it's for social networking, messaging, multimedia sharing, professional networking, or more.

  • Country: The geographical region where the user is located. The dataset simulates global coverage, showcasing users from diverse locations and regions. It helps in understanding how user behavior varies across different countries.

  • Daily Time Spent (min): This field tracks how much time a user spends on a given platform on a daily basis, expressed in minutes. Time spent data is critical for understanding user engagement levels and the popularity of specific platforms.

  • Verified Account: Indicates whether the user has a verified account. This feature mimics real-world patterns where verified users (often public figures, businesses, or influencers) have enhanced status on social media platforms.

  • Date Joined: The date when the user registered or started using the platform. This data simulates user account history and can provide insights into user retention trends or platform growth over time.

Context and Use Cases:

  • This synthetic dataset is designed to offer a privacy-friendly alternative for analytics, research, and machine learning purposes. Given the complexities and privacy concerns around using real user data, especially in the context of social media, this dataset offers a clean and secure way to develop, test, and fine-tune applications, models, and algorithms without the risks of handling sensitive or personal information.

Researchers, data scientists, and developers can use this dataset to:

  • Model User Behavior: By analyzing patterns in daily time spent, verified status, and country of origin, users can model and predict social media engagement behavior.

  • Test Analytics Tools: Social media monitoring and analytics platforms can use this dataset to simulate user activity and optimize their tools for engagement tracking, reporting, and visualization.

  • Train Machine Learning Algorithms: The dataset can be used to train models for various tasks like user segmentation, recommendation systems, or churn prediction based on engagement metrics.

  • Create Dashboards: This dataset can serve as the foundation for creating user-friendly dashboards that visualize user trends, platform comparisons, and engagement patterns across the globe.

  • Conduct Market Research: Business intelligence teams can use the data to understand how various demographics use social media, offering valuable insights into the most engaged regions, platform preferences, and usage behaviors.

  • Sources of Inspiration: This dataset is inspired by public data from industry reports, such as those from Statista, DataReportal, and other market research platforms. These sources provide insights into the global user base and usage statistics of popular social media platforms. The synthetic nature of this dataset allows for the use of realistic engagement metrics without violating any privacy concerns, making it an ideal tool for educational, analytical, and research purposes.

The structure and design of the dataset are based on real-world usage patterns and aim to represent a variety of users from different backgrounds, countries, and activity levels. This diversity makes it an ideal candidate for testing data-driven solutions and exploring social media trends.

Future Considerations:

As the social media landscape continues to evolve, this dataset can be updated or extended to include new platforms, engagement metrics, or user behaviors. Future iterations may incorporate features like post frequency, follower counts, engagement rates (likes, comments, shares), or even sentiment analysis from user-generated content.

By leveraging this dataset, analysts and data scientists can create better, more effective strategies ...

Search
Clear search
Close search
Google apps
Main menu