100+ datasets found
  1. Social Media Datasets

    • brightdata.com
    .json, .csv, .xlsx
    Updated Sep 7, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bright Data (2022). Social Media Datasets [Dataset]. https://brightdata.com/products/datasets/social-media
    Explore at:
    .json, .csv, .xlsxAvailable download formats
    Dataset updated
    Sep 7, 2022
    Dataset authored and provided by
    Bright Datahttps://brightdata.com/
    License

    https://brightdata.com/licensehttps://brightdata.com/license

    Area covered
    Worldwide
    Description

    Gain valuable insights with our comprehensive Social Media Dataset, designed to help businesses, marketers, and analysts track trends, monitor engagement, and optimize strategies. This dataset provides structured and reliable social media data from multiple platforms.

    Dataset Features

    User Profiles: Access public social media profiles, including usernames, bios, follower counts, engagement metrics, and more. Ideal for audience analysis, influencer marketing, and competitive research. Posts & Content: Extract posts, captions, hashtags, media (images/videos), timestamps, and engagement metrics such as likes, shares, and comments. Useful for trend analysis, sentiment tracking, and content strategy optimization. Comments & Interactions: Analyze user interactions, including replies, mentions, and discussions. This data helps brands understand audience sentiment and engagement patterns. Hashtag & Trend Tracking: Monitor trending hashtags, topics, and viral content across platforms to stay ahead of industry trends and consumer interests.

    Customizable Subsets for Specific Needs Our Social Media Dataset is fully customizable, allowing you to filter data based on platform, region, keywords, engagement levels, or specific user profiles. Whether you need a broad dataset for market research or a focused subset for brand monitoring, we tailor the dataset to your needs.

    Popular Use Cases

    Brand Monitoring & Reputation Management: Track brand mentions, customer feedback, and sentiment analysis to manage online reputation effectively. Influencer Marketing & Audience Analysis: Identify key influencers, analyze engagement metrics, and optimize influencer partnerships. Competitive Intelligence: Monitor competitor activity, content performance, and audience engagement to refine marketing strategies. Market Research & Consumer Insights: Analyze social media trends, customer preferences, and emerging topics to inform business decisions. AI & Predictive Analytics: Leverage structured social media data for AI-driven trend forecasting, sentiment analysis, and automated content recommendations.

    Whether you're tracking brand sentiment, analyzing audience engagement, or monitoring industry trends, our Social Media Dataset provides the structured data you need. Get started today and customize your dataset to fit your business objectives.

  2. Most used social networks 2025, by number of users

    • statista.com
    • abripper.com
    • +2more
    Updated Oct 16, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Most used social networks 2025, by number of users [Dataset]. https://www.statista.com/statistics/272014/global-social-networks-ranked-by-number-of-users/
    Explore at:
    Dataset updated
    Oct 16, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Worldwide
    Description

    Market leader Facebook was the first social network to surpass one billion registered accounts and currently sits at more than three billion monthly active users. Meta Platforms owns four of the biggest social media platforms, all with more than one billion monthly active users each: Facebook (core platform), WhatsApp, Messenger, and Instagram. In the third quarter of 2023, Facebook reported around four billion monthly core Family product users. The United States and China account for the most high-profile social platforms Most top-ranked social networks with more than 100 million users originated in the United States, but services like Chinese social networks WeChat, QQ, or video-sharing app Douyin have also garnered mainstream appeal in their respective regions due to local context and content. Douyin’s popularity has led to the platform releasing an international version of its network, TikTok. How many people use social media? The leading social networks are usually available in multiple languages and enable users to connect with friends or people across geographical, political, or economic borders. In 2025, social networking sites are estimated to reach 5.44 billion users, and these figures are still expected to grow as mobile device usage and mobile social networks increasingly gain traction in previously underserved markets.

  3. Daily Social Media Active Users

    • kaggle.com
    zip
    Updated May 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shaik Barood Mohammed Umar Adnaan Faiz (2025). Daily Social Media Active Users [Dataset]. https://www.kaggle.com/datasets/umeradnaan/daily-social-media-active-users
    Explore at:
    zip(126814 bytes)Available download formats
    Dataset updated
    May 5, 2025
    Authors
    Shaik Barood Mohammed Umar Adnaan Faiz
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Description:

    The "Daily Social Media Active Users" dataset provides a comprehensive and dynamic look into the digital presence and activity of global users across major social media platforms. The data was generated to simulate real-world usage patterns for 13 popular platforms, including Facebook, YouTube, WhatsApp, Instagram, WeChat, TikTok, Telegram, Snapchat, X (formerly Twitter), Pinterest, Reddit, Threads, LinkedIn, and Quora. This dataset contains 10,000 rows and includes several key fields that offer insights into user demographics, engagement, and usage habits.

    Dataset Breakdown:

    • Platform: The name of the social media platform where the user activity is tracked. It includes globally recognized platforms, such as Facebook, YouTube, and TikTok, that are known for their large, active user bases.

    • Owner: The company or entity that owns and operates the platform. Examples include Meta for Facebook, Instagram, and WhatsApp, Google for YouTube, and ByteDance for TikTok.

    • Primary Usage: This category identifies the primary function of each platform. Social media platforms differ in their primary usage, whether it's for social networking, messaging, multimedia sharing, professional networking, or more.

    • Country: The geographical region where the user is located. The dataset simulates global coverage, showcasing users from diverse locations and regions. It helps in understanding how user behavior varies across different countries.

    • Daily Time Spent (min): This field tracks how much time a user spends on a given platform on a daily basis, expressed in minutes. Time spent data is critical for understanding user engagement levels and the popularity of specific platforms.

    • Verified Account: Indicates whether the user has a verified account. This feature mimics real-world patterns where verified users (often public figures, businesses, or influencers) have enhanced status on social media platforms.

    • Date Joined: The date when the user registered or started using the platform. This data simulates user account history and can provide insights into user retention trends or platform growth over time.

    Context and Use Cases:

    • This synthetic dataset is designed to offer a privacy-friendly alternative for analytics, research, and machine learning purposes. Given the complexities and privacy concerns around using real user data, especially in the context of social media, this dataset offers a clean and secure way to develop, test, and fine-tune applications, models, and algorithms without the risks of handling sensitive or personal information.

    Researchers, data scientists, and developers can use this dataset to:

    • Model User Behavior: By analyzing patterns in daily time spent, verified status, and country of origin, users can model and predict social media engagement behavior.

    • Test Analytics Tools: Social media monitoring and analytics platforms can use this dataset to simulate user activity and optimize their tools for engagement tracking, reporting, and visualization.

    • Train Machine Learning Algorithms: The dataset can be used to train models for various tasks like user segmentation, recommendation systems, or churn prediction based on engagement metrics.

    • Create Dashboards: This dataset can serve as the foundation for creating user-friendly dashboards that visualize user trends, platform comparisons, and engagement patterns across the globe.

    • Conduct Market Research: Business intelligence teams can use the data to understand how various demographics use social media, offering valuable insights into the most engaged regions, platform preferences, and usage behaviors.

    • Sources of Inspiration: This dataset is inspired by public data from industry reports, such as those from Statista, DataReportal, and other market research platforms. These sources provide insights into the global user base and usage statistics of popular social media platforms. The synthetic nature of this dataset allows for the use of realistic engagement metrics without violating any privacy concerns, making it an ideal tool for educational, analytical, and research purposes.

    The structure and design of the dataset are based on real-world usage patterns and aim to represent a variety of users from different backgrounds, countries, and activity levels. This diversity makes it an ideal candidate for testing data-driven solutions and exploring social media trends.

    Future Considerations:

    As the social media landscape continues to evolve, this dataset can be updated or extended to include new platforms, engagement metrics, or user behaviors. Future iterations may incorporate features like post frequency, follower counts, engagement rates (likes, comments, shares), or even sentiment analysis from user-generated content.

    By leveraging this dataset, analysts and data scientists can create better, more effective strategies ...

  4. Number of social network users worldwide 2017-2030

    • statista.com
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista, Number of social network users worldwide 2017-2030 [Dataset]. https://www.statista.com/statistics/278414/number-of-worldwide-social-network-users/
    Explore at:
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    World
    Description

    How many people use social media? Social media usage is one of the most popular online activities. In 2025, over *** billion people were estimated to be using social media worldwide, a number projected to increase to over *** billion in 2030. Who uses social media? Social networking is one of the most popular digital activities worldwide, and it is no surprise that social networking penetration across all regions is constantly increasing. As of January 2023, the global social media usage rate stood at ** percent. This figure is anticipated to grow as less developed digital markets catch up with other regions when it comes to infrastructure development and the availability of cheap mobile devices. In fact, most of social media’s global growth is driven by the increasing usage of mobile devices. The mobile-first market of Eastern Asia topped the global ranking of mobile social networking penetration, followed by established digital powerhouses such as the Americas and Northern Europe. How much time do people spend on social media? Social media is an integral part of daily internet usage. On average, internet users spend *** minutes per day on social media and messaging apps, an increase of ** minutes since 2015. On average, internet users in Latin America had the highest average time spent per day on social media. What are the most popular social media platforms? Market leader Facebook was the first social network to surpass *** billion registered accounts and currently boasts approximately *** billion monthly active users, making it the most popular social network worldwide. In June 2023, the top social media apps in the Apple App Store included mobile messaging apps WhatsApp and Telegram Messenger, as well as the ever-popular app version of Facebook.

  5. d

    Removal and Enforcement Actions by Social Media Companies: Year and Month...

    • dataful.in
    Updated Nov 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataful (Factly) (2025). Removal and Enforcement Actions by Social Media Companies: Year and Month wise Number of Content Removed and Accounts Banned/Suspended by SSMIs and Violation Category [Dataset]. https://dataful.in/datasets/18652
    Explore at:
    csv, application/x-parquet, xlsxAvailable download formats
    Dataset updated
    Nov 5, 2025
    Dataset authored and provided by
    Dataful (Factly)
    License

    https://dataful.in/terms-and-conditionshttps://dataful.in/terms-and-conditions

    Area covered
    India
    Variables measured
    Social Media Intermediaries Ban actions
    Description

    This dataset presents year and month wise enforcement actions taken by Significant Social Media Intermediaries (SSMIs) from 2021 to the present, compiled from the mandatory monthly transparency reports published under Rule 4(1)(d) of the Information Technology (Intermediary Guidelines and Digital Media Ethics Code) Rules, 2021. It includes counts of content removed, accounts suspended or banned, and chatrooms, comments, edit profiles and livestreams restricted, along with the policy or violation category (e.g., child sexual exploitation, terrorism, hate speech, bullying, violence, regulated goods, misinformation, etc.).

    To enable comparability across platforms with different reporting terms, the dataset uses a standardised enforcement classification:

    1. enforcement_type:

    The type of action taken: a. Content Actioned (any enforcement such as warning, downranking, age-gating), b. Content Removed (content deleted or made inaccessible), c. Account Banned (account suspension or disabling), d. Quality Metric (AI moderation accuracy indicators reported by some platforms).

    1. proactive_flag:

    Whether the platform identified and enforced before user reports: a. Proactive = Found via automated detection or internal review systems, b. Unknown = Platform did not specify proactive vs reactive.

    Notes: 1. SSMI denotes to Significant Social Media Intermediaries, with over 50,00,000 registered users in India, which primarily or solely enables online interaction between two or more users and allows them to create, upload, share, disseminate, modify or access information using its services

    1. Facebook & Instagram (Meta) a. Content Actioned counts any enforcement, not only removals (e.g., removals, warning screens/covering, age gates, downranking). b. Proactive Rate = (items found & actioned proactively) ÷ (total content actioned).

    2. X/Twitter a. Child Sexual Exploitation and terrorism suspensions are largely proactive, flagged using proprietary tools and industry hash-sharing systems. b. Data reflects global enforcement, not only India.

    3. Google / YouTube a. Number of removal actions as a result of automated detection captures actions triggered by automated systems (ML + human-trained models).

    4. ShareChat a. Content Removed / Taken Down / UGC discard / Comments/Chatrooms deleted are standardised as Content Removed. b. Also includes rights-holder reporting workflow for copyright/IP and automated proactive monitoring for harmful content.

    5. WhatsApp a. Reports Proactively Banned Accounts, meaning accounts banned before any user reports.

    6. Koo a. Distinguishes between Content Removed, Content Actioned (flagged/downranked), and Account Banned. b. Automation Correct/Wrong reflect AI moderation accuracy, not enforcement outcomes.

  6. s

    YouTube Usage

    • searchlogistics.com
    Updated Apr 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). YouTube Usage [Dataset]. https://www.searchlogistics.com/learn/statistics/social-media-user-statistics/
    Explore at:
    Dataset updated
    Apr 1, 2025
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    YouTube gets an average of 14.3 billion total worldwide visits every month.

  7. Impact of social media on suicide rates

    • kaggle.com
    zip
    Updated Oct 21, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aadya Singh (2024). Impact of social media on suicide rates [Dataset]. https://www.kaggle.com/datasets/aadyasingh55/impact-of-social-media-on-suicide-rates
    Explore at:
    zip(811 bytes)Available download formats
    Dataset updated
    Oct 21, 2024
    Authors
    Aadya Singh
    License

    Attribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
    License information was derived automatically

    Description

    Impact of Social Media on Suicide Rates: Produced Results

    Overview

    This dataset explores the impact of social media usage on suicide rates, presenting an analysis based on social media platform data and WHO suicide rate statistics. It is an insightful resource for researchers, data scientists, and analysts looking to understand the correlation between increased social media activity and suicide rates across different regions and demographics.

    Content

    The dataset includes the following key sources:

    WHO Suicide Rate Data (SDGSUICIDE): Retrieved from WHO data export, which tracks global suicide rates. Social Media Usage Data: Information from major social media platforms, sourced from Kaggle, supplemented with data from:

    Facebook: Statista

    Twitter: Twitter Investor Relations

    Instagram: Facebook Investor Relations

    Acknowledgements

    We would like to acknowledge:

    World Health Organization (WHO): For providing global suicide rate data, accessible under their data policy (WHO Data Policy). Kaggle Dataset Contributors: For social media usage data that played a crucial role in the analysis.

    Usage

    This dataset is useful for studying the potential social factors contributing to suicide rates, especially the role of social media. Analysts can explore correlations using time-series analysis, regression models, or other statistical tools to derive meaningful insights. Please ensure compliance with the Creative Commons Attribution Non-Commercial Share Alike 4.0 International License (CC BY-NC-SA 4.0).

    Data Files

    Impact-of-social-media-on-suicide-rates-results-1.1.0.zip (90.9 kB) Contains processed results and supplementary data.

    Citations

    If you use this dataset in your work, please cite:

    Martin Winkler. (2021). Impact of social media on suicide rates: produced results (1.1.0) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.4701587 https://zenodo.org/records/4701587

    License

    This dataset is released under the Creative Commons Attribution Non-Commercial Share Alike 4.0 International (CC BY-NC-SA 4.0) license. You are free to share and adapt the material, provided proper attribution is given, it's not used for commercial purposes, and any derivatives are distributed under the same license.

    Columns

    Year: The year of the recorded data. Sex: Demographic indicator (e.g., male, female). Suicide Rate % Change Since 2010: Percentage change in suicide rates compared to the year 2010. Twitter User Count % Change Since 2010: Percentage change in Twitter user counts compared to the year 2010. Facebook User Count % Change Since 2010: Percentage change in Facebook user counts compared to the year 2010.

    Data Bins

    The dataset includes categorized data ranges, allowing for analysis of trends within specified intervals. For example, ranges for suicide rates, Twitter user counts, and Facebook user counts are represented in bins for better granularity.

    Count Summary

    The dataset summarizes counts for various intervals, enabling researchers to identify trends and patterns over time, highlighting periods of significant change or stability in both suicide rates and social media usage.

    Use Cases

    This dataset can be used for:

    Statistical analysis to understand correlations between social media usage and mental health outcomes. Academic research focused on public health, psychology, or sociology. Policy-making discussions aimed at addressing mental health concerns linked to social media.

    Cautions

    The dataset contains sensitive information regarding suicide rates. Users should handle this data with care and sensitivity, considering ethical implications when presenting findings.

  8. g

    COVID-19 Social Media Counts & Sentiment

    • covid-hub.gio.georgia.gov
    Updated Apr 6, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    foustl32 (2020). COVID-19 Social Media Counts & Sentiment [Dataset]. https://covid-hub.gio.georgia.gov/datasets/feb6280d42de4e91b47cf37344a91eae
    Explore at:
    Dataset updated
    Apr 6, 2020
    Dataset authored and provided by
    foustl32
    Area covered
    Description

    Update: As of August 26th, 2020 we are sunsetting updates to this free dataset. Please reach out to lyden@spatial.ai if you have interest in this data, Geosocial data, or other related datasets. As part of an effort to provide open source resources and data related to the COVID-19 outbreak, this feature layer includes counts of social media posts aggregated at the county that mention COVID-19. This data is provided historically week over week as far back January 26th, 2020. This feature service will be refreshed regularly to remain up to date. It was most recently updated using data collected through August 24th. Data also includes information about the sentiment of posts collected. Posts are classified as negative, neutral, or positive and aggregated at a county level per week. To perform sentiment analysis, the VADER (Valence Aware Dictionary and sEntiment Reasoner) model was used. This feature service was developed in collaboration between Datastory & Spatial.ai. There's a powerful story hidden in your data... Datastory can help you see it. Visit www.datastoryconsulting.com to learn more. Social media counts and statistics come from Twitter data collected by Spatial.ai for the creation of Geosocial data, which uses machine learning to create geographic social media segmentation. Learn more about the underlying data at https://spatial.ai/esri or reach out to lyden@spatial.ai for more information.

  9. s

    Twitter Users

    • searchlogistics.com
    Updated Apr 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Twitter Users [Dataset]. https://www.searchlogistics.com/learn/statistics/social-media-user-statistics/
    Explore at:
    Dataset updated
    Apr 1, 2025
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The average Twitter user spends 5.1 hours per month on the platform.

  10. s

    TikTok Users

    • searchlogistics.com
    Updated Apr 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). TikTok Users [Dataset]. https://www.searchlogistics.com/learn/statistics/social-media-user-statistics/
    Explore at:
    Dataset updated
    Apr 1, 2025
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Users spend an average of 19.6 hours per month on TikTok alone. This works out to be approximately 39 minutes per day.

  11. s

    Truth Social vs Other Social Media Platforms

    • searchlogistics.com
    Updated Apr 24, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). Truth Social vs Other Social Media Platforms [Dataset]. https://www.searchlogistics.com/learn/statistics/truth-social-statistics/
    Explore at:
    Dataset updated
    Apr 24, 2023
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    How does Truth Social compare to other social media platforms? There are around 2 million active Truth Social users.

  12. Social Media Sponsorship & Engagement Dataset

    • kaggle.com
    zip
    Updated May 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    OmenKj (2025). Social Media Sponsorship & Engagement Dataset [Dataset]. https://www.kaggle.com/datasets/omenkj/social-media-sponsorship-and-engagement-dataset/data
    Explore at:
    zip(8047768 bytes)Available download formats
    Dataset updated
    May 28, 2025
    Authors
    OmenKj
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    This social media content dataset is simulate realistic influencer posts across multiple popular platforms, reflecting diverse content types, sponsorship details, audience demographics, and engagement metrics. The dataset contains over 52,000 rows representing individual content posts generated over the past two years. It includes a balanced distribution of sponsored and non-sponsored content, with detailed disclosure information to support transparency studies and analyses. The variety of platforms, languages, content categories, and audience demographics makes this dataset ideal for exploring influencer marketing dynamics, content performance analytics, disclosure practices, and audience segmentation in social media research.

    Dataset Features

    id: Unique identifier for each content post (starting from 1).

    platform: The social media platform where the content was posted. Values: YouTube, TikTok, Instagram, Bilibili, RedNote.

    content_id: Unique ID for each content piece (e.g., content_0, content_1, …).

    creator_id: Unique identifier for the content creator, cycling through 5000 distinct creators.

    creator_name: Username of the content creator.

    content_url: URL pointing to the content.

    content_type: Format of the content. Values: video, image, text, mixed.

    content_category: The main theme or niche of the content. Values: beauty, lifestyle, tech.

    post_date: Timestamp of the post, randomly distributed over the past two years.

    language: Language of the content, with probabilities favoring English. Values: English, Chinese, Spanish, Hindi, Japanese.

    content_length: Length of the content in seconds (for video) or word count (for text), varying by content type.

    content_description: Textual description or caption of the content.

    hashtags: A comma-separated string of hashtags used in the post (0 to 5 tags).

    views: Number of views (simulated via a Poisson distribution).

    likes: Number of likes received.

    shares: Number of shares.

    comments_count: Count of comments on the post.

    comments_text: Aggregated text of comments (0 to 5 comments concatenated).

    follower_count: Number of followers the creator had at the time of posting.

    is_sponsored: Boolean indicating whether the post is sponsored.

    disclosure_type: Disclosure type regarding sponsorship for sponsored posts. Values: explicit, implicit, none (non-sponsored always 'none').

    sponsor_name: Name of the sponsoring company if sponsored, else 'Not sponsors'.

    sponsor_category: Sponsorship industry category. Values: cosmetics, electronics, fashion, food, gaming, travel or 'Not sponsors'.

    disclosure_location: Where sponsorship disclosure appears in the post. Values: video, caption, hashtags, none (non-sponsored always 'none').

    audience_age_distribution: Predominant age group of the audience. Values: 13-18, 19-25, 26-35, 36-50, 50+.

    audience_gender_distribution: Predominant gender of the audience. Values: male, female, non-binary, unknown.

    audience_location: Primary geographic location of the audience. Values: USA, China, India, Japan, Brazil, Germany, UK, Russia.

  13. Social Media PII Disclosure Analyses

    • kaggle.com
    zip
    Updated Jul 30, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eidan Rosado (2024). Social Media PII Disclosure Analyses [Dataset]. https://www.kaggle.com/datasets/edyvision/social-media-pii-disclosure-analyses
    Explore at:
    zip(29813203 bytes)Available download formats
    Dataset updated
    Jul 30, 2024
    Authors
    Eidan Rosado
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Privacy vs. Social Capital: Social Media PII Disclosure Analyses

    This data was collected and analyzed as part of a study on PII disclosures in social media conversations with special attention to influencer characteristics in the interactions in the dissertation titled Privacy vs. Social Capital: Examining Information Disclosure Patterns within Social Media Influencer Networks and the research paper titled Unveiling Influencer-Driven Personal Data Sharing in Social Media Discourse.

    Each study phase is different, with X (Twitter) data used in the pilot analysis and Reddit data used in the main study. Both folders will have the analyzed_posts and cluster summary csv files broken down by collection (either based on trend or collection date).

    Note: Raw data is not made available in these datasets due to the nature of the study and to protect the original authors.

    Notable Data Elements

    Post Data

    Column nameTypeDescription
    Node IDUUIDUnique identifier for post (replaces original platform identifier)
    User IDUUIDUnique identifier assigned for user (replaces original platform identifier)
    Cluster NameStrComposite ID for subgraph using collection name and subgraph index
    Influence PowerFloatEigenvector centrality
    Influencer TierStrCategorical label calculated by follower count
    Collection NameStrTrend collection assigned based on search query
    HashtagsSet(str)The set of hashtags included in the node
    PII DisclosedBoolWhether or not PII was disclosed
    PII DetectedSet(str)The detected token types in post
    PII Risk ScoreFloatThe PII score for all tokens in a post
    Is CommentBoolWhether or not the post is a comment or reply
    Is Text StarterBoolWhether or not the post has text content
    CommunityStrThe group, community, channel, etc. associated with
    TimestampTimestampCreation timestamp (provided by social media API)
    Time ElapsedIntTime elapsed (seconds) from original influencer’s post

    Cluster Data

    Column NameTypeDescription
    Cluster NameStrComposite ID for subgraph using collection name and subgraph index
    Influencer Tiers FrequenciesList[dict]Frequency of influencer tiers of all users in the cluster
    Top Influence Power ScoreFloatEigenvector centrality of top influencer
    Top Influencer TierStrSize tier of top influencer
    Collection NameStrTrend collection assigned based on search query.
    HashtagsSet(str)The set of hashtags included in the cluster
    PII Detection FrequenciesList[dict]The detected token types in post with frequencies
    Node CountIntCount of all nodes in the influencer cluster
    Node DisclosuresIntCount of all nodes with mean_risk_score > 1*
    Disclosure RatioFloatSum of nodes with confirmed disclosed PII divided by overall cluster size (count of nodes in the cluster)
    Mean Risk ScoreFloatThe mean risk score for an entire network cluster
    Median Risk ScoreFloatThe median risk score for an entire network cluster
    Min Risk ScoreFloatThe min risk score for an entire network cluster
    Max Risk ScoreFloatThe max risk score for an entire network cluster
    Time SpanFloatTotal Time Elapsed
  14. s

    Snapchat Users

    • searchlogistics.com
    Updated Apr 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Snapchat Users [Dataset]. https://www.searchlogistics.com/learn/statistics/social-media-user-statistics/
    Explore at:
    Dataset updated
    Apr 1, 2025
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Snapchat now boasts over 319 million daily active users. That means it’s one of the most engaging platforms. Snapchat currently has a total user base of 800 million.

  15. Social Media Engagement: A Comprehensive Analysis

    • kaggle.com
    zip
    Updated Jul 12, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mehmet ISIK (2023). Social Media Engagement: A Comprehensive Analysis [Dataset]. https://www.kaggle.com/datasets/mehmetisik/livedataset
    Explore at:
    zip(101676 bytes)Available download formats
    Dataset updated
    Jul 12, 2023
    Authors
    Mehmet ISIK
    Description

    Dataset Overview

    This comprehensive dataset offers an in-depth analysis of social media engagements across various platforms. It captures the dynamics of user interactions by tracking the number of reactions, comments, shares, and types of posts. Ideal for social media analysts, marketers, and researchers, this dataset serves as a critical tool for understanding digital communication trends and enhancing social media strategies. Each entry provides detailed metrics on how posts are received by audiences, enabling data-driven insights into content performance.

    Key Features:

    📌 num_reactions: Total number of reactions a post receives, encapsulating the overall engagement. 📍 num_comments: Reflects the level of audience interaction through comments. 📸 num_shares: Indicates the virality of the post by counting how many times it has been shared. ❤️ num_likes: Tracks the number of likes, showing general approval of the content. 🥰 num_loves: Captures more intense affection reactions to posts. 😮 num_wows: Measures the surprise or awe factor of the post. 😂 num_hahas: Counts instances of amusement or laughter triggered by the post. 😢 num_sads: Reflects the number of sad reactions, indicating emotional impact. 😡 num_angrys: Tracks angry reactions, highlighting content that might be controversial or upsetting. 🔗 status_type_link: Binary indicator of whether the post includes a link, enhancing its informational value. 🖼️ status_type_photo: Identifies posts with photos, crucial for visual content analysis. 📝 status_type_status: Marks textual posts, focusing on written content engagement. 🎥 status_type_video: Distinguishes posts with videos, important for engagement in dynamic content.

    This dataset not only aids in measuring the effectiveness of social media campaigns but also supports the development of targeted marketing strategies and content optimization efforts to maximize audience engagement.

  16. m

    Graph-Based Social Media Data on Mental Health Topics

    • data.mendeley.com
    Updated Nov 4, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Samuel Ady Sanjaya (2024). Graph-Based Social Media Data on Mental Health Topics [Dataset]. http://doi.org/10.17632/z45txpdp7f.2
    Explore at:
    Dataset updated
    Nov 4, 2024
    Authors
    Samuel Ady Sanjaya
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is structured as a graph, where nodes represent users and edges capture their interactions, including tweets, retweets, replies, and mentions. Each node provides detailed user attributes, such as unique ID, follower and following counts, and verification status, offering insights into each user's identity, role, and influence in the mental health discourse. The edges illustrate user interactions, highlighting engagement patterns and types of content that drive responses, such as tweet impressions. This interconnected structure enables sentiment analysis and public reaction studies, allowing researchers to explore engagement trends and identify the mental health topics that resonate most with users.

    The dataset consists of three files: 1. Edges Data: Contains graph data essential for social network analysis, including fields for UserID (Source), UserID (Destination), Post/Tweet ID, and Date of Relationship. This file enables analysis of user connections without including tweet content, maintaining compliance with Twitter/X’s data-sharing policies. 2. Nodes Data: Offers user-specific details relevant to network analysis, including UserID, Account Creation Date, Follower and Following counts, Verified Status, and Date Joined Twitter. This file allows researchers to examine user behavior (e.g., identifying influential users or spam-like accounts) without direct reference to tweet content. 3. Twitter/X Content Data: This file contains only the raw tweet text as a single-column dataset, without associated user identifiers or metadata. By isolating the text, we ensure alignment with anonymization standards observed in similar published datasets, safeguarding user privacy in compliance with Twitter/X's data guidelines. This content is crucial for addressing the research focus on mental health discourse in social media. (References to prior Data in Brief publications involving Twitter/X data informed the dataset's structure.)

  17. s

    Snapchat Demographics

    • searchlogistics.com
    Updated Apr 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Snapchat Demographics [Dataset]. https://www.searchlogistics.com/learn/statistics/social-media-user-statistics/
    Explore at:
    Dataset updated
    Apr 1, 2025
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Snapchat has a reach into 75% of the millenial and Gen Z audience.

  18. Instagram Dataset

    • brightdata.com
    .json, .csv, .xlsx
    Updated Apr 26, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bright Data (2022). Instagram Dataset [Dataset]. https://brightdata.com/products/datasets/instagram
    Explore at:
    .json, .csv, .xlsxAvailable download formats
    Dataset updated
    Apr 26, 2022
    Dataset authored and provided by
    Bright Datahttps://brightdata.com/
    License

    https://brightdata.com/licensehttps://brightdata.com/license

    Area covered
    Worldwide
    Description

    Access detailed insights with our Instagram datasets, featuring follower counts, verified status, account types, and engagement scores. Explore post information including URLs, descriptions, hashtags, comments, likes, media, posting dates, locations, and reel URLs. Perfect for understanding user engagement and content trends to drive informed decisions and optimize your social media strategies. Over 750M records available Price starts at $250/100K records Data formats are available in JSON, NDJSON, CSV, XLSX and Parquet. 100% ethical and compliant data collection Included datapoints:

    Account Fbid Id Followers Posts Count Is Business Account Is Professional Account Is Verified Avg Engagement External Url Biography Business Category Name Category Name Post Hashtags Following Posts Profile Image Link Profile URL Profile Name Highlights Count Highlights Full Name Is Private Bio Hashtags URL Is Joined Recently And much more

  19. Twitter Sentiment Analysis Datasets

    • brightdata.com
    .json, .csv, .xlsx
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bright Data, Twitter Sentiment Analysis Datasets [Dataset]. https://brightdata.com/products/datasets/twitter/sentiment-analysis
    Explore at:
    .json, .csv, .xlsxAvailable download formats
    Dataset authored and provided by
    Bright Datahttps://brightdata.com/
    License

    https://brightdata.com/licensehttps://brightdata.com/license

    Area covered
    Worldwide
    Description

    Our Twitter Sentiment Analysis Dataset provides a comprehensive collection of tweets, enabling businesses, researchers, and analysts to assess public sentiment, track trends, and monitor brand perception in real time. This dataset includes detailed metadata for each tweet, allowing for in-depth analysis of user engagement, sentiment trends, and social media impact.

    Key Features:
    
      Tweet Content & Metadata: Includes tweet text, hashtags, mentions, media attachments, and engagement metrics such as likes, retweets, and replies.
      Sentiment Classification: Analyze sentiment polarity (positive, negative, neutral) to gauge public opinion on brands, events, and trending topics.
      Author & User Insights: Access user details such as username, profile information, follower count, and account verification status.
      Hashtag & Topic Tracking: Identify trending hashtags and keywords to monitor conversations and sentiment shifts over time.
      Engagement Metrics: Measure tweet performance based on likes, shares, and comments to evaluate audience interaction.
      Historical & Real-Time Data: Choose from historical datasets for trend analysis or real-time data for up-to-date sentiment tracking.
    
    
    Use Cases:
    
      Brand Monitoring & Reputation Management: Track public sentiment around brands, products, and services to manage reputation and customer perception.
      Market Research & Consumer Insights: Analyze consumer opinions on industry trends, competitor performance, and emerging market opportunities.
      Political & Social Sentiment Analysis: Evaluate public opinion on political events, social movements, and global issues.
      AI & Machine Learning Applications: Train sentiment analysis models for natural language processing (NLP) and predictive analytics.
      Advertising & Campaign Performance: Measure the effectiveness of marketing campaigns by analyzing audience engagement and sentiment.
    
    
    
      Our dataset is available in multiple formats (JSON, CSV, Excel) and can be delivered via API, cloud storage (AWS, Google Cloud, Azure), or direct download. 
      Gain valuable insights into social media sentiment and enhance your decision-making with high-quality, structured Twitter data.
    
  20. Top 100+ Social Media Platforms/Sites (2025)

    • kaggle.com
    zip
    Updated Jan 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Taimoor Khurshid Chughtai (2025). Top 100+ Social Media Platforms/Sites (2025) [Dataset]. https://www.kaggle.com/datasets/taimoor888/top-100-social-media-platformssites-2025
    Explore at:
    zip(2761 bytes)Available download formats
    Dataset updated
    Jan 12, 2025
    Authors
    Taimoor Khurshid Chughtai
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset provides detailed rankings and key metrics for 100+ social media platforms and sites in 2025. It includes information such as user base, popularity trends, and global reach. Ideal for analyzing social media growth, user engagement, and market trends. Whether you're a data scientist, marketer, or researcher, this dataset offers valuable insights into the evolving digital landscape.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Bright Data (2022). Social Media Datasets [Dataset]. https://brightdata.com/products/datasets/social-media
Organization logo

Social Media Datasets

Explore at:
.json, .csv, .xlsxAvailable download formats
Dataset updated
Sep 7, 2022
Dataset authored and provided by
Bright Datahttps://brightdata.com/
License

https://brightdata.com/licensehttps://brightdata.com/license

Area covered
Worldwide
Description

Gain valuable insights with our comprehensive Social Media Dataset, designed to help businesses, marketers, and analysts track trends, monitor engagement, and optimize strategies. This dataset provides structured and reliable social media data from multiple platforms.

Dataset Features

User Profiles: Access public social media profiles, including usernames, bios, follower counts, engagement metrics, and more. Ideal for audience analysis, influencer marketing, and competitive research. Posts & Content: Extract posts, captions, hashtags, media (images/videos), timestamps, and engagement metrics such as likes, shares, and comments. Useful for trend analysis, sentiment tracking, and content strategy optimization. Comments & Interactions: Analyze user interactions, including replies, mentions, and discussions. This data helps brands understand audience sentiment and engagement patterns. Hashtag & Trend Tracking: Monitor trending hashtags, topics, and viral content across platforms to stay ahead of industry trends and consumer interests.

Customizable Subsets for Specific Needs Our Social Media Dataset is fully customizable, allowing you to filter data based on platform, region, keywords, engagement levels, or specific user profiles. Whether you need a broad dataset for market research or a focused subset for brand monitoring, we tailor the dataset to your needs.

Popular Use Cases

Brand Monitoring & Reputation Management: Track brand mentions, customer feedback, and sentiment analysis to manage online reputation effectively. Influencer Marketing & Audience Analysis: Identify key influencers, analyze engagement metrics, and optimize influencer partnerships. Competitive Intelligence: Monitor competitor activity, content performance, and audience engagement to refine marketing strategies. Market Research & Consumer Insights: Analyze social media trends, customer preferences, and emerging topics to inform business decisions. AI & Predictive Analytics: Leverage structured social media data for AI-driven trend forecasting, sentiment analysis, and automated content recommendations.

Whether you're tracking brand sentiment, analyzing audience engagement, or monitoring industry trends, our Social Media Dataset provides the structured data you need. Get started today and customize your dataset to fit your business objectives.

Search
Clear search
Close search
Google apps
Main menu