Facebook
TwitterMarket leader Facebook was the first social network to surpass one billion registered accounts and currently sits at more than three billion monthly active users. Meta Platforms owns four of the biggest social media platforms, all with more than one billion monthly active users each: Facebook (core platform), WhatsApp, Messenger, and Instagram. In the third quarter of 2023, Facebook reported around four billion monthly core Family product users. The United States and China account for the most high-profile social platforms Most top-ranked social networks with more than 100 million users originated in the United States, but services like Chinese social networks WeChat, QQ, or video-sharing app Douyin have also garnered mainstream appeal in their respective regions due to local context and content. Douyin’s popularity has led to the platform releasing an international version of its network, TikTok. How many people use social media? The leading social networks are usually available in multiple languages and enable users to connect with friends or people across geographical, political, or economic borders. In 2025, social networking sites are estimated to reach 5.44 billion users, and these figures are still expected to grow as mobile device usage and mobile social networks increasingly gain traction in previously underserved markets.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
*****Documentation Process***** 1. Data Preparation: - Upload the data into Power Query to assess quality and identify duplicate values, if any. - Verify data quality and types for each column, addressing any miswriting or inconsistencies. 2. Data Management: - Duplicate the original data sheet for future reference and label the new sheet as the "Working File" to preserve the integrity of the original dataset. 3. Understanding Metrics: - Clarify the meaning of column headers, particularly distinguishing between Impressions and Reach, and comprehend how Engagement Rate is calculated. - Engagement Rate formula: Total likes, comments, and shares divided by Reach. 4. Data Integrity Assurance: - Recognize that Impressions should outnumber Reach, reflecting total views versus unique audience size. - Investigate discrepancies between Reach and Impressions to ensure data integrity, identifying and resolving root causes for accurate reporting and analysis. 5. Data Correction: - Collaborate with the relevant team to rectify data inaccuracies, specifically addressing the discrepancy between Impressions and Reach. - Engage with the concerned team to understand the root cause of discrepancies between Impressions and Reach. - Identify instances where Impressions surpass Reach, potentially attributable to data transformation errors. - Following the rectification process, meticulously adjust the dataset to reflect the corrected Impressions and Reach values accurately. - Ensure diligent implementation of the corrections to maintain the integrity and reliability of the data. - Conduct a thorough recalculation of the Engagement Rate post-correction, adhering to rigorous data integrity standards to uphold the credibility of the analysis. 6. Data Enhancement: - Categorize Audience Age into three groups: "Senior Adults" (45+ years), "Mature Adults" (31-45 years), and "Adolescent Adults" (<30 years) within a new column named "Age Group." - Split date and time into separate columns using the text-to-columns option for improved analysis. 7. Temporal Analysis: - Introduce a new column for "Weekend and Weekday," renamed as "Weekday Type," to discern patterns and trends in engagement. - Define time periods by categorizing into "Morning," "Afternoon," "Evening," and "Night" based on time intervals. 8. Sentiment Analysis: - Populate blank cells in the Sentiment column with "Mixed Sentiment," denoting content containing both positive and negative sentiments or ambiguity. 9. Geographical Analysis: - Group countries and obtain additional continent data from an online source (e.g., https://statisticstimes.com/geography/countries-by-continents.php). - Add a new column for "Audience Continent" and utilize XLOOKUP function to retrieve corresponding continent data.
*****Drawing Conclusions and Providing a Summary*****
Facebook
TwitterHow many people use social media? Social media usage is one of the most popular online activities. In 2025, over 5.4 billion people were estimated to be using social media worldwide, a number projected to increase to over 6.6 billion in 2030. Who uses social media? Social networking is one of the most popular digital activities worldwide, and it is no surprise that social networking penetration across all regions is constantly increasing. As of January 2023, the global social media usage rate stood at 59 percent. This figure is anticipated to grow as less developed digital markets catch up with other regions when it comes to infrastructure development and the availability of cheap mobile devices. In fact, most of social media’s global growth is driven by the increasing usage of mobile devices. The mobile-first market of Eastern Asia topped the global ranking of mobile social networking penetration, followed by established digital powerhouses such as the Americas and Northern Europe. How much time do people spend on social media? Social media is an integral part of daily internet usage. On average, internet users spend 151 minutes per day on social media and messaging apps, an increase of 40 minutes since 2015. On average, internet users in Latin America had the highest average time spent per day on social media. What are the most popular social media platforms? Market leader Facebook was the first social network to surpass one billion registered accounts and currently boasts approximately 2.9 billion monthly active users, making it the most popular social network worldwide. In June 2023, the top social media apps in the Apple App Store included mobile messaging apps WhatsApp and Telegram Messenger, as well as the ever-popular app version of Facebook.
Facebook
TwitterDataset Overview
This comprehensive dataset offers an in-depth analysis of social media engagements across various platforms. It captures the dynamics of user interactions by tracking the number of reactions, comments, shares, and types of posts. Ideal for social media analysts, marketers, and researchers, this dataset serves as a critical tool for understanding digital communication trends and enhancing social media strategies. Each entry provides detailed metrics on how posts are received by audiences, enabling data-driven insights into content performance.
Key Features:
📌 num_reactions: Total number of reactions a post receives, encapsulating the overall engagement. 📍 num_comments: Reflects the level of audience interaction through comments. 📸 num_shares: Indicates the virality of the post by counting how many times it has been shared. ❤️ num_likes: Tracks the number of likes, showing general approval of the content. 🥰 num_loves: Captures more intense affection reactions to posts. 😮 num_wows: Measures the surprise or awe factor of the post. 😂 num_hahas: Counts instances of amusement or laughter triggered by the post. 😢 num_sads: Reflects the number of sad reactions, indicating emotional impact. 😡 num_angrys: Tracks angry reactions, highlighting content that might be controversial or upsetting. 🔗 status_type_link: Binary indicator of whether the post includes a link, enhancing its informational value. 🖼️ status_type_photo: Identifies posts with photos, crucial for visual content analysis. 📝 status_type_status: Marks textual posts, focusing on written content engagement. 🎥 status_type_video: Distinguishes posts with videos, important for engagement in dynamic content.
This dataset not only aids in measuring the effectiveness of social media campaigns but also supports the development of targeted marketing strategies and content optimization efforts to maximize audience engagement.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset provides detailed rankings and key metrics for 100+ social media platforms and sites in 2025. It includes information such as user base, popularity trends, and global reach. Ideal for analyzing social media growth, user engagement, and market trends. Whether you're a data scientist, marketer, or researcher, this dataset offers valuable insights into the evolving digital landscape.
Facebook
TwitterOpen Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
This dataset was originally collected for a data science and machine learning project that aimed at investigating the potential correlation between the amount of time an individual spends on social media and the impact it has on their mental health.
The project involves conducting a survey to collect data, organizing the data, and using machine learning techniques to create a predictive model that can determine whether a person should seek professional help based on their answers to the survey questions.
This project was completed as part of a Statistics course at a university, and the team is currently in the process of writing a report and completing a paper that summarizes and discusses the findings in relation to other research on the topic.
The following is the Google Colab link to the project, done on Jupyter Notebook -
https://colab.research.google.com/drive/1p7P6lL1QUw1TtyUD1odNR4M6TVJK7IYN
The following is the GitHub Repository of the project -
https://github.com/daerkns/social-media-and-mental-health
Libraries used for the Project -
Pandas
Numpy
Matplotlib
Seaborn
Sci-kit Learn
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
A list of the most popular (top 100 by followers) Instagram, Twitter, YouTube, Twitch, and TikTok users. NB! For YouTube the followers are subscribers and the posts are videos.
Facebook
Twitterhttps://www.ibisworld.com/about/termsofuse/https://www.ibisworld.com/about/termsofuse/
Over the five years through 2025-26, industry revenue is forecast to expand at a compound annual rate of 20.3% to reach £12.5 billion. Social media platforms are integral to people's lives, offering ways to communicate, create and view content and share information. According to Ofcom, approximately 89% of UK internet users in 2023 used social media apps or sites. Teenagers and young adults are the biggest users. Advertising is the primary revenue source for social media platforms, although subscription-based services are gaining momentum as platforms seek to diversify their incomes. TikTok is the success story of the past five years, becoming the most downloaded app between 2020 and 2022, according to Apptopia. The short-form video platform has over 30 million monthly users in the UK in 2025. After Musk's takeover, X, formerly known as Twitter, adjusted its content moderation and allowed previously banned accounts to return. As a result, over 600 advertisers pulled their ads from the site because of fears their brand may be associated with malcontent. In response to falling ad revenue, X has introduced a subscription-based service which enables users to verify themselves and boosts the number of people who view their tweets. Meta-owned Facebook and Instagram have responded by introducing a similar service. In 2025, more social media platforms are using AI to boost user engagement. This improves click-through rates and drives higher advertising revenue. Industry revenue is expected to grow by 6.3% in 2025-26. Over the five years through 2030-31, social media platforms' revenue is projected to climb at an estimated 9.2% to reach £19.4 billion. Regulations relating to how data is collected, stored, and shared will force advertisers and platforms to rethink how they can target their desired demographics. The tightening of regulations will raise industry compliance costs, weighing on profit margin. Older age groups present a new revenue opportunity for social media platforms if they can bridge the gap between passive TV consumption and interactive digital engagement. Augmented Reality (AR) technology will move beyond filters to become standard for immersive product trials, interactive ads, and virtual meetups
Facebook
Twitterhttps://brightdata.com/licensehttps://brightdata.com/license
Gain valuable insights with our comprehensive Social Media Dataset, designed to help businesses, marketers, and analysts track trends, monitor engagement, and optimize strategies. This dataset provides structured and reliable social media data from multiple platforms.
Dataset Features
User Profiles: Access public social media profiles, including usernames, bios, follower counts, engagement metrics, and more. Ideal for audience analysis, influencer marketing, and competitive research. Posts & Content: Extract posts, captions, hashtags, media (images/videos), timestamps, and engagement metrics such as likes, shares, and comments. Useful for trend analysis, sentiment tracking, and content strategy optimization. Comments & Interactions: Analyze user interactions, including replies, mentions, and discussions. This data helps brands understand audience sentiment and engagement patterns. Hashtag & Trend Tracking: Monitor trending hashtags, topics, and viral content across platforms to stay ahead of industry trends and consumer interests.
Customizable Subsets for Specific Needs Our Social Media Dataset is fully customizable, allowing you to filter data based on platform, region, keywords, engagement levels, or specific user profiles. Whether you need a broad dataset for market research or a focused subset for brand monitoring, we tailor the dataset to your needs.
Popular Use Cases
Brand Monitoring & Reputation Management: Track brand mentions, customer feedback, and sentiment analysis to manage online reputation effectively. Influencer Marketing & Audience Analysis: Identify key influencers, analyze engagement metrics, and optimize influencer partnerships. Competitive Intelligence: Monitor competitor activity, content performance, and audience engagement to refine marketing strategies. Market Research & Consumer Insights: Analyze social media trends, customer preferences, and emerging topics to inform business decisions. AI & Predictive Analytics: Leverage structured social media data for AI-driven trend forecasting, sentiment analysis, and automated content recommendations.
Whether you're tracking brand sentiment, analyzing audience engagement, or monitoring industry trends, our Social Media Dataset provides the structured data you need. Get started today and customize your dataset to fit your business objectives.
Facebook
TwitterDuring a 2025 survey among marketers worldwide, around 83 percent reported using Facebook for marketing purposes. Instagram and LinkedIn followed, respectively mentioned by 78 and 69 percent of the respondents. The global social media marketing segment According to the same study, 60 percent of responding marketers intended to increase their organic use of YouTube for marketing purposes throughout that year. LinkedIn and Instagram followed with similar shares, rounding up the top three social media platforms attracting a planned growth in organic use among global marketers in 2025. Their main driver is increasing brand exposure and traffic, which led the ranking of benefits of social media marketing worldwide. Social media for B2B marketing Social media platform adoption rates among business-to-consumer (B2C) and business-to-business (B2B) marketers vary according to each subsegment's focus. While B2C professionals prioritize Facebook and Instagram, both run by Meta, Inc., due to their popularity among online audiences, B2B marketers concentrate their endeavors on Microsoft-owned LinkedIn due to its goal to connect people and companies in a corporate context.
Facebook
TwitterThis case study is an analysis of how social media affects mental health and recommendations for healthy social media usage based on the insights gathered from the analysis. You will find here the csv file of the dataset used for the case study, the case study roadmap, data cleaning log including its R code, analysis documentation including its R code, and related presentation discussing the findings of the case study.
Credit (dataset acquired from): https://www.kaggle.com/datasets/souvikahmed071/social-media-and-mental-health
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This data was collected and analyzed as part of a study on PII disclosures in social media conversations with special attention to influencer characteristics in the interactions in the dissertation titled Privacy vs. Social Capital: Examining Information Disclosure Patterns within Social Media Influencer Networks and the research paper titled Unveiling Influencer-Driven Personal Data Sharing in Social Media Discourse.
Each study phase is different, with X (Twitter) data used in the pilot analysis and Reddit data used in the main study. Both folders will have the analyzed_posts and cluster summary csv files broken down by collection (either based on trend or collection date).
Note: Raw data is not made available in these datasets due to the nature of the study and to protect the original authors.
| Column name | Type | Description |
|---|---|---|
| Node ID | UUID | Unique identifier for post (replaces original platform identifier) |
| User ID | UUID | Unique identifier assigned for user (replaces original platform identifier) |
| Cluster Name | Str | Composite ID for subgraph using collection name and subgraph index |
| Influence Power | Float | Eigenvector centrality |
| Influencer Tier | Str | Categorical label calculated by follower count |
| Collection Name | Str | Trend collection assigned based on search query |
| Hashtags | Set(str) | The set of hashtags included in the node |
| PII Disclosed | Bool | Whether or not PII was disclosed |
| PII Detected | Set(str) | The detected token types in post |
| PII Risk Score | Float | The PII score for all tokens in a post |
| Is Comment | Bool | Whether or not the post is a comment or reply |
| Is Text Starter | Bool | Whether or not the post has text content |
| Community | Str | The group, community, channel, etc. associated with |
| Timestamp | Timestamp | Creation timestamp (provided by social media API) |
| Time Elapsed | Int | Time elapsed (seconds) from original influencer’s post |
| Column Name | Type | Description |
|---|---|---|
| Cluster Name | Str | Composite ID for subgraph using collection name and subgraph index |
| Influencer Tiers Frequencies | List[dict] | Frequency of influencer tiers of all users in the cluster |
| Top Influence Power Score | Float | Eigenvector centrality of top influencer |
| Top Influencer Tier | Str | Size tier of top influencer |
| Collection Name | Str | Trend collection assigned based on search query. |
| Hashtags | Set(str) | The set of hashtags included in the cluster |
| PII Detection Frequencies | List[dict] | The detected token types in post with frequencies |
| Node Count | Int | Count of all nodes in the influencer cluster |
| Node Disclosures | Int | Count of all nodes with mean_risk_score > 1* |
| Disclosure Ratio | Float | Sum of nodes with confirmed disclosed PII divided by overall cluster size (count of nodes in the cluster) |
| Mean Risk Score | Float | The mean risk score for an entire network cluster |
| Median Risk Score | Float | The median risk score for an entire network cluster |
| Min Risk Score | Float | The min risk score for an entire network cluster |
| Max Risk Score | Float | The max risk score for an entire network cluster |
| Time Span | Float | Total Time Elapsed |
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The average Twitter user spends 5.1 hours per month on the platform.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
YouTube gets an average of 14.3 billion total worldwide visits every month.
Facebook
Twitterhttps://sqmagazine.co.uk/privacy-policy/https://sqmagazine.co.uk/privacy-policy/
In 2004, a Harvard student launched a platform that would go on to redefine how humans connect. Fast forward to 2025, social media isn't just a way to stay in touch, it’s where people shop, learn, protest, play, and even find love. From early-morning scrolls to late-night reels, platforms have...
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
The dataset provides structured information about the top 100 influencers from various countries globally. Each entry represents an influencer and includes the following attributes:
Facebook
TwitterOpen Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
In the last two decades, social media usage has surged, reaching nearly five billion users worldwide in 2022. Unfortunately, there is a rise in mental health issues during that same time. Through a two-phase data analysis, this project studies the patterns of mental health influenced by social media. Analyzing data from 479 individuals across various platforms, the study employs K-means clustering to categorize mental health states into three groups, each indicating varying levels of professional/intervention needs. In the subsequent supervised learning phase, predictive models, including the Naive Bayes model with an under-sampled dataset and the Decision Tree model with an oversampled dataset, were developed to determine mental health categories, achieving an accuracy of 60.42%. These models, developed with comprehensive predictors, offer valuable insights for future research and the need for interventions addressing mental health challenges linked to social media use. Table 1 displays the variables, their descriptions, and value types.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F13828311%2Fd9e0fb90d862e58aba958a14b3b8dcea%2FScreen%20Shot%202023-12-14%20at%2012.27.20%20PM.png?generation=1702578478575969&alt=media" alt="">
Phase I : Unsupervised Learning Techniques K-means Clustering Model
Using the elbow method pictured below in plot 1, we could visualize the optimal number of clusters (K), and then perform the K-means clustering with the optimal K. Several values for K were considered, and models were created for K = 2, 3, 4, 5, 6, 7, and 8, which were then compared.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F13828311%2Fa77706842d108c7fbee363c1192b763a%2FScreen%20Shot%202023-12-14%20at%2012.08.01%20PM.png?generation=1702577407983039&alt=media" alt="">
In table 4 we can see the comparison of the bss/tss ratios. K = 3 is the last model with a significant jump and therefore is the optimal model.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F13828311%2F9a44382d9c08a616bd0248f150b85526%2FScreen%20Shot%202023-12-14%20at%2012.08.20%20PM.png?generation=1702577436944201&alt=media" alt="">
In Table 5, we can observe the cluster centers for each variable within each cluster in the K-means clustering model with k = 3.https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F13828311%2Fdf92bc28b65f67d88efa3b8a96295dcc%2FScreen%20Shot%202023-12-14%20at%2012.09.13%20PM.png?generation=1702577557552624&alt=media" alt="">
Based on the above cluster centers, we could interpret the cluster groups as shown in the
table 6 below:
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F13828311%2F1d0624052cfc9ce50e7bc5b404d916d0%2FScreen%20Shot%202023-12-14%20at%2012.08.34%20PM.png?generation=1702577449886328&alt=media" alt="">
Phase II: Supervised Learning Techniques
Prediction Models
Data Input
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F13828311%2F51672c4d16a801532a3ac8017cf72958%2FScreen%20Shot%202023-12-14%20at%2012.16.16%20PM.png?generation=1702577897888133&alt=media" alt="">
Above in Image A, we can see a sneak peek of the dataset with the new variable 'MHScore,' indicating mental health state cluster groups.
The outcome variable (MHScore) is categorical and multi-class (3 Levels: 1,2,3). Therefore, the implemented models include Naïve Bayes (NB), Support Vector Machines (SVM), SVM with parameter changes, Decision Trees, and Pruned Decision Trees.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F13828311%2F06827fe209b78ffbddee69b272a8cdfc%2FScreen%20Shot%202023-12-14%20at%2012.20.41%20PM.png?generation=1702578062241650&alt=media" alt="">
Table 11 summarizes the results of the best model from each predictive machine learning technique for accuracy, balanced accuracy, sensitivity, specificity, and precision for each class. Each model was developed using the same predictors from the dataset, including age, gender, relationship status, occupation, organization of employment, social media usage, the number of social media platforms used, the hours spent on social media, and the frequency of social media use. The higher accuracy observed in both the under-sampled and oversampled datasets indicates the importance of class equality.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset provides detailed daily records of social media follower growth for multiple brands across major platforms, including engagement rates, campaign activity, ad spend, and top-performing content. It empowers marketing teams to analyze the impact of campaigns, content strategies, and paid promotions on follower growth, enabling data-driven optimization and benchmarking.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Users spend an average of 19.6 hours per month on TikTok alone. This works out to be approximately 39 minutes per day.
Facebook
TwitterSalutary Data is a boutique, B2B contact and company data provider that's committed to delivering high quality data for sales intelligence, lead generation, marketing, recruiting / HR, identity resolution, and ML / AI. Our database currently consists of 148MM+ highly curated B2B Contacts ( US only), along with over 4M+ companies, and is updated regularly to ensure we have the most up-to-date information.
We can enrich your in-house data ( CRM Enrichment, Lead Enrichment, etc.) and provide you with a custom dataset ( such as a lead list) tailored to your target audience specifications and data use-case. We also support large-scale data licensing to software providers and agencies that intend to redistribute our data to their customers and end-users.
What makes Salutary unique? - We offer our clients a truly unique, one-stop aggregation of the best-of-breed quality data sources. Our supplier network consists of numerous, established high quality suppliers that are rigorously vetted. - We leverage third party verification vendors to ensure phone numbers and emails are accurate and connect to the right person. Additionally, we deploy automated and manual verification techniques to ensure we have the latest job information for contacts. - We're reasonably priced and easy to work with.
Products: API Suite Web UI Full and Custom Data Feeds
Services: Data Enrichment - We assess the fill rate gaps and profile your customer file for the purpose of appending fields, updating information, and/or rendering net new “look alike” prospects for your campaigns. ABM Match & Append - Send us your domain or other company related files, and we’ll match your Account Based Marketing targets and provide you with B2B contacts to campaign. Optionally throw in your suppression file to avoid any redundant records. Verification (“Cleaning/Hygiene”) Services - Address the 2% per month aging issue on contact records! We will identify duplicate records, contacts no longer at the company, rid your email hard bounces, and update/replace titles or phones. This is right up our alley and levers our existing internal and external processes and systems.
Facebook
TwitterMarket leader Facebook was the first social network to surpass one billion registered accounts and currently sits at more than three billion monthly active users. Meta Platforms owns four of the biggest social media platforms, all with more than one billion monthly active users each: Facebook (core platform), WhatsApp, Messenger, and Instagram. In the third quarter of 2023, Facebook reported around four billion monthly core Family product users. The United States and China account for the most high-profile social platforms Most top-ranked social networks with more than 100 million users originated in the United States, but services like Chinese social networks WeChat, QQ, or video-sharing app Douyin have also garnered mainstream appeal in their respective regions due to local context and content. Douyin’s popularity has led to the platform releasing an international version of its network, TikTok. How many people use social media? The leading social networks are usually available in multiple languages and enable users to connect with friends or people across geographical, political, or economic borders. In 2025, social networking sites are estimated to reach 5.44 billion users, and these figures are still expected to grow as mobile device usage and mobile social networks increasingly gain traction in previously underserved markets.