Facebook
TwitterAs of February 2025, it was found that around 14.1 percent of TikTok's global audience were women between the ages of 18 and 24 years, while male users of the same age formed approximately 16.6 percent of the platform's audience. The online audience of the popular social video platform was further composed of 14.6 percent of female users aged between 25 and 34 years, and 20.7 percent of male users in the same age group.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains survey responses about social media usage patterns and their perceived effects on relationships and mental health. The data was collected from individuals primarily in the 18-25 age group.
Facebook
TwitterAs of April 2024, almost 32 percent of global Instagram audiences were aged between 18 and 24 years, and 30.6 percent of users were aged between 25 and 34 years. Overall, 16 percent of users belonged to the 35 to 44 year age group.
Instagram users
With roughly one billion monthly active users, Instagram belongs to the most popular social networks worldwide. The social photo sharing app is especially popular in India and in the United States, which have respectively 362.9 million and 169.7 million Instagram users each.
Instagram features
One of the most popular features of Instagram is Stories. Users can post photos and videos to their Stories stream and the content is live for others to view for 24 hours before it disappears. In January 2019, the company reported that there were 500 million daily active Instagram Stories users. Instagram Stories directly competes with Snapchat, another photo sharing app that initially became famous due to it’s “vanishing photos” feature.
As of the second quarter of 2021, Snapchat had 293 million daily active users.
Facebook
Twitterhttps://www.ibisworld.com/about/termsofuse/https://www.ibisworld.com/about/termsofuse/
Over the five years through 2025-26, industry revenue is forecast to expand at a compound annual rate of 20.3% to reach £12.5 billion. Social media platforms are integral to people's lives, offering ways to communicate, create and view content and share information. According to Ofcom, approximately 89% of UK internet users in 2023 used social media apps or sites. Teenagers and young adults are the biggest users. Advertising is the primary revenue source for social media platforms, although subscription-based services are gaining momentum as platforms seek to diversify their incomes. TikTok is the success story of the past five years, becoming the most downloaded app between 2020 and 2022, according to Apptopia. The short-form video platform has over 30 million monthly users in the UK in 2025. After Musk's takeover, X, formerly known as Twitter, adjusted its content moderation and allowed previously banned accounts to return. As a result, over 600 advertisers pulled their ads from the site because of fears their brand may be associated with malcontent. In response to falling ad revenue, X has introduced a subscription-based service which enables users to verify themselves and boosts the number of people who view their tweets. Meta-owned Facebook and Instagram have responded by introducing a similar service. In 2025, more social media platforms are using AI to boost user engagement. This improves click-through rates and drives higher advertising revenue. Industry revenue is expected to grow by 6.3% in 2025-26. Over the five years through 2030-31, social media platforms' revenue is projected to climb at an estimated 9.2% to reach £19.4 billion. Regulations relating to how data is collected, stored, and shared will force advertisers and platforms to rethink how they can target their desired demographics. The tightening of regulations will raise industry compliance costs, weighing on profit margin. Older age groups present a new revenue opportunity for social media platforms if they can bridge the gap between passive TV consumption and interactive digital engagement. Augmented Reality (AR) technology will move beyond filters to become standard for immersive product trials, interactive ads, and virtual meetups
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
*****Documentation Process***** 1. Data Preparation: - Upload the data into Power Query to assess quality and identify duplicate values, if any. - Verify data quality and types for each column, addressing any miswriting or inconsistencies. 2. Data Management: - Duplicate the original data sheet for future reference and label the new sheet as the "Working File" to preserve the integrity of the original dataset. 3. Understanding Metrics: - Clarify the meaning of column headers, particularly distinguishing between Impressions and Reach, and comprehend how Engagement Rate is calculated. - Engagement Rate formula: Total likes, comments, and shares divided by Reach. 4. Data Integrity Assurance: - Recognize that Impressions should outnumber Reach, reflecting total views versus unique audience size. - Investigate discrepancies between Reach and Impressions to ensure data integrity, identifying and resolving root causes for accurate reporting and analysis. 5. Data Correction: - Collaborate with the relevant team to rectify data inaccuracies, specifically addressing the discrepancy between Impressions and Reach. - Engage with the concerned team to understand the root cause of discrepancies between Impressions and Reach. - Identify instances where Impressions surpass Reach, potentially attributable to data transformation errors. - Following the rectification process, meticulously adjust the dataset to reflect the corrected Impressions and Reach values accurately. - Ensure diligent implementation of the corrections to maintain the integrity and reliability of the data. - Conduct a thorough recalculation of the Engagement Rate post-correction, adhering to rigorous data integrity standards to uphold the credibility of the analysis. 6. Data Enhancement: - Categorize Audience Age into three groups: "Senior Adults" (45+ years), "Mature Adults" (31-45 years), and "Adolescent Adults" (<30 years) within a new column named "Age Group." - Split date and time into separate columns using the text-to-columns option for improved analysis. 7. Temporal Analysis: - Introduce a new column for "Weekend and Weekday," renamed as "Weekday Type," to discern patterns and trends in engagement. - Define time periods by categorizing into "Morning," "Afternoon," "Evening," and "Night" based on time intervals. 8. Sentiment Analysis: - Populate blank cells in the Sentiment column with "Mixed Sentiment," denoting content containing both positive and negative sentiments or ambiguity. 9. Geographical Analysis: - Group countries and obtain additional continent data from an online source (e.g., https://statisticstimes.com/geography/countries-by-continents.php). - Add a new column for "Audience Continent" and utilize XLOOKUP function to retrieve corresponding continent data.
*****Drawing Conclusions and Providing a Summary*****
Facebook
Twitterhttp://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
This dataset provides a comprehensive and diverse snapshot of social media users and their engagements across various popular platforms such as Instagram, Twitter, Facebook, YouTube, Pinterest, TikTok, and Spotify. With 100 rows of anonymized data, it offers valuable insights into the dynamic world of social media usage. 😀
Each row in the dataset represents a unique user with a designated User ID and Username to ensure anonymity. Alongside user-specific details, the dataset captures essential information, including the platform being used, the post's content, timestamp, and media type (text, image, or video). Additionally, it tracks engagement metrics such as likes, comments, shares/retweets, and user interactions, providing an overview of the user's popularity and social impact. 💬
https://media.giphy.com/media/3GSoFVODOkiPBFArlu/giphy.gif" alt="social">
The dataset also includes pertinent user attributes, such as account creation date, privacy settings, number of followers, and following. The users' profiles are further enriched with demographic characteristics, including anonymized representations of their age group and gender. 🗨️
https://media.giphy.com/media/2tSodgDfwCjIMCBY8h/giphy.gif" alt="socialcat">
Hashtags, mentions, media URLs, post URLs, and self-reported location contribute to understanding user interests, content themes, and geographic distribution. Moreover, users' bios and language preferences offer insights into their passions, activities, and linguistic communication on the platforms.
Facebook
TwitterHow many people use social media?
Social media usage is one of the most popular online activities. In 2024, over five billion people were using social media worldwide, a number projected to increase to over six billion in 2028.
Who uses social media?
Social networking is one of the most popular digital activities worldwide and it is no surprise that social networking penetration across all regions is constantly increasing. As of January 2023, the global social media usage rate stood at 59 percent. This figure is anticipated to grow as lesser developed digital markets catch up with other regions
when it comes to infrastructure development and the availability of cheap mobile devices. In fact, most of social media’s global growth is driven by the increasing usage of mobile devices. Mobile-first market Eastern Asia topped the global ranking of mobile social networking penetration, followed by established digital powerhouses such as the Americas and Northern Europe.
How much time do people spend on social media?
Social media is an integral part of daily internet usage. On average, internet users spend 151 minutes per day on social media and messaging apps, an increase of 40 minutes since 2015. On average, internet users in Latin America had the highest average time spent per day on social media.
What are the most popular social media platforms?
Market leader Facebook was the first social network to surpass one billion registered accounts and currently boasts approximately 2.9 billion monthly active users, making it the most popular social network worldwide. In June 2023, the top social media apps in the Apple App Store included mobile messaging apps WhatsApp and Telegram Messenger, as well as the ever-popular app version of Facebook.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Health organizations are increasingly using social media, such as Twitter, to disseminate health messages to target audiences. Determining the extent to which the target audience (e.g., age groups) was reached is critical to evaluating the impact of social media education campaigns. The main objective of this study was to examine the separate and joint predictive validity of linguistic and metadata features in predicting the age of Twitter users. We created a labeled dataset of Twitter users across different age groups (youth, young adults, adults) by collecting publicly available birthday announcement tweets using the Twitter Search application programming interface. We manually reviewed results and, for each age-labeled handle, collected the 200 most recent publicly available tweets and user handles’ metadata. The labeled data were split into training and test datasets. We created separate models to examine the predictive validity of language features only, metadata features only, language and metadata features, and words/phrases from another age-validated dataset. We estimated accuracy, precision, recall, and F1 metrics for each model. An L1-regularized logistic regression model was conducted for each age group, and predicted probabilities between the training and test sets were compared for each age group. Cohen’s d effect sizes were calculated to examine the relative importance of significant features. Models containing both Tweet language features and metadata features performed the best (74% precision, 74% recall, 74% F1) while the model containing only Twitter metadata features were least accurate (58% precision, 60% recall, and 57% F1 score). Top predictive features included use of terms such as “school” for youth and “college” for young adults. Overall, it was more challenging to predict older adults accurately. These results suggest that examining linguistic and Twitter metadata features to predict youth and young adult Twitter users may be helpful for informing public health surveillance and evaluation research.
Facebook
TwitterAs of January 2025, ** percent of social media users in the United States aged 40 to 49 years were users of Facebook, as were ** percent of ** to ** year olds in the country. Overall, ** percent of those aged 18 to 29 years were using Instagram in the U.S. The social media market in the United States The number of social media users in the United States has shown continuous growth in the past years, and it is forecast to continue increasing to reach *** million users in 2029. As of 2023, the social network user penetration in the United States amounted to an impressive ***** percent, meaning that more than nine in ten people in the country engaged with online platforms. Furthermore, Facebook was by far the most popular social media platform in the United States, accounting for ** percent of all social media visits in 2023, followed by Pinterest with **** percent of visits. The global social media landscape As of April 2024, **** billion people were social media users, accounting for **** percent of the world’s population. Northern Europe was the region with the highest social media penetration rate with a reach of **** percent, followed by Western Europe with **** percent and Eastern Asia **** percent. In contrast, less than one in ten people in Middle Africa used social networks. Facebook’s popularity is not limited to the United States: this network leads the market on a global scale, and it accumulated more than three billion monthly active users (MAU) as of 2024, which is far more any other social media platform. YouTube, Instagram, and WhatsApp followed, all with *** billion or more MAU.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This social media content dataset is simulate realistic influencer posts across multiple popular platforms, reflecting diverse content types, sponsorship details, audience demographics, and engagement metrics. The dataset contains over 52,000 rows representing individual content posts generated over the past two years. It includes a balanced distribution of sponsored and non-sponsored content, with detailed disclosure information to support transparency studies and analyses. The variety of platforms, languages, content categories, and audience demographics makes this dataset ideal for exploring influencer marketing dynamics, content performance analytics, disclosure practices, and audience segmentation in social media research.
Dataset Features
id: Unique identifier for each content post (starting from 1).
platform: The social media platform where the content was posted. Values: YouTube, TikTok, Instagram, Bilibili, RedNote.
content_id: Unique ID for each content piece (e.g., content_0, content_1, …).
creator_id: Unique identifier for the content creator, cycling through 5000 distinct creators.
creator_name: Username of the content creator.
content_url: URL pointing to the content.
content_type: Format of the content. Values: video, image, text, mixed.
content_category: The main theme or niche of the content. Values: beauty, lifestyle, tech.
post_date: Timestamp of the post, randomly distributed over the past two years.
language: Language of the content, with probabilities favoring English. Values: English, Chinese, Spanish, Hindi, Japanese.
content_length: Length of the content in seconds (for video) or word count (for text), varying by content type.
content_description: Textual description or caption of the content.
hashtags: A comma-separated string of hashtags used in the post (0 to 5 tags).
views: Number of views (simulated via a Poisson distribution).
likes: Number of likes received.
shares: Number of shares.
comments_count: Count of comments on the post.
comments_text: Aggregated text of comments (0 to 5 comments concatenated).
follower_count: Number of followers the creator had at the time of posting.
is_sponsored: Boolean indicating whether the post is sponsored.
disclosure_type: Disclosure type regarding sponsorship for sponsored posts. Values: explicit, implicit, none (non-sponsored always 'none').
sponsor_name: Name of the sponsoring company if sponsored, else 'Not sponsors'.
sponsor_category: Sponsorship industry category. Values: cosmetics, electronics, fashion, food, gaming, travel or 'Not sponsors'.
disclosure_location: Where sponsorship disclosure appears in the post. Values: video, caption, hashtags, none (non-sponsored always 'none').
audience_age_distribution: Predominant age group of the audience. Values: 13-18, 19-25, 26-35, 36-50, 50+.
audience_gender_distribution: Predominant gender of the audience. Values: male, female, non-binary, unknown.
audience_location: Primary geographic location of the audience. Values: USA, China, India, Japan, Brazil, Germany, UK, Russia.
Facebook
TwitterAs of September 2024, 46 percent of social media users in the United Kingdom were aged between 30 to 49 years. Overall, one-quarter of social network users in the UK were aged 18 to 29 years. Over 20 percent of social media users in the UK stated that Facebook was their favorite social media platform.
Facebook
TwitterOpen Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
In the last two decades, social media usage has surged, reaching nearly five billion users worldwide in 2022. Unfortunately, there is a rise in mental health issues during that same time. Through a two-phase data analysis, this project studies the patterns of mental health influenced by social media. Analyzing data from 479 individuals across various platforms, the study employs K-means clustering to categorize mental health states into three groups, each indicating varying levels of professional/intervention needs. In the subsequent supervised learning phase, predictive models, including the Naive Bayes model with an under-sampled dataset and the Decision Tree model with an oversampled dataset, were developed to determine mental health categories, achieving an accuracy of 60.42%. These models, developed with comprehensive predictors, offer valuable insights for future research and the need for interventions addressing mental health challenges linked to social media use. Table 1 displays the variables, their descriptions, and value types.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F13828311%2Fd9e0fb90d862e58aba958a14b3b8dcea%2FScreen%20Shot%202023-12-14%20at%2012.27.20%20PM.png?generation=1702578478575969&alt=media" alt="">
Phase I : Unsupervised Learning Techniques K-means Clustering Model
Using the elbow method pictured below in plot 1, we could visualize the optimal number of clusters (K), and then perform the K-means clustering with the optimal K. Several values for K were considered, and models were created for K = 2, 3, 4, 5, 6, 7, and 8, which were then compared.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F13828311%2Fa77706842d108c7fbee363c1192b763a%2FScreen%20Shot%202023-12-14%20at%2012.08.01%20PM.png?generation=1702577407983039&alt=media" alt="">
In table 4 we can see the comparison of the bss/tss ratios. K = 3 is the last model with a significant jump and therefore is the optimal model.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F13828311%2F9a44382d9c08a616bd0248f150b85526%2FScreen%20Shot%202023-12-14%20at%2012.08.20%20PM.png?generation=1702577436944201&alt=media" alt="">
In Table 5, we can observe the cluster centers for each variable within each cluster in the K-means clustering model with k = 3.https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F13828311%2Fdf92bc28b65f67d88efa3b8a96295dcc%2FScreen%20Shot%202023-12-14%20at%2012.09.13%20PM.png?generation=1702577557552624&alt=media" alt="">
Based on the above cluster centers, we could interpret the cluster groups as shown in the
table 6 below:
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F13828311%2F1d0624052cfc9ce50e7bc5b404d916d0%2FScreen%20Shot%202023-12-14%20at%2012.08.34%20PM.png?generation=1702577449886328&alt=media" alt="">
Phase II: Supervised Learning Techniques
Prediction Models
Data Input
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F13828311%2F51672c4d16a801532a3ac8017cf72958%2FScreen%20Shot%202023-12-14%20at%2012.16.16%20PM.png?generation=1702577897888133&alt=media" alt="">
Above in Image A, we can see a sneak peek of the dataset with the new variable 'MHScore,' indicating mental health state cluster groups.
The outcome variable (MHScore) is categorical and multi-class (3 Levels: 1,2,3). Therefore, the implemented models include Naïve Bayes (NB), Support Vector Machines (SVM), SVM with parameter changes, Decision Trees, and Pruned Decision Trees.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F13828311%2F06827fe209b78ffbddee69b272a8cdfc%2FScreen%20Shot%202023-12-14%20at%2012.20.41%20PM.png?generation=1702578062241650&alt=media" alt="">
Table 11 summarizes the results of the best model from each predictive machine learning technique for accuracy, balanced accuracy, sensitivity, specificity, and precision for each class. Each model was developed using the same predictors from the dataset, including age, gender, relationship status, occupation, organization of employment, social media usage, the number of social media platforms used, the hours spent on social media, and the frequency of social media use. The higher accuracy observed in both the under-sampled and oversampled datasets indicates the importance of class equality.
Facebook
Twitterhttps://cdla.io/sharing-1-0/https://cdla.io/sharing-1-0/
This dataset provides insights into the daily mobile usage patterns of 1,000 users, covering aspects such as screen time, app usage, and user engagement across different app categories.
It includes a diverse range of users based on age, gender, and location.
The data focuses on total app usage, time spent on social media, productivity, and gaming apps, along with overall screen time.
This information is valuable for understanding behavioral trends and app usage preferences, making it useful for app developers, marketers, and UX researchers.
This dataset is useful for analyzing mobile engagement, app usage habits, and the impact of demographic factors on mobile behavior. It can help identify trends for marketing, app development, and user experience optimization.
This dataset enables a deeper understanding of mobile user behavior and app engagement across different demographics.
Key outcomes include insights into app usage preferences, daily screen time habits, and the impact of age, gender, and location on mobile behavior.
This analysis can help identify patterns for improving user experience, tailoring marketing strategies, and optimizing app development for different user segments.
Facebook
TwitterOpen Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Percentage of Internet users by selected Internet service and technology, such as; home Internet access, use of smart home devices, use of smartphones, use of social networking accounts, use or purchase of streaming services, use of government services online and online shopping.
Facebook
TwitterOpen Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
This dataset was originally collected for a data science and machine learning project that aimed at investigating the potential correlation between the amount of time an individual spends on social media and the impact it has on their mental health.
The project involves conducting a survey to collect data, organizing the data, and using machine learning techniques to create a predictive model that can determine whether a person should seek professional help based on their answers to the survey questions.
This project was completed as part of a Statistics course at a university, and the team is currently in the process of writing a report and completing a paper that summarizes and discusses the findings in relation to other research on the topic.
The following is the Google Colab link to the project, done on Jupyter Notebook -
https://colab.research.google.com/drive/1p7P6lL1QUw1TtyUD1odNR4M6TVJK7IYN
The following is the GitHub Repository of the project -
https://github.com/daerkns/social-media-and-mental-health
Libraries used for the Project -
Pandas
Numpy
Matplotlib
Seaborn
Sci-kit Learn
Facebook
TwitterAccording to a survey of internet users in Great Britain (GB) conducted between July 20 and 23, over half of respondents aged 18 to 24 reported having used Facebook in this period. The same amount of users in the age cohort reported having used TikTok, while 81 percent used YouTube in the examined time. Overall, Facebook was the most popular social media platform across the older demographics, with 72 percent of users aged between 50 and 64 reporting to having engaged with the Meta-powered social media in the examined period. The 2024 general election in the United Kingdom was held on July 4, and saw parties and politicians make ample usage of social media channels in the weeks before the country casted its vote.
Facebook
TwitterThis paper specifies, designs and critically evaluates two tools for the automated identification of demographic data (age, occupation and social class) from the profile descriptions of Twitter users in the United Kingdom (UK). Meta-data data routinely collected through the Collaborative Social Media Observatory (COSMOS: http://www.cosmosproject.net/) relating to UK Twitter users is matched with the occupational lookup tables between job and social class provided by the Office for National Statistics (ONS) using SOC2010. Using expert human validation, the validity and reliability of the automated matching process is critically assessed and a prospective class distribution of UK Twitter users is offered with 2011 Census baseline comparisons. The pattern matching rules for identifying age are explained and enacted following a discussion on how to minimise false positives. The age distribution of Twitter users, as identified using the tool, is presented alongside the age distribution of the UK population from the 2011 Census. The automated occupation detection tool reliably identifies certain occupational groups, such as professionals, for which job titles cannot be confused with hobbies or are used in common parlance within alternative contexts. An alternative explanation on the prevalence of hobbies is that the creative sector is overrepresented on Twitter compared to 2011 Census data. The age detection tool illustrates the youthfulness of Twitter users compared to the general UK population as of the 2011 Census according to proportions, but projections demonstrate that there is still potentially a large number of older platform users. It is possible to detect “signatures” of both occupation and age from Twitter meta-data with varying degrees of accuracy (particularly dependent on occupational groups) but further confirmatory work is needed.
Facebook
Twitterhttps://sqmagazine.co.uk/privacy-policy/https://sqmagazine.co.uk/privacy-policy/
In 2008, the average human attention span was 12 seconds. Fast forward to 2025, and many studies suggest it's now hovering around 8 seconds, shorter than that of a goldfish. It’s no coincidence that during this same period, social media platforms surged to dominate how we consume content. Whether you're...
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
By Twitter [source]
This dataset offers an exciting opportunity to explore how YouTube tweets and videos influence social media engagement. Through a comprehensive examination of likes, retweets, and quote counts across various topics or categories of videos, researchers can gain valuable insight into users’ reactions to content from popular YouTube channels. By collecting data on the level of conversations, this dataset will enable researchers to measure the success of YouTube's marketing efforts – along with those of its competitors – in driving viewer engagement. Further research could reveal patterns related to the demographic makeup of viewers across age groups and locations that have higher levels of conversation when responding to posted tweets. Investigating these patterns could be crucial in creating more effective strategies for interacting with potential consumers on social media platforms
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
How to Use this Dataset
Analyze Conversation Patterns – Researchers can utilize the tweets and conversation metrics in this dataset to analyze the different conversation patterns related to each topic or video category discussed on YouTube. This includes topics such as politics, sports, music, and more. Through analyzing these conversations, researchers can gain an understanding of what drives engagement with YouTube videos and how they may be used as a platform for further marketing campaigns.
Measure Engagement Levels – This dataset contains metrics that allow users to measure overall engagement levels for different topics of Twitter conversations in relation to YouTube content. By studying these metrics, researchers are able to compare the level of engagement that different types of content receive on various social media platforms such as Twitter.
Assess Marketing Strategies Effectiveness –Researchers can use this data set to assess how effective various marketing strategies are by looking at like counts, retweet counts and quote counts for particular campaigns or categories of YouTube contents . In addition , research could be conducted on specific marketing campaigns in areas where companies wish optimize their success rates such as increasing clicks or viewer retention rates .
Compare Engagement Across Platforms-The data contained within this dataset allows users compare engagement levels across other social media channels such as Facebook and Instagram in order learn which ones drive higher levels of engagement when it comes promoting content on YouTube or creating conversations around certain topics
Studying the correlation between YouTube content categories and levels of engagement, in order to identify which types of videos/topics have higher engagement rates.
Assessing patterns in likes, retweets and conversations in response to different topics or categories of videos related to YouTube content, in order to gain an understanding of the type of content that typically receives the most conversation.
Tracking changes in engagement levels over time for different topics or channels on YouTube, such as popular channels or videos with high view counts, in order to measure the overall effectiveness of YouTube's marketing campaigns across social media platforms
If you use this dataset in your research, please credit the original authors. Data Source
License
License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.
If you use this dataset in your research, please credit the original authors. If you use this dataset in your research, please credit Twitter.
Facebook
TwitterPlease cite the following paper when using this dataset: N. Thakur, “Twitter Big Data as a Resource for Exoskeleton Research: A Large-Scale Dataset of about 140,000 Tweets and 100 Research Questions,” Preprints, 2022, DOI: 10.20944/preprints202206.0383.v1 Abstract The exoskeleton technology has been rapidly advancing in the recent past due to its multitude of applications and use cases in assisted living, military, healthcare, firefighting, and industries. With the projected increase in the diverse uses of exoskeletons in the next few years in these application domains and beyond, it is crucial to study, interpret, and analyze user perspectives, public opinion, reviews, and feedback related to exoskeletons, for which a dataset is necessary. The Internet of Everything era of today's living, characterized by people spending more time on the Internet than ever before, holds the potential for developing such a dataset by mining relevant web behavior data from social media communications, which have increased exponentially in the last few years. Twitter, one such social media platform, is highly popular amongst all age groups, who communicate on diverse topics including but not limited to news, current events, politics, emerging technologies, family, relationships, and career opportunities, via tweets, while sharing their views, opinions, perspectives, and feedback towards the same. Therefore, this work presents a dataset of about 140,000 Tweets related to exoskeletons. that were mined for a period of 5-years from May 21, 2017, to May 21, 2022. The tweets contain diverse forms of communications and conversations which communicate user interests, user perspectives, public opinion, reviews, feedback, suggestions, etc., related to exoskeletons. Instructions: This dataset contains about 140,000 Tweets related to exoskeletons. that were mined for a period of 5-years from May 21, 2017, to May 21, 2022. The tweets contain diverse forms of communications and conversations which communicate user interests, user perspectives, public opinion, reviews, feedback, suggestions, etc., related to exoskeletons. The dataset contains only tweet identifiers (Tweet IDs) due to the terms and conditions of Twitter to re-distribute Twitter data only for research purposes. They need to be hydrated to be used. The process of retrieving a tweet's complete information (such as the text of the tweet, username, user ID, date and time, etc.) using its ID is known as the hydration of a tweet ID. The Hydrator application (link to download the application: https://github.com/DocNow/hydrator/releases and link to a step-by-step tutorial: https://towardsdatascience.com/learn-how-to-easily-hydrate-tweets-a0f393ed340e#:~:text=Hydrating%20Tweets) or any similar application may be used for hydrating this dataset. Data Description This dataset consists of 7 .txt files. The following shows the number of Tweet IDs and the date range (of the associated tweets) in each of these files. Filename: Exoskeleton_TweetIDs_Set1.txt (Number of Tweet IDs – 22945, Date Range of Tweets - July 20, 2021 – May 21, 2022) Filename: Exoskeleton_TweetIDs_Set2.txt (Number of Tweet IDs – 19416, Date Range of Tweets - Dec 1, 2020 – July 19, 2021) Filename: Exoskeleton_TweetIDs_Set3.txt (Number of Tweet IDs – 16673, Date Range of Tweets - April 29, 2020 - Nov 30, 2020) Filename: Exoskeleton_TweetIDs_Set4.txt (Number of Tweet IDs – 16208, Date Range of Tweets - Oct 5, 2019 - Apr 28, 2020) Filename: Exoskeleton_TweetIDs_Set5.txt (Number of Tweet IDs – 17983, Date Range of Tweets - Feb 13, 2019 - Oct 4, 2019) Filename: Exoskeleton_TweetIDs_Set6.txt (Number of Tweet IDs – 34009, Date Range of Tweets - Nov 9, 2017 - Feb 12, 2019) Filename: Exoskeleton_TweetIDs_Set7.txt (Number of Tweet IDs – 11351, Date Range of Tweets - May 21, 2017 - Nov 8, 2017) Here, the last date for May is May 21 as it was the most recent date at the time of data collection. The dataset would be updated soon to incorporate more recent tweets.
Facebook
TwitterAs of February 2025, it was found that around 14.1 percent of TikTok's global audience were women between the ages of 18 and 24 years, while male users of the same age formed approximately 16.6 percent of the platform's audience. The online audience of the popular social video platform was further composed of 14.6 percent of female users aged between 25 and 34 years, and 20.7 percent of male users in the same age group.