https://brightdata.com/licensehttps://brightdata.com/license
Use our TikTok profiles dataset to extract business and non-business information from complete public profiles and filter by account name, followers, create date, or engagement score. You may purchase the entire dataset or a customized subset depending on your needs. Popular use cases include sentiment analysis, brand monitoring, influencer marketing, and more. The TikTok dataset includes all major data points: timestamp, account name, nickname, bio,average engagement score, creation date, is_verified,l ikes, followers, external link in bio, and more. Get your TikTok dataset today!
In the fourth quarter of 2024, TikTok generated around 186 million downloads from users worldwide. Initially launched in China first by ByteDance as Douyin, the short-video format was popularized by TikTok and took over the global social media environment in 2020. In the first quarter of 2020, TikTok downloads peaked at over 313.5 million worldwide, up by 62.3 percent compared to the first quarter of 2019.
TikTok interactions: is there a magic formula for content success?
In 2024, TikTok registered an engagement rate of approximately 4.64 percent on video content hosted on its platform. During the same examined year, the social video app recorded over 1,100 interactions on average. These interactions were primarily composed of likes, while only recording less than 20 comments per piece of content on average in 2024.
The platform has been actively monitoring the issue of fake interactions, as it removed around 236 million fake likes during the first quarter of 2024. Though there is no secret formula to get the maximum of these metrics, recommended video length can possibly contribute to the success of content on TikTok.
It was recommended that tiny TikTok accounts with up to 500 followers post videos that are around 2.6 minutes long as of the first quarter of 2024. While, the ideal video duration for huge TikTok accounts with over 50,000 followers was 7.28 minutes. The average length of TikTok videos posted by the creators in 2024 was around 43 seconds.
Whatās trending on TikTok Shop?
Since its launch in September 2023, TikTok Shop has become one of the most popular online shopping platforms, offering consumers a wide variety of products. In 2023, TikTok shops featuring beauty and personal care items sold over 370 million products worldwide.
TikTok shops featuring womenswear and underwear, as well as food and beverages, followed with 285 and 138 million products sold, respectively. Similarly, in the United States market, health and beauty products were the most-selling items,
accounting for 85 percent of sales made via the TikTok Shop feature during the first month of its launch. In 2023, Indonesia was the market with the largest number of TikTok Shops, hosting over 20 percent of all TikTok Shops. Thailand and Vietnam followed with 18.29 and 17.54 percent of the total shops listed on the famous short video platform, respectively.
During the fourth quarter 2024, approximately 20.6 million TikTok accounts were removed from the platform due to suspicion of being operated by users under the age of 13. During the last measured period, around 185 million fake accounts were removed from fake accounts removed from TikTok.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
By [source]
This dataset explores various factors associated with the reception of COVID-19 related content on TikTok. It not only captures overall levels of user engagement such as likes, comments, and views but also explores source credibility including information from healthcare professionals, news sources, patients, and other outlets. It further dives into demographic factors such as gender and age range as well as content type like humor or provision of clinical instruction. Finally, it takes a look at elements such as description of risk factors & symptoms along with modes of transmission established by the posts in question and prevention that was discussed within them. Moreover, there is a discernment component that breaks down user perception - rating the posts for level of misinformation (moderate/high/low). All these measures combined provide insights into how users are engaging with COVID-19 related misinformation on TikTok
For more datasets, click here.
- šØ Your notebook can be here! šØ!
This dataset contains user engagement data and measures of source credibility related to COVID-19 misinformation on TikTok. It can be used to examine the factors associated with content reception, such as views, likes, comments, as well as factors relating to credibility, demographics and content type.
Using this dataset: - Explore the columns available in the dataset. There are a number of columns that measure user engagement (views, likes and comments) as well as source credibility (official source, healthcare professional etc.), demographic factors (gender, age group etc.), and content type (humor etc). Get familiar with all these columns so that you know what information is available for analysis.
- Decide what kind of analysis you want to perform. You can use this data for exploratory or explanatory work - depending on your aims or research question. For example if you want to see how source credibility affects user engagement then you would need descriptive statistical techniques such as correlation tests or regression analyses etc., whereas if you just want to gain an overall understanding of patterns in this data then exploratory techniques such as cross tabulations may be more suitable.
- Developing a predictive model to identify which demographic and source characteristics are correlated with high user engagement for COVID-related posts on TikTok (e.g. views, likes, and comments).
- Investigating the difference in user engagement for posts from healthcare professionals vs non-professional sources to compare how different types of content are received by users on TikTok.
- Analyzing the sentiment of words related to masks and tests in order to gain insights into how content about this topic is perceived by users on TikTok (i.e., positive or negative sentiment)
If you use this dataset in your research, please credit the original authors. Data Source
License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.
File: tiktok_data_open.csv | Column name | Description | |:-------------------------------|:------------------------------------------------------------------------| | views | Number of views for the video. (Integer) | | likes | Number of likes for the video. (Integer) | | comments | Number of comments for the video. (Integer) | | official_source | Whether the source of the video is an official source. (Boolean) | | pub_hcp | Whether the source of the video is a healthcare professional. (Boolean) | | pub_news | Whether the source of the video is a news source. (Boolean) | | pub_patient | Whether the source of the video is a patient. (Boolean) | | pub_other | Whether the source of the video is another source. (Boolean) | | female ...
https://brightdata.com/licensehttps://brightdata.com/license
Use our TikTok Shop dataset to extract detailed e-commerce insights, including product names, prices, discounts, seller details, product descriptions, categories, customer ratings, and reviews. You may purchase the entire dataset or a customized subset tailored to your needs. Popular use cases include trend analysis, pricing optimization, customer behavior studies, and marketing strategy refinement. The TikTok Shop dataset includes key data points: product performance metrics, user engagement, customer reviews, and more. Unlock the potential of TikTok's shopping platform today with our comprehensive dataset!
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is a dataset of videos and comments related to the invasion of Ukraine, published on TikTok by a number of users over the year of 2022. It was compiled by Benjamin Steel, Sara Parker and Derek Ruths at the Network Dynamics Lab, McGill University. We created this dataset to facilitate the study of TikTok, and the nature of social interaction on the platform relevant to a major political event.
The dataset has been released here on Zenodo: https://doi.org/10.5281/zenodo.7926959 as well as on Github: https://github.com/networkdynamics/data-and-code/tree/master/ukraine_tiktok
To create the dataset, we identified hashtags and keywords explicitly related to the conflict to collect a core set of videos (or āTikToksā). We then compiled comments associated with these videos. All of the data captured is publically available information, and contains personally identifiable information. In total we collected approximately 16 thousand videos and 12 million comments, from approximately 6 million users. There are approximately 1.9 comments on average per user captured, and 1.5 videos per user who posted a video. The author personally collected this data using the web scraping PyTok library, developed by the author: https://github.com/networkdynamics/pytok.
Due to scraping duration, this is just a sample of the publically available discourse concerning the invasion of Ukraine on TikTok. Due to the fuzzy search functionality of the TikTok, the dataset contains videos with a range of relatedness to the invasion.
We release here the unique video IDs of the dataset in a CSV format. The data was collected without the specific consent of the content creators, so we have released only the data required to re-create it, to allow users to delete content from TikTok and be removed from the dataset if they wish. Contained in this repository are scripts that will automatically pull the full dataset, which will take the form of JSON files organised into a folder for each video. The JSON files are the entirety of the data returned by the TikTok API. We include a script to parse the JSON files into CSV files with the most commonly used data. We plan to further expand this dataset as collection processes progress and the war continues. We will version the dataset to ensure reproducibility.
To build this dataset from the IDs here:
pip install -e .
in the pytok directorypip install pandas tqdm
to install these libraries if not already installedget_videos.py
to get the video datavideo_comments.py
to get the comment datauser_tiktoks.py
to get the video history of the usershashtag_tiktoks.py
or search_tiktoks.py
to get more videos from other hashtags and search termsload_json_to_csv.py
to compile the JSON files into two CSV files, comments.csv
and videos.csv
If you get an error about the wrong chrome version, use the command line argument get_videos.py --chrome-version YOUR_CHROME_VERSION
Please note pulling data from TikTok takes a while! We recommend leaving the scripts running on a server for a while for them to finish downloading everything. Feel free to play around with the delay constants to either speed up the process or avoid TikTok rate limiting.
Please do not hesitate to make an issue in this repo to get our help with this!
The videos.csv
will contain the following columns:
video_id
: Unique video ID
createtime
: UTC datetime of video creation time in YYYY-MM-DD HH:MM:SS format
author_name
: Unique author name
author_id
: Unique author ID
desc
: The full video description from the author
hashtags
: A list of hashtags used in the video description
share_video_id
: If the video is sharing another video, this is the video ID of that original video, else empty
share_video_user_id
: If the video is sharing another video, this the user ID of the author of that video, else empty
share_video_user_name
: If the video is sharing another video, this is the user name of the author of that video, else empty
share_type
: If the video is sharing another video, this is the type of the share, stitch, duet etc.
mentions
: A list of users mentioned in the video description, if any
The comments.csv
will contain the following columns:
comment_id
: Unique comment ID
createtime
: UTC datetime of comment creation time in YYYY-MM-DD HH:MM:SS format
author_name
: Unique author name
author_id
: Unique author ID
text
: Text of the comment
mentions
: A list of users that are tagged in the comment
video_id
: The ID of the video the comment is on
comment_language
: The language of the comment, as predicted by the TikTok API
reply_comment_id
: If the comment is replying to another comment, this is the ID of that comment
The date can be compiled into a user interaction network to facilitate study of interaction dynamics. There is code to help with that here: https://github.com/networkdynamics/polar-seeds. Additional scripts for further preprocessing of this data can be found there too.
In 2023, the number of TikTok users in Malaysia was estimated to reach around ** million. The number was forecast to continuously increase between 2024 and 2029. Based on the forecast, the number of TikTok users in Malaysia will reach **** million by 2029.User figures, shown here with regards to the platform TikTok, have been estimated by considering company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period and count multiple accounts by persons only once.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset was initially used in the paper "The use and impact of TikTok in the 2022 Brazilian presidential election". It contains data from official TikTok accounts of the two main candidates running for the 2022 Brazilian presidential election, Lula (@lulaoficial) and Bolsonaro (@bolsonaromessiasjair). It was collected 576 posts of the candidates and more than 540 million interactions on these posts. Data encompass three periods of 2022: (i) Pre-campaign (Jun 30 to Aug 15); (ii) 1st round campaign (Aug 16 to Oct 1); and (iii) 2nd round campaign (Oct 2 - Oct 29). It contains two files. (i) Accounts: How many followers the candidate has, on a day-to-day basis, starting on Sept 5; and (ii) Posts and interactions: Individual data and metrics of each post, including date of the post, text, link for the post, number of plays, likes, comments and shares.
With barely 10 seconds and 61 million likes, Bella Poach's lip syncing "M to the B" by Millie B was the most engaging video on TikTok as of March 2023. Bella Poarch, who as of the beginning of 2023 was the third-most followed creator on the popular social video platform, rose to popularity as a singer and content creators since opening a TikTok account in January 2020. Second ranked "dancing in front of the bathroom mirror," by user @jamie32bsh generated almost 52 million likes between its upload time - January 2022 and March 2023.
Unlock insights into high-performing content with this curated dataset of TikTok posts, each with over 50,000 plays. This collection surfaces the videos that resonate most with audiencesāspanning creators, themes, and formats that drive virality.
š Performance Threshold: Only includes posts that have exceeded 50K views, ensuring a focus on high-engagement, trend-relevant content.
š± Detailed Post Data: Captures video captions, play counts, likes, shares, comments, sound IDs, hashtags, and posting timestamps.
š¤ Creator Metadata: Includes usernames, follower counts, bio snippets, and profile metrics to support creator analysis.
š Engagement Benchmarking: Useful for identifying viral content, measuring campaign performance, and refining creative strategies.
ā” Trend Analysis Ready: Track how themes, hashtags, or sounds perform at scale within and across verticals.
š Structured for Scale: Delivered in clean CSV format API, or custom format, ready for integration into analytics tools, dashboards, or model training environments.
This dataset is designed for marketers, agencies, analysts, and researchers looking to decode the mechanics of virality, identify top-performing content, and inform influencer strategy on TikTok. Whether you're building recommendation engines or planning your next campaign, this dataset offers a high-signal view into TikTok's most impactful content.
The global number of Facebook users was forecast to continuously increase between 2023 and 2027 by in total 391 million users (+14.36 percent). After the fourth consecutive increasing year, the Facebook user base is estimated to reach 3.1 billion users and therefore a new peak in 2027. Notably, the number of Facebook users was continuously increasing over the past years. User figures, shown here regarding the platform Facebook, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period and count multiple accounts by persons only once.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).
How many people use social media?
Social media usage is one of the most popular online activities. In 2024, over five billion people were using social media worldwide, a number projected to increase to over six billion in 2028.
Who uses social media?
Social networking is one of the most popular digital activities worldwide and it is no surprise that social networking penetration across all regions is constantly increasing. As of January 2023, the global social media usage rate stood at 59 percent. This figure is anticipated to grow as lesser developed digital markets catch up with other regions
when it comes to infrastructure development and the availability of cheap mobile devices. In fact, most of social mediaās global growth is driven by the increasing usage of mobile devices. Mobile-first market Eastern Asia topped the global ranking of mobile social networking penetration, followed by established digital powerhouses such as the Americas and Northern Europe.
How much time do people spend on social media?
Social media is an integral part of daily internet usage. On average, internet users spend 151 minutes per day on social media and messaging apps, an increase of 40 minutes since 2015. On average, internet users in Latin America had the highest average time spent per day on social media.
What are the most popular social media platforms?
Market leader Facebook was the first social network to surpass one billion registered accounts and currently boasts approximately 2.9 billion monthly active users, making it the most popular social network worldwide. In June 2023, the top social media apps in the Apple App Store included mobile messaging apps WhatsApp and Telegram Messenger, as well as the ever-popular app version of Facebook.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Please refer to the "README.rtf" and "Data Details for Replication.pdf" files.Paper AbstractSocial media platforms have become vital channels for businesses to reach consumers through advertising. But in the U.S., the digital advertising market in which these platforms operate is dominated by a few major players, raising concerns for antitrust regulators. In such a concentrated market, the entry or exit of a single platform can reallocate billions in ad spending, affecting businesses and users. TikTok's temporary suspension in the U.S. in January 2025 provides a unique natural experiment to examine how the removal of a major player would shift advertising demand and supply on competitors, specifically Facebook and Instagram, revealing the degree of substitutability across platforms and the intensity of competition. Using a difference-in-differences approach comparing advertising activity in the U.S. to other countries, we find that Meta ad volume and spend rose by 6.3% and 22.4%, as a result of the outage, without a corresponding increase in ad impressions. Consequently, Meta ad prices, as measured by cost per thousand impressions, jumped by 12.1%. Shifts in demand were three times greater for larger advertisers relative to smaller ones, suggesting that Meta platforms and TikTok are closer substitutes for larger firms and that a TikTok ban would therefore impose greater challenges on smaller businesses.
As of January 2024, Instagram was slightly more popular with men than women, with men accounting for 50.6 percent of the platformās global users. Additionally, the social media app was most popular amongst younger audiences, with almost 32 percent of users aged between 18 and 24 years.
Instagramās Global Audience
As of January 2024, Instagram was the fourth most popular social media platform globally, reaching two billion monthly active users (MAU). This number is projected to keep growing with no signs of slowing down, which is not a surprise as the global online social penetration rate across all regions is constantly increasing.
As of January 2024, the country with the largest Instagram audience was India with 362.9 million users, followed by the United States with 169.7 million users.
Who is winning over the generations?
Even though Instagramās audience is almost twice the size of TikTokās on a global scale, TikTok has shown itself to be a fierce competitor, particularly amongst younger audiences. TikTok was the most downloaded mobile app globally in 2022, generating 672 million downloads. As of 2022, Generation Z in the United States spent more time on TikTok than on Instagram monthly.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
The dataset is organized into three distinct folders, each corresponding to a chapter of the research project that explores gendered and algorithmic representations on three popular Chinese social media platforms: Rednote (Chapter 3), Kuaishou (Chapter 4), and Douyin (Chapter 5). This research investigates how platform affordances shape the visibility of users and influence the performance of gendered subjectivities and socio-economic influenceEach folder contains different types of data collected for the respective platform:Rednote (Chapter 3) ā This folder includes a collection of screenshots from 26 users' posts, comments left by viewers, and 12 interview fieldnotes from participants who actively engage with the platform. 7 key informants' fieldnotes. The creators were selected using digital snowball sampling, and their feedback provides insights into the algorithmic visibility of femininities and the governance of content on Rednote. TKuaishou (Chapter 4) ā The Kuaishou folder consists of : screenshots of 40 user posts, viewers comments, and interview fieldnotes from a group of selected Kuaishou influencers, users, and management officials. Kuaishou is a platform known for its emphasis on short video content, and the dataset here reflects the intersection of local cultures, rural narratives, and algorithmic shaping of visibility and influence. Interviews with users provide nuanced perspectives on how they navigate the platformās dynamics and align their content with the platformās affordances.Douyin (Chapter 5) ā This section contains set of data for Douyin, also known as the Chinese counterpart of TikTok. The data includes 22 creators' screenshots and user comments on Douyin. The study focuses on how Douyin shapes gender performances and the ways users leverage its features to build online communities and visibility.
The goal of the study is to explore how social media users think in moral and ethical terms about their online participation when they talk about TikTok. Relatively little research has focused on moral and ethical reasoning in the use of social media and no study to date has provided the opportunity to voice a userās own experience with moral issues as they perceive them through their use of TikTok. A thematic analysis of 40 in-depth interviews is applied to explore how young users define the āgoodā and what significance they attribute to moral principles.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Research Hypothesis This study hypothesizes that Indonesian consumersā preferences for sunscreen, particularly the local brand Azarine, are shaped by product attributes (active ingredients, SPF, packaging), user experience, and social proof. It also assumes that communication patterns differ across platforms: TikTok (visual, testimonial-driven) versus X/Twitter (textual, information-driven).
Description of the Data The dataset consists of 400 consumer comments (200 from TikTok and 200 from X/Twitter) collected manually between January 2024 and June 2025 using relevant hashtags (#SunscreenLokal, #AzarineSunscreen, #SunscreenSpray, #Azarine). The Raw Data sheet contains the original comments, posting dates, and links. The Preprocessing & Coding Process sheet documents open, axial, and selective coding. The Frequency Table summarizes the distribution of main themes, while the Comparative Analysis sheet highlights platform differences. The Keyword Attribute sheet captures specific categories such as skin type, skin issues, and product variants.
Notable Findings Three main themes emerged: (1) Product Attributes (composition, SPF, packaging), (2) Product Performance (visible results on skin), and (3) Social Proof (peer recommendations). TikTok discussions were dominated by performance and social validation (87% and 85%), while X users focused more on technical details (61% attributes). Keywords such as variant, skin type, and skin issue indicate that consumers increasingly seek personalized solutions rather than one-size-fits-all products.
How to Interpret and Use the Data This dataset reflects how consumers evaluate and discuss sunscreen across different platforms. Wordclouds and frequency tables provide an overview of frequently used terms, but deeper insights come from grounded theory analysis that organizes comments into thematic categories. Researchers and practitioners can use the dataset to understand consumer discourse patterns, design platform-specific marketing strategies, and develop skincare products that are more personalized, effective, and aligned with Indonesian consumersā needs.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Goodwill-and-Other-Intagible-Assets Time Series for Brave Bison Group PLC. Brave Bison Group plc provides digital advertising and technology services in the United Kingdom, Europe, and internationally. The company offers social media advertising, influencer marketing, search engine optimization, e-commerce software integration, paid media services, and AI tools. It also owns and operates a network of social-media channels on platforms, such as YouTube, Meta, and TikTok; and publishes and monetizes video content through the social and digital media platforms, as well as advertises on behalf of customers. In addition, the company offers digital media services in various platforms comprising Google, Meta, and TikTok, as well as provides digital PR services; designs and builds ecommerce websites; and manages the customer experience in a digital environment. Further, it provides organic and paid performance, technology experience, social and influencer, sports marketing, digital media network, and growth consultancy services. The company was formerly known as Rightster Group Plc and changed its name to Brave Bison Group Plc in May 2016. Brave Bison Group plc was incorporated in 2013 and is based in London, the United Kingdom.
As of April 2024, around 16.5 percent of global active Instagram users were men between the ages of 18 and 24 years. More than half of the global Instagram population worldwide was aged 34 years or younger.
Teens and social media
As one of the biggest social networks worldwide, Instagram is especially popular with teenagers. As of fall 2020, the photo-sharing app ranked third in terms of preferred social network among teenagers in the United States, second to Snapchat and TikTok. Instagram was one of the most influential advertising channels among female Gen Z users when making purchasing decisions. Teens report feeling more confident, popular, and better about themselves when using social media, and less lonely, depressed and anxious.
Social media can have negative effects on teens, which is also much more pronounced on those with low emotional well-being. It was found that 35 percent of teenagers with low social-emotional well-being reported to have experienced cyber bullying when using social media, while in comparison only five percent of teenagers with high social-emotional well-being stated the same. As such, social media can have a big impact on already fragile states of mind.
The number of Reddit users in the United States was forecast to continuously increase between 2024 and 2028 by in total 10.3 million users (+5.21 percent). After the ninth consecutive increasing year, the Reddit user base is estimated to reach 208.12 million users and therefore a new peak in 2028. Notably, the number of Reddit users of was continuously increasing over the past years.User figures, shown here with regards to the platform reddit, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period and count multiple accounts by persons only once. Reddit users encompass both users that are logged in and those that are not.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of Reddit users in countries like Mexico and Canada.
https://brightdata.com/licensehttps://brightdata.com/license
Use our TikTok profiles dataset to extract business and non-business information from complete public profiles and filter by account name, followers, create date, or engagement score. You may purchase the entire dataset or a customized subset depending on your needs. Popular use cases include sentiment analysis, brand monitoring, influencer marketing, and more. The TikTok dataset includes all major data points: timestamp, account name, nickname, bio,average engagement score, creation date, is_verified,l ikes, followers, external link in bio, and more. Get your TikTok dataset today!