Facebook
TwitterDuring the fourth quarter 2024, approximately 20.6 million TikTok accounts were removed from the platform due to suspicion of being operated by users under the age of 13. During the last measured period, around 185 million fake accounts were removed from fake accounts removed from TikTok.
Facebook
Twitterhttps://brightdata.com/licensehttps://brightdata.com/license
Use our TikTok profiles dataset to extract business and non-business information from complete public profiles and filter by account name, followers, create date, or engagement score. You may purchase the entire dataset or a customized subset depending on your needs. Popular use cases include sentiment analysis, brand monitoring, influencer marketing, and more. The TikTok dataset includes all major data points: timestamp, account name, nickname, bio,average engagement score, creation date, is_verified,l ikes, followers, external link in bio, and more. Get your TikTok dataset today!
Facebook
Twitterhttps://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
TikTok-10M Dataset
Dataset Description
TikTok-10M is a large-scale dataset containing 10 million short-form posts from TikTok, designed for video understanding, multimodal learning, and social media content analysis. The dataset was curated to bridge the gap between academic video datasets and actual user-generated content, providing researchers with authentic patterns and characteristics of modern short-form video content that dominates social media platforms.… See the full description on the dataset page: https://huggingface.co/datasets/The-data-company/TikTok-10M.
Facebook
TwitterBackground: COVID-related misinformation is prevalent online, including on social media. The purpose of this study was to explore factors associated with user engagement with COVID-related misinformation on the social media platform, TikTok. Methods: A sample of TikTok videos associated with the hashtag #coronavirus were downloaded on September 20, 2020. Misinformation was evaluated on a scale (low, medium, high) using a codebook developed by experts in infectious diseases. Multivariable modeling was used to evaluate factors associated with number of views and presence of user comments indicating intention to change behavior. Results: 166 TikTok videos were identified. Moderate misinformation was present in 36 (22%) videos, and high-level misinformation was present in 11 (7%). After controlling for characteristics and content, videos containing moderate misinformation were less likely to generate a user response indicating intended behavior change. By contrast, videos containing high-le..., ,
Facebook
TwitterIn the fourth quarter of 2024, TikTok generated around 186 million downloads from users worldwide. Initially launched in China first by ByteDance as Douyin, the short-video format was popularized by TikTok and took over the global social media environment in 2020. In the first quarter of 2020, TikTok downloads peaked at over 313.5 million worldwide, up by 62.3 percent compared to the first quarter of 2019.
TikTok interactions: is there a magic formula for content success?
In 2024, TikTok registered an engagement rate of approximately 4.64 percent on video content hosted on its platform. During the same examined year, the social video app recorded over 1,100 interactions on average. These interactions were primarily composed of likes, while only recording less than 20 comments per piece of content on average in 2024.
The platform has been actively monitoring the issue of fake interactions, as it removed around 236 million fake likes during the first quarter of 2024. Though there is no secret formula to get the maximum of these metrics, recommended video length can possibly contribute to the success of content on TikTok.
It was recommended that tiny TikTok accounts with up to 500 followers post videos that are around 2.6 minutes long as of the first quarter of 2024. While, the ideal video duration for huge TikTok accounts with over 50,000 followers was 7.28 minutes. The average length of TikTok videos posted by the creators in 2024 was around 43 seconds.
What’s trending on TikTok Shop?
Since its launch in September 2023, TikTok Shop has become one of the most popular online shopping platforms, offering consumers a wide variety of products. In 2023, TikTok shops featuring beauty and personal care items sold over 370 million products worldwide.
TikTok shops featuring womenswear and underwear, as well as food and beverages, followed with 285 and 138 million products sold, respectively. Similarly, in the United States market, health and beauty products were the most-selling items,
accounting for 85 percent of sales made via the TikTok Shop feature during the first month of its launch. In 2023, Indonesia was the market with the largest number of TikTok Shops, hosting over 20 percent of all TikTok Shops. Thailand and Vietnam followed with 18.29 and 17.54 percent of the total shops listed on the famous short video platform, respectively.
Facebook
Twitterhttps://brightdata.com/licensehttps://brightdata.com/license
Use our TikTok Shop dataset to extract detailed e-commerce insights, including product names, prices, discounts, seller details, product descriptions, categories, customer ratings, and reviews. You may purchase the entire dataset or a customized subset tailored to your needs. Popular use cases include trend analysis, pricing optimization, customer behavior studies, and marketing strategy refinement. The TikTok Shop dataset includes key data points: product performance metrics, user engagement, customer reviews, and more. Unlock the potential of TikTok's shopping platform today with our comprehensive dataset!
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
By [source]
This dataset explores various factors associated with the reception of COVID-19 related content on TikTok. It not only captures overall levels of user engagement such as likes, comments, and views but also explores source credibility including information from healthcare professionals, news sources, patients, and other outlets. It further dives into demographic factors such as gender and age range as well as content type like humor or provision of clinical instruction. Finally, it takes a look at elements such as description of risk factors & symptoms along with modes of transmission established by the posts in question and prevention that was discussed within them. Moreover, there is a discernment component that breaks down user perception - rating the posts for level of misinformation (moderate/high/low). All these measures combined provide insights into how users are engaging with COVID-19 related misinformation on TikTok
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
This dataset contains user engagement data and measures of source credibility related to COVID-19 misinformation on TikTok. It can be used to examine the factors associated with content reception, such as views, likes, comments, as well as factors relating to credibility, demographics and content type.
Using this dataset: - Explore the columns available in the dataset. There are a number of columns that measure user engagement (views, likes and comments) as well as source credibility (official source, healthcare professional etc.), demographic factors (gender, age group etc.), and content type (humor etc). Get familiar with all these columns so that you know what information is available for analysis.
- Decide what kind of analysis you want to perform. You can use this data for exploratory or explanatory work - depending on your aims or research question. For example if you want to see how source credibility affects user engagement then you would need descriptive statistical techniques such as correlation tests or regression analyses etc., whereas if you just want to gain an overall understanding of patterns in this data then exploratory techniques such as cross tabulations may be more suitable.
- Developing a predictive model to identify which demographic and source characteristics are correlated with high user engagement for COVID-related posts on TikTok (e.g. views, likes, and comments).
- Investigating the difference in user engagement for posts from healthcare professionals vs non-professional sources to compare how different types of content are received by users on TikTok.
- Analyzing the sentiment of words related to masks and tests in order to gain insights into how content about this topic is perceived by users on TikTok (i.e., positive or negative sentiment)
If you use this dataset in your research, please credit the original authors. Data Source
License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.
File: tiktok_data_open.csv | Column name | Description | |:-------------------------------|:------------------------------------------------------------------------| | views | Number of views for the video. (Integer) | | likes | Number of likes for the video. (Integer) | | comments | Number of comments for the video. (Integer) | | official_source | Whether the source of the video is an official source. (Boolean) | | pub_hcp | Whether the source of the video is a healthcare professional. (Boolean) | | pub_news | Whether the source of the video is a news source. (Boolean) | | pub_patient | Whether the source of the video is a patient. (Boolean) | | pub_other | Whether the source of the video is another source. (Boolean) | | female ...
Facebook
Twitterhttps://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
1) Data Introduction • The Dataset from TikTok contains 19,382 reports that users flagged as including "claim" in videos or comments, along with video length, transcription text, account status, and participation indicators, and is suitable for analyzing reporting reasons and viewer reactions by content.
2) Data Utilization (1) Dataset from TikTok has characteristics that: • This dataset consists of 12 columns, providing both the reported content type and the meta-participation index of the video. (2) Dataset from TikTok can be used to: • Claim Judgment Classification Model Development: By inputting video transcription text, participation indicators such as views, likes, shares, comments, and account authentication and sanctions information, the machine learning classification model can be automatically determined whether the content contains "claims." • Optimizing moderation tasks: Automate reporting priorities based on classification model predictability to speed up reporting processing and reduce supervision burden by selecting content that managers urgently need to review.
Facebook
Twitterhttps://www.gesis.org/en/institute/data-usage-termshttps://www.gesis.org/en/institute/data-usage-terms
TikTok is developing into a key platform for news, advertising, politics, online shopping, and entertainment in Germany, with over 20 million monthly users. Especially among young people, TikTok plays an increasing role in their information environment. We provide a human-coded dataset of over 4,000 TikTok videos from German-speaking news outlets from 2023. The coding includes descriptive variables of the videos (e.g., visual style, text overlays, and audio presence) and theory-derived concepts from the journalism sciences (e.g., news values).
This dataset consists of every second video published in 2023 by major news outlets active on TikTok from Germany, Austria, and Switzerland. The data collection was facilitated with the official TikTok API in January 2024. The manual coding took place between September 2024 and December 2024. For a detailed description of the data collection, validation, annotation and descriptive analysis, please refer to [Forthcoming dataset paper publication].
Facebook
TwitterWith barely 10 seconds and 61 million likes, Bella Poach's lip syncing "M to the B" by Millie B was the most engaging video on TikTok as of March 2023. Bella Poarch, who as of the beginning of 2023 was the third-most followed creator on the popular social video platform, rose to popularity as a singer and content creators since opening a TikTok account in January 2020. Second ranked "dancing in front of the bathroom mirror," by user @jamie32bsh generated almost 52 million likes between its upload time - January 2022 and March 2023.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Tiktok network graph with 5,638 nodes and 318,986 unique links, representing up to 790,599 weighted links between labels, using Gephi network analysis software.
Source of:
Peña-Fernández, Simón, Larrondo-Ureta, Ainara, & Morales-i-Gras, Jordi. (2022). Current affairs on TikTok. Virality and entertainment for digital natives. Profesional De La Información, 31(1), 1–12. https://doi.org/10.5281/zenodo.5962655
Abstract:
Since its appearance in 2018, TikTok has become one of the most popular social media platforms among digital natives because of its algorithm-based engagement strategies, a policy of public accounts, and a simple, colorful, and intuitive content interface. As happened in the past with other platforms such as Facebook, Twitter, and Instagram, various media are currently seeking ways to adapt to TikTok and its particular characteristics to attract a younger audience less accustomed to the consumption of journalistic material. Against this background, the aim of this study is to identify the presence of the media and journalists on TikTok, measure the virality and engagement of the content they generate, describe the communities created around them, and identify the presence of journalistic use of these accounts. For this, 23,174 videos from 143 accounts belonging to media from 25 countries were analyzed. The results indicate that, in general, the presence and impact of the media in this social network are low and that most of their content is oriented towards the creation of user communities based on viral content and entertainment. However, albeit with a lesser presence, one can also identify accounts and messages that adapt their content to the specific characteristics of TikTok. Their virality and engagement figures illustrate that there is indeed a niche for current affairs on this social network.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is a dataset of videos and comments related to the invasion of Ukraine, published on TikTok by a number of users over the year of 2022. It was compiled by Benjamin Steel, Sara Parker and Derek Ruths at the Network Dynamics Lab, McGill University. We created this dataset to facilitate the study of TikTok, and the nature of social interaction on the platform relevant to a major political event.
The dataset has been released here on Zenodo: https://doi.org/10.5281/zenodo.7926959 as well as on Github: https://github.com/networkdynamics/data-and-code/tree/master/ukraine_tiktok
To create the dataset, we identified hashtags and keywords explicitly related to the conflict to collect a core set of videos (or ”TikToks”). We then compiled comments associated with these videos. All of the data captured is publically available information, and contains personally identifiable information. In total we collected approximately 16 thousand videos and 12 million comments, from approximately 6 million users. There are approximately 1.9 comments on average per user captured, and 1.5 videos per user who posted a video. The author personally collected this data using the web scraping PyTok library, developed by the author: https://github.com/networkdynamics/pytok.
Due to scraping duration, this is just a sample of the publically available discourse concerning the invasion of Ukraine on TikTok. Due to the fuzzy search functionality of the TikTok, the dataset contains videos with a range of relatedness to the invasion.
We release here the unique video IDs of the dataset in a CSV format. The data was collected without the specific consent of the content creators, so we have released only the data required to re-create it, to allow users to delete content from TikTok and be removed from the dataset if they wish. Contained in this repository are scripts that will automatically pull the full dataset, which will take the form of JSON files organised into a folder for each video. The JSON files are the entirety of the data returned by the TikTok API. We include a script to parse the JSON files into CSV files with the most commonly used data. We plan to further expand this dataset as collection processes progress and the war continues. We will version the dataset to ensure reproducibility.
To build this dataset from the IDs here:
pip install -e . in the pytok directorypip install pandas tqdm to install these libraries if not already installedget_videos.py to get the video datavideo_comments.py to get the comment datauser_tiktoks.py to get the video history of the usershashtag_tiktoks.py or search_tiktoks.py to get more videos from other hashtags and search termsload_json_to_csv.py to compile the JSON files into two CSV files, comments.csv and videos.csvIf you get an error about the wrong chrome version, use the command line argument get_videos.py --chrome-version YOUR_CHROME_VERSION Please note pulling data from TikTok takes a while! We recommend leaving the scripts running on a server for a while for them to finish downloading everything. Feel free to play around with the delay constants to either speed up the process or avoid TikTok rate limiting.
Please do not hesitate to make an issue in this repo to get our help with this!
The videos.csv will contain the following columns:
video_id: Unique video ID
createtime: UTC datetime of video creation time in YYYY-MM-DD HH:MM:SS format
author_name: Unique author name
author_id: Unique author ID
desc: The full video description from the author
hashtags: A list of hashtags used in the video description
share_video_id: If the video is sharing another video, this is the video ID of that original video, else empty
share_video_user_id: If the video is sharing another video, this the user ID of the author of that video, else empty
share_video_user_name: If the video is sharing another video, this is the user name of the author of that video, else empty
share_type: If the video is sharing another video, this is the type of the share, stitch, duet etc.
mentions: A list of users mentioned in the video description, if any
The comments.csv will contain the following columns:
comment_id: Unique comment ID
createtime: UTC datetime of comment creation time in YYYY-MM-DD HH:MM:SS format
author_name: Unique author name
author_id: Unique author ID
text: Text of the comment
mentions: A list of users that are tagged in the comment
video_id: The ID of the video the comment is on
comment_language: The language of the comment, as predicted by the TikTok API
reply_comment_id: If the comment is replying to another comment, this is the ID of that comment
The date can be compiled into a user interaction network to facilitate study of interaction dynamics. There is code to help with that here: https://github.com/networkdynamics/polar-seeds. Additional scripts for further preprocessing of this data can be found there too.
Facebook
TwitterAs of February 2025, the United States was the region with the largest TikTok audience by far, with almost ****** million users engaging with the popular social video platform. Indonesia followed, with around ***** million TikTok users. Brazil came in third, with almost ***** million users on TikTok watching short-videos. From Reels to Shorts: social short video takes the internet Between 2021 and 2022 some of the most popular social media platforms have been adding short-video features on the heels of TikTok’s popularity. YouTube Shorts, which rolled out to the global market in June 2021, reached *** billion monthly active logged-in users in 2023. In comparison, Instagram’s short-video format Reels, which launched in August 2020, presented a higher view rate than regular videos on the platform between June 2021 and June 2022, as well as a higher likes rate than other content types on Instagram. TikTok business model TikTok is owned by the Beijing-based ByteDance, along with the short-video app Douyin (TikTok’s version for the Chinese market), video platform Xigua, and popular news app Toutiao. While the products intended for domestic market consumption operate in the Chinese digital ecosystem and have a plurality of established monetization methods such as a live-shopping events hosted by famous influencers, TikTok’s main revenue stream comes from online advertising. In 2022, TikTok was estimated to have generated around **** billion U.S. dollars worldwide via online advertising.
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
The dataset primarily encompasses daily-refreshed reviews and ratings from users of the Tiktok App. Supplementary data, such as the relevancy of the reviews and the dates they were published, is also part of the dataset.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In a context where there is permanent electoral campaigning, an increasing number of political communication experts are trying to unravel the resources used by government officials and their parties to influence TikTok users. From a broad perspective, the subject matter is not new, but it is topical; nonetheless, this research discloses a gap in the literature by amalgamating the recognition of idiosyncratic attributes of the feminisation of political discourse on TikTok with the analysis of the reactions (text and emojis) that the audiovisual content imbued by this trend elicits in users. The purpose is to ascertain whether the inclusive tone of the feminised rhetorical style can be extrapolated to TikTok and, if so, whether its particular characteristics mitigate expressions of incivility. To do so, the initial content posted (first seven months) on TikTok by the Spanish political platform Sumar with its leader, Yolanda DĂaz, featuring prominently in most of the videos, were selected for scrutiny. A mixed methodology analysis of audiovisual content and comments showed that the anti-polarisation rhetoric and storytelling contributed to neutralising the extreme forms of flaming, although Sumar did not use a strategy tailor-made to suit TikTok.
Facebook
TwitterA high-engagement dataset of TikTok posts with 50K+ plays. Includes video captions, play counts, likes, comments, shares, and creator details—ideal for trend analysis, viral content tracking, and performance benchmarking across TikTok creators and campaigns.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Please refer to the "README.rtf" and "Data Details for Replication.pdf" files.Paper AbstractSocial media platforms have become vital channels for businesses to reach consumers through advertising. But in the U.S., the digital advertising market in which these platforms operate is dominated by a few major players, raising concerns for antitrust regulators. In such a concentrated market, the entry or exit of a single platform can reallocate billions in ad spending, affecting businesses and users. TikTok's temporary suspension in the U.S. in January 2025 provides a unique natural experiment to examine how the removal of a major player would shift advertising demand and supply on competitors, specifically Facebook and Instagram, revealing the degree of substitutability across platforms and the intensity of competition. Using a difference-in-differences approach comparing advertising activity in the U.S. to other countries, we find that Meta ad volume and spend rose by 6.3% and 22.4%, as a result of the outage, without a corresponding increase in ad impressions. Consequently, Meta ad prices, as measured by cost per thousand impressions, jumped by 12.1%. Shifts in demand were three times greater for larger advertisers relative to smaller ones, suggesting that Meta platforms and TikTok are closer substitutes for larger firms and that a TikTok ban would therefore impose greater challenges on smaller businesses.
Facebook
TwitterA curated dataset of TikTok Mom creators with rich engagement metrics and post-level insights. Ideal for analyzing parenting and lifestyle content trends, influencer performance, and audience behavior—delivered in flexible, structured formats for easy integration.
Facebook
TwitterA curated dataset of 18K+ verified TikTok influencers across top content categories. Includes bios, follower counts, engagement metrics, and trend signals—ideal for premium brand campaigns and influencer analytics, or creator discovery.
Facebook
TwitterThis dataset encompasses social media exposure to sponsored posts, collected from over 150,000 triple-opt-in first-party U.S. Daily Active Users (DAU). Use it for measurement, attribution or brand lift surveying. Platforms covered include Facebook, TikTok, X, Instagram and YouTube.
Facebook
TwitterDuring the fourth quarter 2024, approximately 20.6 million TikTok accounts were removed from the platform due to suspicion of being operated by users under the age of 13. During the last measured period, around 185 million fake accounts were removed from fake accounts removed from TikTok.