Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
We are probably all familiar with TikTok. People tend to spend hours each day scrolling through the millions of videos which are uploaded every single day. Not to mention the uploaders who are giving anything to get as many likes and followers as possible. But what makes one TikTok video a true hit or a miss? I give you an opportunity to figure this out ;)
I scraped the first 1000 trending videos on TikTok, using an unofficial TikTok web-scraper. Note to mention I had to provide my user information to scrape the trending information, so trending might be a personalized page. But that doesn't change the fact that certain people and videos got a certain amount of likes and comments.
I transformed the data into usable csv files and attached the actual videos as well.
Videos.zip This file contains the actual 1000 trending TikTok videos. Each filename corresponds to the id key in the trending.json file.
trending.json The raw scraped dataset. I figured splitting up the dataset resulted in messy errors. For example: a user might have one avatar while posting a video and another while posting the next video. This resulted in multiple users with the same name, id etc. except for the avatar. So I decided to post the raw data and I will show you how to translate this multi-level JSON structure to a single DataFrame in my first Notebook.
Many thanks to Andrew Nord the creator of the tiktok-scraper, and his contributers.
So what does make a TikTok video a true hit? Is it the moment when a video is uploaded? Or perhaps the amount of followers is an important factor? Maybe the hashtags or even the music being used?
So... are you the one who unlocks the mystery?
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
his dataset contains a large collection of TikTok video metadata fetched using the TikTok Scraper API. It includes videos from multiple regions (e.g., US, India,) and categories (e.g., fyp, dance, comedy, food, travel, etc.). Each video entry
provides detailed information such as:
Video ID: Unique identifier for the video. Region: The region where the video is popular. Category: The keyword/category used to fetch the video (e.g., dance, comedy). Title: The title of the video. Duration: The length of the video in seconds. Play URL: Direct link to the video. Watermarked URL: Link to the watermarked version of the video. Cover Image: URL of the video's cover image. Music URL: Link to the music used in the video. Timestamp: The date and time when the data was fetched.
How This Dataset Can Be Helpful
Trend Analysis: Analyze trending videos across different regions and categories. Identify patterns in video popularity based on region, duration, or category.
Machine Learning: Train models to predict video popularity based on features like duration, region, and category. Build recommendation systems for TikTok videos.
Content Moderation: Use the dataset to analyze video content for moderation purposes.
Sentiment Analysis: Perform sentiment analysis on video titles to understand user preferences.
Cross-Region Insights: Compare video trends across different regions to understand cultural differences.
How to Use This Dataset Filter by Region: Analyze videos from a specific region (e.g., US or India).
Filter by Category: Focus on videos from a specific category (e.g., dance or comedy).
Trend Analysis: Identify trending videos based on timestamp and region.
Machine Learning: Use the dataset to train models for video popularity prediction or recommendation systems.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
This dataset, titled TikTok Viral Trends 2025, provides a curated snapshot of 50 trending TikTok videos from September 2025, capturing the platform's dynamic content landscape. Sourced from real-time web analyses and social media insights (e.g., X posts, trend reports from reputable sources like Ramdam, NapoleonCat, and Tokchart), it focuses on viral videos across diverse categories such as Entertainment, Music, Comedy, Lifestyle, Beauty, Sustainability, and Technology. The dataset is designed for data scientists, researchers, and enthusiasts interested in analyzing social media trends, predicting virality, or exploring multimodal machine learning applications (e.g., NLP, time-series, or clustering). It stands out from existing Kaggle datasets by offering fresh, 2025-specific data with rich metadata, including engagement metrics, hashtags, and sound/trend associations.
tiktok_data.csv).post:72, web:65).The dataset contains the following 12 columns:
- video_id: Unique identifier for each video or trend (integer or hashtag-based).
- author: Creator username or group (anonymized as "Unknown" where not specified).
- description: Brief summary of the video content or trend, derived from source context.
- upload_date: Approximate or exact posting date (YYYY-MM-DD).
- views: Reported view count (e.g., millions, billions for hashtag aggregates; "N/A" if unavailable).
- likes: Reported like count (e.g., thousands, millions; "N/A" if unavailable).
- shares: Share count (often "N/A" due to limited public data).
- comments: Comment count (often "N/A" due to limited public data).
- hashtags: Key hashtags associated with the video or trend (e.g., #Kpop, #Viral).
- category: Inferred content category (e.g., Entertainment, Music, Comedy, Lifestyle, Sustainability, Tech).
- sound_or_trend: Associated audio track or challenge name driving the trend (e.g., "Soda Pop dance", "JUMP").
- source: Citation of data origin (e.g., post:72 for X post ID, web:65 for web source ID).
#Perfume reaching 39.3B views.This dataset is ideal for a variety of machine learning and data analysis tasks on Kaggle, including but not limited to:
- Virality Prediction: Use views, likes, and hashtags to train regression or classification models (e.g., XGBoost, neural networks) to predict video success.
- Trend Analysis: Apply clustering (e.g., K-means) or topic modeling (e.g., LDA) to identify emerging content themes or regional differences.
- NLP Applications: Analyze descriptions and hashtags with BERT or word embeddings to study sentiment, cultural trends, or influencer impact.
- Time-Series Forecasting: Leverage upload_date and engagement metrics for temporal analysis of trend lifecycles.
- Recommendation Systems: Build content recommendation models based on category, sound, or hashtag similarities.
- Social Media Ethics: Explore AI-driven trends (e.g., deepfake Identity Swaps) for studies on misinformation or content authenticity.
#Ominous). Exact metrics may vary slightly due to real-time fluctuations.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset records various features of top trending videos on TikTok and Youtube Shorts in the summer of 2022. Features include video (theme, type, style, length), and music(genre, release year, and part of the music used).
For use of data examples, please refer to the dashboards I made with Tableau here: TikTok Top Trending Video dashboard: https://public.tableau.com/app/profile/caroline.zhu6047/viz/TopTrendingVideoDashboard_16691429927590/Overview
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset was created by Marcus Ong
Released under CC0: Public Domain
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
The TikHarm dataset is a curated collection of TikTok videos designed to train models for classifying harmful content. The dataset is in the format of UCF101, and it is specifically focused on content accessible to children, with the aim of distinguishing between different types of potentially harmful material.
Data was gathered from TikTok, targeting videos that are accessible to children to ensure the dataset reflects the type of content they are likely to encounter.
Collected videos were manually labeled into four predefined categories: - Harmful Content: Videos that depict violence, dangerous actions that children might imitate, or other harmful behavior. - Adult Content: Videos containing sexual content or other material deemed inappropriate for children. - Safe: Videos that are appropriate and safe for children to view: popular cartoon, etc. - Suicide: Videos that depict, suggest, or discuss suicidal behavior or ideation.
| Subset | Samples | Min Duration (s) | Max Duration (s) | Avg Duration (s) | Total Duration (h) |
|---|---|---|---|---|---|
| Train | 2762 | 3.88 | 600 | 38.71 | 29.71 |
| Dev | 790 | 5.04 | 600 | 38.57 | 4.24 |
| Test | 396 | 1.95 | 600 | 38.77 | 8.51 |
| Class | Samples | Min Duration (s) | Max Duration (s) | Avg Duration (s) | Total Duration (h) |
|---|---|---|---|---|---|
| Safe | 997 | 5.04 | 568.8 | 65.36 | 18.1 |
| Adult | 977 | 1.95 | 600 | 36.25 | 9.84 |
| Harmful | 990 | 4.8 | 600 | 35.92 | 9.88 |
| Suicide | 984 | 3.88 | 181.23 | 16.96 | 4.63 |
These tables present the duration statistics for each subset and class within the TikHarm dataset.
This comprehensive dataset is invaluable for developing robust video classification models to automatically detect and categorize harmful content on social media platforms.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
TikTok is one of the hottest social media platforms out there, and it's only getting bigger. If you're looking to get in on the action, this dataset is for you!
This dataset contains a collection of videos from TikTok, including information on the user who posted the video, the number of likes, shares, and comments the video received, as well as the video's length and description. With this data, you can see what types of videos are popular on TikTok and start planning your own viral content!
- The dataset contains a collection of videos from the social media platform TikTok.
- The videos include information on the user who posted the video, the number of likes, shares, and comments the video received, as well as the video's length and description.
- The dataset also contains information on popular TikTok authors, including their unique ID, nickname, avatar thumbnail, signature, and whether or not their account is verified or private.
- Additionally, the dataset includes a list of trending videos on TikTok, as well as the number of likes, shares, comments, and plays each video has received
- Identifying popular TikTok authors to target for scraping videos and liked videos
- Finding trending videos on TikTok for further analysis
- Generating a list of videos from the TikTok app that are tagged with the #funny hashtag
License
License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.
File: tiktok_collected_liked_videos.csv | Column name | Description | |:---------------|:---------------------------------------------------------| | user_name | The name of the user who posted the video. (String) | | n_likes | The number of likes the video has received. (Integer) | | n_shares | The number of shares the video has received. (Integer) | | n_comments | The number of comments the video has received. (Integer) | | n_plays | The number of times the video has been played. (Integer) |
File: tiktok_collected_videos.csv | Column name | Description | |:---------------|:---------------------------------------------------------| | user_name | The name of the user who posted the video. (String) | | n_likes | The number of likes the video has received. (Integer) | | n_shares | The number of shares the video has received. (Integer) | | n_comments | The number of comments the video has received. (Integer) | | n_plays | The number of times the video has been played. (Integer) |
File: tiktok_funny_hashtag_videos.csv | Column name | Description | |:--------------------------|:-----------------------------------------------------------| | author_nickname | The author's nickname. (String) | | author_avatarThumb | The author's avatar thumbnail. (String) | | author_signature | The author's signature. (String) | | author_verification | Whether or not the author's account is verified. (Boolean) | | author_privateAccount | Whether or not the author's account is private. (Boolean) | | author_followingCount | The number of people the author is following. (Integer) | | author_followerCount | The number of people following the author. (Integer) | | author_heartCount | The number of hearts the author has. (Integer) | | author_diggCount | The number of diggs the author has. (Integer) | | music_title | The title of the music. (String) | | music_playUrl | The play url of the music. (String) | | music_coverThumb | The cover thumbnail of the music. (String) | | music_authorName | The author name of the music. (String) | | music_originality | The originality of the music. (String) | | music_duration | The duration of the music. (String) |
File: trending_authors.csv | Column name | Description ...
Facebook
TwitterThis dataset contains comprehensive information about TikTok posts, originally fetched from RapidAPI. It provides valuable insights into various aspects of TikTok content, including details about the videos, their creators, and audience engagement metrics.
Here's a breakdown of the columns included in this dataset:
video_id: A unique identifier for each TikTok video. author: The username or handle of the TikTok account that posted the video. description: The textual description or caption provided by the creator for the video. (Note: This column contains some missing values.) likes: The number of likes the video has received. comments: The number of comments on the video. shares: The number of times the video has been shared. plays: The total number of plays or views the video has accumulated. (Note: This column contains some missing values.) hashtags: A list of hashtags used in the video's description, which helps categorize content and improve discoverability. (Note: This column contains some missing values.) music: Information about the background music or sound used in the video. create_time: The timestamp indicating when the video was created or published. (Note: This column contains some missing values.) video_url: The direct URL to the TikTok video. fetch_time: The timestamp when the data for the video was fetched from the API. (Note: This column has a high number of missing values.) views: Another metric for the number of views. (Note: This column has a high number of missing values and appears to overlap with plays.) posted_time: The time the video was posted. (Note: This column has a high number of missing values and appears to overlap with create_time.) Potential Uses of This Dataset:
Content Analysis: Analyze popular TikTok content by examining descriptions, hashtags, and engagement metrics. Trend Identification: Identify trending topics, music, and creators on TikTok. Audience Engagement Studies: Understand how different types of content generate likes, comments, shares, and plays. Creator Analysis: Study the posting habits and performance of various TikTok creators. Social Media Research: Conduct research on the dynamics of content dissemination and user interaction on short-form video platforms. Notes on Data Quality:
The description, plays, hashtags, and create_time columns have some missing values, which may require handling (e.g., imputation or removal) depending on your analysis. The fetch_time, views, and posted_time columns are largely empty, suggesting they may not be reliable for comprehensive analysis. It is recommended to primarily rely on create_time for timestamps and plays for engagement metrics. This dataset can be a valuable resource for anyone looking to explore the vast and dynamic world of TikTok content and user engagement.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset captures the pulse of viral social media trends across TikTok, Instagram, Twitter, and YouTube. It provides insights into the most popular hashtags, content types, and user engagement levels, offering a comprehensive view of how trends unfold across platforms. With regional data and influencer-driven content, this dataset is perfect for:
Dive in to explore what makes content go viral, the behaviors that drive engagement, and how trends evolve on a global scale! π
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
How do you measure the success of a video on social media? Is it the number of likes? The number of shares? The number of comments?
This dataset contains information on videos posted to the social media platform TikTok. The data includes the video ID, description, creation time, length, number of likes, shares, and comments, as well as a link to the video.
With this data, you can explore what factors make a video popular on TikTok and learn more about user preferences on this rapidly growing social media platform
This dataset can be used to study user preferences in social media. The data includes the number of likes, shares, comments, and plays for each video, as well as the video's description, length, and link
- Identifying trends in social media
- Analyzing user preferences in social media
- Predicting future trends in social media
Dataset by TikTok
License
License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.
File: omnibuslaw_videos.csv | Column name | Description | |:---------------|:---------------------------------------------------------| | createTime | The date and time the video was posted. (DateTime) | | n_likes | The number of likes the video has received. (Integer) | | n_shares | The number of times the video has been shared. (Integer) | | n_comments | The number of comments the video has received. (Integer) | | n_plays | The number of times the video has been played. (Integer) |
File: tiktok_liked_videos.csv | Column name | Description | |:---------------|:----------------------------------------------------------| | n_likes | The number of likes the video has received. (Integer) | | n_shares | The number of times the video has been shared. (Integer) | | n_comments | The number of comments the video has received. (Integer) | | n_plays | The number of times the video has been played. (Integer) | | user_name | The username of the person who posted the video. (String) |
File: trending.csv | Column name | Description | |:---------------|:----------------------------------------------------------| | user_name | The username of the person who posted the video. (String) | | n_likes | The number of likes the video has received. (Integer) | | n_shares | The number of times the video has been shared. (Integer) | | n_comments | The number of comments the video has received. (Integer) | | n_plays | The number of times the video has been played. (Integer) |
File: washingtonpost_videos.csv | Column name | Description | |:---------------|:----------------------------------------------------------| | user_name | The username of the person who posted the video. (String) | | n_likes | The number of likes the video has received. (Integer) | | n_shares | The number of times the video has been shared. (Integer) | | n_comments | The number of comments the video has received. (Integer) | | n_plays | The number of times the video has been played. (Integer) |
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
With the rise of short-form video platforms such as TikTok, Instagram Reels, YouTube Shorts, and Snapchat, the music industry is yet again adapting to changing consumer consumption trends.
Here's the data for music used in trending videos in the second half of 2022.
Facebook
TwitterUS Supermarkets have seen a recent shortage of Feta Cheese due to a TikTok pasta that went viral. "https://www.fox5ny.com/news/viral-tiktok-video-recipe-prompts-feta-cheese-shortage"
The Brazilian music industry is already experiencing huge shifts in it's business model, TikTok changed young people playlists. Most of the biggest players in this market realized the day-light revolution of music going on, and are trying to influence as much as possible something many believe to be random: songs going viral.
This data contains 10.000 rows, each describing a single video. Along with that, there are 14 columns: username, user id, video id, video desc, videotime, video length, video link, n likes, n shares, n comments, n plays, music name, music url
Thank you David Teather for developing a nice and easy-to-use API.
Facebook
TwitterThis dataset contains information on over 19,000 TikTok videos, sourced from the Google Advanced Data Analytics course. It includes details on video duration, transcriptions, engagement metrics (views, likes, shares, comments), and author attributes like verification and ban status. Use this dataset to explore video trends, analyze social media engagement, or build machine learning models for content recommendation and trend prediction.
Key Features: - Claim Status: Status of claims on the videos. - Duration: Length of videos in seconds. - Engagement Metrics: Likes, shares, views, downloads, and comments. - Transcriptions: Textual transcriptions for content analysis. - Author Information: Verification and ban status of video creators.
Facebook
TwitterAs of January 2022, The United States was the country with the largest TikTok audience by far, with approximately 131 million users engaging with the popular social video platform. Indonesia followed, with around 92 million TikTok users. Brazil came in third, with 74 million users using TikTok to watch short-videos.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The reviews and ratings for the TikTok application on the Android platform provide valuable insights into user experiences, satisfaction levels, and overall performance of the app. TikTok, a popular social media platform known for its short-form video content, has garnered millions of downloads and active users worldwide. On the Google Play Store, users have the opportunity to rate the app on a scale of 1 to 5 stars and leave detailed reviews highlighting their thoughts, feedback, and suggestions.
Positive reviews often praise TikTok for its user-friendly interface, innovative video editing tools, and the ability to discover entertaining and creative content from a diverse global community. Many users appreciate the app's algorithm, which curates personalized content tailored to individual preferences, making it highly engaging and addictive. Additionally, the frequent updates and introduction of new features, such as filters, effects, and music integration, are frequently mentioned as reasons for high ratings.
On the other hand, some negative reviews highlight concerns about privacy, data security, and the presence of inappropriate content. A few users have reported occasional bugs, crashes, or performance issues, particularly on older Android devices. Despite these criticisms, TikTok's overall rating remains high, reflecting its widespread popularity and the enjoyment it brings to the majority of its users. The reviews and ratings collectively serve as a useful resource for potential new users to gauge the app's strengths and weaknesses before downloading it.
Facebook
TwitterTikTok's platform is mostly fueled by viral videos of users doing outlandish, scary, or funny things. On the platform, these trend and meme videos typically come with a hashtag that includes the word challenge. But what is a TikTok challenge and how do you find or create them? Here's everything you need to know.
This TikTok book challenge was made by @haleyisfearless, . It asks you to show, your prettiest book,your tiniest book a book you highly suggest a book you're currently reading and one of your favorite books . In the most basic sense, these challenges originate from viral TikTok challenge isn't complete without its defining hashtag in the video's description
These TikTok challenges are the perfect way to ease into what can be an intimidating social media platform and help you find your fellow book lovers.
This dataset is generated entirely from TikTok , so we want to thank @haleyisfearless for building such this challange video
the goal of this project is to make Python script which takes a video as input and returns all texts visible on the video. the videos are titlok videos so texts can appear everywhere on screen, with different background, font size etc..
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains metadata from TikTok videos in the beauty and personal care niche. The data is structured to analyze video performance, user interaction, and content features, with specific metrics such as play count, share count, comments, and video details. It also includes user-level attributes like follower count, region, and engagement metrics, enabling analysis of influencer activity and content trends in this domain.
Source: Public TikTok profiles collected via Apify (a web scraping tools).
Inspiration: Explore how users engage with TikTok content and profiles. Use this data to create predictive models or track trends in social media engagement.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset provides information about the top 100 TikTok accounts worldwide in 2025, ranked based on their popularity. The data has been manually curated and includes essential metrics that reflect the performance and engagement of TikTok creators. It can be used for various purposes such as trend analysis, content strategy development, or understanding the growth of social media influencers.
Features Included: Rank: Ranking based on follower count. Uploads: The total number of videos uploaded by the account. Views: Total views generated by the account's videos. Followers: Number of followers for the account. Following: Number of accounts the user is following. Username: The username of the TikTok account.
This dataset is suitable for data analysis, machine learning model development, and studying trends in social media content.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
By [source]
This dataset provides unprecedented insight into public opinion and discourse related to a major foreign policy event: the hypothetical invasion of Ukraine in 2022. Through this dataset, researchers have access to 16 thousand TikTok videos, spanning 6 million unique users, as well as 12 million associated comments. Explore discourse themes on the platform and investigate how opinions are shaped by political events through sentiment analysis. As further research develops, compare findings from this dataset with similar datasets from other social media platforms to better illuminate the nature of digital public opinion and its potential influence on national policies
For more datasets, click here.
- π¨ Your notebook can be here! π¨!
This dataset provides an opportunity to gain a broad understanding of how users engage with and contribute to the conversation around a major political event on the TikTok platform. Here are some tips on how you can use this dataset:
- Analyze User Engagement: You can study user engagement by exploring the comment threads associated with each video in the dataset, examining trends for particular user types or locations, or exploring any features that could have predictive value in terms of engagement levels.
- Compare User Participation: You can compare user participation from different countries or regions by analyzing comments and likes over time in relation to nationality. This would allow you to better understand where conversations about this particular event is most popular, and which countries/regions are more likely to have an opinion about it.
- Explore Topics & Narratives: By taking advantage of NLP techniques such as sentiment analysis and topic modeling on comments data, you will be able to uncover common themes amongst videos with shared narratives related the event in question
By leveraging these tools, you will be able to extract meaning from this massive dataset and gain insightful information into individual usersβ behavior as well as overall discourse around the invasion of Ukraine in 2022
- Cultural attitudes towards the invasion of Ukraine in 2022: This dataset can be used to determine public attitudes towards the event by analyzing both the comments and videos from users, providing an alternative means of studying cultural predispositions than traditional polls or surveys.
- Influence of online communities on discussing issues: This dataset can be used to study how online communities influence peopleβs mindset and opinions on a certain topic. By analyzing how conversations change across different platforms, academics may be able to determine what makes certain communities more effective at forming consensus around issues compared to others.
- Interpersonal dynamics among users regarding significant events: Analyzing this data can shed light into how conversations turn into heated debates between two groups of users, establishing either agreement or dissent over a particular topic matter related to the invasion in 2022 as well as identifying which individuals are influential among certain circles for sparking engagement with their ideas or statements about their views towards said event
If you use this dataset in your research, please credit the original authors. Data Source
License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.
File: video_ids.csv
If you use this dataset in your research, please credit the original authors. If you use this dataset in your research, please credit .
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F1842206%2F051d845a31b5a7cfddb786e6c05a579d%2Ftiktoknonlite2.png?generation=1700450788125487&alt=media" alt="">
As per wikipedia https://en.wikipedia.org/wiki/TikTok
TikTok, whose mainland Chinese counterpart is Douyin (Chinese: ζι³; pinyin: DΗuyΔ«n), is a short-form video hosting service owned by ByteDance. It hosts user-submitted videos, which can range in duration from 3 seconds to 10 minutes. Since their launches, TikTok and Douyin have gained global popularity. In October 2020, TikTok surpassed 2 billion mobile downloads worldwide. Morning Consult named TikTok the third-fastest growing infotech brand of 2020, after Zoom and Peacock. Cloudflare ranked TikTok the most popular website of 2021, surpassing Google. TikTok's popularity has resulted in the platform having an increasing cultural impact worldwide.
These reviews were extracted from its [Google Store page NONLITE/ FULL]VERSION(https://play.google.com/store/apps/details?id=com.ss.android.ugc.trill).
This dataset should paint a good picture on what is the public's perception of the app over the years. Using this dataset, we can do the following
(AND MANY MORE!)
Images generated using Bing Image Generator
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
We are probably all familiar with TikTok. People tend to spend hours each day scrolling through the millions of videos which are uploaded every single day. Not to mention the uploaders who are giving anything to get as many likes and followers as possible. But what makes one TikTok video a true hit or a miss? I give you an opportunity to figure this out ;)
I scraped the first 1000 trending videos on TikTok, using an unofficial TikTok web-scraper. Note to mention I had to provide my user information to scrape the trending information, so trending might be a personalized page. But that doesn't change the fact that certain people and videos got a certain amount of likes and comments.
I transformed the data into usable csv files and attached the actual videos as well.
Videos.zip This file contains the actual 1000 trending TikTok videos. Each filename corresponds to the id key in the trending.json file.
trending.json The raw scraped dataset. I figured splitting up the dataset resulted in messy errors. For example: a user might have one avatar while posting a video and another while posting the next video. This resulted in multiple users with the same name, id etc. except for the avatar. So I decided to post the raw data and I will show you how to translate this multi-level JSON structure to a single DataFrame in my first Notebook.
Many thanks to Andrew Nord the creator of the tiktok-scraper, and his contributers.
So what does make a TikTok video a true hit? Is it the moment when a video is uploaded? Or perhaps the amount of followers is an important factor? Maybe the hashtags or even the music being used?
So... are you the one who unlocks the mystery?