Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This is the Dataset of popular hashtags on TikTok, this includes the author name, author id, author signature, comment count, hashtags details, URL, share count, hashtags which i scrape are meme, funny, humor, comedy, education, lol, dance, song, music, etc.
Facebook
TwitterThis dataset contains comprehensive information about TikTok posts, originally fetched from RapidAPI. It provides valuable insights into various aspects of TikTok content, including details about the videos, their creators, and audience engagement metrics.
Here's a breakdown of the columns included in this dataset:
video_id: A unique identifier for each TikTok video. author: The username or handle of the TikTok account that posted the video. description: The textual description or caption provided by the creator for the video. (Note: This column contains some missing values.) likes: The number of likes the video has received. comments: The number of comments on the video. shares: The number of times the video has been shared. plays: The total number of plays or views the video has accumulated. (Note: This column contains some missing values.) hashtags: A list of hashtags used in the video's description, which helps categorize content and improve discoverability. (Note: This column contains some missing values.) music: Information about the background music or sound used in the video. create_time: The timestamp indicating when the video was created or published. (Note: This column contains some missing values.) video_url: The direct URL to the TikTok video. fetch_time: The timestamp when the data for the video was fetched from the API. (Note: This column has a high number of missing values.) views: Another metric for the number of views. (Note: This column has a high number of missing values and appears to overlap with plays.) posted_time: The time the video was posted. (Note: This column has a high number of missing values and appears to overlap with create_time.) Potential Uses of This Dataset:
Content Analysis: Analyze popular TikTok content by examining descriptions, hashtags, and engagement metrics. Trend Identification: Identify trending topics, music, and creators on TikTok. Audience Engagement Studies: Understand how different types of content generate likes, comments, shares, and plays. Creator Analysis: Study the posting habits and performance of various TikTok creators. Social Media Research: Conduct research on the dynamics of content dissemination and user interaction on short-form video platforms. Notes on Data Quality:
The description, plays, hashtags, and create_time columns have some missing values, which may require handling (e.g., imputation or removal) depending on your analysis. The fetch_time, views, and posted_time columns are largely empty, suggesting they may not be reliable for comprehensive analysis. It is recommended to primarily rely on create_time for timestamps and plays for engagement metrics. This dataset can be a valuable resource for anyone looking to explore the vast and dynamic world of TikTok content and user engagement.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset captures the pulse of viral social media trends across TikTok, Instagram, Twitter, and YouTube. It provides insights into the most popular hashtags, content types, and user engagement levels, offering a comprehensive view of how trends unfold across platforms. With regional data and influencer-driven content, this dataset is perfect for:
Dive in to explore what makes content go viral, the behaviors that drive engagement, and how trends evolve on a global scale! 🌍
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
This dataset, titled TikTok Viral Trends 2025, provides a curated snapshot of 50 trending TikTok videos from September 2025, capturing the platform's dynamic content landscape. Sourced from real-time web analyses and social media insights (e.g., X posts, trend reports from reputable sources like Ramdam, NapoleonCat, and Tokchart), it focuses on viral videos across diverse categories such as Entertainment, Music, Comedy, Lifestyle, Beauty, Sustainability, and Technology. The dataset is designed for data scientists, researchers, and enthusiasts interested in analyzing social media trends, predicting virality, or exploring multimodal machine learning applications (e.g., NLP, time-series, or clustering). It stands out from existing Kaggle datasets by offering fresh, 2025-specific data with rich metadata, including engagement metrics, hashtags, and sound/trend associations.
tiktok_data.csv).post:72, web:65).The dataset contains the following 12 columns:
- video_id: Unique identifier for each video or trend (integer or hashtag-based).
- author: Creator username or group (anonymized as "Unknown" where not specified).
- description: Brief summary of the video content or trend, derived from source context.
- upload_date: Approximate or exact posting date (YYYY-MM-DD).
- views: Reported view count (e.g., millions, billions for hashtag aggregates; "N/A" if unavailable).
- likes: Reported like count (e.g., thousands, millions; "N/A" if unavailable).
- shares: Share count (often "N/A" due to limited public data).
- comments: Comment count (often "N/A" due to limited public data).
- hashtags: Key hashtags associated with the video or trend (e.g., #Kpop, #Viral).
- category: Inferred content category (e.g., Entertainment, Music, Comedy, Lifestyle, Sustainability, Tech).
- sound_or_trend: Associated audio track or challenge name driving the trend (e.g., "Soda Pop dance", "JUMP").
- source: Citation of data origin (e.g., post:72 for X post ID, web:65 for web source ID).
#Perfume reaching 39.3B views.This dataset is ideal for a variety of machine learning and data analysis tasks on Kaggle, including but not limited to:
- Virality Prediction: Use views, likes, and hashtags to train regression or classification models (e.g., XGBoost, neural networks) to predict video success.
- Trend Analysis: Apply clustering (e.g., K-means) or topic modeling (e.g., LDA) to identify emerging content themes or regional differences.
- NLP Applications: Analyze descriptions and hashtags with BERT or word embeddings to study sentiment, cultural trends, or influencer impact.
- Time-Series Forecasting: Leverage upload_date and engagement metrics for temporal analysis of trend lifecycles.
- Recommendation Systems: Build content recommendation models based on category, sound, or hashtag similarities.
- Social Media Ethics: Explore AI-driven trends (e.g., deepfake Identity Swaps) for studies on misinformation or content authenticity.
#Ominous). Exact metrics may vary slightly due to real-time fluctuations.
Facebook
TwitterAs of January 2022, the hashtag "fyp," which stands for "for you page," was the most used hashtag on TikTok, amassing over 18.57 trillion views across posts using it. The hashtag "viral" ranked second, with approximately 6.3 trillion views on TikTok short-video posts using the hashtag. Posts using the hashtag "duet," which refers to TikTok videos that can be shared, mirrored, and commented on by creators, collected around 2.4 trillion views as of January 2022.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
TikTok is one of the hottest social media platforms out there, and it's only getting bigger. If you're looking to get in on the action, this dataset is for you!
This dataset contains a collection of videos from TikTok, including information on the user who posted the video, the number of likes, shares, and comments the video received, as well as the video's length and description. With this data, you can see what types of videos are popular on TikTok and start planning your own viral content!
- The dataset contains a collection of videos from the social media platform TikTok.
- The videos include information on the user who posted the video, the number of likes, shares, and comments the video received, as well as the video's length and description.
- The dataset also contains information on popular TikTok authors, including their unique ID, nickname, avatar thumbnail, signature, and whether or not their account is verified or private.
- Additionally, the dataset includes a list of trending videos on TikTok, as well as the number of likes, shares, comments, and plays each video has received
- Identifying popular TikTok authors to target for scraping videos and liked videos
- Finding trending videos on TikTok for further analysis
- Generating a list of videos from the TikTok app that are tagged with the #funny hashtag
License
License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.
File: tiktok_collected_liked_videos.csv | Column name | Description | |:---------------|:---------------------------------------------------------| | user_name | The name of the user who posted the video. (String) | | n_likes | The number of likes the video has received. (Integer) | | n_shares | The number of shares the video has received. (Integer) | | n_comments | The number of comments the video has received. (Integer) | | n_plays | The number of times the video has been played. (Integer) |
File: tiktok_collected_videos.csv | Column name | Description | |:---------------|:---------------------------------------------------------| | user_name | The name of the user who posted the video. (String) | | n_likes | The number of likes the video has received. (Integer) | | n_shares | The number of shares the video has received. (Integer) | | n_comments | The number of comments the video has received. (Integer) | | n_plays | The number of times the video has been played. (Integer) |
File: tiktok_funny_hashtag_videos.csv | Column name | Description | |:--------------------------|:-----------------------------------------------------------| | author_nickname | The author's nickname. (String) | | author_avatarThumb | The author's avatar thumbnail. (String) | | author_signature | The author's signature. (String) | | author_verification | Whether or not the author's account is verified. (Boolean) | | author_privateAccount | Whether or not the author's account is private. (Boolean) | | author_followingCount | The number of people the author is following. (Integer) | | author_followerCount | The number of people following the author. (Integer) | | author_heartCount | The number of hearts the author has. (Integer) | | author_diggCount | The number of diggs the author has. (Integer) | | music_title | The title of the music. (String) | | music_playUrl | The play url of the music. (String) | | music_coverThumb | The cover thumbnail of the music. (String) | | music_authorName | The author name of the music. (String) | | music_originality | The originality of the music. (String) | | music_duration | The duration of the music. (String) |
File: trending_authors.csv | Column name | Description ...
Facebook
TwitterTikTok's platform is mostly fueled by viral videos of users doing outlandish, scary, or funny things. On the platform, these trend and meme videos typically come with a hashtag that includes the word challenge. But what is a TikTok challenge and how do you find or create them? Here's everything you need to know.
This TikTok book challenge was made by @haleyisfearless, . It asks you to show, your prettiest book,your tiniest book a book you highly suggest a book you're currently reading and one of your favorite books . In the most basic sense, these challenges originate from viral TikTok challenge isn't complete without its defining hashtag in the video's description
These TikTok challenges are the perfect way to ease into what can be an intimidating social media platform and help you find your fellow book lovers.
This dataset is generated entirely from TikTok , so we want to thank @haleyisfearless for building such this challange video
the goal of this project is to make Python script which takes a video as input and returns all texts visible on the video. the videos are titlok videos so texts can appear everywhere on screen, with different background, font size etc..
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Please cite the following paper when using this dataset:
N. Thakur, “Five Years of COVID-19 Discourse on Instagram: A Labeled Instagram Dataset of Over Half a Million Posts for Multilingual Sentiment Analysis”, Proceedings of the 7th International Conference on Machine Learning and Natural Language Processing (MLNLP 2024), Chengdu, China, October 18-20, 2024 (Paper accepted for publication, Preprint available at: https://arxiv.org/abs/2410.03293)
Abstract
The outbreak of COVID-19 served as a catalyst for content creation and dissemination on social media platforms, as such platforms serve as virtual communities where people can connect and communicate with one another seamlessly. While there have been several works related to the mining and analysis of COVID-19-related posts on social media platforms such as Twitter (or X), YouTube, Facebook, and TikTok, there is still limited research that focuses on the public discourse on Instagram in this context. Furthermore, the prior works in this field have only focused on the development and analysis of datasets of Instagram posts published during the first few months of the outbreak. The work presented in this paper aims to address this research gap and presents a novel multilingual dataset of 500,153 Instagram posts about COVID-19 published between January 2020 and September 2024. This dataset contains Instagram posts in 161 different languages. After the development of this dataset, multilingual sentiment analysis was performed using VADER and twitter-xlm-roberta-base-sentiment. This process involved classifying each post as positive, negative, or neutral. The results of sentiment analysis are presented as a separate attribute in this dataset.
For each of these posts, the Post ID, Post Description, Date of publication, language code, full version of the language, and sentiment label are presented as separate attributes in the dataset.
The Instagram posts in this dataset are present in 161 different languages out of which the top 10 languages in terms of frequency are English (343041 posts), Spanish (30220 posts), Hindi (15832 posts), Portuguese (15779 posts), Indonesian (11491 posts), Tamil (9592 posts), Arabic (9416 posts), German (7822 posts), Italian (5162 posts), Turkish (4632 posts)
There are 535,021 distinct hashtags in this dataset with the top 10 hashtags in terms of frequency being #covid19 (169865 posts), #covid (132485 posts), #coronavirus (117518 posts), #covid_19 (104069 posts), #covidtesting (95095 posts), #coronavirusupdates (75439 posts), #corona (39416 posts), #healthcare (38975 posts), #staysafe (36740 posts), #coronavirusoutbreak (34567 posts)
The following is a description of the attributes present in this dataset - Post ID: Unique ID of each Instagram post - Post Description: Complete description of each post in the language in which it was originally published - Date: Date of publication in MM/DD/YYYY format - Language code: Language code (for example: “en”) that represents the language of the post as detected using the Google Translate API - Full Language: Full form of the language (for example: “English”) that represents the language of the post as detected using the Google Translate API - Sentiment: Results of sentiment analysis (using the preprocessed version of each post) where each post was classified as positive, negative, or neutral
Open Research Questions
This dataset is expected to be helpful for the investigation of the following research questions and even beyond:
All the Instagram posts that were collected during this data mining process to develop this dataset were publicly available on Instagram and did not require a user to log in to Instagram to view the same (at the time of writing this paper).
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This is the Dataset of popular hashtags on TikTok, this includes the author name, author id, author signature, comment count, hashtags details, URL, share count, hashtags which i scrape are meme, funny, humor, comedy, education, lol, dance, song, music, etc.