We learn high fidelity human depths by leveraging a collection of social media dance videos scraped from the TikTok mobile social networking application. It is by far one of the most popular video sharing applications across generations, which include short videos (10-15 seconds) of diverse dance challenges as shown above. We manually find more than 300 dance videos that capture a single person performing dance moves from TikTok dance challenge compilations for each month, variety, type of dances, which are moderate movements that do not generate excessive motion blur. For each video, we extract RGB images at 30 frame per second, resulting in more than 100K images. We segmented these images using Removebg application, and computed the UV coordinates from DensePose.
Download TikTok Dataset:
Please use the dataset only for the research purpose.
The dataset can be viewed and downloaded from the Kaggle page. (you need to make an account in Kaggle to be able to download the data. It is free!)
The dataset can also be downloaded from here (42 GB). The dataset resolution is: (1080 x 604)
The original YouTube videos corresponding to each sequence and the dance name can be downloaded from here (2.6 GB).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is a dataset of videos and comments related to the invasion of Ukraine, published on TikTok by a number of users over the year of 2022. It was compiled by Benjamin Steel, Sara Parker and Derek Ruths at the Network Dynamics Lab, McGill University. We created this dataset to facilitate the study of TikTok, and the nature of social interaction on the platform relevant to a major political event.
The dataset has been released here on Zenodo: https://doi.org/10.5281/zenodo.7926959 as well as on Github: https://github.com/networkdynamics/data-and-code/tree/master/ukraine_tiktok
To create the dataset, we identified hashtags and keywords explicitly related to the conflict to collect a core set of videos (or ”TikToks”). We then compiled comments associated with these videos. All of the data captured is publically available information, and contains personally identifiable information. In total we collected approximately 16 thousand videos and 12 million comments, from approximately 6 million users. There are approximately 1.9 comments on average per user captured, and 1.5 videos per user who posted a video. The author personally collected this data using the web scraping PyTok library, developed by the author: https://github.com/networkdynamics/pytok.
Due to scraping duration, this is just a sample of the publically available discourse concerning the invasion of Ukraine on TikTok. Due to the fuzzy search functionality of the TikTok, the dataset contains videos with a range of relatedness to the invasion.
We release here the unique video IDs of the dataset in a CSV format. The data was collected without the specific consent of the content creators, so we have released only the data required to re-create it, to allow users to delete content from TikTok and be removed from the dataset if they wish. Contained in this repository are scripts that will automatically pull the full dataset, which will take the form of JSON files organised into a folder for each video. The JSON files are the entirety of the data returned by the TikTok API. We include a script to parse the JSON files into CSV files with the most commonly used data. We plan to further expand this dataset as collection processes progress and the war continues. We will version the dataset to ensure reproducibility.
To build this dataset from the IDs here:
pip install -e .
in the pytok directorypip install pandas tqdm
to install these libraries if not already installedget_videos.py
to get the video datavideo_comments.py
to get the comment datauser_tiktoks.py
to get the video history of the usershashtag_tiktoks.py
or search_tiktoks.py
to get more videos from other hashtags and search termsload_json_to_csv.py
to compile the JSON files into two CSV files, comments.csv
and videos.csv
If you get an error about the wrong chrome version, use the command line argument get_videos.py --chrome-version YOUR_CHROME_VERSION
Please note pulling data from TikTok takes a while! We recommend leaving the scripts running on a server for a while for them to finish downloading everything. Feel free to play around with the delay constants to either speed up the process or avoid TikTok rate limiting.
Please do not hesitate to make an issue in this repo to get our help with this!
The videos.csv
will contain the following columns:
video_id
: Unique video ID
createtime
: UTC datetime of video creation time in YYYY-MM-DD HH:MM:SS format
author_name
: Unique author name
author_id
: Unique author ID
desc
: The full video description from the author
hashtags
: A list of hashtags used in the video description
share_video_id
: If the video is sharing another video, this is the video ID of that original video, else empty
share_video_user_id
: If the video is sharing another video, this the user ID of the author of that video, else empty
share_video_user_name
: If the video is sharing another video, this is the user name of the author of that video, else empty
share_type
: If the video is sharing another video, this is the type of the share, stitch, duet etc.
mentions
: A list of users mentioned in the video description, if any
The comments.csv
will contain the following columns:
comment_id
: Unique comment ID
createtime
: UTC datetime of comment creation time in YYYY-MM-DD HH:MM:SS format
author_name
: Unique author name
author_id
: Unique author ID
text
: Text of the comment
mentions
: A list of users that are tagged in the comment
video_id
: The ID of the video the comment is on
comment_language
: The language of the comment, as predicted by the TikTok API
reply_comment_id
: If the comment is replying to another comment, this is the ID of that comment
The date can be compiled into a user interaction network to facilitate study of interaction dynamics. There is code to help with that here: https://github.com/networkdynamics/polar-seeds. Additional scripts for further preprocessing of this data can be found there too.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In 2020, TikTok brought in $33.4 billion in revenue.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Top 1000 TikTok Influencers Ranking’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/prasertk/top-1000-tiktok-influencers-ranking on 13 February 2022.
--- Dataset description provided by original source is as follows ---
Find the top TikTok accounts.
What's inside is more than just rows and columns. Make it easy for others to get started by describing how you acquired the data and what time period it represents, too.
Data source: https://hypeauditor.com/top-tiktok/
--- Original source retains full ownership of the source dataset ---
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Interpretative struggles of global crises are increasingly being reflected on social media networks. TikTok is a relatively new social media platform that has achieved substantial popularity among young people in many parts of the world and is now being used to disseminate and make sense of information about the Russian invasion of Ukraine. Through a user-centered sampling approach, we collected 62 TikTok videos and conducted an in-depth qualitative analysis of them and their uploading profiles to explore how the war was being represented on the platform. Our analysis revealed a strong prevalence of remixing practices among content creators; that is, they recontextualise images, sounds and embodied self-performance within the platform-specific affordances of trends. We found that distant suffering is mediated through the emotive online self-performance of content creators, cuing their audiences toward appropriate emotional responses. Trending sounds situate videos within a singular-motif and context-diverse environment, facilitating what we theorize as affective audio networks.
The number of Instagram users in the United Kingdom was forecast to continuously increase between 2024 and 2028 by in total 2.1 million users (+7.02 percent). After the ninth consecutive increasing year, the Instagram user base is estimated to reach 32 million users and therefore a new peak in 2028. Notably, the number of Instagram users of was continuously increasing over the past years.User figures, shown here with regards to the platform instagram, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period and count multiple accounts by persons only once.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).
The number of LinkedIn users in the United Kingdom was forecast to continuously increase between 2024 and 2028 by in total 1.5 million users (+4.51 percent). After the eighth consecutive increasing year, the LinkedIn user base is estimated to reach 34.7 million users and therefore a new peak in 2028. User figures, shown here with regards to the platform LinkedIn, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period and count multiple accounts by persons only once.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).
The number of Pinterest users in the United Kingdom was forecast to continuously increase between 2024 and 2028 by in total 0.3 million users (+3.14 percent). After the ninth consecutive increasing year, the Pinterest user base is estimated to reach 9.88 million users and therefore a new peak in 2028. Notably, the number of Pinterest users of was continuously increasing over the past years.User figures, shown here regarding the platform pinterest, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period and count multiple accounts by persons only once.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).
Not seeing a result you expected?
Learn how you can add new datasets to our index.
We learn high fidelity human depths by leveraging a collection of social media dance videos scraped from the TikTok mobile social networking application. It is by far one of the most popular video sharing applications across generations, which include short videos (10-15 seconds) of diverse dance challenges as shown above. We manually find more than 300 dance videos that capture a single person performing dance moves from TikTok dance challenge compilations for each month, variety, type of dances, which are moderate movements that do not generate excessive motion blur. For each video, we extract RGB images at 30 frame per second, resulting in more than 100K images. We segmented these images using Removebg application, and computed the UV coordinates from DensePose.
Download TikTok Dataset:
Please use the dataset only for the research purpose.
The dataset can be viewed and downloaded from the Kaggle page. (you need to make an account in Kaggle to be able to download the data. It is free!)
The dataset can also be downloaded from here (42 GB). The dataset resolution is: (1080 x 604)
The original YouTube videos corresponding to each sequence and the dance name can be downloaded from here (2.6 GB).