67 datasets found

YouTube users worldwide 2020-2029
statista.com
Updated Mar 3, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). YouTube users worldwide 2020-2029 [Dataset]. https://www.statista.com/forecasts/1144088/youtube-users-in-the-world
Explore at:
Dataset updated
Mar 3, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
World
Description
The global number of Youtube users in was forecast to continuously increase between 2024 and 2029 by in total 232.5 million users (+24.91 percent). After the ninth consecutive increasing year, the Youtube user base is estimated to reach 1.2 billion users and therefore a new peak in 2029. Notably, the number of Youtube users of was continuously increasing over the past years.User figures, shown here regarding the platform youtube, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of Youtube users in countries like Africa and South America.
i
Data from: YouTube Video Network Dataset for Israel-Hamas War
ieee-dataport.org
Updated Dec 23, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Thejas T (2023). YouTube Video Network Dataset for Israel-Hamas War [Dataset]. https://ieee-dataport.org/documents/youtube-video-network-dataset-israel-hamas-war
Explore at:
Dataset updated
Dec 23, 2023
Authors
Thejas T
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Israel, YouTube
Description
Over the past few years YouTube has became a popular site for video broadcasting and earning money by publishing various different skills in the form of videos. For some people it has become a main source to earn money. Getting the videos trending among the viewers is one of the major tasks which each and every content creator wants. Popularity of any video and its reach to the audience is completely based on YouTube's Recommendation algorithm. This document is a dataset descriptor for the dataset collected over the time span of about 45 days during the Israel-Hamas War
P
YouCook Dataset
paperswithcode.com
opendatalab.com
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pradipto Das; Chenliang Xu; Richard F. Doell; Jason J. Corso, YouCook Dataset [Dataset]. https://paperswithcode.com/dataset/youcook
Explore at:
Authors
Pradipto Das; Chenliang Xu; Richard F. Doell; Jason J. Corso
Description
This data set was prepared from 88 open-source YouTube cooking videos. The YouCook dataset contains videos of people cooking various recipes. The videos were downloaded from YouTube and are all in the third-person viewpoint; they represent a significantly more challenging visual problem than existing cooking and kitchen datasets (the background kitchen/scene is different for many and most videos have dynamic camera changes). In addition, frame-by-frame object and action annotations are provided for training data (as well as a number of precomputed low-level features). Finally, each video has a number of human provided natural language descriptions (on average, there are eight different descriptions per video). This dataset has been created to serve as a benchmark in describing complex real-world videos with natural language descriptions.
H
TED and YouTube Video Dataset
dataverse.harvard.edu
Updated Dec 17, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Stella Kleanthous; Jahna Otterbacher (2021). TED and YouTube Video Dataset [Dataset]. http://doi.org/10.7910/DVN/RJZZPH
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/RJZZPH
Dataset updated
Dec 17, 2021
Dataset provided by
Harvard Dataverse
Authors
Stella Kleanthous; Jahna Otterbacher
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Area covered
YouTube
Description
When modelling for the social we need to consider more than one medium. Little is known as to how platform community characteristics shape the discussion and how communicators could best engage each community, taking into consideration these characteristics. In this dataset, we consider comments on TED videos featuring roboticists, shared at TED.com and YouTube. The textual comments were then subjected to analysis via the Linguistic Inquiry and Word Count tool (LIWC).
League of Legends LEC Spring Season 2024 Stats
kaggle.com
Updated Sep 22, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
smvjkk (2024). League of Legends LEC Spring Season 2024 Stats [Dataset]. https://www.kaggle.com/datasets/smvjkk/league-of-legends-lec-spring-season-2024-stats
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 22, 2024
Dataset provided by
Kaggle
Authors
smvjkk
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
I have created this dataset for people interested in League of Legends who want to approach the game from a more analytical side.

Most of the data was acquired from Games of Legends (https://gol.gg/tournament/tournament-stats/LEC%20Spring%20Season%202024/) and also from official account of the League of Legends EMEA Championship (https://www.youtube.com/c/LEC)

Dataset Contents:

Player: Name of the player.

Role: Role of the player (e.g., TOP, JUNGLE, MID, ADC, SUPPORT)

Team: Name of the player's team

Opponent Team: Name of the opposing team

Opponent Player: Name of the opposing player

Date: Date of the match

Week: Week of the tournament

Day: Specific day of the tournament

Patch: Version of the game patch during the match

Stage: Stage of the tournament

No Game: Game number in the series

all Games: Total number of games in the series

Format: Format of the match (e.g., Best of 1, Best of 3)

Game of day: Number of the game that day

Side: Side of the map the team started on (Blue/Red)

Time: Duration of the match

Team Performance Metrics:

Kills Team: Total kills by the team

Turrets Team: Total turrets destroyed by the team

Dragon Team: Total dragons killed by the team

Baron Team: Total barons killed by the team

Player Performance Metrics:

Level: Final level of the player

Kills: Number of kills by the player

Deaths: Number of deaths of the player

Assists: Number of assists by the player

KDA: Kill/Death/Assist ratio

CS: Creep Score (minions killed)

CS in Team's Jungle: Creep Score in the team's jungle

CS in Enemy Jungle: Creep Score in the enemy's jungle

CSM: Creep Score per Minute

Golds: Total gold earned

GPM: Gold Per Minute

GOLD%: Percentage of team's total gold earned by the player

Vision and Warding:

Vision Score: Total vision score

Wards placed: Number of wards placed

Wards destroyed: Number of wards destroyed

Control Wards Purchased: Number of control wards purchased

Detector Wards Placed: Number of detector wards placed

VSPM: Vision Score Per Minute

WPM: Wards Placed per Minute

VWPM: Vision Wards Placed per Minute

WCPM: Wards Cleared per Minute

VS%: Vision Score percentage

Damage Metrics:

Total damage to Champion: Total damage dealt to champions

Physical Damage: Total physical damage dealt

Magic Damage: Total magic damage dealt

True Damage: Total true damage dealt

DPM: Damage Per Minute

DMG%: Percentage of team’s total damage dealt by the player

Combat Metrics:

K+A Per Minute: Kills and Assists per Minute

KP%: Kill Participation percentage

Solo kills: Number of solo kills

Double kills: Number of double kills

Triple kills: Number of triple kills

Quadra kills: Number of quadra kills

Penta kills: Number of pentakills

Early Game Metrics:

GD@15: Gold Difference at 15 minutes

CSD@15: Creep Score Difference at 15 minutes

XPD@15: Experience Difference at 15 minutes

LVLD@15: Level Difference at 15 minutes

Objective Control:

Objectives Stolen: Number of objectives stolen

Damage dealt to turrets: Total damage dealt to turrets

Damage dealt to buildings: Total damage dealt to buildings

Healing and Mitigation:

Total heal: Total healing done

Total Heals On Teammates: Total healing done on teammates

Damage self mitigated: Total damage self-mitigated

Total Damage Shielded On Teammates: Total damage shielded on teammates

Crowd Control Metrics:

Time ccing others: Time spent crowd controlling others

Total Time CC Dealt: Total crowd control time dealt

Survival and Economy:

Total damage taken: Total damage taken

Total Time Spent Dead: Total time spent dead

Consumables purchased: Number of consumables purchased

Items Purchased: Number of items purchased

Shutdown bounty collected: Total shutdown bounty collected

Shutdown bounty lost: Total shutdown bounty lost
E
Data from: English YouTube Hate Speech Corpus
live.european-language-grid.eu
binary format
Updated Oct 13, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2021). English YouTube Hate Speech Corpus [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/20160
Explore at:
binary formatAvailable download formats
Dataset updated
Oct 13, 2021
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Area covered
YouTube
Description
We present an English YouTube dataset manually annotated for hate speech types and targets. The comments to be annotated were sampled from the English YouTube comments on videos about the Covid-19 pandemic in the period from January 2020 to May 2020. Two sets were annotated: a training set with 51,655 comments (IMSyPP_EN_YouTube_comments_train.csv) and two evaluation sets, one annotated in-context (IMSyPP_EN_YouTube_comments_evaluation_context.csv), another out-of-context (IMSyPP_EN_YouTube_comments_evaluation_no_context.csv), each based on the same 10,759 comments. The dataset was annotated by 10 annotators with most (99.9%) of the comments being annotated by two annotators. It was used to train a classification model for hate speech types detection that is publicly available at the following URL: https://huggingface.co/IMSyPP/hate_speech_en.

The dataset consists of the following fields: Video_ID - YouTube ID of the video under which the comment was posted Comment_ID - YouTube ID of the comment Text - text of the comment Type - type of hate speech Target - the target of hate speech Annotator - code of the human annotator
P
YT-BB Dataset
paperswithcode.com
Updated Nov 20, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Esteban Real; Jonathon Shlens; Stefano Mazzocchi; Xin Pan; Vincent Vanhoucke (2021). YT-BB Dataset [Dataset]. https://paperswithcode.com/dataset/youtube-boundingboxes
Explore at:
Dataset updated
Nov 20, 2021
Authors
Esteban Real; Jonathon Shlens; Stefano Mazzocchi; Xin Pan; Vincent Vanhoucke
Area covered
YouTube
Description
YouTube-BoundingBoxes (YT-BB) is a large-scale data set of video URLs with densely-sampled object bounding box annotations. The data set consists of approximately 380,000 video segments about 19s long, automatically selected to feature objects in natural settings without editing or post-processing, with a recording quality often akin to that of a hand-held cell phone camera. The objects represent a subset of the MS COCO label set. All video segments were human-annotated with high-precision classification labels and bounding boxes at 1 frame per second.
l
YouTube RPM by Niche (2025)
learningrevolution.net
html
Updated Jun 23, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jawad Khan (2025). YouTube RPM by Niche (2025) [Dataset]. https://www.learningrevolution.net/how-much-money-does-youtube-pay-for-1-million-views/
Explore at:
htmlAvailable download formats
Dataset updated
Jun 23, 2025
Dataset provided by
Learning Revolution
Authors
Jawad Khan
Area covered
YouTube
Variables measured
Gaming, Travel, Finance, Education, Technology, Memes/Vlogs
Description
This dataset provides estimated YouTube RPM (Revenue Per Mille) ranges for different niches in 2025, based on ad revenue earned per 1,000 monetized views.
f
Microsoft Excel dataset file of YouTube videos.
plos.figshare.com
xlsx
Updated Nov 29, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dan Sun; Guochang Zhao (2023). Microsoft Excel dataset file of YouTube videos. [Dataset]. http://doi.org/10.1371/journal.pone.0294665.s002
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0294665.s002
Dataset updated
Nov 29, 2023
Dataset provided by
PLOS ONE
Authors
Dan Sun; Guochang Zhao
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
YouTube
Description
News dissemination plays a vital role in supporting people to incorporate beneficial actions during public health emergencies, thereby significantly reducing the adverse influences of events. Based on big data from YouTube, this research study takes the declaration of COVID-19 National Public Health Emergency (PHE) as the event impact and employs a DiD model to investigate the effect of PHE on the news dissemination strength of relevant videos. The study findings indicate that the views, comments, and likes on relevant videos significantly increased during the COVID-19 public health emergency. Moreover, the public’s response to PHE has been rapid, with the highest growth in comments and views on videos observed within the first week of the public health emergency, followed by a gradual decline and returning to normal levels within four weeks. In addition, during the COVID-19 public health emergency, in the context of different types of media, lifestyle bloggers, local media, and institutional media demonstrated higher growth in the news dissemination strength of relevant videos as compared to news & political bloggers, foreign media, and personal media, respectively. Further, the audience attracted by related news tends to display a certain level of stickiness, therefore this audience may subscribe to these channels during public health emergencies, which confirms the incentive mechanisms of social media platforms to foster relevant news dissemination during public health emergencies. The proposed findings provide essential insights into effective news dissemination in potential future public health events.
P
Kinetics Dataset
paperswithcode.com
Updated Apr 21, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Will Kay; Joao Carreira; Karen Simonyan; Brian Zhang; Chloe Hillier; Sudheendra Vijayanarasimhan; Fabio Viola; Tim Green; Trevor Back; Paul Natsev; Mustafa Suleyman; Andrew Zisserman (2021). Kinetics Dataset [Dataset]. https://paperswithcode.com/dataset/kinetics
Explore at:
Dataset updated
Apr 21, 2021
Authors
Will Kay; Joao Carreira; Karen Simonyan; Brian Zhang; Chloe Hillier; Sudheendra Vijayanarasimhan; Fabio Viola; Tim Green; Trevor Back; Paul Natsev; Mustafa Suleyman; Andrew Zisserman
Description
The Kinetics dataset is a large-scale, high-quality dataset for human action recognition in videos. The dataset consists of around 500,000 video clips covering 600 human action classes with at least 600 video clips for each action class. Each video clip lasts around 10 seconds and is labeled with a single action class. The videos are collected from YouTube.
first-impressions
kaggle.com
Updated Dec 26, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Muhammad Waqas786 (2024). first-impressions [Dataset]. https://www.kaggle.com/datasets/muhammadwaqas786/first-impressions
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 26, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Muhammad Waqas786
Description
Researchers mostly use the dataset launched for ChaLearn Looking At People First Impression Challenge (ECCV Challenge). The CVPR’17 dataset (an extension to the ECCV challenge dataset) consists of video files labelled with Big Five Personality Traits. The dataset consists of 3,000 high-definition YouTube videos featuring YouTubers speaking in English. To create the dataset, selected videos were divided into 10,000 clips with an average duration of 15 seconds. The dataset comprises three sets for training, validation, and testing, with a ratio of 3:1:1. The videos feature YouTubers from various nationalities, genders, and age groups. To label the videos with Big-Five personality traits, Amazon Mechanical Turk (AMT) was used. Each video clip was assigned a label corresponding to its Big-Five values, which range from 0 to 1.
h
palsynet-data
huggingface.co
Updated Jul 28, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jasir (2024). palsynet-data [Dataset]. https://huggingface.co/datasets/jasir/palsynet-data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 28, 2024
Authors
Jasir
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset Card for Dataset Name

A data set of images of faces of people affected with Bell's palsy (Facial palsy).

Dataset Details Dataset Description

A data set of images of faces of people affected with Bell's palsy (Facial palsy). Created using curating and editing publically available youtube videos. Also included are images from people not affected by it, using the same method.

License: CC-BY-4.0

Uses

Can be used to train image models to detect… See the full description on the dataset page: https://huggingface.co/datasets/jasir/palsynet-data.
o
Perseverance Land on Mars YouTube Live Comments
opendatabay.com
kaggle.com
.undefined
Updated Jun 25, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Datasimple (2025). Perseverance Land on Mars YouTube Live Comments [Dataset]. https://www.opendatabay.com/data/ai-ml/3854bcb4-dc7b-403e-9257-077c40327af6
Explore at:
.undefinedAvailable download formats
Dataset updated
Jun 25, 2025
Dataset authored and provided by
Datasimple
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Area covered
Space & Astronomy, YouTube
Description
Content The dataset contains two basic attributes from which you can extract an arrangement of exciting features, starting from DateTime-based features up to text-based features.

The first is the time in the video in which the comment was posted; it is important to note that the EST time the live stream started is 2:15.

The second is the comment that was posted; here, it is important to note that non-english comments were removed.

Inspiration I think it might be interesting to get a better understanding of how people around the world reacted to the rover landing on Mars and the content shown in the video. There were many points where the video lagged, or the site crashed.

License

CC0

Original Data Source: Perseverance Land on Mars YouTube Live Comments
P
ShareGPT4Video Dataset
paperswithcode.com
Updated Sep 3, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lin Chen; Xilin Wei; Jinsong Li; Xiaoyi Dong; Pan Zhang; Yuhang Zang; Zehui Chen; Haodong Duan; Bin Lin; Zhenyu Tang; Li Yuan; Yu Qiao; Dahua Lin; Feng Zhao; Jiaqi Wang (2024). ShareGPT4Video Dataset [Dataset]. https://paperswithcode.com/dataset/sharegpt4video
Explore at:
Dataset updated
Sep 3, 2024
Authors
Lin Chen; Xilin Wei; Jinsong Li; Xiaoyi Dong; Pan Zhang; Yuhang Zang; Zehui Chen; Haodong Duan; Bin Lin; Zhenyu Tang; Li Yuan; Yu Qiao; Dahua Lin; Feng Zhao; Jiaqi Wang
Description
The ShareGPT4Video dataset is a large-scale resource designed to improve video understanding and generation¹. It features 1.2 million highly descriptive captions⁴ for video clips, surpassing existing datasets in diversity and information content⁴. The captions cover a wide range of aspects, including world knowledge, object properties, spatial relationships, and aesthetic evaluations⁴.

The dataset includes detailed captions of 40K videos generated by GPT-4V¹ and 4.8M videos generated by ShareCaptioner-Video¹. The videos are sourced from YouTube and other user-uploaded video websites, and they cover a variety of scenarios, such as human activities and auto-driving¹.

The ShareGPT4Video dataset also provides a basis for the ShareCaptioner-Video, an exceptional video captioner capable of efficiently generating high-quality captions for videos of a wide range of resolution, aspect ratio, and duration¹.

For example, the dataset includes a detailed caption of a video documenting a meticulous meal preparation by an individual with tattooed forearms¹. The caption describes the individual's actions in detail, from slicing a cucumber to mixing the dressing and adding croutons to the salad¹.

In addition to its use in research, the ShareGPT4Video dataset has been used to train the sharegpt4video-8b model, an open-source video chatbot². This model was trained on open-source video instruction data and is primarily intended for researchers and hobbyists in computer vision, natural language processing, machine learning, and artificial intelligence².

(1) arXiv:2406.04325v1 [cs.CV] 6 Jun 2024. https://arxiv.org/pdf/2406.04325. (2) ShareGPT4V: Improving Large Multi-Modal Models with Better Captions. https://arxiv.org/abs/2311.12793. (3) Lin-Chen/sharegpt4video-8b · Hugging Face. https://huggingface.co/Lin-Chen/sharegpt4video-8b. (4) ShareGPT4Video: Improving Video Understanding and Generation with .... https://www.aimodels.fyi/papers/arxiv/sharegpt4video-improving-video-understanding-generation-better-captions. (5) GitHub - ShareGPT4Omni/ShareGPT4Video: An official implementation of .... https://github.com/ShareGPT4Omni/ShareGPT4Video. (6) undefined. https://sharegpt4video.github.io/.
l
TL;DR Dataset: Best YouTube Alternatives for Creators in 2025
learningrevolution.net
html
Updated Sep 25, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jawad Khan (2024). TL;DR Dataset: Best YouTube Alternatives for Creators in 2025 [Dataset]. https://www.learningrevolution.net/youtube-alternatives/
Explore at:
htmlAvailable download formats
Dataset updated
Sep 25, 2024
Dataset provided by
Learning Revolution
Authors
Jawad Khan
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
YouTube
Variables measured
Platform, Best Use Case
Description
Concise comparison of the top 10 YouTube alternatives for content creators in 2025. Covers monetization, audience size, and ideal use cases.
Z
RealVAD: A Real-world Dataset for Voice Activity Detection
data.niaid.nih.gov
zenodo.org
Updated Jul 3, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Vittorio Murino (2020). RealVAD: A Real-world Dataset for Voice Activity Detection [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3928150
Explore at:
Dataset updated
Jul 3, 2020
Dataset provided by
Muhammad Shahid
Vittorio Murino
Cigdem Beyan
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
RealVAD: A Real-world Dataset for Voice Activity Detection

The task of automatically detecting “Who is Speaking and When” is broadly named as Voice Activity Detection (VAD). Automatic VAD is a very important task and also the foundation of several domains, e.g., human-human, human-computer/ robot/ virtual-agent interaction analyses, and industrial applications.

RealVAD dataset is constructed from a YouTube video composed of a panel discussion lasting approx. 83 minutes. The audio is available from a single channel. There is one static camera capturing all panelists, the moderator and audiences.

Particular aspects of RealVAD dataset are:

It is composed of panelists with different nationalities (British, Dutch, French, German, Italian, American, Mexican, Columbian, Thai). This aspect allows studying the effect of ethnic origin variety to the automatic VAD.

There is a gender balance such that there are four female and five male panelists.

The panelists are sitting in two rows and they can be gazing audience, other panelists, their laptop, the moderator or anywhere in the room while speaking or not-speaking. Therefore, they were captured not only from frontal-view but also from side-view varying based on their instant posture and head orientation.

The panelists are moving freely and are doing various spontaneous actions (e.g., drinking water, checking their cell phone, using their laptop, etc.), resulting in different postures.

The panelists’ body parts are sometimes partially occluded by their/other's body part or belongings (e.g., laptop).

There are also natural changes of illumination and shadow rising on the wall behind the panelists in the back row.

Especially, for the panelists sitting in the front row, there is sometimes background motion occurring when the person(s) behind them moves.

The annotations includes:

The upper body detection of nine panelists in bounding box form.

Associated VAD ground-truth (speaking, not-speaking) for nine panelists.

Acoustic features extracted from the video: MFCC and raw filterbank energies.

All info regarding the annotations are given in the ReadMe.txt and Acoustic Features README.txt files.

When using this dataset for your research, please cite the following paper in your publication:

C. Beyan, M. Shahid and V. Murino, "RealVAD: A Real-world Dataset and A Method for Voice Activity Detection by Body Motion Analysis", in IEEE Transactions on Multimedia, 2020.
D
WildAvatar: Learning In-the-wild 3D Avatars from the Web
researchdata.ntu.edu.sg
Updated May 9, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
DR-NTU (Data) (2025). WildAvatar: Learning In-the-wild 3D Avatars from the Web [Dataset]. http://doi.org/10.21979/N9/5G18B1
Explore at:
Unique identifier
https://doi.org/10.21979/N9/5G18B1
Dataset updated
May 9, 2025
Dataset provided by
DR-NTU (Data)
License
https://researchdata.ntu.edu.sg/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.21979/N9/5G18B1https://researchdata.ntu.edu.sg/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.21979/N9/5G18B1
Dataset funded by
Industry Collaboration Projects (IAF-ICP) Funding Initiative
Description
Existing research on avatar creation is typically limited to laboratory datasets, which require high costs against scalability and exhibit insufficient representation of the real world. On the other hand, the web abounds with off-the-shelf real-world human videos, but these videos vary in quality and require accurate annotations for avatar creation. To this end, we propose an automatic annotating pipeline with filtering protocols to curate these humans from the web. Our pipeline surpasses state-of-the-art methods on the EMDB benchmark, and the filtering protocols boost verification metrics on web videos. We then curate WildAvatar, a web-scale in-the-wild human avatar creation dataset extracted from YouTube, with 10,000+ different human subjects and scenes. WildAvatar is at least 10x richer than previous datasets for 3D human avatar creation and closer to the real world. To explore its potential, we demonstrate the quality and generalizability of avatar creation methods on WildAvatar. We will publicly release our code, data source links and annotations to push forward 3D human avatar creation and other related fields for real-world applications.
R
Hands_only_cpr Dataset
universe.roboflow.com
zip
Updated Jun 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kursa darbs (2025). Hands_only_cpr Dataset [Dataset]. https://universe.roboflow.com/kursa-darbs/hands_only_cpr
Explore at:
zipAvailable download formats
Dataset updated
Jun 4, 2025
Dataset authored and provided by
Kursa darbs
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Bounding Boxes
Description
To enable the development of an objective, real-time CPR (cardiopulmonary resuscitation) quality assessment system using object detection, this specialized dataset was created. This dataset was manually compiled and annotated from existing open-source datasets and Youtube video segmentation into screenshots, focusing on critical elements such as people, hands, manikins and the CPR action.

The dataset includes five classes representing essential HandsOnlyCPR process components: "CPR_Massage", "man", "Dummy", "hands" . These main objects were determined to be the main part of tracking CPR actions, enabling the model to access enough contextual data for evaluating the quality of CPR performed using Computer vision.

The purpose of creating this dataset was to provide a targeted, clinically relevant resource for training deep learning models capable of recognizing critical CPR objects. Publicly available datasets often lack detailed medical context, do not focus on CPR-specific objects or lack enough variaty and quality of the performed CPR. As a result, this dataset fills an important gap, especially for developing automated medical quality control systems that focus on metrics like CPM (compressions per minute), Compression depth and pause time.

Note: This dataset is not meant for the assessment of posture and hand placement accuracy. Even though this is one of the biggest datasets on Hands_only CPR object detection it still lacks enough contextual data for detection with new angles, distances and distorted images.
R
RECOD.ai events dataset
redu.unicamp.br
Updated Mar 21, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Repositório de Dados de Pesquisa da Unicamp (2025). RECOD.ai events dataset [Dataset]. http://doi.org/10.25824/redu/BLIYYR
Explore at:
Unique identifier
https://doi.org/10.25824/redu/BLIYYR
Dataset updated
Mar 21, 2025
Dataset provided by
Repositório de Dados de Pesquisa da Unicamp
Dataset funded by
Fundação de Amparo à Pesquisa do Estado de São Paulo
Description
Overview This data set consists of links to social network items for 34 different forensic events that took place between August 14th, 2018 and January 06th, 2021. The majority of the text and images are from Twitter (a minor part is from Flickr, Facebook and Google+), and every video is from YouTube. Data Collection We used Social Tracker, along with the social medias' APIs, to gather most of the collections. For a minor part, we used Twint. In both cases, we provided keywords related to the event to receive the data. It is important to mention that, in procedures like this one, usually only a small fraction of the collected data is in fact related to the event and useful for a further forensic analysis. Content We have data from 34 events, and for each of them we provide the files: items_full.csv: It contains links to any social media post that was collected. images.csv: Enlists the images collected. In some files there is a field called "ItemUrl", that refers to the social network post (e.g., a tweet) that mentions that media. video.csv: Urls of YouTube videos that were gathered about the event. video_tweet.csv: This file contains IDs of tweets and IDs of YouTube videos. A tweet whose ID is in this file has a video in its content. In turn, the link of a Youtube video whose ID is in this file was mentioned by at least one collected tweet. Only two collections have this file. description.txt: Contains some standard information about the event, and possibly some comments about any specific issue related to it. In fact, most of the collections do not have all the files above. Such an issue is due to changes in our collection procedure throughout the time of this work. Events We divided the events into six groups. They are: Fire: Devastating fire is the main issue of the event, therefore most of the informative pictures show flames or burned constructions. 14 Events Collapse: Most of the relevant images depict collapsed buildings, bridges, etc. (not caused by fire). 5 Events Shooting: Likely images of guns and police officers. Few or no destruction of the environment. 5 Events Demonstration: Plethora of people on the streets. Possibly some problem took place on that, but in most cases the demonstration is the actual event. 7 Events Collision: Traffic collision. Pictures of damaged vehicles on an urban landscape. Possibly there are images with victims on the street. 1 Event Flood: Events that range from fierce rain to a tsunami. Many pictures depict water. 2 Events Media Content Due to the terms of use from the social networks, we do not make publicly available the texts, images and videos that were collected. However, we can provide some extra piece of media content related to one (or more) events by contacting the authors.
l
Viral Views by Platform – How Many Views Is Viral (2025)
learningrevolution.net
html
Updated Jun 23, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jawad Khan (2025). Viral Views by Platform – How Many Views Is Viral (2025) [Dataset]. https://www.learningrevolution.net/how-many-views-is-viral/
Explore at:
htmlAvailable download formats
Dataset updated
Jun 23, 2025
Authors
Jawad Khan
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Platform, Time to Go Viral, Viral Views Threshold
Description
A structured dataset comparing viral view thresholds and timeframes across major platforms, including TikTok, YouTube (long-form & Shorts), Instagram Reels, Facebook, Twitter (X), LinkedIn Video, and LinkedIn Posts.

Facebook

Twitter

Click to copy link

Link copied

Cite

Statista (2025). YouTube users worldwide 2020-2029 [Dataset]. https://www.statista.com/forecasts/1144088/youtube-users-in-the-world

YouTube users worldwide 2020-2029

Explore at:

51 scholarly articles cite this dataset (View in Google Scholar)

Dataset updated

Mar 3, 2025

Dataset authored and provided by

Statistahttp://statista.com/

Area covered

World

Description

The global number of Youtube users in was forecast to continuously increase between 2024 and 2029 by in total 232.5 million users (+24.91 percent). After the ninth consecutive increasing year, the Youtube user base is estimated to reach 1.2 billion users and therefore a new peak in 2029. Notably, the number of Youtube users of was continuously increasing over the past years.User figures, shown here regarding the platform youtube, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of Youtube users in countries like Africa and South America.

Clear search

Close search

Google apps

Main menu

YouTube users worldwide 2020-2029

Data from: YouTube Video Network Dataset for Israel-Hamas War

YouCook Dataset

TED and YouTube Video Dataset

League of Legends LEC Spring Season 2024 Stats

Dataset Contents:

Team Performance Metrics:

Player Performance Metrics:

Vision and Warding:

Damage Metrics:

Combat Metrics:

Early Game Metrics:

Objective Control:

Healing and Mitigation:

Crowd Control Metrics:

Survival and Economy:

Data from: English YouTube Hate Speech Corpus

YT-BB Dataset

YouTube RPM by Niche (2025)

Microsoft Excel dataset file of YouTube videos.

Kinetics Dataset

first-impressions

palsynet-data

Perseverance Land on Mars YouTube Live Comments

License

ShareGPT4Video Dataset

TL;DR Dataset: Best YouTube Alternatives for Creators in 2025

RealVAD: A Real-world Dataset for Voice Activity Detection

WildAvatar: Learning In-the-wild 3D Avatars from the Web

Hands_only_cpr Dataset

RECOD.ai events dataset

Viral Views by Platform – How Many Views Is Viral (2025)

YouTube users worldwide 2020-2029