https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
YouTube was created in 2005, with the first video – Me at the Zoo - being uploaded on 23 April 2005. Since then, 1.3 billion people have set up YouTube accounts. In 2018, people watch nearly 5 billion videos each day. People upload 300 hours of video to the site every minute.
According to 2016 research undertaken by Pexeso, music only accounts for 4.3% of YouTube’s content. Yet it makes 11% of the views. Clearly, an awful lot of people watch a comparatively small number of music videos. It should be no surprise, therefore, that the most watched videos of all time on YouTube are predominantly music videos.
On August 13, BTS became the most-viewed artist in YouTube history, accumulating over 26.7 billion views across all their official channels. This count includes all music videos and dance practice videos.
Justin Bieber and Ed Sheeran now hold the records for second and third-highest views, with over 26 billion views each.
Currently, BTS’s most viewed videos are their music videos for “**Boy With Luv**,” “**Dynamite**,” and “**DNA**,” which all have over 1.4 billion views.
Headers of the Dataset Total = Total views (in millions) across all official channels Avg = Current daily average of all videos combined 100M = Number of videos with more than 100 million views
The global number of Youtube users in was forecast to continuously increase between 2024 and 2029 by in total ***** million users (+***** percent). After the ninth consecutive increasing year, the Youtube user base is estimated to reach *** billion users and therefore a new peak in 2029. Notably, the number of Youtube users of was continuously increasing over the past years.User figures, shown here regarding the platform youtube, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to *** countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of Youtube users in countries like Africa and South America.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Over the past few years YouTube has became a popular site for video broadcasting and earning money by publishing various different skills in the form of videos. For some people it has become a main source to earn money. Getting the videos trending among the viewers is one of the major tasks which each and every content creator wants. Popularity of any video and its reach to the audience is completely based on YouTube's Recommendation algorithm. This document is a dataset descriptor for the dataset collected over the time span of about 45 days during the Israel-Hamas War
The number of Youtube users in India was forecast to continuously increase between 2024 and 2029 by in total 222.2 million users (+34.88 percent). After the ninth consecutive increasing year, the Youtube user base is estimated to reach 859.26 million users and therefore a new peak in 2029. Notably, the number of Youtube users of was continuously increasing over the past years.User figures, shown here regarding the platform youtube, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of Youtube users in countries like Sri Lanka and Nepal.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F769452%2F3c07321245b5cbec0dad06a5d9c3201d%2Fssssss.png?generation=1597339315897882&alt=media" alt="">
The data id collected using YouTube Data Tools from BBC YouTube channel. It shows information about all videos from this channel, starting with 2007.
Using YouTube Data Tools one can access the metadata for YouTube channels, videos, comments, upvotes.
Use this amazing dataset to analyze the impact of these videos, by looking to view, like, dislike, favorite, comments. Try to understand from description of the video if some subjects have larger impact. Factor-in the ”age” of each video, with this amazing dataset collecting video metadata starting from 2007.
This data set was prepared from 88 open-source YouTube cooking videos. The YouCook dataset contains videos of people cooking various recipes. The videos were downloaded from YouTube and are all in the third-person viewpoint; they represent a significantly more challenging visual problem than existing cooking and kitchen datasets (the background kitchen/scene is different for many and most videos have dynamic camera changes). In addition, frame-by-frame object and action annotations are provided for training data (as well as a number of precomputed low-level features). Finally, each video has a number of human provided natural language descriptions (on average, there are eight different descriptions per video). This dataset has been created to serve as a benchmark in describing complex real-world videos with natural language descriptions.
I was finding a specific dataset but never got one.
This is a text dataset focussing on the top comments on the best youtube videos (views>1B)
I wanna thank youtube api for helping me, lol and mongo db where I stored all the raw data.
I shared this dataset to see how the world will react and what will people do with this dataset. I hope this helps me learn more about NLP and ML
https://cdla.io/sharing-1-0/https://cdla.io/sharing-1-0/
Data is from the Youtube Channel "TED".
Data scraping was on March 16th 2024 using an API via fetching the channels ID, using a node.js code.
"The TED Talks channel features the best talks and performances from the TED Conference, where the world's leading thinkers and doers give the talk of their lives in 18 minutes (or less). Look for talks on Technology, Entertainment and Design -- plus science, business, global issues, the arts and more. You're welcome to link to or embed these videos, forward them to others and share these ideas with people you know." - Information from the Ted talks - Youtube page https://www.youtube.com/@TED
Deleted columns: "channelId", "publishedAt", "position", "duration", "dimension", "definition", "defaultLanguage", "thumbnail_maxres", "licensedContent", "locationDescription", "latitude", "longitude", "dislikeCount", "favoriteCount"
Split column publishedAtSQL into Date (release_date) and Time (release_time).
Changed durationSec - duration of video in seconds - to duration - duration of video mm:ss.
Split information in "Title" into "Title" of episode and "Speaker".
Overview This data set consists of links to social network items for 34 different forensic events that took place between August 14th, 2018 and January 06th, 2021. The majority of the text and images are from Twitter (a minor part is from Flickr, Facebook and Google+), and every video is from YouTube. Data Collection We used Social Tracker, along with the social medias' APIs, to gather most of the collections. For a minor part, we used Twint. In both cases, we provided keywords related to the event to receive the data. It is important to mention that, in procedures like this one, usually only a small fraction of the collected data is in fact related to the event and useful for a further forensic analysis. Content We have data from 34 events, and for each of them we provide the files: items_full.csv: It contains links to any social media post that was collected. images.csv: Enlists the images collected. In some files there is a field called "ItemUrl", that refers to the social network post (e.g., a tweet) that mentions that media. video.csv: Urls of YouTube videos that were gathered about the event. video_tweet.csv: This file contains IDs of tweets and IDs of YouTube videos. A tweet whose ID is in this file has a video in its content. In turn, the link of a Youtube video whose ID is in this file was mentioned by at least one collected tweet. Only two collections have this file. description.txt: Contains some standard information about the event, and possibly some comments about any specific issue related to it. In fact, most of the collections do not have all the files above. Such an issue is due to changes in our collection procedure throughout the time of this work. Events We divided the events into six groups. They are: Fire: Devastating fire is the main issue of the event, therefore most of the informative pictures show flames or burned constructions. 14 Events Collapse: Most of the relevant images depict collapsed buildings, bridges, etc. (not caused by fire). 5 Events Shooting: Likely images of guns and police officers. Few or no destruction of the environment. 5 Events Demonstration: Plethora of people on the streets. Possibly some problem took place on that, but in most cases the demonstration is the actual event. 7 Events Collision: Traffic collision. Pictures of damaged vehicles on an urban landscape. Possibly there are images with victims on the street. 1 Event Flood: Events that range from fierce rain to a tsunami. Many pictures depict water. 2 Events Media Content Due to the terms of use from the social networks, we do not make publicly available the texts, images and videos that were collected. However, we can provide some extra piece of media content related to one (or more) events by contacting the authors.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
so if you have to have a G+ account (for YouTube, location services, or other reasons) - here's how you can make it totally private! No one will be able to add you, send you spammy links, or otherwise annoy you. You need to visit the "Audience Settings" page - https://plus.google.com/u/0/settings/audience You can then set a "custom audience" - usually you would use this to restrict your account to people from a specific geographic location, or within a specific age range. In this case, we're going to choose a custom audience of "No-one" Check the box and hit save. Now, when people try to visit your Google+ profile - they'll see this "restricted" message. You can visit my G+ Profile if you want to see this working. (https://plus.google.com/114725651137252000986) If you are not able to understand you can follow this website : http://www.livehuntz.com/google-plus/support-phone-number
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The COVYT dataset contains speech samples from individuals who self-reported their COVID-19 infection on public social media platforms (YouTube, Xiaohongshu). These videos, as well as accompanying videos of the same people prior to infection, were mined in an attempt to gather publicly-available data for COVID-19 research. This release includes the links to the original videos along with the accompanying manual segmentation and diarisation that identifies the utterances of the target individuals. We are additionally releasing features derived from the segmented utterances. Finally, the dataset includes partitioning information according to 4 different cross-validation schemes. See the arxiv pre-print for more details: https://arxiv.org/abs/2206.11045
The number of Youtube users in Europe was forecast to continuously increase between 2024 and 2029 by in total 7.8 million users (+3.61 percent). After the ninth consecutive increasing year, the Youtube user base is estimated to reach 223.61 million users and therefore a new peak in 2029. Notably, the number of Youtube users of was continuously increasing over the past years.User figures, shown here regarding the platform youtube, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of Youtube users in countries like North America and Australia & Oceania.
VLEP contains 28,726 future event prediction examples (along with their rationales) from 10,234 diverse TV Show and YouTube Lifestyle Vlog video clips. Each example (see Figure 1) consists of a Premise Event (a short video clip with dialogue), a Premise Summary (a text summary of the premise event), and two potential natural language Future Events (along with Rationales) written by people. These clips are on average 6.1 seconds long and are harvested from diverse event-rich sources, i.e., TV show and YouTube Lifestyle Vlog videos.
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
The video dataset comprises 65 videos that consist of both anomalous and normal video segments. These videos are temporally annotated by human annotators for anomalous and normal segments. Annotations are cross-validated by a different person who was not part of the annotators' group, this is done to minimize human error to a certain extent. Annotation data for videos is represented as a set of frame ranges that contain anomalous segments and those frames that are not included within the range are considered normal video segments.
To ensure diversification in terms of location and people, the data for both image and video formats have been collected manually from the internet. Mostly, multimedia sharing platforms such as YouTube, Kaotic, Dailymail, Itemfix, leakedreality, GettyImages, and Shutterstock are leveraged as sources. Collection from internet sources is done with the help of multiple text-based search queries that are slightly varied in terms of vocabulary and language such as "atm robbery", "atm theft", "atm chori", and "atm Diebstahl". Genuine ATM-based data on the internet is meager, so this approach of search and collection has mitigated the challenge to some extent. To prepare a high-quality dataset, certain conditions are imposed during the collection process such as: avoiding shaky, overly labeled videos/images, and videos that are compiled.
In 2021, YouTube's user base in Vietnam amounts to approximately ***** million users. The number of YouTube users in Vietnam is projected to reach ***** million users by 2025. User figures have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to *** countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
IdiapVideoAge dataset is a set of youtube video IDs with age labels to facilitates the research in the area of audio-visual age verification with the focus on detecting ages of people below 18 years old. The dataset contains 4260 IDs to the youtube videos that come from two existing video databases: VoxCeleb2 and child speech dataset from Google. Our main contribution are the age labels of people in the videos. Three different human annotators were used for labeling. They were instructed give a valid age label if a person's face in a video is visible within more than 80% of the frames and it is clear that the audible speech matches the person in the video. As the age label, we used the average of the three annotators. Out of the total 4260 videos, 1973 videos are of the minors below 18 years old.
Reference
If you use this dataset, please cite the following publication:
Pavel Korshunov and Sebastien Marcel, "Face Anthropometry Aware Audio-visual Age Verification", ACM Multimedia international conference (MM'22), October 2022.
https://publications.idiap.ch/index.php/publications/show/4862
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Accident Detection Model is made using YOLOv8, Google Collab, Python, Roboflow, Deep Learning, OpenCV, Machine Learning, Artificial Intelligence. It can detect an accident on any accident by live camera, image or video provided. This model is trained on a dataset of 3200+ images, These images were annotated on roboflow.
https://user-images.githubusercontent.com/78155393/233774342-287492bb-26c1-4acf-bc2c-9462e97a03ca.png" alt="Survey">
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Concise comparison of the top 10 YouTube alternatives for content creators in 2025. Covers monetization, audience size, and ideal use cases.
We learn high fidelity human depths by leveraging a collection of social media dance videos scraped from the TikTok mobile social networking application. It is by far one of the most popular video sharing applications across generations, which include short videos (10-15 seconds) of diverse dance challenges as shown above. We manually find more than 300 dance videos that capture a single person performing dance moves from TikTok dance challenge compilations for each month, variety, type of dances, which are moderate movements that do not generate excessive motion blur. For each video, we extract RGB images at 30 frame per second, resulting in more than 100K images. We segmented these images using Removebg application, and computed the UV coordinates from DensePose.
Download TikTok Dataset:
Please use the dataset only for the research purpose.
The dataset can be viewed and downloaded from the Kaggle page. (you need to make an account in Kaggle to be able to download the data. It is free!)
The dataset can also be downloaded from here (42 GB). The dataset resolution is: (1080 x 604)
The original YouTube videos corresponding to each sequence and the dance name can be downloaded from here (2.6 GB).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We introduce the Group Affect from ViDeos (GAViD) dataset, which comprises 5091 video clips with multimodal data (video, audio, and context), annotated with ternary valence and discrete emotion labels, and enriched with VideoGPT-generated contextual metadata and human-annotated action cues. We also present CAGNet, a baseline model for multimodal context aware group affect recognition. CAGNet achieves 61.20% test accuracy on GAViD, comparable to state-of-the art performance in the field.
NOTE: For now we are providing only Train video clips. The corresponding paper is under Review in ACM Multimedia 2025 Dataset Track. After its publication, the validation and Test set access will be granted upon request and approval, in accordance with the Responsible Use Policy.
GAViD is a large-scale, in-the-wild multimodal dataset of 5091 samples, each annotated with the elements listed below. The following sections describe its key details and compilation procedure.
Dataset details
Positive | Positive | Negative | Negative | Neutral | Neutral |
Team Celebration | Happy | Protest | Angry Sport | Group Meeting | Panel Discussion |
Group Meeting | Video Conference | Heated Argument | Violent Protest | Parliament speech | People on street |
Get Together | Meeting | Emotional breakdown in Public | Aggressive Argument | People walking on street | Team brainstorming Session |
Celebration | Press Conference | Spritual Gathering | Aggressive Group | Team Building Activities | Group Discussion |
Religious gathering | Talk Show | Street Race | Condolence | Group work session | Team Planning session |
Farewell | Group Performance | Group Fight | Wrestling | Students in Discussion | Wedding Group Dance |
People Dancing on Street | Street Comedy | MMA Fight | VIolence | Roundtable Discus- sion | Oath |
Wedding Performance | Dhol masti | Boxing | Silent Protest | Mental health ad- dress | General Talk |
Couple group dance | Comedy show | People in the fight | Group Fight | Wedding Celebration | Festival Celebration |
Model | Val Acc. | Val F1 | Test Acc. | Test F1 |
CAGNet | 62.55% | 0.454 | 60.33% | 0.448 |
The dataset comprises two main components:
The dataset is structured in GAViD.csv file along with corresponding Videos in related folders. This CSV file includes the following fields:
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
YouTube was created in 2005, with the first video – Me at the Zoo - being uploaded on 23 April 2005. Since then, 1.3 billion people have set up YouTube accounts. In 2018, people watch nearly 5 billion videos each day. People upload 300 hours of video to the site every minute.
According to 2016 research undertaken by Pexeso, music only accounts for 4.3% of YouTube’s content. Yet it makes 11% of the views. Clearly, an awful lot of people watch a comparatively small number of music videos. It should be no surprise, therefore, that the most watched videos of all time on YouTube are predominantly music videos.
On August 13, BTS became the most-viewed artist in YouTube history, accumulating over 26.7 billion views across all their official channels. This count includes all music videos and dance practice videos.
Justin Bieber and Ed Sheeran now hold the records for second and third-highest views, with over 26 billion views each.
Currently, BTS’s most viewed videos are their music videos for “**Boy With Luv**,” “**Dynamite**,” and “**DNA**,” which all have over 1.4 billion views.
Headers of the Dataset Total = Total views (in millions) across all official channels Avg = Current daily average of all videos combined 100M = Number of videos with more than 100 million views