56 datasets found
  1. Top Youtube Artist

    • kaggle.com
    Updated Jan 12, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mrityunjay Pathak (2023). Top Youtube Artist [Dataset]. https://www.kaggle.com/datasets/themrityunjaypathak/top-youtube-artist
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 12, 2023
    Dataset provided by
    Kaggle
    Authors
    Mrityunjay Pathak
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    YouTube
    Description

    YouTube was created in 2005, with the first video – Me at the Zoo - being uploaded on 23 April 2005. Since then, 1.3 billion people have set up YouTube accounts. In 2018, people watch nearly 5 billion videos each day. People upload 300 hours of video to the site every minute.

    According to 2016 research undertaken by Pexeso, music only accounts for 4.3% of YouTube’s content. Yet it makes 11% of the views. Clearly, an awful lot of people watch a comparatively small number of music videos. It should be no surprise, therefore, that the most watched videos of all time on YouTube are predominantly music videos.

    On August 13, BTS became the most-viewed artist in YouTube history, accumulating over 26.7 billion views across all their official channels. This count includes all music videos and dance practice videos.

    Justin Bieber and Ed Sheeran now hold the records for second and third-highest views, with over 26 billion views each.

    Currently, BTS’s most viewed videos are their music videos for “**Boy With Luv**,” “**Dynamite**,” and “**DNA**,” which all have over 1.4 billion views.

    Headers of the Dataset Total = Total views (in millions) across all official channels Avg = Current daily average of all videos combined 100M = Number of videos with more than 100 million views

  2. YouTube users worldwide 2020-2029

    • statista.com
    Updated Jul 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). YouTube users worldwide 2020-2029 [Dataset]. https://www.statista.com/forecasts/1144088/youtube-users-in-the-world
    Explore at:
    Dataset updated
    Jul 7, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Worldwide, YouTube
    Description

    The global number of Youtube users in was forecast to continuously increase between 2024 and 2029 by in total ***** million users (+***** percent). After the ninth consecutive increasing year, the Youtube user base is estimated to reach *** billion users and therefore a new peak in 2029. Notably, the number of Youtube users of was continuously increasing over the past years.User figures, shown here regarding the platform youtube, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to *** countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of Youtube users in countries like Africa and South America.

  3. i

    Data from: YouTube Video Network Dataset for Israel-Hamas War

    • ieee-dataport.org
    Updated Dec 23, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Thejas T (2023). YouTube Video Network Dataset for Israel-Hamas War [Dataset]. https://ieee-dataport.org/documents/youtube-video-network-dataset-israel-hamas-war
    Explore at:
    Dataset updated
    Dec 23, 2023
    Authors
    Thejas T
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    YouTube, Israel
    Description

    Over the past few years YouTube has became a popular site for video broadcasting and earning money by publishing various different skills in the form of videos. For some people it has become a main source to earn money. Getting the videos trending among the viewers is one of the major tasks which each and every content creator wants. Popularity of any video and its reach to the audience is completely based on YouTube's Recommendation algorithm. This document is a dataset descriptor for the dataset collected over the time span of about 45 days during the Israel-Hamas War

  4. YouTube users in India 2020-2029

    • statista.com
    Updated Mar 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). YouTube users in India 2020-2029 [Dataset]. https://www.statista.com/forecasts/1146150/youtube-users-in-india
    Explore at:
    Dataset updated
    Mar 3, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    India
    Description

    The number of Youtube users in India was forecast to continuously increase between 2024 and 2029 by in total 222.2 million users (+34.88 percent). After the ninth consecutive increasing year, the Youtube user base is estimated to reach 859.26 million users and therefore a new peak in 2029. Notably, the number of Youtube users of was continuously increasing over the past years.User figures, shown here regarding the platform youtube, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of Youtube users in countries like Sri Lanka and Nepal.

  5. BBC YouTube Videos Metadata

    • kaggle.com
    zip
    Updated Aug 13, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gabriel Preda (2020). BBC YouTube Videos Metadata [Dataset]. https://www.kaggle.com/gpreda/bbc-youtube-videos-metadata
    Explore at:
    zip(1856076 bytes)Available download formats
    Dataset updated
    Aug 13, 2020
    Authors
    Gabriel Preda
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Introduction

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F769452%2F3c07321245b5cbec0dad06a5d9c3201d%2Fssssss.png?generation=1597339315897882&alt=media" alt="">

    The data id collected using YouTube Data Tools from BBC YouTube channel. It shows information about all videos from this channel, starting with 2007.

    Data collection

    Using YouTube Data Tools one can access the metadata for YouTube channels, videos, comments, upvotes.

    References

    Inspiration

    Use this amazing dataset to analyze the impact of these videos, by looking to view, like, dislike, favorite, comments. Try to understand from description of the video if some subjects have larger impact. Factor-in the ”age” of each video, with this amazing dataset collecting video metadata starting from 2007.

  6. O

    YouCook

    • opendatalab.com
    • paperswithcode.com
    zip
    Updated Mar 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    State University of New York (2023). YouCook [Dataset]. https://opendatalab.com/OpenDataLab/YouCook
    Explore at:
    zip(1865855952 bytes)Available download formats
    Dataset updated
    Mar 22, 2023
    Dataset provided by
    State University of New York
    Description

    This data set was prepared from 88 open-source YouTube cooking videos. The YouCook dataset contains videos of people cooking various recipes. The videos were downloaded from YouTube and are all in the third-person viewpoint; they represent a significantly more challenging visual problem than existing cooking and kitchen datasets (the background kitchen/scene is different for many and most videos have dynamic camera changes). In addition, frame-by-frame object and action annotations are provided for training data (as well as a number of precomputed low-level features). Finally, each video has a number of human provided natural language descriptions (on average, there are eight different descriptions per video). This dataset has been created to serve as a benchmark in describing complex real-world videos with natural language descriptions.

  7. MOST LIKED COMMENTS ON YOUTUBE

    • kaggle.com
    Updated Sep 9, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nipun Arora (2020). MOST LIKED COMMENTS ON YOUTUBE [Dataset]. https://www.kaggle.com/nipunarora8/most-liked-comments-on-youtube/notebooks
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 9, 2020
    Dataset provided by
    Kaggle
    Authors
    Nipun Arora
    Area covered
    YouTube
    Description

    Context

    I was finding a specific dataset but never got one.

    Content

    This is a text dataset focussing on the top comments on the best youtube videos (views>1B)

    Acknowledgements

    I wanna thank youtube api for helping me, lol and mongo db where I stored all the raw data.

    Inspiration

    I shared this dataset to see how the world will react and what will people do with this dataset. I hope this helps me learn more about NLP and ML

  8. TED talks - Youtube

    • kaggle.com
    Updated May 11, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ulrike Herold (2024). TED talks - Youtube [Dataset]. https://www.kaggle.com/datasets/ulrikeherold/ted-talks-youtube
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 11, 2024
    Dataset provided by
    Kaggle
    Authors
    Ulrike Herold
    License

    https://cdla.io/sharing-1-0/https://cdla.io/sharing-1-0/

    Area covered
    YouTube
    Description

    Data is from the Youtube Channel "TED".

    Data scraping was on March 16th 2024 using an API via fetching the channels ID, using a node.js code.

    "The TED Talks channel features the best talks and performances from the TED Conference, where the world's leading thinkers and doers give the talk of their lives in 18 minutes (or less). Look for talks on Technology, Entertainment and Design -- plus science, business, global issues, the arts and more. You're welcome to link to or embed these videos, forward them to others and share these ideas with people you know." - Information from the Ted talks - Youtube page https://www.youtube.com/@TED

    Deleted columns: "channelId", "publishedAt", "position", "duration", "dimension", "definition", "defaultLanguage", "thumbnail_maxres", "licensedContent", "locationDescription", "latitude", "longitude", "dislikeCount", "favoriteCount"

    Split column publishedAtSQL into Date (release_date) and Time (release_time).

    Changed durationSec - duration of video in seconds - to duration - duration of video mm:ss.

    Split information in "Title" into "Title" of episode and "Speaker".

  9. R

    RECOD.ai events dataset

    • redu.unicamp.br
    Updated Mar 21, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Repositório de Dados de Pesquisa da Unicamp (2025). RECOD.ai events dataset [Dataset]. http://doi.org/10.25824/redu/BLIYYR
    Explore at:
    Dataset updated
    Mar 21, 2025
    Dataset provided by
    Repositório de Dados de Pesquisa da Unicamp
    Dataset funded by
    Fundação de Amparo à Pesquisa do Estado de São Paulo
    Description

    Overview This data set consists of links to social network items for 34 different forensic events that took place between August 14th, 2018 and January 06th, 2021. The majority of the text and images are from Twitter (a minor part is from Flickr, Facebook and Google+), and every video is from YouTube. Data Collection We used Social Tracker, along with the social medias' APIs, to gather most of the collections. For a minor part, we used Twint. In both cases, we provided keywords related to the event to receive the data. It is important to mention that, in procedures like this one, usually only a small fraction of the collected data is in fact related to the event and useful for a further forensic analysis. Content We have data from 34 events, and for each of them we provide the files: items_full.csv: It contains links to any social media post that was collected. images.csv: Enlists the images collected. In some files there is a field called "ItemUrl", that refers to the social network post (e.g., a tweet) that mentions that media. video.csv: Urls of YouTube videos that were gathered about the event. video_tweet.csv: This file contains IDs of tweets and IDs of YouTube videos. A tweet whose ID is in this file has a video in its content. In turn, the link of a Youtube video whose ID is in this file was mentioned by at least one collected tweet. Only two collections have this file. description.txt: Contains some standard information about the event, and possibly some comments about any specific issue related to it. In fact, most of the collections do not have all the files above. Such an issue is due to changes in our collection procedure throughout the time of this work. Events We divided the events into six groups. They are: Fire: Devastating fire is the main issue of the event, therefore most of the informative pictures show flames or burned constructions. 14 Events Collapse: Most of the relevant images depict collapsed buildings, bridges, etc. (not caused by fire). 5 Events Shooting: Likely images of guns and police officers. Few or no destruction of the environment. 5 Events Demonstration: Plethora of people on the streets. Possibly some problem took place on that, but in most cases the demonstration is the actual event. 7 Events Collision: Traffic collision. Pictures of damaged vehicles on an urban landscape. Possibly there are images with victims on the street. 1 Event Flood: Events that range from fierce rain to a tsunami. Many pictures depict water. 2 Events Media Content Due to the terms of use from the social networks, we do not make publicly available the texts, images and videos that were collected. However, we can provide some extra piece of media content related to one (or more) events by contacting the authors.

  10. o

    How to make google plus posts private - Dataset - openAFRICA

    • open.africa
    Updated Jan 4, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2018). How to make google plus posts private - Dataset - openAFRICA [Dataset]. https://open.africa/dataset/how-to-make-google-plus-posts-private
    Explore at:
    Dataset updated
    Jan 4, 2018
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    so if you have to have a G+ account (for YouTube, location services, or other reasons) - here's how you can make it totally private! No one will be able to add you, send you spammy links, or otherwise annoy you. You need to visit the "Audience Settings" page - https://plus.google.com/u/0/settings/audience You can then set a "custom audience" - usually you would use this to restrict your account to people from a specific geographic location, or within a specific age range. In this case, we're going to choose a custom audience of "No-one" Check the box and hit save. Now, when people try to visit your Google+ profile - they'll see this "restricted" message. You can visit my G+ Profile if you want to see this working. (https://plus.google.com/114725651137252000986) If you are not able to understand you can follow this website : http://www.livehuntz.com/google-plus/support-phone-number

  11. Z

    Data from: Introducing the COVID-19 YouTube (COVYT) speech dataset featuring...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Sep 8, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andreas Triantafyllopoulos (2022). Introducing the COVID-19 YouTube (COVYT) speech dataset featuring the same speakers with and without infection [Dataset]. https://data.niaid.nih.gov/resources?id=ZENODO_6962929
    Explore at:
    Dataset updated
    Sep 8, 2022
    Dataset provided by
    Meishu Song
    Anastasia Semertzidou
    Andreas Triantafyllopoulos
    Florian B. Pokorny
    Björn W. Schuller
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    YouTube
    Description

    The COVYT dataset contains speech samples from individuals who self-reported their COVID-19 infection on public social media platforms (YouTube, Xiaohongshu). These videos, as well as accompanying videos of the same people prior to infection, were mined in an attempt to gather publicly-available data for COVID-19 research. This release includes the links to the original videos along with the accompanying manual segmentation and diarisation that identifies the utterances of the target individuals. We are additionally releasing features derived from the segmented utterances. Finally, the dataset includes partitioning information according to 4 different cross-validation schemes. See the arxiv pre-print for more details: https://arxiv.org/abs/2206.11045

  12. YouTube users in Europe 2020-2029

    • statista.com
    Updated May 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista Research Department (2025). YouTube users in Europe 2020-2029 [Dataset]. https://www.statista.com/topics/3853/internet-usage-in-europe/
    Explore at:
    Dataset updated
    May 21, 2025
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Statista Research Department
    Area covered
    Europe
    Description

    The number of Youtube users in Europe was forecast to continuously increase between 2024 and 2029 by in total 7.8 million users (+3.61 percent). After the ninth consecutive increasing year, the Youtube user base is estimated to reach 223.61 million users and therefore a new peak in 2029. Notably, the number of Youtube users of was continuously increasing over the past years.User figures, shown here regarding the platform youtube, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of Youtube users in countries like North America and Australia & Oceania.

  13. P

    VLEP Dataset

    • paperswithcode.com
    Updated Oct 12, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jie Lei; Licheng Yu; Tamara L. Berg; Mohit Bansal (2021). VLEP Dataset [Dataset]. https://paperswithcode.com/dataset/vlep
    Explore at:
    Dataset updated
    Oct 12, 2021
    Authors
    Jie Lei; Licheng Yu; Tamara L. Berg; Mohit Bansal
    Description

    VLEP contains 28,726 future event prediction examples (along with their rationales) from 10,234 diverse TV Show and YouTube Lifestyle Vlog video clips. Each example (see Figure 1) consists of a Premise Event (a short video clip with dialogue), a Premise Summary (a text summary of the premise event), and two potential natural language Future Events (along with Rationales) written by people. These clips are on average 6.1 seconds long and are harvested from diverse event-rich sources, i.e., TV show and YouTube Lifestyle Vlog videos.

  14. ATM Anomaly Video Dataset (ATMA-V)

    • kaggle.com
    Updated Apr 13, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mehant Kammakomati (2022). ATM Anomaly Video Dataset (ATMA-V) [Dataset]. http://doi.org/10.34740/kaggle/dsv/3455016
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 13, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Mehant Kammakomati
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    ATMA-V Dataset

    The video dataset comprises 65 videos that consist of both anomalous and normal video segments. These videos are temporally annotated by human annotators for anomalous and normal segments. Annotations are cross-validated by a different person who was not part of the annotators' group, this is done to minimize human error to a certain extent. Annotation data for videos is represented as a set of frame ranges that contain anomalous segments and those frames that are not included within the range are considered normal video segments.

    Data Collection

    To ensure diversification in terms of location and people, the data for both image and video formats have been collected manually from the internet. Mostly, multimedia sharing platforms such as YouTube, Kaotic, Dailymail, Itemfix, leakedreality, GettyImages, and Shutterstock are leveraged as sources. Collection from internet sources is done with the help of multiple text-based search queries that are slightly varied in terms of vocabulary and language such as "atm robbery", "atm theft", "atm chori", and "atm Diebstahl". Genuine ATM-based data on the internet is meager, so this approach of search and collection has mitigated the challenge to some extent. To prepare a high-quality dataset, certain conditions are imposed during the collection process such as: avoiding shaky, overly labeled videos/images, and videos that are compiled.

  15. Youtube users in Vietnam 2017-2025

    • statista.com
    Updated Jul 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Youtube users in Vietnam 2017-2025 [Dataset]. https://www.statista.com/forecasts/1146013/youtube-users-in-vietnam
    Explore at:
    Dataset updated
    Jul 10, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    2017 - 2019
    Area covered
    Vietnam
    Description

    In 2021, YouTube's user base in Vietnam amounts to approximately ***** million users. The number of YouTube users in Vietnam is projected to reach ***** million users by 2025. User figures have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to *** countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).

  16. IdiapVideoAge

    • zenodo.org
    • explore.openaire.eu
    application/gzip
    Updated Sep 7, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pavel Korshunov; Pavel Korshunov; Sébastien Marcel; Sébastien Marcel (2022). IdiapVideoAge [Dataset]. http://doi.org/10.34777/e6vt-fz55
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Sep 7, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Pavel Korshunov; Pavel Korshunov; Sébastien Marcel; Sébastien Marcel
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Description

    IdiapVideoAge dataset is a set of youtube video IDs with age labels to facilitates the research in the area of audio-visual age verification with the focus on detecting ages of people below 18 years old. The dataset contains 4260 IDs to the youtube videos that come from two existing video databases: VoxCeleb2 and child speech dataset from Google. Our main contribution are the age labels of people in the videos. Three different human annotators were used for labeling. They were instructed give a valid age label if a person's face in a video is visible within more than 80% of the frames and it is clear that the audible speech matches the person in the video. As the age label, we used the average of the three annotators. Out of the total 4260 videos, 1973 videos are of the minors below 18 years old.

    Reference

    If you use this dataset, please cite the following publication:

    Pavel Korshunov and Sebastien Marcel, "Face Anthropometry Aware Audio-visual Age Verification", ACM Multimedia international conference (MM'22), October 2022.
    https://publications.idiap.ch/index.php/publications/show/4862

  17. R

    Accident Detection Model Dataset

    • universe.roboflow.com
    zip
    Updated Apr 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Accident detection model (2024). Accident Detection Model Dataset [Dataset]. https://universe.roboflow.com/accident-detection-model/accident-detection-model/model/1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 8, 2024
    Dataset authored and provided by
    Accident detection model
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Accident Bounding Boxes
    Description

    Accident-Detection-Model

    Accident Detection Model is made using YOLOv8, Google Collab, Python, Roboflow, Deep Learning, OpenCV, Machine Learning, Artificial Intelligence. It can detect an accident on any accident by live camera, image or video provided. This model is trained on a dataset of 3200+ images, These images were annotated on roboflow.

    Problem Statement

    • Road accidents are a major problem in India, with thousands of people losing their lives and many more suffering serious injuries every year.
    • According to the Ministry of Road Transport and Highways, India witnessed around 4.5 lakh road accidents in 2019, which resulted in the deaths of more than 1.5 lakh people.
    • The age range that is most severely hit by road accidents is 18 to 45 years old, which accounts for almost 67 percent of all accidental deaths.

    Accidents survey

    https://user-images.githubusercontent.com/78155393/233774342-287492bb-26c1-4acf-bc2c-9462e97a03ca.png" alt="Survey">

    Literature Survey

    • Sreyan Ghosh in Mar-2019, The goal is to develop a system using deep learning convolutional neural network that has been trained to identify video frames as accident or non-accident.
    • Deeksha Gour Sep-2019, uses computer vision technology, neural networks, deep learning, and various approaches and algorithms to detect objects.

    Research Gap

    • Lack of real-world data - We trained model for more then 3200 images.
    • Large interpretability time and space needed - Using google collab to reduce interpretability time and space required.
    • Outdated Versions of previous works - We aer using Latest version of Yolo v8.

    Proposed methodology

    • We are using Yolov8 to train our custom dataset which has been 3200+ images, collected from different platforms.
    • This model after training with 25 iterations and is ready to detect an accident with a significant probability.

    Model Set-up

    Preparing Custom dataset

    • We have collected 1200+ images from different sources like YouTube, Google images, Kaggle.com etc.
    • Then we annotated all of them individually on a tool called roboflow.
    • During Annotation we marked the images with no accident as NULL and we drew a box on the site of accident on the images having an accident
    • Then we divided the data set into train, val, test in the ratio of 8:1:1
    • At the final step we downloaded the dataset in yolov8 format.
      #### Using Google Collab
    • We are using google colaboratory to code this model because google collab uses gpu which is faster than local environments.
    • You can use Jupyter notebooks, which let you blend code, text, and visualisations in a single document, to write and run Python code using Google Colab.
    • Users can run individual code cells in Jupyter Notebooks and quickly view the results, which is helpful for experimenting and debugging. Additionally, they enable the development of visualisations that make use of well-known frameworks like Matplotlib, Seaborn, and Plotly.
    • In Google collab, First of all we Changed runtime from TPU to GPU.
    • We cross checked it by running command ‘!nvidia-smi’
      #### Coding
    • First of all, We installed Yolov8 by the command ‘!pip install ultralytics==8.0.20’
    • Further we checked about Yolov8 by the command ‘from ultralytics import YOLO from IPython.display import display, Image’
    • Then we connected and mounted our google drive account by the code ‘from google.colab import drive drive.mount('/content/drive')’
    • Then we ran our main command to run the training process ‘%cd /content/drive/MyDrive/Accident Detection model !yolo task=detect mode=train model=yolov8s.pt data= data.yaml epochs=1 imgsz=640 plots=True’
    • After the training we ran command to test and validate our model ‘!yolo task=detect mode=val model=runs/detect/train/weights/best.pt data=data.yaml’ ‘!yolo task=detect mode=predict model=runs/detect/train/weights/best.pt conf=0.25 source=data/test/images’
    • Further to get result from any video or image we ran this command ‘!yolo task=detect mode=predict model=runs/detect/train/weights/best.pt source="/content/drive/MyDrive/Accident-Detection-model/data/testing1.jpg/mp4"’
    • The results are stored in the runs/detect/predict folder.
      Hence our model is trained, validated and tested to be able to detect accidents on any video or image.

    Challenges I ran into

    I majorly ran into 3 problems while making this model

    • I got difficulty while saving the results in a folder, as yolov8 is latest version so it is still underdevelopment. so i then read some blogs, referred to stackoverflow then i got to know that we need to writ an extra command in new v8 that ''save=true'' This made me save my results in a folder.
    • I was facing problem on cvat website because i was not sure what
  18. l

    TL;DR Dataset: Best YouTube Alternatives for Creators in 2025

    • learningrevolution.net
    html
    Updated Sep 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jawad Khan (2024). TL;DR Dataset: Best YouTube Alternatives for Creators in 2025 [Dataset]. https://www.learningrevolution.net/youtube-alternatives/
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Sep 25, 2024
    Dataset provided by
    Learning Revolution
    Authors
    Jawad Khan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    YouTube
    Variables measured
    Platform, Best Use Case
    Description

    Concise comparison of the top 10 YouTube alternatives for content creators in 2025. Covers monetization, audience size, and ideal use cases.

  19. P

    TikTok Dataset Dataset

    • paperswithcode.com
    Updated Jul 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yasamin Jafarian; Hyun Soo Park (2024). TikTok Dataset Dataset [Dataset]. https://paperswithcode.com/dataset/tiktok-dataset
    Explore at:
    Dataset updated
    Jul 22, 2024
    Authors
    Yasamin Jafarian; Hyun Soo Park
    Description

    We learn high fidelity human depths by leveraging a collection of social media dance videos scraped from the TikTok mobile social networking application. It is by far one of the most popular video sharing applications across generations, which include short videos (10-15 seconds) of diverse dance challenges as shown above. We manually find more than 300 dance videos that capture a single person performing dance moves from TikTok dance challenge compilations for each month, variety, type of dances, which are moderate movements that do not generate excessive motion blur. For each video, we extract RGB images at 30 frame per second, resulting in more than 100K images. We segmented these images using Removebg application, and computed the UV coordinates from DensePose.

    Download TikTok Dataset:

    Please use the dataset only for the research purpose.

    The dataset can be viewed and downloaded from the Kaggle page. (you need to make an account in Kaggle to be able to download the data. It is free!)

    The dataset can also be downloaded from here (42 GB). The dataset resolution is: (1080 x 604)

    The original YouTube videos corresponding to each sequence and the dance name can be downloaded from here (2.6 GB).

  20. GAViD: Group Affect from ViDeos

    • zenodo.org
    csv, zip
    Updated Jun 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Deepak Kumar; Deepak Kumar; Puneet Kumar; Puneet Kumar; Xiaobai Li; Xiaobai Li; Balasubramanian Raman; Balasubramanian Raman (2025). GAViD: Group Affect from ViDeos [Dataset]. http://doi.org/10.5281/zenodo.15448846
    Explore at:
    csv, zipAvailable download formats
    Dataset updated
    Jun 5, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Deepak Kumar; Deepak Kumar; Puneet Kumar; Puneet Kumar; Xiaobai Li; Xiaobai Li; Balasubramanian Raman; Balasubramanian Raman
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jun 1, 2025
    Description

    Overview

    We introduce the Group Affect from ViDeos (GAViD) dataset, which comprises 5091 video clips with multimodal data (video, audio, and context), annotated with ternary valence and discrete emotion labels, and enriched with VideoGPT-generated contextual metadata and human-annotated action cues. We also present CAGNet, a baseline model for multimodal context aware group affect recognition. CAGNet achieves 61.20% test accuracy on GAViD, comparable to state-of-the art performance in the field.

    NOTE: For now we are providing only Train video clips. The corresponding paper is under Review in ACM Multimedia 2025 Dataset Track. After its publication, the validation and Test set access will be granted upon request and approval, in accordance with the Responsible Use Policy.

    Dataset Description

    GAViD is a large-scale, in-the-wild multimodal dataset of 5091 samples, each annotated with the elements listed below. The following sections describe its key details and compilation procedure.

    1. Raw video clips of an average duration of five seconds,
    2. Audio aligned with the video clips,
    3. Contextual metadata (scene descriptions, event labels) generated by a multimodal LLM and human-verified,
    4. Group affect labels: ternary valence (positive, neutral, negative) and five discrete emotions (happy, sad, fear, anger, neutral),
    5. Emotion intensity ratings (high, medium, low),
    6. Interaction type labels (cooperative, hostile, neutral),
    7. Action cues (e.g. smiling, clapping, shouting, dancing, singing).

    Dataset details

    • Number of clips (samples) in GAViD-> 5130
    • Number of samples with some problem-> 39
    • Number of samples after filtering-> 5,091
    • Duration per clip-> 5 sec
    • Clip count per video-> 1–35
    • Dataset split-> Train: 3503; Val: 542; Test:1046
    • Affect labels (classwise distribution)-> Positive: 2600; Negative: 1189; Neutral: 1302
    • Emotion label distribution-> Neutral: 1522; Happy: 2428; Anger: 884; Sad: 201; Fear: 56

    Keywords used to rearch the raw videos from YouTube

    PositivePositiveNegativeNegativeNeutralNeutral
    Team CelebrationHappyProtestAngry SportGroup MeetingPanel Discussion
    Group MeetingVideo ConferenceHeated ArgumentViolent ProtestParliament speechPeople on street
    Get TogetherMeetingEmotional breakdown in PublicAggressive ArgumentPeople walking on streetTeam brainstorming Session
    CelebrationPress ConferenceSpritual GatheringAggressive GroupTeam Building ActivitiesGroup Discussion
    Religious gathering Talk Show Street RaceCondolenceGroup work sessionTeam Planning session
    FarewellGroup Performance Group FightWrestlingStudents in DiscussionWedding Group Dance
    People Dancing on StreetStreet Comedy MMA FightVIolenceRoundtable Discus-
    sion
    Oath
    Wedding PerformanceDhol masti BoxingSilent ProtestMental health ad-
    dress
    General Talk
    Couple group danceComedy showPeople in the fightGroup FightWedding CelebrationFestival Celebration

    Emotion Recognition Results using CAGNet

    ModelVal Acc.Val F1Test Acc.Test F1
    CAGNet62.55%0.45460.33%0.448

    Components of the Dataset

    The dataset comprises two main components:

    • GAViD_train.csv file: Contains bin number used by labelbox in the annotation process, video_id, group_emotion (Positive, Negative, Neutral), specific_emotion (happy, sad, fear, anger, neutral), emotion_intensity, interaction_type, action_cuse, Video Description genertaed using Video-ChatGPT model.
    • GAViD_Train_VideoClips.zip folder: Contains the video clips of train set [For Now we are providing only Train video clips. Validation and Test set video clips will be provided as per the request].

    Data Format and Fields of the CSV File

    The dataset is structured in GAViD.csv file along with corresponding Videos in related folders. This CSV file includes the following fields:

    • Video_ID: Unique Identifier of a video
    • Group_Affect: Positive, Negative, Neutral
    • Descrete_Emotion: Happy, Sad, Fear, Anger, Neutral
    • Emotion_Intensity: High, Medium, Low
    • Interaction_Type: Cooperative, Hostile, Neutral
    • Action_Cues: e.g. Smiling, Clapping, Shouting, Dancing, Singing etc.
    • Context: Each video clip's summary generated from the Video-ChatGPT model.

    Ethical considerations, data privacy and misuse prevention

    • Data Collection and Consent: The data collection and annotation strictly followed established ethical protocols in line with YouTube's Terms, which state “Public videos with a Creative Commons license may be reused". We downloaded only public-domain videos licensed under Creative Commons (CC BY 4.0), which “allows others to share, copy and redistribute the material in any medium or format, and to adapt, remix, transform, and build upon it for any purpose, even commercially".
    • Privacy: All content was reviewed to ensure no private or sensitive information is present. Faces are included only from public domain videos as needed for group affect research; only group-level content is released, with no attempt or risk of individual identification. Other personally identifiable information, such as
      names and addresses and contacts, was removed.

    Code and Citation

    • Code Repository: https: //github.com/deepakkumar-iitr/GAViD/tree/main
    • Citing the Dataset: Users of the dataset should cite the corresponding paper described at the above GitHub Repository.

    License & Access

    • This dataset is released for academic research only and is free to researchers from educational or

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Mrityunjay Pathak (2023). Top Youtube Artist [Dataset]. https://www.kaggle.com/datasets/themrityunjaypathak/top-youtube-artist
Organization logo

Top Youtube Artist

Top Youtube Artist with Total Views (in millions) across all Official Channels

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 12, 2023
Dataset provided by
Kaggle
Authors
Mrityunjay Pathak
License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Area covered
YouTube
Description

YouTube was created in 2005, with the first video – Me at the Zoo - being uploaded on 23 April 2005. Since then, 1.3 billion people have set up YouTube accounts. In 2018, people watch nearly 5 billion videos each day. People upload 300 hours of video to the site every minute.

According to 2016 research undertaken by Pexeso, music only accounts for 4.3% of YouTube’s content. Yet it makes 11% of the views. Clearly, an awful lot of people watch a comparatively small number of music videos. It should be no surprise, therefore, that the most watched videos of all time on YouTube are predominantly music videos.

On August 13, BTS became the most-viewed artist in YouTube history, accumulating over 26.7 billion views across all their official channels. This count includes all music videos and dance practice videos.

Justin Bieber and Ed Sheeran now hold the records for second and third-highest views, with over 26 billion views each.

Currently, BTS’s most viewed videos are their music videos for “**Boy With Luv**,” “**Dynamite**,” and “**DNA**,” which all have over 1.4 billion views.

Headers of the Dataset Total = Total views (in millions) across all official channels Avg = Current daily average of all videos combined 100M = Number of videos with more than 100 million views

Search
Clear search
Close search
Google apps
Main menu