35 datasets found
  1. YouTube Datasets

    • brightdata.com
    .json, .csv, .xlsx
    Updated Sep 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bright Data (2024). YouTube Datasets [Dataset]. https://brightdata.com/products/datasets/youtube
    Explore at:
    .json, .csv, .xlsxAvailable download formats
    Dataset updated
    Sep 12, 2024
    Dataset authored and provided by
    Bright Datahttps://brightdata.com/
    License

    https://brightdata.com/licensehttps://brightdata.com/license

    Area covered
    Worldwide, YouTube
    Description

    Use our YouTube profiles dataset to extract both business and non-business information from public channels and filter by channel name, views, creation date, or subscribers. Datapoints include URL, handle, banner image, profile image, name, subscribers, description, video count, create date, views, details, and more. You may purchase the entire dataset or a customized subset, depending on your needs. Popular use cases for this dataset include sentiment analysis, brand monitoring, influencer marketing, and more.

  2. YouTube users worldwide 2020-2029

    • statista.com
    Updated Mar 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). YouTube users worldwide 2020-2029 [Dataset]. https://www.statista.com/forecasts/1144088/youtube-users-in-the-world
    Explore at:
    Dataset updated
    Mar 3, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    World
    Description

    The global number of Youtube users in was forecast to continuously increase between 2024 and 2029 by in total 232.5 million users (+24.91 percent). After the ninth consecutive increasing year, the Youtube user base is estimated to reach 1.2 billion users and therefore a new peak in 2029. Notably, the number of Youtube users of was continuously increasing over the past years.User figures, shown here regarding the platform youtube, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of Youtube users in countries like Africa and South America.

  3. YouTube users in India 2020-2029

    • statista.com
    Updated Mar 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). YouTube users in India 2020-2029 [Dataset]. https://www.statista.com/forecasts/1146150/youtube-users-in-india
    Explore at:
    Dataset updated
    Mar 3, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    India
    Description

    The number of Youtube users in India was forecast to continuously increase between 2024 and 2029 by in total 222.2 million users (+34.88 percent). After the ninth consecutive increasing year, the Youtube user base is estimated to reach 859.26 million users and therefore a new peak in 2029. Notably, the number of Youtube users of was continuously increasing over the past years.User figures, shown here regarding the platform youtube, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of Youtube users in countries like Sri Lanka and Nepal.

  4. Youtube Channel ZeeshanUsmani78 Data

    • kaggle.com
    Updated Feb 1, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ayyaz Shaukat (2021). Youtube Channel ZeeshanUsmani78 Data [Dataset]. https://www.kaggle.com/ayyazshaukat/youtube-channel-zeeshanusmani78-data/activity
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 1, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Ayyaz Shaukat
    Area covered
    YouTube
    Description

    Context

    This dataset was extracted for one of the assignment during the Data Science course. This data is extracted from "https://www.youtube.com/c/ZeeshanUsmani78" . If someone interested in Python code that I have used to extract, you can view in my profile: "https://github.com/meayyaz/ParsingInPython/blob/main/ChannelData.py" This kind of data can help to Learn any Youtube channel statistics.

    Content

    Dataset : There are only 325 rows in this dataset and columns are "VideoId", "Title" (title of video), "PublishTime", "ViewCount", "LikeCount", "DislikeCount", "favoriteCount" , "commentCount"

    Acknowledgements

    I would like to Thanks Zeeshan-ul-hassan Usmani for allowing to upload this data and giving such a good live example.

    Inspiration

    I would like to learn Data Science and Machine Learning with my others fellows. Here I think we should get from this dataset: - Main target "After loading any new video, what will be the 'view-count', 'Like-count' in next 24 hours, after 7 days ... " - What kind of videos has more view? - Any relationship of Video publish timestamp?

  5. i

    Data from: YouTube Video Network Dataset for Israel-Hamas War

    • ieee-dataport.org
    Updated Dec 23, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Thejas T (2023). YouTube Video Network Dataset for Israel-Hamas War [Dataset]. https://ieee-dataport.org/documents/youtube-video-network-dataset-israel-hamas-war
    Explore at:
    Dataset updated
    Dec 23, 2023
    Authors
    Thejas T
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Israel, YouTube
    Description

    Over the past few years YouTube has became a popular site for video broadcasting and earning money by publishing various different skills in the form of videos. For some people it has become a main source to earn money. Getting the videos trending among the viewers is one of the major tasks which each and every content creator wants. Popularity of any video and its reach to the audience is completely based on YouTube's Recommendation algorithm. This document is a dataset descriptor for the dataset collected over the time span of about 45 days during the Israel-Hamas War

  6. P

    YouTube-100M Dataset

    • paperswithcode.com
    Updated Mar 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shawn Hershey; Sourish Chaudhuri; Daniel P. W. Ellis; Jort F. Gemmeke; Aren Jansen; R. Channing Moore; Manoj Plakal; Devin Platt; Rif A. Saurous; Bryan Seybold; Malcolm Slaney; Ron J. Weiss; Kevin Wilson (2024). YouTube-100M Dataset [Dataset]. https://paperswithcode.com/dataset/youtube-100m
    Explore at:
    Dataset updated
    Mar 25, 2024
    Authors
    Shawn Hershey; Sourish Chaudhuri; Daniel P. W. Ellis; Jort F. Gemmeke; Aren Jansen; R. Channing Moore; Manoj Plakal; Devin Platt; Rif A. Saurous; Bryan Seybold; Malcolm Slaney; Ron J. Weiss; Kevin Wilson
    Area covered
    YouTube
    Description

    The YouTube-100M data set consists of 100 million YouTube videos: 70M training videos, 10M evaluation videos, and 20M validation videos. Videos average 4.6 minutes each for a total of 5.4M training hours. Each of these videos is labeled with 1 or more topic identifiers from a set of 30,871 labels. There are an average of around 5 labels per video. The labels are assigned automatically based on a combination of metadata (title, description, comments, etc.), context, and image content for each video. The labels apply to the entire video and range from very generic (e.g. “Song”) to very specific (e.g. “Cormorant”). Being machine generated, the labels are not 100% accurate and of the 30K labels, some are clearly acoustically relevant (“Trumpet”) and others are less so (“Web Page”). Videos often bear annotations with multiple degrees of specificity. For example, videos labeled with “Trumpet” are often labeled “Entertainment” as well, although no hierarchy is enforced.

  7. T-Series Estimate Revenue From Youtube

    • kaggle.com
    Updated Dec 7, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Maithil Tandel (2020). T-Series Estimate Revenue From Youtube [Dataset]. https://www.kaggle.com/maithiltandel/tseries-estimate-revenue-from-youtube/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 7, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Maithil Tandel
    Description

    Context

    As YouTube is now one of the biggest online earning platform for content creators, lots of new content creators join everyday and upload almost thousands of video daily, which creates enormous amount of data everyday, from which we can do lots of things. Here I have taken data of T-Series, one of the most subscribed channel on YouTube, it's views and ratings of its past video and estimate its revenue for each video.

    There's a story behind every dataset and here's your opportunity to share yours.

    Content

    There are very less features in this dataset, namely: Date: The date when the particular video was released Name: Name of the video on YouTube Views: The views on YouTube as per December 2020 Ratings: The ratings of the video Comments: Number of comments on the video Estimated Revenue: The revenue generated by the video on YouTube What's inside is more than just rows and columns. Make it easy for others to get started by describing how you acquired the data and what time period it represents, too.

    Acknowledgements

    This data search wouldn't be possible without my sister as she was constantly watching videos on YouTube which lead me to this idea and then started working on this dataset.

    Inspiration

    Your data will be in front of the world's largest data science community. What questions do you want to see answered?

  8. Youtube users in Vietnam 2017-2025

    • statista.com
    Updated Mar 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Youtube users in Vietnam 2017-2025 [Dataset]. https://www.statista.com/forecasts/1146013/youtube-users-in-vietnam
    Explore at:
    Dataset updated
    Mar 3, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    2017 - 2019
    Area covered
    Vietnam
    Description

    In 2021, YouTube's user base in Vietnam amounts to approximately 66.63 million users. The number of YouTube users in Vietnam is projected to reach 75.44 million users by 2025. User figures have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).

  9. Youtube user worldwide 2024, by country

    • statista.com
    Updated Mar 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Youtube user worldwide 2024, by country [Dataset]. https://www.statista.com/forecasts/1146465/youtube-user-by-country
    Explore at:
    Dataset updated
    Mar 3, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Jan 1, 2024 - Dec 31, 2024
    Area covered
    Albania, YouTube
    Description

    The number of Youtube users ranking is led by India with 637.1 million users, while Russia is following with 95.38 million users. In contrast, Iceland is at the bottom of the ranking with 0.26 million users, showing a difference of 636.84 million users to India. User figures, shown here regarding the platform youtube, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).

  10. Data from: Tag Recommendation Datasets

    • figshare.com
    txt
    Updated Jan 25, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fabiano Belem (2016). Tag Recommendation Datasets [Dataset]. http://doi.org/10.6084/m9.figshare.2067183.v4
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jan 25, 2016
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Fabiano Belem
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Associative Tag Recommendation Exploiting Multiple Textual FeaturesFabiano Belem, Eder Martins, Jussara M. Almeida Marcos Goncalves In Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval, July. 2011AbstractThis work addresses the task of recommending relevant tags to a target object by jointly exploiting three dimen- sions of the problem: (i) term co-occurrence with tags preassigned to the target object, (ii) terms extracted from mul- tiple textual features, and (iii) several metrics of tag relevance. In particular, we propose several new heuristic meth- ods, which extend previous, highly effective and efficient, state-of-the-art strategies by including new metrics that try to capture how accurately a candidate term describes the object’s content. We also exploit two learning to rank techniques, namely RankSVM and Genetic Programming, for the task of generating ranking functions that combine multiple metrics to accurately estimate the relevance of a tag to a given object. We evaluate all proposed methods in various scenarios for three popular Web 2.0 applications, namely, LastFM, YouTube and YahooVideo. We found that our new heuristics greatly outperform the methods on which they are based, producing gains in precision of up to 181%, as well as another state-of-the-art technique, with improvements in precision of up to 40% over the best baseline in any scenario. Some further improvements can also be achieved, in some scenarios, with the new learning-to-rank based strategies, which have the additional advantage of being quite flexible and easily extensible to exploit other aspects of the tag recommendation problem.Bibtex Citation@inproceedings{belem@sigir11, author = {Fabiano Bel\'em and Eder Martins and Jussara Almeida and Marcos Gon\c{c}alves}, title = {Associative Tag Recommendation Exploiting Multiple Textual Features}, booktitle = {{Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval (SIGIR'11)}}, month = {{July}}, year = {2011} }

  11. Youtube users in Latin America and the Caribbean 2020, by country

    • statista.com
    • ai-chatbox.pro
    Updated Mar 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Youtube users in Latin America and the Caribbean 2020, by country [Dataset]. https://www.statista.com/forecasts/1169298/youtube-users-in-latin-america-by-country
    Explore at:
    Dataset updated
    Mar 10, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    2019
    Area covered
    Latin America, Caribbean, Americas, YouTube
    Description

    This statistic shows a ranking of the estimated number of Youtube users in 2020 in Latin America and the Caribbean, differentiated by country. The user numbers have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in more than 150 countries and regions worldwide. All input data are sourced from international institutions, national statistical offices, and trade associations. All data has been are processed to generate comparable datasets (see supplementary notes under details for more information).

  12. E

    Data from: English YouTube Hate Speech Corpus

    • live.european-language-grid.eu
    binary format
    Updated Oct 13, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2021). English YouTube Hate Speech Corpus [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/20160
    Explore at:
    binary formatAvailable download formats
    Dataset updated
    Oct 13, 2021
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Area covered
    YouTube
    Description

    We present an English YouTube dataset manually annotated for hate speech types and targets. The comments to be annotated were sampled from the English YouTube comments on videos about the Covid-19 pandemic in the period from January 2020 to May 2020. Two sets were annotated: a training set with 51,655 comments (IMSyPP_EN_YouTube_comments_train.csv) and two evaluation sets, one annotated in-context (IMSyPP_EN_YouTube_comments_evaluation_context.csv), another out-of-context (IMSyPP_EN_YouTube_comments_evaluation_no_context.csv), each based on the same 10,759 comments. The dataset was annotated by 10 annotators with most (99.9%) of the comments being annotated by two annotators. It was used to train a classification model for hate speech types detection that is publicly available at the following URL: https://huggingface.co/IMSyPP/hate_speech_en.

    The dataset consists of the following fields: Video_ID - YouTube ID of the video under which the comment was posted Comment_ID - YouTube ID of the comment Text - text of the comment Type - type of hate speech Target - the target of hate speech Annotator - code of the human annotator

  13. h

    youtube_subs_howto100M

    • huggingface.co
    Updated Mar 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wonchang Chung (2023). youtube_subs_howto100M [Dataset]. https://huggingface.co/datasets/totuta/youtube_subs_howto100M
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 31, 2023
    Authors
    Wonchang Chung
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Area covered
    YouTube
    Description

    Dataset Card for youtube_subs_howto100M

      Dataset Summary
    

    The youtube_subs_howto100M dataset is an English-language dataset of instruction-response pairs extracted from 309136 YouTube videos. The dataset was orignally inspired by and sourced from the HowTo100M dataset, which was developed for natural language search for video clips.

      Supported Tasks and Leaderboards
    

    conversational: The dataset can be used to train a model for instruction(request) and a long form… See the full description on the dataset page: https://huggingface.co/datasets/totuta/youtube_subs_howto100M.

  14. P

    YT-BB Dataset

    • paperswithcode.com
    Updated Jan 28, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Esteban Real; Jonathon Shlens; Stefano Mazzocchi; Xin Pan; Vincent Vanhoucke (2021). YT-BB Dataset [Dataset]. https://paperswithcode.com/dataset/youtube-boundingboxes
    Explore at:
    Dataset updated
    Jan 28, 2021
    Authors
    Esteban Real; Jonathon Shlens; Stefano Mazzocchi; Xin Pan; Vincent Vanhoucke
    Area covered
    YouTube
    Description

    YouTube-BoundingBoxes (YT-BB) is a large-scale data set of video URLs with densely-sampled object bounding box annotations. The data set consists of approximately 380,000 video segments about 19s long, automatically selected to feature objects in natural settings without editing or post-processing, with a recording quality often akin to that of a hand-held cell phone camera. The objects represent a subset of the MS COCO label set. All video segments were human-annotated with high-precision classification labels and bounding boxes at 1 frame per second.

  15. P

    MeLa BitChute Dataset

    • paperswithcode.com
    Updated Feb 18, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Milo Trujillo; Maurício Gruppi; Cody Buntain; Benjamin D. Horne (2022). MeLa BitChute Dataset [Dataset]. https://paperswithcode.com/dataset/mela-bitchute
    Explore at:
    Dataset updated
    Feb 18, 2022
    Authors
    Milo Trujillo; Maurício Gruppi; Cody Buntain; Benjamin D. Horne
    Description

    MeLa BitChute is a near-complete dataset of over 3M videos from 61K channels over 2.5 years (June 2019 to December 2021) from the social video hosting platform BitChute, a commonly used alternative to YouTube. Additionally, the dataset includes a variety of video-level metadata, including comments, channel descriptions, and views for each video.

    The dataset contains data from 3,036,190 videos, 61,229 channels, and 11,434,571 comments between June 28th, 2019 and December 31st, 2021. This dataset provides timestamped activities and estimates on views for the majority of channels and videos on the platform, allowing researchers to align BitChute videos with behavior on other platforms. Therefore, this dataset can facilitate both studies of BitChute in isolation and studies of BitChute’s role in the larger ecosystem.

  16. i

    Netflix

    • ieee-dataport.org
    Updated Oct 1, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Danil Shamsimukhametov (2021). Netflix [Dataset]. https://ieee-dataport.org/documents/youtube-netflix-web-dataset-encrypted-traffic-classification
    Explore at:
    Dataset updated
    Oct 1, 2021
    Authors
    Danil Shamsimukhametov
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    YouTube
    Description

    YouTube flows

  17. YouTube users in Europe 2020-2029

    • statista.com
    Updated May 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista Research Department (2025). YouTube users in Europe 2020-2029 [Dataset]. https://www.statista.com/topics/3853/internet-usage-in-europe/
    Explore at:
    Dataset updated
    May 21, 2025
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Statista Research Department
    Area covered
    Europe
    Description

    The number of Youtube users in Europe was forecast to continuously increase between 2024 and 2029 by in total 7.8 million users (+3.61 percent). After the ninth consecutive increasing year, the Youtube user base is estimated to reach 223.61 million users and therefore a new peak in 2029. Notably, the number of Youtube users of was continuously increasing over the past years.User figures, shown here regarding the platform youtube, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of Youtube users in countries like North America and Australia & Oceania.

  18. Z

    RECOD.ai Events Dataset

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jul 17, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nascimento, José (2024). RECOD.ai Events Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_5547605
    Explore at:
    Dataset updated
    Jul 17, 2024
    Dataset provided by
    Rocha, Anderson
    Nascimento, José
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Overview

    This data set consists of links to social network items for 34 different forensic events that took place between August 14th, 2018 and January 06th, 2021. The majority of the text and images are from Twitter (a minor part is from Flickr, Facebook and Google+), and every video is from YouTube.

    Data Collection

    We used Social Tracker (https://github.com/MKLab-ITI/mmdemo-dockerized), along with the social medias' APIs, to gather most of the collections. For a minor part, we used Twint (https://github.com/twintproject/twint). In both cases, we provided keywords related to the event to receive the data.

    It is important to mention that, in procedures like this one, usually only a small fraction of the collected data is in fact related to the event and useful for a further forensic analysis.

    Content

    We have data from 34 events, and for each of them we provide the files:

    items_full.csv: It contains links to any social media post that was collected.

    images.csv: Enlists the images collected. In some files there is a field called "ItemUrl", that refers to the social network post (e.g., a tweet) that mentions that media.

    video.csv: Urls of YouTube videos that were gathered about the event.

    video_tweet.csv: This file contains IDs of tweets and IDs of YouTube videos. A tweet whose ID is in this file has a video in its content. In turn, the link of a Youtube video whose ID is in this file was mentioned by at least one collected tweet. Only two collections have this file.

    description.txt: Contains some standard information about the event, and possibly some comments about any specific issue related to it.

    In fact, most of the collections do not have all the files above. Such an issue is due to changes in our collection procedure throughout the time of this work.

    Events

    We divided the events into six groups. They are,

    1. Fire

    Devastating fire is the main issue of the event, therefore most of the informative pictures show flames or burned constructions

    14 Events

    1. Collapse

    Most of the relevant images depict collapsed buildings, bridges, etc. (not caused by fire).

    5 Events

    1. Shooting

    Likely images of guns and police officers. Few or no destruction of the environment.

    5 Events

    1. Demonstration

    Plethora of people on the streets. Possibly some problem took place on that, but in most cases the demonstration is the actual event.

    7 Events

    1. Collision

    Traffic collision. Pictures of damaged vehicles on an urban landscape. Possibly there are images with victims on the street.

    1 Event

    1. Flood

    Events that range from fierce rain to a tsunami. Many pictures depict water.

    2 Events

    We enlist the events in the file recod-ai-events-dataset-list.pdf

    Media Content

    Due to the terms of use from the social networks, we do not make publicly available the texts, images and videos that were collected. However, we can provide some extra piece of media content related to one (or more) events by contacting the authors.

    Funding

    DéjàVu thematic project, São Paulo Research Foundation (grants 2017/12646-3, 2018/18264-8 and 2020/02241-9)

  19. YouTube users in Africa 2020-2029

    • statista.com
    Updated Feb 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista Research Department (2025). YouTube users in Africa 2020-2029 [Dataset]. https://www.statista.com/topics/9813/internet-usage-in-africa/
    Explore at:
    Dataset updated
    Feb 15, 2025
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Statista Research Department
    Description

    The number of Youtube users in Africa was forecast to continuously increase between 2024 and 2029 by in total 0.03 million users (+3.95 percent). The Youtube user base is estimated to amount to 0.79 million users in 2029. User figures, shown here regarding the platform youtube, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of Youtube users in countries like Worldwide and the Americas.

  20. Z

    EOAD (Egocentric Outdoor Activity Dataset)

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jul 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mehmet Ali Arabacı (2024). EOAD (Egocentric Outdoor Activity Dataset) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7742659
    Explore at:
    Dataset updated
    Jul 12, 2024
    Dataset provided by
    Alptekin Temizel
    Mehmet Ali Arabacı
    Elif Surer
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    EOAD is a collection of videos captured by wearable cameras, mostly of sports activities. It contains both visual and audio modalities.

    It was initiated by the HUJI and FPVSum egocentric activity datasets. However, the number of samples and diversity of activities for HUJI and FPVSum were insufficient. Therefore, we combined these datasets and populated them with new YouTube videos.

    The selection of videos was based on the following criteria:

    The videos should not include text overlays.

    The videos should contain natural sound (no external music)

    The actions in videos should be continuous (no cutting the scene or jumping in time)

    Video samples were trimmed depending on scene changes for long videos (such as driving, scuba diving, and cycling). As a result, a video may have several clips depicting egocentric actions. Hence, video clips were extracted from carefully defined time intervals within videos. The final dataset includes video clips with a single action and natural audio information.

    Statistics for EOAD:

    30 activities

    303 distinct videos

    1392 video clips

    2243 minutes labeled videos clips

    The detailed statistics for the selected datasets and the crawled videos clips from YouTube are given below:

    HUJI: 49 distinct videos - 148 video clips for 9 activities (driving, biking, motorcycle, walking, boxing, horse riding, running, skiing, stair climbing)

    FPVSum: 39 distinct videos - 124 video segments for 8 activities (biking, horse riding, skiing, longboarding, rock climbing, scuba, skateboarding, surfing)

    YouTube: 216 distinct videos - 1120 video clips for 27 activities (american football, basketball, bungee jumping, driving, go-kart, horse riding, ice hockey, jet ski, kayaking, kitesurfing, longboarding, motorcycle, paintball, paragliding, rafting, rock climbing, rowing, running, sailing, scuba diving, skateboarding, soccer, stair climbing, surfing, tennis, volleyball, walking)

    The video clips used for training, validation and test sets for each activity are listed in Table 1. Multiple video clips may belong to a single video because of trimming it for some reasons (i.e., scene cut, temporary overlayed text on videos, or video parts unrelated to activities).

    While splitting the dataset, the minimum number of videos for each activity was selected as 8. Additionally, the video samples were divided as 50%, 25%, and 25% for training (minimum four videos), validation (minimum two videos), and testing (minimum two videos), respectively. On the other hand, videos were split according to the raw video footage to prevent the mixing of similar video clips (having the same actors and scenes) into training, validation, and test sets. Therefore, we ensured that the video clips trimmed from the same videos were split together into training, validation, or test sets to satisfy a fair comparison.

    Some activities have continuity throughout the video, such as scuba, longboarding, or riding horse, which also have an equal number of video segments with the number of videos. However, some activities, such as skating, occurred in a short time, making the number of video segments higher than the others. As a result, the number of video clips for training, validation, and test sets was highly imbalanced for the selected activities (i.e., jet ski and rafting have 4; however, soccer has 99 video clips for training).

                                      Table 1 - Dataset splitting for EOAD
    

    Train

    Validation

    Test

    Action Label

    Clips

    Total Duration

    Clips

    Total Duration

    Clips

    Total Duration

    AmericanFootball

    34

    00:06:09

    36

    00:05:03

    9

    00:01:20

    Basketball

    43

    01:13:22

    19

    00:08:13

    10

    00:28:46

    Biking

    9

    01:58:01

    6

    00:32:22

    11

    00:36:16

    Boxing

    7

    00:24:54

    11

    00:14:14

    5

    00:17:30

    BungeeJumping

    7

    00:02:22

    4

    00:01:36

    4

    00:01:31

    Driving

    19

    00:37:23

    9

    00:24:46

    9

    00:29:23

    GoKart

    5

    00:40:00

    3

    00:11:46

    3

    00:19:46

    Horseback

    5

    01:15:14

    5

    01:02:26

    2

    00:20:38

    IceHockey

    52

    00:19:22

    46

    00:20:34

    10

    00:36:59

    Jetski

    4

    00:23:35

    5

    00:18:42

    6

    00:02:43

    Kayaking

    28

    00:43:11

    22

    00:14:23

    4

    00:11:05

    Kitesurfing

    30

    00:21:51

    17

    00:05:38

    6

    00:01:32

    Longboarding

    5

    00:15:40

    4

    00:18:03

    4

    00:09:11

    Motorcycle

    20

    00:49:38

    21

    00:13:53

    8

    00:20:30

    Paintball

    7

    00:33:52

    4

    00:12:08

    4

    00:08:52

    Paragliding

    11

    00:28:42

    4

    00:10:16

    4

    00:19:50

    Rafting

    4

    00:15:41

    3

    00:07:27

    3

    00:06:13

    RockClimbing

    6

    00:49:38

    2

    00:21:59

    2

    00:18:50

    Rowing

    5

    00:47:05

    3

    00:13:21

    3

    00:03:26

    Running

    21

    01:21:56

    19

    00:46:29

    11

    00:42:59

    Sailing

    7

    00:39:30

    4

    00:14:39

    6

    00:15:43

    Scuba

    5

    00:35:02

    3

    00:23:43

    2

    00:18:52

    Skate

    91

    00:15:53

    30

    00:07:01

    10

    00:02:03

    Ski

    14

    01:48:15

    17

    01:01:59

    7

    00:39:15

    Soccer

    102

    00:48:39

    52

    00:13:17

    16

    00:06:54

    StairClimbing

    6

    01:05:32

    6

    00:17:18

    5

    00:20:22

    Surfing

    23

    00:12:51

    17

    00:06:52

    10

    00:07:04

    Tennis

    34

    00:27:04

    9

    00:06:03

    9

    00:03:14

    Volleyball

    87

    00:19:14

    35

    00:07:46

    7

    00:18:58

    Walking

    49

    00:43:02

    36

    00:38:25

    10

    00:10:23

    Total

    30

    740

    20:22:37

    452

    09:20:23

    200

    08:00:08

    EOAD Code Repository

    Scripts for downloading raw videos and trim them in to video clips are provided in this GitHub repository.

    Regarding the questions, please contact mali.arabaci@gmail.com.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Bright Data (2024). YouTube Datasets [Dataset]. https://brightdata.com/products/datasets/youtube
Organization logo

YouTube Datasets

Explore at:
.json, .csv, .xlsxAvailable download formats
Dataset updated
Sep 12, 2024
Dataset authored and provided by
Bright Datahttps://brightdata.com/
License

https://brightdata.com/licensehttps://brightdata.com/license

Area covered
Worldwide, YouTube
Description

Use our YouTube profiles dataset to extract both business and non-business information from public channels and filter by channel name, views, creation date, or subscribers. Datapoints include URL, handle, banner image, profile image, name, subscribers, description, video count, create date, views, details, and more. You may purchase the entire dataset or a customized subset, depending on your needs. Popular use cases for this dataset include sentiment analysis, brand monitoring, influencer marketing, and more.

Search
Clear search
Close search
Google apps
Main menu