100+ datasets found
  1. YouTube Datasets

    • brightdata.com
    .json, .csv, .xlsx
    Updated Jan 9, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bright Data (2023). YouTube Datasets [Dataset]. https://brightdata.com/products/datasets/youtube
    Explore at:
    .json, .csv, .xlsxAvailable download formats
    Dataset updated
    Jan 9, 2023
    Dataset authored and provided by
    Bright Datahttps://brightdata.com/
    License

    https://brightdata.com/licensehttps://brightdata.com/license

    Area covered
    Worldwide, YouTube
    Description

    Use our YouTube profiles dataset to extract both business and non-business information from public channels and filter by channel name, views, creation date, or subscribers. Datapoints include URL, handle, banner image, profile image, name, subscribers, description, video count, create date, views, details, and more. You may purchase the entire dataset or a customized subset, depending on your needs. Popular use cases for this dataset include sentiment analysis, brand monitoring, influencer marketing, and more.

  2. h

    YouTube-Commons

    • huggingface.co
    Updated Apr 17, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    PleIAs (2024). YouTube-Commons [Dataset]. https://huggingface.co/datasets/PleIAs/YouTube-Commons
    Explore at:
    Dataset updated
    Apr 17, 2024
    Dataset authored and provided by
    PleIAs
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    YouTube
    Description

    📺 YouTube-Commons 📺

    YouTube-Commons is a collection of audio transcripts of 2,063,066 videos shared on YouTube under a CC-By license.

      Content
    

    The collection comprises 22,709,724 original and automatically translated transcripts from 3,156,703 videos (721,136 individual channels). In total, this represents nearly 45 billion words (44,811,518,375). All the videos where shared on YouTube with a CC-BY license: the dataset provide all the necessary provenance information… See the full description on the dataset page: https://huggingface.co/datasets/PleIAs/YouTube-Commons.

  3. P

    MLB-YouTube Dataset Dataset

    • paperswithcode.com
    Updated Mar 23, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AJ Piergiovanni; Michael S. Ryoo (2021). MLB-YouTube Dataset Dataset [Dataset]. https://paperswithcode.com/dataset/mlb-youtube-dataset
    Explore at:
    Dataset updated
    Mar 23, 2021
    Authors
    AJ Piergiovanni; Michael S. Ryoo
    Area covered
    YouTube
    Description

    The MLB-YouTube dataset is a new, large-scale dataset consisting of 20 baseball games from the 2017 MLB post-season available on YouTube with over 42 hours of video footage. The dataset consists of two components: segmented videos for activity recognition and continuous videos for activity classification. It is quite challenging as it is created from TV broadcast baseball games where multiple different activities share the camera angle. Further, the motion/appearance difference between the various activities is quite small.

  4. h

    youtube-transcriptions

    • huggingface.co
    Updated Jul 6, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    James Briggs (2021). youtube-transcriptions [Dataset]. https://huggingface.co/datasets/jamescalam/youtube-transcriptions
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 6, 2021
    Authors
    James Briggs
    License

    https://choosealicense.com/licenses/afl-3.0/https://choosealicense.com/licenses/afl-3.0/

    Area covered
    YouTube
    Description

    The YouTube transcriptions dataset contains technical tutorials (currently from James Briggs, Daniel Bourke, and AI Coffee Break) transcribed using OpenAI's Whisper (large). Each row represents roughly a sentence-length chunk of text alongside the video URL and timestamp. Note that each item in the dataset contains just a short chunk of text. For most use cases you will likely need to merge multiple rows to create more substantial chunks of text, if you need to do that, this code snippet will… See the full description on the dataset page: https://huggingface.co/datasets/jamescalam/youtube-transcriptions.

  5. P

    YouTube-100M Dataset

    • paperswithcode.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shawn Hershey; Sourish Chaudhuri; Daniel P. W. Ellis; Jort F. Gemmeke; Aren Jansen; R. Channing Moore; Manoj Plakal; Devin Platt; Rif A. Saurous; Bryan Seybold; Malcolm Slaney; Ron J. Weiss; Kevin Wilson, YouTube-100M Dataset [Dataset]. https://paperswithcode.com/dataset/youtube-100m
    Explore at:
    Authors
    Shawn Hershey; Sourish Chaudhuri; Daniel P. W. Ellis; Jort F. Gemmeke; Aren Jansen; R. Channing Moore; Manoj Plakal; Devin Platt; Rif A. Saurous; Bryan Seybold; Malcolm Slaney; Ron J. Weiss; Kevin Wilson
    Area covered
    YouTube
    Description

    The YouTube-100M data set consists of 100 million YouTube videos: 70M training videos, 10M evaluation videos, and 20M validation videos. Videos average 4.6 minutes each for a total of 5.4M training hours. Each of these videos is labeled with 1 or more topic identifiers from a set of 30,871 labels. There are an average of around 5 labels per video. The labels are assigned automatically based on a combination of metadata (title, description, comments, etc.), context, and image content for each video. The labels apply to the entire video and range from very generic (e.g. “Song”) to very specific (e.g. “Cormorant”). Being machine generated, the labels are not 100% accurate and of the 30K labels, some are clearly acoustically relevant (“Trumpet”) and others are less so (“Web Page”). Videos often bear annotations with multiple degrees of specificity. For example, videos labeled with “Trumpet” are often labeled “Entertainment” as well, although no hierarchy is enforced.

  6. c

    Youtube programming videos sample dataset

    • crawlfeeds.com
    json, zip
    Updated Apr 27, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Crawl Feeds (2025). Youtube programming videos sample dataset [Dataset]. https://crawlfeeds.com/datasets/youtube-programming-videos-sample-dataset
    Explore at:
    zip, jsonAvailable download formats
    Dataset updated
    Apr 27, 2025
    Dataset authored and provided by
    Crawl Feeds
    License

    https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy

    Area covered
    YouTube
    Description

    Programming youtube videos dataset. Total records extracted more than 300. Last extracted on 24 jan 2022.

    Get in touch with crawlfeeds team for large datasets and customized youtube datasets.

  7. YouTube 8 Million - Data Lakehouse Ready

    • registry.opendata.aws
    Updated Feb 17, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amazon Web Services (2022). YouTube 8 Million - Data Lakehouse Ready [Dataset]. https://registry.opendata.aws/yt8m/
    Explore at:
    Dataset updated
    Feb 17, 2022
    Dataset provided by
    Amazon Web Serviceshttp://aws.amazon.com/
    Amazon Web Serviceshttps://aws.amazon.com/
    Area covered
    YouTube
    Description

    This both the original .tfrecords and a Parquet representation of the YouTube 8 Million dataset. YouTube-8M is a large-scale labeled video dataset that consists of millions of YouTube video IDs, with high-quality machine-generated annotations from a diverse vocabulary of 3,800+ visual entities. It comes with precomputed audio-visual features from billions of frames and audio segments, designed to fit on a single hard disk. This dataset also includes the YouTube-8M Segments data from June 2019. This dataset is 'Lakehouse Ready'. Meaning, you can query this data in-place straight out of the Registry of Open Data S3 bucket. Deploy this dataset's corresponding CloudFormation template to create the AWS Glue Catalog entries into your account in about 30 seconds. That one step will enable you to interact with the data with AWS Athena, AWS SageMaker, AWS EMR, or join into your AWS Redshift clusters. More detail in (the documentation)[https://github.com/aws-samples/data-lake-as-code/blob/roda-ml/README.md.

  8. Data from: Youtube social network

    • kaggle.com
    zip
    Updated Sep 1, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lorenzo De Tomasi (2019). Youtube social network [Dataset]. https://www.kaggle.com/datasets/lodetomasi1995/youtube-social-network
    Explore at:
    zip(10604317 bytes)Available download formats
    Dataset updated
    Sep 1, 2019
    Authors
    Lorenzo De Tomasi
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    YouTube
    Description

    Youtube social network and ground-truth communities Dataset information Youtube is a video-sharing web site that includes a social network. In the Youtube social network, users form friendship each other and users can create groups which other users can join. We consider such user-defined groups as ground-truth communities. This data is provided by Alan Mislove et al.

    We regard each connected component in a group as a separate ground-truth community. We remove the ground-truth communities which have less than 3 nodes. We also provide the top 5,000 communities with highest quality which are described in our paper. As for the network, we provide the largest connected component.

    more info : https://snap.stanford.edu/data/com-Youtube.html

  9. Hours of video uploaded to YouTube every minute 2007-2022

    • statista.com
    Updated Jun 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Hours of video uploaded to YouTube every minute 2007-2022 [Dataset]. https://www.statista.com/statistics/259477/hours-of-video-uploaded-to-youtube-every-minute/
    Explore at:
    Dataset updated
    Jun 20, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Jun 2007 - Jun 2022
    Area covered
    Worldwide, YouTube
    Description

    As of June 2022, more than *** hours of video were uploaded to YouTube every minute. This equates to approximately ****** hours of newly uploaded content per hour. The amount of content on YouTube has increased dramatically as consumer’s appetites for online video has grown. In fact, the number of video content hours uploaded every 60 seconds grew by around ** percent between 2014 and 2020. YouTube global users Online video is one of the most popular digital activities worldwide, with ** percent of internet users worldwide watching more than ** hours of online videos on a weekly basis in 2023. It was estimated that in 2023 YouTube would reach approximately *** million users worldwide. In 2022, the video platform was one of the leading media and entertainment brands worldwide, with a value of more than ** billion U.S. dollars. YouTube video content consumption The most viewed YouTube channels of all time have racked up billions of viewers, millions of subscribers and cover a wide variety of topics ranging from music to cosmetics. The YouTube channel owner with the most video views is Indian music label T-Series, which counted ****** billion lifetime views. Other popular YouTubers are gaming personalities such as PewDiePie, DanTDM and Markiplier.

  10. h

    youtube-dataset

    • huggingface.co
    Updated Feb 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DrTim-AI (2025). youtube-dataset [Dataset]. https://huggingface.co/datasets/BitiBytes123/youtube-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 8, 2025
    Authors
    DrTim-AI
    Area covered
    YouTube
    Description

    BitiBytes123/youtube-dataset dataset hosted on Hugging Face and contributed by the HF Datasets community

  11. YouTube users worldwide 2020-2029

    • statista.com
    Updated Mar 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). YouTube users worldwide 2020-2029 [Dataset]. https://www.statista.com/forecasts/1144088/youtube-users-in-the-world
    Explore at:
    Dataset updated
    Mar 3, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    World
    Description

    The global number of Youtube users in was forecast to continuously increase between 2024 and 2029 by in total 232.5 million users (+24.91 percent). After the ninth consecutive increasing year, the Youtube user base is estimated to reach 1.2 billion users and therefore a new peak in 2029. Notably, the number of Youtube users of was continuously increasing over the past years.User figures, shown here regarding the platform youtube, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of Youtube users in countries like Africa and South America.

  12. h

    youtube-dataset

    • huggingface.co
    Updated Oct 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Prompt Engineer (2024). youtube-dataset [Dataset]. https://huggingface.co/datasets/Prompt48/youtube-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 23, 2024
    Authors
    Prompt Engineer
    Area covered
    YouTube
    Description

    Prompt48/youtube-dataset dataset hosted on Hugging Face and contributed by the HF Datasets community

  13. Countries with the most YouTube users 2025

    • statista.com
    • ai-chatbox.pro
    Updated Feb 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Countries with the most YouTube users 2025 [Dataset]. https://www.statista.com/statistics/280685/number-of-monthly-unique-youtube-users/
    Explore at:
    Dataset updated
    Feb 17, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Feb 2025
    Area covered
    YouTube, Worldwide
    Description

    As of February 2025, India was the country with the largest YouTube audience by far, with approximately 491 million users engaging with the popular social video platform. The United States followed, with around 253 million YouTube viewers. Brazil came in third, with 144 million users watching content on YouTube. The United Kingdom saw around 54.8 million internet users engaging with the platform in the examined period. What country has the highest percentage of YouTube users? In July 2024, the United Arab Emirates was the country with the highest YouTube penetration worldwide, as around 94 percent of the country's digital population engaged with the service. In 2024, YouTube counted around 100 million paid subscribers for its YouTube Music and YouTube Premium services. YouTube mobile markets In 2024, YouTube was among the most popular social media platforms worldwide. In terms of revenues, the YouTube app generated approximately 28 million U.S. dollars in revenues in the United States in January 2024, as well as 19 million U.S. dollars in Japan.

  14. A YouTube Dataset with User-Level Usage Data

    • kaggle.com
    Updated May 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shruti Lall (2025). A YouTube Dataset with User-Level Usage Data [Dataset]. https://www.kaggle.com/datasets/shrutilall/a-youtube-dataset-with-user-level-usage-data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 28, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Shruti Lall
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Area covered
    YouTube
    Description

    This dataset contains anonymized logs of user-level YouTube viewing activity, collected via Amazon Mechanical Turk. Each user in the dataset provided at least six months of their YouTube watch history, enabling longitudinal analysis of personal viewing patterns.

    Each row in the dataset represents a single watch event and includes metadata such as: - the video ID - watch timestamp - whether the user was subscribed to the channel at the time - and whether the video was part of a playlist

    This dataset is intended to support research in user behavior modeling, content recommendation systems, temporal video engagement, and personalized analytics.

    The dataset accompanies the paper:

    "A YouTube dataset with user-level usage data: Baseline characteristics and key insights"
    Authors: Shruti Lall, Mohit Agarwal, Raghupathy Sivakumar
    Conference: IEEE ICC 2020 – International Conference on Communications

    If you use this dataset in your research, please cite the paper above.

  15. Data from: YouTube Videos Datasets

    • brightdata.com
    .json, .csv, .xlsx
    Updated Dec 20, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bright Data (2024). YouTube Videos Datasets [Dataset]. https://brightdata.com/products/datasets/youtube/videos
    Explore at:
    .json, .csv, .xlsxAvailable download formats
    Dataset updated
    Dec 20, 2024
    Dataset authored and provided by
    Bright Datahttps://brightdata.com/
    License

    https://brightdata.com/licensehttps://brightdata.com/license

    Area covered
    Worldwide, YouTube
    Description

    Use our YouTube Videos dataset to extract detailed information from public videos and filter by video title, views, upload date, or likes. Data points include video URL, title, description, thumbnail, upload date, view count, like count, comment count, tags, and more. You can purchase the entire dataset or a customized subset, tailored to your needs. Popular use cases for this dataset include trend analysis, content performance tracking, brand monitoring, and influencer campaign optimization.

  16. f

    YouTube Dataset on Mobile Streaming for Internet Traffic Modeling, Network...

    • figshare.com
    txt
    Updated Apr 14, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Frank Loh; Florian Wamser; Fabian Poignée; Stefan Geißler; Tobias Hoßfeld (2022). YouTube Dataset on Mobile Streaming for Internet Traffic Modeling, Network Management, and Streaming Analysis [Dataset]. http://doi.org/10.6084/m9.figshare.19096823.v2
    Explore at:
    txtAvailable download formats
    Dataset updated
    Apr 14, 2022
    Dataset provided by
    figshare
    Authors
    Frank Loh; Florian Wamser; Fabian Poignée; Stefan Geißler; Tobias Hoßfeld
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    YouTube
    Description

    Streaming is by far the predominant type of traffic in communication networks. With thispublic dataset, we provide 1,081 hours of time-synchronous video measurements at network, transport, and application layer with the native YouTube streaming client on mobile devices. The dataset includes 80 network scenarios with 171 different individual bandwidth settings measured in 5,181 runs with limited bandwidth, 1,939 runs with emulated 3G/4G traces, and 4,022 runs with pre-defined bandwidth changes. This corresponds to 332GB video payload. We present the most relevant quality indicators for scientific use, i.e., initial playback delay, streaming video quality, adaptive video quality changes, video rebuffering events, and streaming phases.

  17. Most Watched Youtube Videos

    • kaggle.com
    zip
    Updated Apr 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jatinthakur706 (2024). Most Watched Youtube Videos [Dataset]. https://www.kaggle.com/datasets/jatinthakur706/most-watched-youtube-videos
    Explore at:
    zip(0 bytes)Available download formats
    Dataset updated
    Apr 19, 2024
    Authors
    Jatinthakur706
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    YouTube
    Description

    This dataset contains data related to most watched YouTube videos till April 2024 . This contains different columns namely views,artist,channel,etc. The data is ranked on the basis of number of views.

  18. a

    Yale YouTube Video Text

    • academictorrents.com
    bittorrent
    Updated Oct 20, 2014
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yale (2014). Yale YouTube Video Text [Dataset]. https://academictorrents.com/details/156802226bcf5747e0bea4e4f14c03b3b952de80
    Explore at:
    bittorrent(434765881)Available download formats
    Dataset updated
    Oct 20, 2014
    Dataset authored and provided by
    Yale
    License

    https://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified

    Area covered
    YouTube
    Description

    YouTube Video Text (YVT) contains 30 videos. Each video has 15-second length, 30 frames per second, HD 720p quality and was collected from YouTube. The text content in the dataset can be divided into two categories, overlay text (e.g., captions, songs title, logos) and scene text (e.g. street signs, business signs, words on shirt).

  19. T

    youtube_vis

    • tensorflow.org
    Updated Feb 11, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2021). youtube_vis [Dataset]. https://www.tensorflow.org/datasets/catalog/youtube_vis
    Explore at:
    Dataset updated
    Feb 11, 2021
    Area covered
    YouTube
    Description

    Youtube-vis is a video instance segmentation dataset. It contains 2,883 high-resolution YouTube videos, a per-pixel category label set including 40 common objects such as person, animals and vehicles, 4,883 unique video instances, and 131k high-quality manual annotations.

    The YouTube-VIS dataset is split into 2,238 training videos, 302 validation videos and 343 test videos.

    No files were removed or altered during preprocessing.

    To use this dataset:

    import tensorflow_datasets as tfds
    
    ds = tfds.load('youtube_vis', split='train')
    for ex in ds.take(4):
     print(ex)
    

    See the guide for more informations on tensorflow_datasets.

  20. R

    Racoons Youtube Dataset

    • universe.roboflow.com
    zip
    Updated Oct 18, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    PUP (2022). Racoons Youtube Dataset [Dataset]. https://universe.roboflow.com/pup/racoons-youtube-io8ou
    Explore at:
    zipAvailable download formats
    Dataset updated
    Oct 18, 2022
    Dataset authored and provided by
    PUP
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    YouTube
    Variables measured
    Coons Bounding Boxes
    Description

    Racoons Youtube

    ## Overview
    
    Racoons Youtube is a dataset for object detection tasks - it contains Coons annotations for 400 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Bright Data (2023). YouTube Datasets [Dataset]. https://brightdata.com/products/datasets/youtube
Organization logo

YouTube Datasets

Explore at:
.json, .csv, .xlsxAvailable download formats
Dataset updated
Jan 9, 2023
Dataset authored and provided by
Bright Datahttps://brightdata.com/
License

https://brightdata.com/licensehttps://brightdata.com/license

Area covered
Worldwide, YouTube
Description

Use our YouTube profiles dataset to extract both business and non-business information from public channels and filter by channel name, views, creation date, or subscribers. Datapoints include URL, handle, banner image, profile image, name, subscribers, description, video count, create date, views, details, and more. You may purchase the entire dataset or a customized subset, depending on your needs. Popular use cases for this dataset include sentiment analysis, brand monitoring, influencer marketing, and more.

Search
Clear search
Close search
Google apps
Main menu