73 datasets found
  1. YouTube users worldwide 2020-2029

    • statista.com
    Updated Jul 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). YouTube users worldwide 2020-2029 [Dataset]. https://www.statista.com/forecasts/1144088/youtube-users-in-the-world
    Explore at:
    Dataset updated
    Jul 7, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Worldwide, YouTube
    Description

    The global number of Youtube users in was forecast to continuously increase between 2024 and 2029 by in total ***** million users (+***** percent). After the ninth consecutive increasing year, the Youtube user base is estimated to reach *** billion users and therefore a new peak in 2029. Notably, the number of Youtube users of was continuously increasing over the past years.User figures, shown here regarding the platform youtube, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to *** countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of Youtube users in countries like Africa and South America.

  2. YouTube Datasets

    • brightdata.com
    .json, .csv, .xlsx
    Updated Jan 9, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bright Data (2023). YouTube Datasets [Dataset]. https://brightdata.com/products/datasets/youtube
    Explore at:
    .json, .csv, .xlsxAvailable download formats
    Dataset updated
    Jan 9, 2023
    Dataset authored and provided by
    Bright Datahttps://brightdata.com/
    License

    https://brightdata.com/licensehttps://brightdata.com/license

    Area covered
    Worldwide, YouTube
    Description

    Use our YouTube profiles dataset to extract both business and non-business information from public channels and filter by channel name, views, creation date, or subscribers. Datapoints include URL, handle, banner image, profile image, name, subscribers, description, video count, create date, views, details, and more. You may purchase the entire dataset or a customized subset, depending on your needs. Popular use cases for this dataset include sentiment analysis, brand monitoring, influencer marketing, and more.

  3. Youtube video statistics for 1 million videos

    • kaggle.com
    Updated Jun 29, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mattia Zeni (2020). Youtube video statistics for 1 million videos [Dataset]. https://www.kaggle.com/datasets/mattiazeni/youtube-video-statistics-1million-videos/data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 29, 2020
    Dataset provided by
    Kaggle
    Authors
    Mattia Zeni
    License

    Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
    License information was derived automatically

    Area covered
    YouTube
    Description

    Motivation

    Study how YouTube videos become viral or, more in general, how they evolve in terms of views, likes and subscriptions is a topic of interest in many disciplines. With this dataset you can study such phenomena, with statistics about 1 million YouTube videos. The information was collected in 2013 when YouTube was exposing the data publicly: they removed this functionality in the years and now it's possible to have such statistics only to the owner of the video. This makes this dataset unique.

    Context

    This Dataset has been generated with YOUStatAnalyzer, a tool developed by myself (Mattia Zeni) when I was working for CREATE-NET (www.create-net.org) within the framework of the CONGAS FP7 project (http://www.congas-project.eu). For the project we needed to collect and analyse the dynamics of YouTube videos popularity. The dataset contains statistics of more than 1 million Youtube videos, chosen accordingly to random keywords extracted from the WordNet library (http://wordnet.princeton.edu).

    The motivation that led us to the development of the YOUStatAnalyser data collection tool and the creation of this dataset is that there's an active research community working on the interplay among user individual preferences, social dynamics, advertising mechanisms and a common problem is the lack of open large-scale datasets. At the same time, no tool was present at that time. Today, YouTube removed the possibility to visualize these data on each video's page, making this dataset unique.

    When using our dataset for research purposes, please cite it as:

    @INPROCEEDINGS{YOUStatAnalyzer, author={Mattia Zeni and Daniele Miorandi and Francesco {De Pellegrini}}, title = {{YOUStatAnalyzer}: a Tool for Analysing the Dynamics of {YouTube} Content Popularity}, booktitle = {Proc. 7th International Conference on Performance Evaluation Methodologies and Tools (Valuetools, Torino, Italy, December 2013)}, address = {Torino, Italy}, year = {2013} }

    Content

    The dataset contains statistics and metadata of 1 million YouTube videos, collected in 2013. The videos have been chosen accordingly to random keywords extracted from the WordNet library (http://wordnet.princeton.edu).

    Dataset structure

    The structure of a dataset is the following: { u'_id': u'9eToPjUnwmU', u'title': u'Traitor Compilation # 1 (Trouble ...', u'description': u'A traitor compilation by one are ...', u'category': u'Games', u'commentsNumber': u'6', u'publishedDate': u'2012-10-09T23:42:12.000Z', u'author': u'ServilityGaming', u'duration': u'208', u'type': u'video/3gpp', u'relatedVideos': [u'acjHy7oPmls', u'EhW2LbCjm7c', u'UUKigFAQLMA', ...], u'accessControl': { u'comment': {u'permission': u'allowed'}, u'list': {u'permission': u'allowed'}, u'videoRespond': {u'permission': u'moderated'}, u'rate': {u'permission': u'allowed'}, u'syndicate': {u'permission': u'allowed'}, u'embed': {u'permission': u'allowed'}, u'commentVote': {u'permission': u'allowed'}, u'autoPlay': {u'permission': u'allowed'} }, u'views': { u'cumulative': { u'data': [15.0, 25.0, 26.0, 26.0, ...] }, u'daily': { u'data': [15.0, 10.0, 1.0, 0.0, ..] } }, u'shares': { u'cumulative': { u'data': [0.0, 0.0, 0.0, 0.0, ...] }, u'daily': { u'data': [0.0, 0.0, 0.0, 0.0, ...] } }, u'watchtime': { u'cumulative': { u'data': [22.5666666667, 36.5166666667, 36.7, 36.7, ...] }, u'daily': { u'data': [22.5666666667, 13.95, 0.166666666667, 0.0, ...] } }, u'subscribers': { u'cumulative': { u'data': [0.0, 0.0, 0.0, 0.0, ...] }, u'daily': { u'data': [-1.0, 0.0, 0.0, 0.0, ...] } }, u'day': { u'data': [1349740800000.0, 1349827200000.0, 1349913600000.0, 1350000000000.0, ...] } }

    From the structure above is possible to see which fields an entry in the dataset has. It is possible to divide them into 2 sections:

    1) Video Information.

    _id -> Corresponding to the video ID and to the unique identifier of an entry in the database. title -> Te video's title. description -> The video's description. category -> The YouTube category the video is inserted in. commentsNumber -> The number of comments posted by users. publishedDate -> The date the video has been published. author -> The author of the video. duration -> The video duration in seconds. type -> The encoding type of the video. relatedVideos -> A list of related videos. accessControl -> A list of access policies for different aspects related to the video.

    2) Video Statistics.

    Each video can have 4 different statistics variables: views, shares, subscribers and watchtime. Recent videos have all of them while older video can have only the 'views' variable. Each variable has 2 dimensions, daily and cumulative.

    `views -> number of views collected by the vi...

  4. YouTube users in India 2020-2029

    • statista.com
    Updated Jul 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). YouTube users in India 2020-2029 [Dataset]. https://www.statista.com/forecasts/1146150/youtube-users-in-india
    Explore at:
    Dataset updated
    Jul 10, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    India
    Description

    The number of Youtube users in India was forecast to continuously increase between 2024 and 2029 by in total ***** million users (+***** percent). After the ninth consecutive increasing year, the Youtube user base is estimated to reach ****** million users and therefore a new peak in 2029. Notably, the number of Youtube users of was continuously increasing over the past years.User figures, shown here regarding the platform youtube, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to *** countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of Youtube users in countries like Sri Lanka and Nepal.

  5. Data from: Youtube social network

    • kaggle.com
    zip
    Updated Sep 1, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lorenzo De Tomasi (2019). Youtube social network [Dataset]. https://www.kaggle.com/datasets/lodetomasi1995/youtube-social-network
    Explore at:
    zip(10604317 bytes)Available download formats
    Dataset updated
    Sep 1, 2019
    Authors
    Lorenzo De Tomasi
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    YouTube
    Description

    Youtube social network and ground-truth communities Dataset information Youtube is a video-sharing web site that includes a social network. In the Youtube social network, users form friendship each other and users can create groups which other users can join. We consider such user-defined groups as ground-truth communities. This data is provided by Alan Mislove et al.

    We regard each connected component in a group as a separate ground-truth community. We remove the ground-truth communities which have less than 3 nodes. We also provide the top 5,000 communities with highest quality which are described in our paper. As for the network, we provide the largest connected component.

    more info : https://snap.stanford.edu/data/com-Youtube.html

  6. i

    Data from: YouTube Video Network Dataset for Israel-Hamas War

    • ieee-dataport.org
    Updated Dec 23, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Thejas T (2023). YouTube Video Network Dataset for Israel-Hamas War [Dataset]. https://ieee-dataport.org/documents/youtube-video-network-dataset-israel-hamas-war
    Explore at:
    Dataset updated
    Dec 23, 2023
    Authors
    Thejas T
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Israel, YouTube
    Description

    Over the past few years YouTube has became a popular site for video broadcasting and earning money by publishing various different skills in the form of videos. For some people it has become a main source to earn money. Getting the videos trending among the viewers is one of the major tasks which each and every content creator wants. Popularity of any video and its reach to the audience is completely based on YouTube's Recommendation algorithm. This document is a dataset descriptor for the dataset collected over the time span of about 45 days during the Israel-Hamas War

  7. YouTube Video and Channel Analytics

    • kaggle.com
    Updated Dec 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). YouTube Video and Channel Analytics [Dataset]. https://www.kaggle.com/datasets/thedevastator/youtube-video-and-channel-analytics/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 8, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    The Devastator
    Area covered
    YouTube
    Description

    YouTube Video and Channel Analytics

    YouTube Video and Channel Analytics: Statistics and Features

    By VISHWANATH SESHAGIRI [source]

    About this dataset

    The YouTube Video and Channel Metadata dataset is a comprehensive collection of data related to YouTube videos and channels. It consists of various features and statistics that provide insights into the performance and engagement of videos, as well as the overall popularity and success of channels.

    The dataset includes both direct features, such as total views, channel elapsed time, channel ID, video category ID, channel view count, likes per subscriber, dislikes per subscriber, comments per subscriber, and more. Additionally, there are indirect features derived from YouTube's API that provide additional metrics for analysis.

    One important aspect covered in this dataset is the ratio between certain metrics. For example: - The totalviews/channelelapsedtime ratio represents the average number of views a video has received relative to the elapsed time since the channel was created. - The likes/dislikes ratio indicates the proportion of likes on a video compared to dislikes. - The views/subscribers ratio showcases how engaged subscribers are by measuring the number of views relative to the number of subscribers.

    Other metrics explored in this dataset include comments/views ratio (representing viewer engagement), dislikes/views ratio (measuring viewer sentiment), comments/subscriber ratio (indicating community participation), likes/subscriber ratio (reflecting audience loyalty), dislikes/subscriber ratio (highlighting dissatisfaction levels), total number of subscribers for a channel (subscriberCount), total views on a channel (channelViewCount), total number of comments on a channel (channelCommentCount), among others.

    By analyzing these features and statistics within this dataset, researchers or data analysts can gain valuable insights into various aspects related to YouTube videos and channels. Furthermore, it may be possible to build statistical relationships between videos based on their performance characteristics or even develop topic trees based on similarities between different content categories. This dataset serves as an excellent resource for studying YouTube's ecosystem comprehensively.

    For accessing additional resources related to this dataset or exploring code repositories associated with it, users can refer to the provided GitHub repository

    How to use the dataset

    Introduction:

    Step 1: Understanding the Dataset Start by familiarizing yourself with the columns in the dataset. Here are some key features to pay attention to:

    • totalviews/channelelapsedtime: The ratio of total views of a video to the elapsed time of the channel.
    • channelViewCount: The total number of views on the channel.
    • likes/subscriber: The ratio of likes on a video to the number of subscribers of the channel.
    • views/subscribers: The ratio of views on a video to the number of subscribers of the channel.
    • subscriberCount: The total number of subscribers for a channel.
    • dislikes/views: The ratio of dislikes on a video to its total views.
    • comments/subscriber: The ratio comments on a video receive per subscriber count.

    Step 2: Determining Data Analysis Objectives Define your objectives or research questions before diving into data analysis using this dataset. For example, you may want to explore relationships between viewership, engagement metrics, and various attributes such as category ID or elapsed time.

    Step 3: Analyzing Relationships between Variables Use statistical techniques like correlation analysis or visualization tools like scatter plots, bar graphs, or heatmaps to understand relationships between variables in this dataset.

    For example: - Plotting totalviews/channelelapsedtime against channelViewCount can help identify patterns between overall video popularity and channels' view count growth over time. - Comparing likes/dislikes with comments/views can give insights into viewer engagement levels across different videos.

    Step 4: Building Machine Learning Models (Optional) If your objective includes predictive analysis or building machine learning models, select relevant features as predictors and the target variable (e.g., totalviews/channelelapsedtime) for training and evaluation.

    You can use various algorithms such as linear regression, decision trees, or neural networks to predict video performance or channel growth based on available attributes.

    Step 5: Evaluating Model Performance Assess the predictive model's performance using appropriate evaluation metrics like mean square...

  8. Top 1000 YouTube Channels in the World 🌐📊🎥

    • kaggle.com
    Updated Jun 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mayank Anand (2024). Top 1000 YouTube Channels in the World 🌐📊🎥 [Dataset]. https://www.kaggle.com/datasets/mayankanand2701/top-1000-youtube-channels-in-the-world/data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 25, 2024
    Dataset provided by
    Kaggle
    Authors
    Mayank Anand
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Area covered
    YouTube
    Description

    YouTube is the world's largest video-sharing platform, launched in 2005. It allows users to upload, view, and share videos, and has grown to be a central hub for content creators across various fields, including entertainment, education, music, and more. With over 2 billion logged-in users monthly, YouTube has become an essential platform for digital content and marketing.

    The Top 1000 YouTube Channels Dataset captures detailed information about the top-performing YouTube channels globally. This dataset includes the following columns:

    • Rank : The ranking of the YouTube channel based on its overall popularity and performance.
    • Youtuber : The name of the YouTuber or the title of the YouTube channel.
    • Subscribers : The total number of subscribers to the channel, indicating its reach and popularity.
    • Video Views : The total number of video views the channel has accumulated, reflecting its engagement and audience interaction.
    • Video Count : The total number of videos uploaded by the channel, showing the content volume produced.
    • Category : The genre or category the channel belongs to, such as music, education, entertainment, etc.
    • Started : The year the channel was created, providing insight into its longevity and growth over time.

    This dataset is invaluable for analyzing trends, understanding content strategies, and benchmarking channel performances within the YouTube ecosystem.

  9. h

    youtube

    • huggingface.co
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Common Pile, youtube [Dataset]. https://huggingface.co/datasets/common-pile/youtube
    Explore at:
    Dataset authored and provided by
    Common Pile
    Area covered
    YouTube
    Description

    Creative Commons YouTube

      Description
    

    YouTube is large-scale video-sharing platform where users have the option of uploading content under a CC BY license. To collect high-quality speech-based textual content and combat the rampant license laundering on YouTube, we manually curated a set of over 2,000 YouTube channels that consistently release original openly licensed content containing speech. The resulting collection spans a wide range of genres, including lectures… See the full description on the dataset page: https://huggingface.co/datasets/common-pile/youtube.

  10. Data from: Introducing the COVID-19 YouTube (COVYT) speech dataset featuring...

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Sep 8, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andreas Triantafyllopoulos; Andreas Triantafyllopoulos; Anastasia Semertzidou; Meishu Song; Florian B. Pokorny; Florian B. Pokorny; Björn W. Schuller; Björn W. Schuller; Anastasia Semertzidou; Meishu Song (2022). Introducing the COVID-19 YouTube (COVYT) speech dataset featuring the same speakers with and without infection [Dataset]. http://doi.org/10.5281/zenodo.6962930
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 8, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Andreas Triantafyllopoulos; Andreas Triantafyllopoulos; Anastasia Semertzidou; Meishu Song; Florian B. Pokorny; Florian B. Pokorny; Björn W. Schuller; Björn W. Schuller; Anastasia Semertzidou; Meishu Song
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    YouTube
    Description

    The COVYT dataset contains speech samples from individuals who self-reported their COVID-19 infection on public social media platforms (YouTube, Xiaohongshu). These videos, as well as accompanying videos of the same people prior to infection, were mined in an attempt to gather publicly-available data for COVID-19 research. This release includes the links to the original videos along with the accompanying manual segmentation and diarisation that identifies the utterances of the target individuals. We are additionally releasing features derived from the segmented utterances. Finally, the dataset includes partitioning information according to 4 different cross-validation schemes. See the arxiv pre-print for more details: https://arxiv.org/abs/2206.11045

  11. YouTube Videos and Channels Metadata

    • kaggle.com
    Updated Dec 14, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2022). YouTube Videos and Channels Metadata [Dataset]. https://www.kaggle.com/datasets/thedevastator/revealing-insights-from-youtube-video-and-channe
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 14, 2022
    Dataset provided by
    Kaggle
    Authors
    The Devastator
    Area covered
    YouTube
    Description

    YouTube Videos and Channels Metadata

    Analyze the statistical relation between videos and form a topic tree

    By VISHWANATH SESHAGIRI [source]

    About this dataset

    This dataset contains YouTube video and channel metadata to analyze the statistical relation between videos and form a topic tree. With 9 direct features, 13 more indirect features, it has all that you need to build a deep understanding of how videos are related – including information like total views per unit time, channel views, likes/subscribers ratio, comments/views ratio, dislikes/subscribers ratio etc. This data provides us with a unique opportunity to gain insights on topics such as subscriber count trends over time or calculating the impact of trends on subscriber engagement. We can develop powerful models that show us how different types of content drive viewership and identify the most popular styles or topics within YouTube's vast catalogue. Additionally this data offers an intriguing look into consumer behaviour as we can explore what drives people to watch specific videos at certain times or appreciate certain channels more than others - by analyzing things like likes per subscribers and dislikes per views ratios for example! Finally this dataset is completely open source with an easy-to-understand Github repo making it an invaluable resource for anyone looking to gain better insights into how their audience interacts with their content and how they might improve it in the future

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    How to Use This Dataset

    In general, it is important to understand each parameter in the data set before proceeding with analysis. The parameters included are totalviews/channelelapsedtime, channelViewCount, likes/subscriber, views/subscribers, subscriberCounts, dislikes/views comments/subscriberchannelCommentCounts,, likes/dislikes comments/views dislikes/ subscribers totviewes /totsubsvews /elapsedtime.

    To use this dataset for your own analysis:1) Review each parameter’s meaning and purpose in our dataset; 2) Get familiar with basic descriptive statistics such as mean median mode range; 3) Create visualizations or tables based on subsets of our data; 4) Understand correlations between different sets of variables or parameters; 5) Generate meaningful conclusions about specific channels or topics based on organized graph hierarchies or tables.; 6) Analyze trends over time for individual parameters as well as an aggregate reaction from all users when videos are released

    Research Ideas

    • Predicting the Relative Popularity of Videos: This dataset can be used to build a statistical model that can predict the relative popularity of videos based on various factors such as total views, channel viewers, likes/dislikes ratio, and comments/views ratio. This model could then be used to make recommendations and predict which videos are likely to become popular or go viral.

    • Creating Topic Trees: The dataset can also be used to create topic trees or taxonomies by analyzing the content of videos and looking at what topics they cover. For example, one could analyze the most popular YouTube channels in a specific subject area, group together those that discuss similar topics, and then build an organized tree structure around those topics in order to better understand viewer interests in that area.

    • Viewer Engagement Analysis: This dataset could also be used for viewer engagement analysis purposes by analyzing factors such as subscriber count, average time spent watching a video per user (elapsed time), comments made per view etc., so as to gain insights into how engaged viewers are with specific content or channels on YouTube. From this information it would be possible to optimize content strategy accordingly in order improve overall engagement rates across various types of video content and channel types

    Acknowledgements

    If you use this dataset in your research, please credit the original authors.

    Data Source

    License

    Unknown License - Please check the dataset description for more information.

    Columns

    File: YouTubeDataset_withChannelElapsed.csv | Column name | Description | |:----------------------------------|:-------------------------------------------------------| | totalviews/channelelapsedtime | Ratio of total views to channel elapsed time. (Ratio) | | channelViewCount | Total number of views for the channel. (Integer) | | likes/subscriber ...

  12. H

    Replication Data for: The Momo Challenge: Measuring the extent to which...

    • dataverse.harvard.edu
    Updated Jan 26, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lara Kobilke (2021). Replication Data for: The Momo Challenge: Measuring the extent to which YouTube portrays harmful and helpful depictions of a suicide game [Dataset]. http://doi.org/10.7910/DVN/6ESJEF
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 26, 2021
    Dataset provided by
    Harvard Dataverse
    Authors
    Lara Kobilke
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    YouTube
    Description

    Allegedly, the online suicide game "Momo Challenge" dares players to perform self-harming tasks and, ultimately, commit suicide. Journalists have criticized the large amount of YouTube videos in which YouTubers promote the challenge by passing on the phone numbers to their viewers. However, empirical knowledge about this and similar cyber threats is lacking. This data set was created to give insight into the reach of the Momo Challenge on YouTube, how users form communities around this video material, and to what extent it puts them at risk. It contains the results of a data crawl with NodeXL. Using the keywords ‘Momo Challenge English’, we ran the crawl for titles, descriptions, and tags during the turn of 2018/2019. We identified 487 videos, which we manually cleansed of videos unrelated to the challenge. The remaining data set consists of 209 videos.

  13. f

    Microsoft Excel dataset file of YouTube videos.

    • plos.figshare.com
    xlsx
    Updated Nov 29, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dan Sun; Guochang Zhao (2023). Microsoft Excel dataset file of YouTube videos. [Dataset]. http://doi.org/10.1371/journal.pone.0294665.s002
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Nov 29, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Dan Sun; Guochang Zhao
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    YouTube
    Description

    News dissemination plays a vital role in supporting people to incorporate beneficial actions during public health emergencies, thereby significantly reducing the adverse influences of events. Based on big data from YouTube, this research study takes the declaration of COVID-19 National Public Health Emergency (PHE) as the event impact and employs a DiD model to investigate the effect of PHE on the news dissemination strength of relevant videos. The study findings indicate that the views, comments, and likes on relevant videos significantly increased during the COVID-19 public health emergency. Moreover, the public’s response to PHE has been rapid, with the highest growth in comments and views on videos observed within the first week of the public health emergency, followed by a gradual decline and returning to normal levels within four weeks. In addition, during the COVID-19 public health emergency, in the context of different types of media, lifestyle bloggers, local media, and institutional media demonstrated higher growth in the news dissemination strength of relevant videos as compared to news & political bloggers, foreign media, and personal media, respectively. Further, the audience attracted by related news tends to display a certain level of stickiness, therefore this audience may subscribe to these channels during public health emergencies, which confirms the incentive mechanisms of social media platforms to foster relevant news dissemination during public health emergencies. The proposed findings provide essential insights into effective news dissemination in potential future public health events.

  14. Youtube Statistics and MacroEconomics - 2023

    • kaggle.com
    Updated May 20, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    raahul raj (2024). Youtube Statistics and MacroEconomics - 2023 [Dataset]. https://www.kaggle.com/datasets/raahulraj/youtube-statistics-and-macroeconomics-2023
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 20, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    raahul raj
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Area covered
    YouTube
    Description

    The dataset provides a comprehensive overview of leading YouTube channels, capturing key metrics such as subscriber counts, video views, and estimated annual earnings. It includes information on the channel's category, number of uploads, and geographical data like country and urban population. Additionally, socio-economic indicators such as gross tertiary education enrollment, unemployment rate, and development status of the channel's country are included. For instance, T-Series, the top-ranked channel, has 245 million subscribers and 228 billion video views, generating significant annual earnings. This dataset is invaluable for analyzing the dynamics of content creation on YouTube and understanding how geographical and economic factors influence channel success.

  15. H

    Replication Data for: Cross-Partisan Discussions on YouTube: Conservatives...

    • dataverse.harvard.edu
    bz2, tsv
    Updated Apr 27, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Harvard Dataverse (2021). Replication Data for: Cross-Partisan Discussions on YouTube: Conservatives Talk to Liberals but Liberals Don't Talk to Conservatives [Dataset]. http://doi.org/10.7910/DVN/KF5JC5
    Explore at:
    bz2(74499679), bz2(31294272), tsv(24181195), bz2(1971422760), tsv(515670)Available download formats
    Dataset updated
    Apr 27, 2021
    Dataset provided by
    Harvard Dataverse
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    The dataset is first introduced in the following paper: Siqi Wu and Paul Resnick. Cross-Partisan Discussions on YouTube: Conservatives Talk to Liberals but Liberals Don't Talk to Conservatives. In AAAI International Conference on Weblogs and Social Media (ICWSM), 2021. us_partisan.csv Metadata for 1,267 US partisan media on YouTube. The first row is header. Fields include "title, url, channel_title, channel_id, leaning, type, source, channel_description" video_meta.csv Metadata for 274241 YouTube political videos from US partisan media. The first row is header. Fields include "video_id, channel_id, media_leaning, media_type, num_view, num_comment, num_cmt_from_liberal, num_cmt_from_conservative, num_cmt_from_unknown" user_comment_meta.csv.bz2 Metadata for 9,304,653 YouTube users who have commented on YouTube political videos. The first row is header. Fields include "hashed_user_id, predicted_user_leaning, num_comment, num_cmt_on_left, num_cmt_on_right" user_comment_trace.tsv.bz2 Comment trace for 9,304,653 YouTube users who have commented on YouTube political videos. The first row is header. Fields include "hashed_user_id predicted_user_leaning comment_trace" (split by \t) "comment_trace" consists of "channel_id1,num_comment_on_this_channel1;channel_id2,num_comment_on_this_channel2;..." (split by ;) trained_HAN_models.tar.bz2 Five trained HAN models for predicting user political leanings. Each model consists a ".h5" model file and ".tokenizer" tokenizer file. See this for how to use our pre-trained HAN models. See more details in this data description.

  16. G

    QGIS Training Tutorials: Using Spatial Data in Geographic Information...

    • open.canada.ca
    • datasets.ai
    • +2more
    html
    Updated Oct 5, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statistics Canada (2021). QGIS Training Tutorials: Using Spatial Data in Geographic Information Systems [Dataset]. https://open.canada.ca/data/en/dataset/89be0c73-6f1f-40b7-b034-323cb40b8eff
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Oct 5, 2021
    Dataset provided by
    Statistics Canada
    License

    Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
    License information was derived automatically

    Description

    Have you ever wanted to create your own maps, or integrate and visualize spatial datasets to examine changes in trends between locations and over time? Follow along with these training tutorials on QGIS, an open source geographic information system (GIS) and learn key concepts, procedures and skills for performing common GIS tasks – such as creating maps, as well as joining, overlaying and visualizing spatial datasets. These tutorials are geared towards new GIS users. We’ll start with foundational concepts, and build towards more advanced topics throughout – demonstrating how with a few relatively easy steps you can get quite a lot out of GIS. You can then extend these skills to datasets of thematic relevance to you in addressing tasks faced in your day-to-day work.

  17. Youtube users in Vietnam 2017-2025

    • statista.com
    Updated Jul 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Youtube users in Vietnam 2017-2025 [Dataset]. https://www.statista.com/forecasts/1146013/youtube-users-in-vietnam
    Explore at:
    Dataset updated
    Jul 10, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    2017 - 2019
    Area covered
    Vietnam
    Description

    In 2021, YouTube's user base in Vietnam amounts to approximately ***** million users. The number of YouTube users in Vietnam is projected to reach ***** million users by 2025. User figures have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to *** countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).

  18. e

    Youth/YouTube/Cultural Education. Horizon 2019 - Dataset - B2FIND

    • b2find.eudat.eu
    Updated Jun 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). Youth/YouTube/Cultural Education. Horizon 2019 - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/5a42f23e-eafc-5cab-b0c3-a0271270c48e
    Explore at:
    Dataset updated
    Jun 2, 2023
    Area covered
    YouTube
    Description

    The increasing popularity and use of digital platforms and social media such as WhatsApp, Facebook, YouTube and Instagram are opening up new opportunities for children, young people and adults to pursue cultural interests or to stage themselves aesthetically. If we focus on young people between the ages of 12 and 19, a number of studies on media use show that YouTube in particular has become the leading medium for this age group. Given the growth in importance of this web video platform, questions arise about the receptive and productive content of experience and the significance of cultural content and practices. Furthermore, there are hardly any findings on the extent to which YouTube stimulates young people to engage in cultural activities and self-organized learning processes. The sample is composed of n=818 adolescents aged 12-19 years. The selection of the study units was based on a quota procedure. The adolescent target subjects were recruited via the IFAK interviewer staff according to predefined quotas for age, gender, region, place size class, type of school attended (for students), and occupation (for non-students). The characteristics "age and gender" and "region and place size" were crossed or combined with each other to produce as accurate a representation of the population as possible. The characteristic "migration background" was not used as a quota characteristic. The specifications for this are based on the latest data from the Federal Statistical Office and ma Radio 2018 II. The structural composition of the sample corresponds to the data for the population according to the characteristics mentioned. The study was conducted as a face-to-face oral survey. The answers of the young people were recorded by an interviewer on a laptop via a corresponding survey program. 111 face-to-face interviewers from the in-house interviewing staff, who have experience in interviewing children and adolescents, were used. The predefined questionnaire was binding for all interviewers with regard to the wording and sequence of questions. The maximum number of interviews per interviewer was n=10. Each interviewer received a detailed written briefing on the project at the beginning of the study. Die zunehmende Verbreitung und Nutzung digitaler Plattformen und sozialer Medien wie z. B. WhatsApp, Facebook, YouTube oder Instagram eröffnen Kindern, Jugendlichen und Erwachsenen neue Möglichkeiten, kulturellen Interessen nachzugehen oder sich ästhetisch zu inszenieren. Richtet man seinen Blick auf Jugendliche im Alter von 12 bis 19 Jahren, so zeigt eine Reihe von Studien zur Mediennutzung, dass sich insbesondere YouTube zum Leitmedium dieser Altersgruppe entwickelt hat. Angesichts des Bedeutungszuwachses dieser Webvideo-Plattform stellen sich Fragen nach den rezeptiven und produktiven Erfahrungsgehalten sowie der Bedeutung kultureller Inhalte und Praktiken. Weiterhin existieren kaum Erkenntnisse darüber, inwiefern YouTube die Jugendlichen zu kulturellen Aktivitäten und selbstorganisierten Lernprozessen anregt. Die Stichprobe setzt sich aus n=818 Jugendlichen im Alter von 12-19 Jahren zusammen. Die Auswahl der Untersuchungseinheiten erfolgte auf der Grundlage eines Quotenverfahrens. Die Rekrutierung der jugendlichen Zielpersonen erfolgte über den IFAK-Interviewerstab nach vorgegeben Quoten für Alter, Geschlecht, Region, Ortsgrößenklasse, besuchter Schultyp (bei Schülern) und Berufstätigkeit (bei Nicht-Schülern). Dabei wurden die Merkmale „Alter und Geschlecht“ sowie „Region und Ortsgröße“ gekreuzt bzw. miteinander kombiniert, um ein möglichst genaues Abbild der Grundgesamtheit herzustellen.Das Merkmal „Migrationshintergrund“ wurde nicht als Quotierungsmerkmal herangezogen. Die Vorgaben hierfür basieren auf den aktuellsten Angaben des Statistischen Bundesamtes und der ma Radio 2018 II. Die strukturelle Zusammensetzung der Stichprobe entspricht nach den genannten Merkmalen den Daten für die Grundgesamtheit. Die Studie wurde als persönlich-mündliche Befragung durchgeführt. Die Antworten der Jugendlichen wurden dabei über ein entsprechendes Befragungsprogramm von einem Interviewer auf einem Laptop erfasst. Zum Einsatz kamen 111 face-to-face Interviewer aus dem hauseigenen Interviewerstab, die Erfahrungen mit der Befragung von Kindern und Jugendlichen haben. Der vorgegebene Fragebogen war im Hinblick auf Wortlaut und Reihenfolge der Fragen für alle Interviewer verbindlich. Die maximale Anzahl an Interviews pro Interviewer lag bei n=10. Jeder Interviewer erhielt zu Beginn der Studie eine detaillierte schriftliche Einweisung in das Projekt.

  19. NCSRD-DS-5GDDoS: 5G Radio and Core metrics containing sporadic DDoS attacks

    • zenodo.org
    • data.europa.eu
    bin, csv, txt
    Updated Oct 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zenodo (2024). NCSRD-DS-5GDDoS: 5G Radio and Core metrics containing sporadic DDoS attacks [Dataset]. http://doi.org/10.5281/zenodo.13898091
    Explore at:
    csv, bin, txtAvailable download formats
    Dataset updated
    Oct 7, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    NCSRD-DS-5GDDos v2.0 Dataset
    ===
    NCSRD-DS-5GDDos is a comprehensive dataset recorded in a real-world 5G testbed that aligns with the 3GPP specifications. The dataset captures Distributed Denial of Service (DDoS) attacks initiated by malicious connected users (UEs).

    The setup comprises of 3 cells with a total of 9 UEs connected to the same core network. The 5G network is implemented by the Amarisoft Callbox Mini solution (cell 2), and we further employ a second cell using the Amarisoft Classic (cell 1 & 3), that also hosts the 5G core.

    The setup utilizes a broad set of UE devices comprising a set of smart phones (Huawei P40), microcomputers (Raspberry Pi 4 - Waveshare 5G Hat M2), industrial 5G routers (Industrial Waveshare 5G Router), a WiFi-6 mobile hotspot (DWR-2101 5G Wi-Fi 6 Mobile Hotspot) and a CPE box (Waveshare 5G CPE Box). All UEs are being operated by subsidiary hosts which are responsible for the traffic generation, occurring from scheduled communications times.

    All identifiers are artificially generated and do not represent or based on personal data. We identify each UE through its ‘imeisv’ ID, that corresponds to the device in use, due to vendor implementation, that uses the same IMSI for all UEs.

    This dataset captures attack data from a total of 5 malicious User Equipment (UE) devices that initiated various flooding attacks on a 5G network. Each record includes key identifiers such as the IMEISV (International Mobile Equipment Identity Software Version number) and IP address of the attacking UE, along with the device type. The file "summary_report.csv" summarizes this information. The traffic types used in the attacks include syn flooding, UDP flooding, ICMP flooding, DNS flooding, and GTP-U flooding. The benign users stream YouTube and Skype traffic.

    The dataset is recorded through the use of a data collector that interfaces with the 5G network and gathers data regarding UEs, gNBs and the Core Network. The data are recorded in an InfluxdB and pre-processed into three separate tabular .csv files for more efficient processing: “amari_ue_data.csv”, “enb_counters.csv” and “mme_counters.csv”. In this version, we use an Amarisoft Classic (cells 1 & 3, Core Network) and an Amarisoft Mini (cell 2) (more information on the products can be found in https://www.amarisoft.com/).

    The ”amari_ue_data.csv” provides information on the UEs regarding identification (“imeisv”, “5g_tmsi”, “rnti”), IP addressing, bearer information, cell information (“tac”, “ran_plmn”), and cell information (“ul_bitrate”, “dl_bitrate”, “cell_id”, retransmissions per user per cell “ul_retx” as well as aggregated bit rates for each cell).

    The ”enb_counters.csv” focuses on cell-level information, providing downlink and uplink bitrates, usage ratio per user, cpu load of the gNB.

    We provide separate files of ”amari_ue_data.csv” and ”enb_counters.csv” generated from each gNB (Amarisoft Classic and Mini).

    The “mme_counters.csv” provides information on the Non-Access Stratum (NAS) of the 5G Network and focuses on session status reports (e.g., number of PDU session establishments, paging, context setup. This part gives an overview of the connection management throughout the recording session, and provides information on features suggested by 3GPP for abnormal user behavior.

    We also provide a separate pre-processed dataset, that merges the two "amari_ue_data_*.csv" file, including labeling of the malicious/benign samples, and may be more flexible for interested data scientists.

    Please refer to README.txt for the features included in each file.

  20. SportsOri: A Novel Dataset for Analyzing Public Sentiment on Controversial...

    • zenodo.org
    bin, json, pdf
    Updated Jun 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yuvraj Singh; Devadripta Jadhav; Samiksha Boduwar; Kripabandhu Ghosh; Yuvraj Singh; Devadripta Jadhav; Samiksha Boduwar; Kripabandhu Ghosh (2025). SportsOri: A Novel Dataset for Analyzing Public Sentiment on Controversial Sports Events in YouTube Comments [Dataset]. http://doi.org/10.5281/zenodo.15599619
    Explore at:
    bin, pdf, jsonAvailable download formats
    Dataset updated
    Jun 15, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Yuvraj Singh; Devadripta Jadhav; Samiksha Boduwar; Kripabandhu Ghosh; Yuvraj Singh; Devadripta Jadhav; Samiksha Boduwar; Kripabandhu Ghosh
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    May 2024
    Area covered
    YouTube
    Description

    Sports engages billions of followers worldwide and impacts the
    economy. Sports controversies often ignite passionate discus-
    sions among fans, analysts, and players. With the rise of social
    media, platforms like YouTube have become central to these discus-
    sions. This study aims to analyze the stances or perform opinion
    mining namely for, against, and neutral on comments from fa-
    mous social media platforms like YouTube for famous public sports
    controversies.

    To our knowledge, it is the first-ever study and dataset (hand curated) of civic
    engagement in controversial sports events spanning around 40 years.
    LLMs (Llama and Deepseek reasoning family) were used for initial
    annotations (stance) of comments and later fine-tuned for comparative performance analysis ( 30% boost in accuracy).

    This dataset presents a collection of YouTube comments (around 43k) on famous
    and controversial Public Sports Events.

    We explore public sentiment analysis (stance detection) on a total of 6 famous controversial
    sports incidents by extracting and processing YouTube comments.
    Stance detection is performed on those events through fine-tuning
    of models like Llama-3.1-8b and Deepseek reasoning models (Llama-
    8b distilled) on comments from events like The Underarm Incident,
    Jonny Bairstow’s Run-Out Incident, Ashwin’s Mankading Event,
    Luis Suarez Handball Event etc.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Statista (2025). YouTube users worldwide 2020-2029 [Dataset]. https://www.statista.com/forecasts/1144088/youtube-users-in-the-world
Organization logo

YouTube users worldwide 2020-2029

Explore at:
50 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Jul 7, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
Worldwide, YouTube
Description

The global number of Youtube users in was forecast to continuously increase between 2024 and 2029 by in total ***** million users (+***** percent). After the ninth consecutive increasing year, the Youtube user base is estimated to reach *** billion users and therefore a new peak in 2029. Notably, the number of Youtube users of was continuously increasing over the past years.User figures, shown here regarding the platform youtube, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to *** countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of Youtube users in countries like Africa and South America.

Search
Clear search
Close search
Google apps
Main menu