61 datasets found

YouTube Trending Video Dataset (updated daily)
kaggle.com
zip
Updated Apr 15, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rishav Sharma (2024). YouTube Trending Video Dataset (updated daily) [Dataset]. https://www.kaggle.com/rsrishav/youtube-trending-video-dataset
Explore at:
zip(0 bytes)Available download formats
Dataset updated
Apr 15, 2024
Authors
Rishav Sharma
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Area covered
YouTube
Description
This dataset is a daily record of the top trending YouTube videos and it will be updated daily.

Context

YouTube maintains a list of the top trending videos on the platform. According to Variety magazine, “To determine the year’s top-trending videos, YouTube uses a combination of factors including measuring users interactions (number of views, shares, comments and likes). Note that they’re not the most-viewed videos overall for the calendar year”.

Note that this dataset is a structurally improved version of this dataset.

Content

This dataset includes several months (and counting) of data on daily trending YouTube videos. Data is included for the IN, US, GB, DE, CA, FR, RU, BR, MX, KR, and JP regions (India, USA, Great Britain, Germany, Canada, France, Russia, Brazil, Mexico, South Korea, and, Japan respectively), with up to 200 listed trending videos per day.

Each region’s data is in a separate file. Data includes the video title, channel title, publish time, tags, views, likes and dislikes, description, and comment count.

The data also includes a category_id field, which varies between regions. To retrieve the categories for a specific video, find it in the associated JSON. One such file is included for each of the 11 regions in the dataset.

For more information on specific columns in the dataset refer to the column metadata.

Acknowledgements

This dataset was collected using the YouTube API. This dataset is the updated version of Trending YouTube Video Statistics.

Inspiration

Possible uses for this dataset could include: - Sentiment analysis in a variety of forms - Categorizing YouTube videos based on their comments and statistics. - Training ML algorithms like RNNs to generate their own YouTube comments. - Analyzing what factors affect how popular a YouTube video will be. - Statistical analysis over time.

For further inspiration, see the kernels on this dataset!
YouTube Videos and Channels Metadata
kaggle.com
Updated Dec 14, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2022). YouTube Videos and Channels Metadata [Dataset]. https://www.kaggle.com/datasets/thedevastator/revealing-insights-from-youtube-video-and-channe
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 14, 2022
Dataset provided by
Kaggle
Authors
The Devastator
Area covered
YouTube
Description
YouTube Videos and Channels Metadata

Analyze the statistical relation between videos and form a topic tree

By VISHWANATH SESHAGIRI [source]

About this dataset

This dataset contains YouTube video and channel metadata to analyze the statistical relation between videos and form a topic tree. With 9 direct features, 13 more indirect features, it has all that you need to build a deep understanding of how videos are related – including information like total views per unit time, channel views, likes/subscribers ratio, comments/views ratio, dislikes/subscribers ratio etc. This data provides us with a unique opportunity to gain insights on topics such as subscriber count trends over time or calculating the impact of trends on subscriber engagement. We can develop powerful models that show us how different types of content drive viewership and identify the most popular styles or topics within YouTube's vast catalogue. Additionally this data offers an intriguing look into consumer behaviour as we can explore what drives people to watch specific videos at certain times or appreciate certain channels more than others - by analyzing things like likes per subscribers and dislikes per views ratios for example! Finally this dataset is completely open source with an easy-to-understand Github repo making it an invaluable resource for anyone looking to gain better insights into how their audience interacts with their content and how they might improve it in the future

More Datasets

For more datasets, click here.

Featured Notebooks

🚨 Your notebook can be here! 🚨!

How to use the dataset

How to Use This Dataset

In general, it is important to understand each parameter in the data set before proceeding with analysis. The parameters included are totalviews/channelelapsedtime, channelViewCount, likes/subscriber, views/subscribers, subscriberCounts, dislikes/views comments/subscriberchannelCommentCounts,, likes/dislikes comments/views dislikes/ subscribers totviewes /totsubsvews /elapsedtime.

To use this dataset for your own analysis:1) Review each parameter’s meaning and purpose in our dataset; 2) Get familiar with basic descriptive statistics such as mean median mode range; 3) Create visualizations or tables based on subsets of our data; 4) Understand correlations between different sets of variables or parameters; 5) Generate meaningful conclusions about specific channels or topics based on organized graph hierarchies or tables.; 6) Analyze trends over time for individual parameters as well as an aggregate reaction from all users when videos are released

Research Ideas

Predicting the Relative Popularity of Videos: This dataset can be used to build a statistical model that can predict the relative popularity of videos based on various factors such as total views, channel viewers, likes/dislikes ratio, and comments/views ratio. This model could then be used to make recommendations and predict which videos are likely to become popular or go viral.

Creating Topic Trees: The dataset can also be used to create topic trees or taxonomies by analyzing the content of videos and looking at what topics they cover. For example, one could analyze the most popular YouTube channels in a specific subject area, group together those that discuss similar topics, and then build an organized tree structure around those topics in order to better understand viewer interests in that area.

Viewer Engagement Analysis: This dataset could also be used for viewer engagement analysis purposes by analyzing factors such as subscriber count, average time spent watching a video per user (elapsed time), comments made per view etc., so as to gain insights into how engaged viewers are with specific content or channels on YouTube. From this information it would be possible to optimize content strategy accordingly in order improve overall engagement rates across various types of video content and channel types

Acknowledgements

If you use this dataset in your research, please credit the original authors.

Data Source

License

Unknown License - Please check the dataset description for more information.

Columns

File: YouTubeDataset_withChannelElapsed.csv | Column name | Description | |:----------------------------------|:-------------------------------------------------------| | totalviews/channelelapsedtime | Ratio of total views to channel elapsed time. (Ratio) | | channelViewCount | Total number of views for the channel. (Integer) | | likes/subscriber ...
h
YouTube-Commons
huggingface.co
Updated Apr 17, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
PleIAs (2024). YouTube-Commons [Dataset]. https://huggingface.co/datasets/PleIAs/YouTube-Commons
Explore at:
Dataset updated
Apr 17, 2024
Dataset authored and provided by
PleIAs
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
YouTube
Description
📺 YouTube-Commons 📺

YouTube-Commons is a collection of audio transcripts of 2,063,066 videos shared on YouTube under a CC-By license.

Content

The collection comprises 22,709,724 original and automatically translated transcripts from 3,156,703 videos (721,136 individual channels). In total, this represents nearly 45 billion words (44,811,518,375). All the videos where shared on YouTube with a CC-BY license: the dataset provide all the necessary provenance information… See the full description on the dataset page: https://huggingface.co/datasets/PleIAs/YouTube-Commons.
YouTube Datasets
brightdata.com
.json, .csv, .xlsx
Updated Jan 9, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bright Data (2023). YouTube Datasets [Dataset]. https://brightdata.com/products/datasets/youtube
Explore at:
.json, .csv, .xlsxAvailable download formats
Dataset updated
Jan 9, 2023
Dataset authored and provided by
Bright Datahttps://brightdata.com/
License
https://brightdata.com/licensehttps://brightdata.com/license
Area covered
Worldwide, YouTube
Description
Use our YouTube profiles dataset to extract both business and non-business information from public channels and filter by channel name, views, creation date, or subscribers. Datapoints include URL, handle, banner image, profile image, name, subscribers, description, video count, create date, views, details, and more. You may purchase the entire dataset or a customized subset, depending on your needs. Popular use cases for this dataset include sentiment analysis, brand monitoring, influencer marketing, and more.
Hours of video uploaded to YouTube every minute 2007-2022
statista.com
Updated Jun 20, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Hours of video uploaded to YouTube every minute 2007-2022 [Dataset]. https://www.statista.com/statistics/259477/hours-of-video-uploaded-to-youtube-every-minute/
Explore at:
Dataset updated
Jun 20, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Jun 2007 - Jun 2022
Area covered
Worldwide, YouTube
Description
As of June 2022, more than *** hours of video were uploaded to YouTube every minute. This equates to approximately ****** hours of newly uploaded content per hour. The amount of content on YouTube has increased dramatically as consumer’s appetites for online video has grown. In fact, the number of video content hours uploaded every 60 seconds grew by around ** percent between 2014 and 2020. YouTube global users Online video is one of the most popular digital activities worldwide, with ** percent of internet users worldwide watching more than ** hours of online videos on a weekly basis in 2023. It was estimated that in 2023 YouTube would reach approximately *** million users worldwide. In 2022, the video platform was one of the leading media and entertainment brands worldwide, with a value of more than ** billion U.S. dollars. YouTube video content consumption The most viewed YouTube channels of all time have racked up billions of viewers, millions of subscribers and cover a wide variety of topics ranging from music to cosmetics. The YouTube channel owner with the most video views is Indian music label T-Series, which counted ****** billion lifetime views. Other popular YouTubers are gaming personalities such as PewDiePie, DanTDM and Markiplier.
Youtube video statistics for 1 million videos
kaggle.com
Updated Jun 29, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mattia Zeni (2020). Youtube video statistics for 1 million videos [Dataset]. https://www.kaggle.com/datasets/mattiazeni/youtube-video-statistics-1million-videos/data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 29, 2020
Dataset provided by
Kaggle
Authors
Mattia Zeni
License
Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
License information was derived automatically
Area covered
YouTube
Description
Motivation

Study how YouTube videos become viral or, more in general, how they evolve in terms of views, likes and subscriptions is a topic of interest in many disciplines. With this dataset you can study such phenomena, with statistics about 1 million YouTube videos. The information was collected in 2013 when YouTube was exposing the data publicly: they removed this functionality in the years and now it's possible to have such statistics only to the owner of the video. This makes this dataset unique.

Context

This Dataset has been generated with YOUStatAnalyzer, a tool developed by myself (Mattia Zeni) when I was working for CREATE-NET (www.create-net.org) within the framework of the CONGAS FP7 project (http://www.congas-project.eu). For the project we needed to collect and analyse the dynamics of YouTube videos popularity. The dataset contains statistics of more than 1 million Youtube videos, chosen accordingly to random keywords extracted from the WordNet library (http://wordnet.princeton.edu).

The motivation that led us to the development of the YOUStatAnalyser data collection tool and the creation of this dataset is that there's an active research community working on the interplay among user individual preferences, social dynamics, advertising mechanisms and a common problem is the lack of open large-scale datasets. At the same time, no tool was present at that time. Today, YouTube removed the possibility to visualize these data on each video's page, making this dataset unique.

When using our dataset for research purposes, please cite it as:

@INPROCEEDINGS{YOUStatAnalyzer, author={Mattia Zeni and Daniele Miorandi and Francesco {De Pellegrini}}, title = {{YOUStatAnalyzer}: a Tool for Analysing the Dynamics of {YouTube} Content Popularity}, booktitle = {Proc. 7th International Conference on Performance Evaluation Methodologies and Tools (Valuetools, Torino, Italy, December 2013)}, address = {Torino, Italy}, year = {2013} }

Content

The dataset contains statistics and metadata of 1 million YouTube videos, collected in 2013. The videos have been chosen accordingly to random keywords extracted from the WordNet library (http://wordnet.princeton.edu).

Dataset structure

The structure of a dataset is the following: { u'_id': u'9eToPjUnwmU', u'title': u'Traitor Compilation # 1 (Trouble ...', u'description': u'A traitor compilation by one are ...', u'category': u'Games', u'commentsNumber': u'6', u'publishedDate': u'2012-10-09T23:42:12.000Z', u'author': u'ServilityGaming', u'duration': u'208', u'type': u'video/3gpp', u'relatedVideos': [u'acjHy7oPmls', u'EhW2LbCjm7c', u'UUKigFAQLMA', ...], u'accessControl': { u'comment': {u'permission': u'allowed'}, u'list': {u'permission': u'allowed'}, u'videoRespond': {u'permission': u'moderated'}, u'rate': {u'permission': u'allowed'}, u'syndicate': {u'permission': u'allowed'}, u'embed': {u'permission': u'allowed'}, u'commentVote': {u'permission': u'allowed'}, u'autoPlay': {u'permission': u'allowed'} }, u'views': { u'cumulative': { u'data': [15.0, 25.0, 26.0, 26.0, ...] }, u'daily': { u'data': [15.0, 10.0, 1.0, 0.0, ..] } }, u'shares': { u'cumulative': { u'data': [0.0, 0.0, 0.0, 0.0, ...] }, u'daily': { u'data': [0.0, 0.0, 0.0, 0.0, ...] } }, u'watchtime': { u'cumulative': { u'data': [22.5666666667, 36.5166666667, 36.7, 36.7, ...] }, u'daily': { u'data': [22.5666666667, 13.95, 0.166666666667, 0.0, ...] } }, u'subscribers': { u'cumulative': { u'data': [0.0, 0.0, 0.0, 0.0, ...] }, u'daily': { u'data': [-1.0, 0.0, 0.0, 0.0, ...] } }, u'day': { u'data': [1349740800000.0, 1349827200000.0, 1349913600000.0, 1350000000000.0, ...] } }

From the structure above is possible to see which fields an entry in the dataset has. It is possible to divide them into 2 sections:

1) Video Information.

_id -> Corresponding to the video ID and to the unique identifier of an entry in the database. title -> Te video's title. description -> The video's description. category -> The YouTube category the video is inserted in. commentsNumber -> The number of comments posted by users. publishedDate -> The date the video has been published. author -> The author of the video. duration -> The video duration in seconds. type -> The encoding type of the video. relatedVideos -> A list of related videos. accessControl -> A list of access policies for different aspects related to the video.

2) Video Statistics.

Each video can have 4 different statistics variables: views, shares, subscribers and watchtime. Recent videos have all of them while older video can have only the 'views' variable. Each variable has 2 dimensions, daily and cumulative.

`views -> number of views collected by the vi...
f
YouTube Dataset on Mobile Streaming for Internet Traffic Modeling, Network...
figshare.com
txt
Updated Apr 14, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Frank Loh; Florian Wamser; Fabian Poignée; Stefan Geißler; Tobias Hoßfeld (2022). YouTube Dataset on Mobile Streaming for Internet Traffic Modeling, Network Management, and Streaming Analysis [Dataset]. http://doi.org/10.6084/m9.figshare.19096823.v2
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.19096823.v2
Dataset updated
Apr 14, 2022
Dataset provided by
figshare
Authors
Frank Loh; Florian Wamser; Fabian Poignée; Stefan Geißler; Tobias Hoßfeld
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
YouTube
Description
Streaming is by far the predominant type of traffic in communication networks. With thispublic dataset, we provide 1,081 hours of time-synchronous video measurements at network, transport, and application layer with the native YouTube streaming client on mobile devices. The dataset includes 80 network scenarios with 171 different individual bandwidth settings measured in 5,181 runs with limited bandwidth, 1,939 runs with emulated 3G/4G traces, and 4,022 runs with pre-defined bandwidth changes. This corresponds to 332GB video payload. We present the most relevant quality indicators for scientific use, i.e., initial playback delay, streaming video quality, adaptive video quality changes, video rebuffering events, and streaming phases.
YouTube users worldwide 2020-2029
statista.com
tokrwards.com
Updated Jul 7, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). YouTube users worldwide 2020-2029 [Dataset]. https://www.statista.com/forecasts/1144088/youtube-users-in-the-world
Explore at:
Dataset updated
Jul 7, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
World
Description
The global number of Youtube users in was forecast to continuously increase between 2024 and 2029 by in total ***** million users (+***** percent). After the ninth consecutive increasing year, the Youtube user base is estimated to reach *** billion users and therefore a new peak in 2029. Notably, the number of Youtube users of was continuously increasing over the past years.User figures, shown here regarding the platform youtube, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to *** countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of Youtube users in countries like Africa and South America.
YouTube Channel Statistics Dataset
kaggle.com
Updated Jul 11, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Vamshi krishna Pennakoduru (2023). YouTube Channel Statistics Dataset [Dataset]. https://www.kaggle.com/datasets/vamshikrishna305/youtube-channel-statistics-dataset/versions/1
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 11, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Vamshi krishna Pennakoduru
Area covered
YouTube
Description
This comprehensive YouTube Video Analytics Dataset provides valuable insights into the performance of a wide range of videos on the popular platform. Spanning various genres, the dataset encompasses essential information such as - 1.Genre 2.video titles, 3.publish times, 4.view counts, 5.watch time (in hours), 6.subscriber counts, 7.average view durations, 8.impressions, and 9.impressions click-through rates (%).

By leveraging this dataset, researchers, analysts, and data enthusiasts can delve into the factors that influence video success on YouTube. Analyze the correlation between genre and view counts, investigate the impact of subscriber counts on watch time, or explore how average view durations and click-through rates affect video impressions.

Whether you're interested in exploring video trends, identifying patterns in user behavior, or developing machine learning models, this dataset serves as a valuable resource. Gain actionable insights into YouTube video performance and contribute to the ever-growing field of online content analysis. LICENCE NOTE - This is the dataset of my own channel.
Z
Dataset of Video Comments of a Vision Video Classified by Their Relevance,...
data.niaid.nih.gov
Updated Jul 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Karras, Oliver (2024). Dataset of Video Comments of a Vision Video Classified by Their Relevance, Polarity, Intention, and Topic [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4533301
Explore at:
Dataset updated
Jul 19, 2024
Dataset provided by
Kristo, Eklekta
Karras, Oliver
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains all comments (comments and replies) of the YouTube vision video "Tunnels" by "The Boring Company" fetched on 2020-10-13 using YouTube API. The comments are classified manually by three persons. We performed a single-class labeling of the video comments regarding their relevance for requirement engineering (RE) (ham/spam), their polarity (positive/neutral/negative). Furthermore, we performed a multi-class labeling of the comments regarding their intention (feature request and problem report) and their topic (efficiency and safety). While a comment can only be relevant or not relevant and have only one polarity, a comment can have one or more intentions and also one or more topics.

For the replies, one person also classified them regarding their relevance for RE. However, the investigation of the replies is ongoing and future work.

Remark: For 126 comments and 26 replies, we could not determine the date and time since they were no longer accessible on YouTube at the time this data set was created. In the case of a missing date and time, we inserted "NULL" in the corresponding cell.

This data set includes the following files:

Dataset.xlsx contains the raw and labeled video comments and replies:

For each comment, the data set contains:

ID: An identification number generated by YouTube for the comment

Date: The date and time of the creation of the comment

Author: The username of the author of the comment

Likes: The number of likes of the comment

Replies: The number of replies to the comment

Comment: The written comment

Relevance: Label indicating the relevance of the comment for RE (ham = relevant, spam = irrelevant)

Polarity: Label indicating the polarity of the comment

Feature request: Label indicating that the comment request a feature

Problem report: Label indicating that the comment reports a problem

Efficiency: Label indicating that the comment deals with the topic efficiency

Safety: Label indicating that the comment deals with the topic safety

For each reply, the data set contains:

ID: The identification number of the comment to which the reply belongs

Date: The date and time of the creation of the reply

Author: The username of the author of the reply

Likes: The number of likes of the reply

Comment: The written reply

Relevance: Label indicating the relevance of the reply for RE (ham = relevant, spam = irrelevant)

Detailed analysis results.xlsx contains the detailed results of all ten times repeated 10-fold cross validation analyses for each of all considered combinations of machine learning algorithms and features

Guide Sheet - Multi-class labeling.pdf describes the coding task, defines the categories, and lists examples to reduce inconsistencies and increase the quality of manual multi-class labeling

Guide Sheet - Single-class labeling.pdf describes the coding task, defines the categories, and lists examples to reduce inconsistencies and increase the quality of manual single-class labeling

Python scripts for analysis.zip contains the scripts (as jupyter notebooks) and prepared data (as csv-files) for the analyses
YouTube Dataset - Medical papers analising YouTube videos
figshare.com
xlsx
Updated Jun 1, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Juan-José Boté (2019). YouTube Dataset - Medical papers analising YouTube videos [Dataset]. http://doi.org/10.6084/m9.figshare.7108511.v3
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.7108511.v3
Dataset updated
Jun 1, 2019
Dataset provided by
Figsharehttp://figshare.com/
figshare
Authors
Juan-José Boté
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
YouTube
Description
These dataset are review from YouTube content analysis papers in Health. Data used to publish a review paper into the Revista Cubana de Información en Ciencias de la Salud.This dataset are a list of scientific papers that have been reviewed to analise how health care scientist classify videos they analyse.Este conjunto de datos es una revisión de artículos sobre el análisis de contenido en YouTube en el campo de la salud. Este dataset se ha empleado para publicar un articulo en la Revista Cubana de Información en Ciencias de la Salud. Este dataset es una lista de articulos científicos que han sido revisados para analizar como los científicos en atención médica clasifican los videos que analizan.
YouTube users in India 2020-2029
statista.com
tokrwards.com
Updated Jul 10, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). YouTube users in India 2020-2029 [Dataset]. https://www.statista.com/forecasts/1146150/youtube-users-in-india
Explore at:
Dataset updated
Jul 10, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
India
Description
The number of Youtube users in India was forecast to continuously increase between 2024 and 2029 by in total ***** million users (+***** percent). After the ninth consecutive increasing year, the Youtube user base is estimated to reach ****** million users and therefore a new peak in 2029. Notably, the number of Youtube users of was continuously increasing over the past years.User figures, shown here regarding the platform youtube, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to *** countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of Youtube users in countries like Sri Lanka and Nepal.
YouTube Video and Channel Analytics
kaggle.com
Updated Dec 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2023). YouTube Video and Channel Analytics [Dataset]. https://www.kaggle.com/datasets/thedevastator/youtube-video-and-channel-analytics/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 8, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
The Devastator
Area covered
YouTube
Description
YouTube Video and Channel Analytics

YouTube Video and Channel Analytics: Statistics and Features

By VISHWANATH SESHAGIRI [source]

About this dataset

The YouTube Video and Channel Metadata dataset is a comprehensive collection of data related to YouTube videos and channels. It consists of various features and statistics that provide insights into the performance and engagement of videos, as well as the overall popularity and success of channels.

The dataset includes both direct features, such as total views, channel elapsed time, channel ID, video category ID, channel view count, likes per subscriber, dislikes per subscriber, comments per subscriber, and more. Additionally, there are indirect features derived from YouTube's API that provide additional metrics for analysis.

One important aspect covered in this dataset is the ratio between certain metrics. For example: - The totalviews/channelelapsedtime ratio represents the average number of views a video has received relative to the elapsed time since the channel was created. - The likes/dislikes ratio indicates the proportion of likes on a video compared to dislikes. - The views/subscribers ratio showcases how engaged subscribers are by measuring the number of views relative to the number of subscribers.

Other metrics explored in this dataset include comments/views ratio (representing viewer engagement), dislikes/views ratio (measuring viewer sentiment), comments/subscriber ratio (indicating community participation), likes/subscriber ratio (reflecting audience loyalty), dislikes/subscriber ratio (highlighting dissatisfaction levels), total number of subscribers for a channel (subscriberCount), total views on a channel (channelViewCount), total number of comments on a channel (channelCommentCount), among others.

By analyzing these features and statistics within this dataset, researchers or data analysts can gain valuable insights into various aspects related to YouTube videos and channels. Furthermore, it may be possible to build statistical relationships between videos based on their performance characteristics or even develop topic trees based on similarities between different content categories. This dataset serves as an excellent resource for studying YouTube's ecosystem comprehensively.

For accessing additional resources related to this dataset or exploring code repositories associated with it, users can refer to the provided GitHub repository

How to use the dataset

Introduction:

Step 1: Understanding the Dataset Start by familiarizing yourself with the columns in the dataset. Here are some key features to pay attention to:

totalviews/channelelapsedtime: The ratio of total views of a video to the elapsed time of the channel.

channelViewCount: The total number of views on the channel.

likes/subscriber: The ratio of likes on a video to the number of subscribers of the channel.

views/subscribers: The ratio of views on a video to the number of subscribers of the channel.

subscriberCount: The total number of subscribers for a channel.

dislikes/views: The ratio of dislikes on a video to its total views.

comments/subscriber: The ratio comments on a video receive per subscriber count.

Step 2: Determining Data Analysis Objectives Define your objectives or research questions before diving into data analysis using this dataset. For example, you may want to explore relationships between viewership, engagement metrics, and various attributes such as category ID or elapsed time.

Step 3: Analyzing Relationships between Variables Use statistical techniques like correlation analysis or visualization tools like scatter plots, bar graphs, or heatmaps to understand relationships between variables in this dataset.

For example: - Plotting totalviews/channelelapsedtime against channelViewCount can help identify patterns between overall video popularity and channels' view count growth over time. - Comparing likes/dislikes with comments/views can give insights into viewer engagement levels across different videos.

Step 4: Building Machine Learning Models (Optional) If your objective includes predictive analysis or building machine learning models, select relevant features as predictors and the target variable (e.g., totalviews/channelelapsedtime) for training and evaluation.

You can use various algorithms such as linear regression, decision trees, or neural networks to predict video performance or channel growth based on available attributes.

Step 5: Evaluating Model Performance Assess the predictive model's performance using appropriate evaluation metrics like mean square...
h
howto100m
huggingface.co
Updated Jun 30, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
HuggingFaceM4 (2022). howto100m [Dataset]. https://huggingface.co/datasets/HuggingFaceM4/howto100m
Explore at:
Dataset updated
Jun 30, 2022
Dataset authored and provided by
HuggingFaceM4
Description
HowTo100M is a large-scale dataset of narrated videos with an emphasis on instructional videos where content creators teach complex tasks with an explicit intention of explaining the visual content on screen. HowTo100M features a total of - 136M video clips with captions sourced from 1.2M YouTube videos (15 years of video) - 23k activities from domains such as cooking, hand crafting, personal care, gardening or fitness

Each video is associated with a narration available as subtitles automatically downloaded from YouTube.
E
Webis YouTube 8M Augmented 2018
live.european-language-grid.eu
zenodo.org
json
Updated Mar 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Webis YouTube 8M Augmented 2018 [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/7585
Explore at:
jsonAvailable download formats
Dataset updated
Mar 19, 2024
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
We used the YouTube Data API to augment the YouTube 8M corpus by crawling a variety of meta data for the videos.
First point of interest was the "video resource," which comprises data about the video, such as the video’s title, description, uploader name, tags, view count, and more. Also included in the meta data is whether comments have been left for the video. If so, we downloaded them as well, including information about their authors, likes, dislikes, and responses.
There is no property which specifies a video’s language, since this information is not mandatory when uploading a video. Also, the API provides only information about the available captions, but not the captions themselves. Only the uploader of a video is given access to its captions via the API; we extracted them using youtube-dl. For each video, all manually created captions were downloaded, and auto-generated captions in the "default" language and English. The "default" auto-generated caption gives perhaps the only hint at a video’s original language.
Finally, we downloaded all thumbnails used to advertise a video, which are not available via the API, but only via a canonical URL. Our corpus provides the possibility to recreate the way a video is presented on YouTube (meta data and thumbnail), what the actual content is ((sub)titles and descriptions), and how its viewers reacted (comments).
If you use this dataset in your publication, please cite the dataset as outlined in the right column.
Data from: Using Multistreaming Social Media Video as a Research Method for...
research.usc.edu.au
researchdata.edu.au
Updated Mar 23, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Karen Sutherland; Krisztina Morris (2022). Using Multistreaming Social Media Video as a Research Method for Interview Data Collection [Dataset]. https://research.usc.edu.au/esploro/outputs/dataset/Using-Multistreaming-Social-Media-Video-as/99620208702621
Explore at:
Dataset updated
Mar 23, 2022
Dataset provided by
Sagehttp://www.sagepublications.com/
Authors
Karen Sutherland; Krisztina Morris
Time period covered
2022
Description
This dataset is designed to explore multistreaming social media video as a research method used to collect semi-structured interview data. The data are provided by Dr Karen E. Sutherland and Ms Krisztina Morris from the School of Business and Creative Industries at the University of the Sunshine Coast in Queensland, Australia. The dataset is drawn from the publicly available video recording of an interview undertaken as part of the research project called: ‘Like, Share, Follow’, a multistreaming show, featuring Dr Sutherland interviewing university graduates about their career journeys, that is broadcast across Facebook, LinkedIn, and Twitter and later uploaded to YouTube. This dataset examines how multistreaming video interview data can be used to answer research questions and the benefits and challenges this specific method of data collection can pose in the process of data analysis. The video example is accompanied by a teaching guide and a student guide.
T
youtube_vis
tensorflow.org
Updated Dec 6, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). youtube_vis [Dataset]. https://www.tensorflow.org/datasets/catalog/youtube_vis
Explore at:
Dataset updated
Dec 6, 2022
Area covered
YouTube
Description
Youtube-vis is a video instance segmentation dataset. It contains 2,883 high-resolution YouTube videos, a per-pixel category label set including 40 common objects such as person, animals and vehicles, 4,883 unique video instances, and 131k high-quality manual annotations.

The YouTube-VIS dataset is split into 2,238 training videos, 302 validation videos and 343 test videos.

No files were removed or altered during preprocessing.

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('youtube_vis', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.
Youtube users in the United States 2017-2025
statista.com
Updated Jul 9, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Youtube users in the United States 2017-2025 [Dataset]. https://www.statista.com/forecasts/1147203/youtube-users-in-the-united-states
Explore at:
Dataset updated
Jul 9, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
2017 - 2019
Area covered
United States
Description
In 2021, YouTube's user base in the United States amounts to approximately ****** million users. The number of YouTube users in the United States is projected to reach ****** million users by 2025. User figures have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to *** countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).
h
youtube_subs_howto100M
huggingface.co
Updated Mar 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Wonchang Chung (2023). youtube_subs_howto100M [Dataset]. https://huggingface.co/datasets/totuta/youtube_subs_howto100M
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 31, 2023
Authors
Wonchang Chung
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Area covered
YouTube
Description
Dataset Card for youtube_subs_howto100M

Dataset Summary

The youtube_subs_howto100M dataset is an English-language dataset of instruction-response pairs extracted from 309136 YouTube videos. The dataset was orignally inspired by and sourced from the HowTo100M dataset, which was developed for natural language search for video clips.

Supported Tasks and Leaderboards

conversational: The dataset can be used to train a model for instruction(request) and a long form… See the full description on the dataset page: https://huggingface.co/datasets/totuta/youtube_subs_howto100M.
Most viewed YouTube videos of all time 2025
statista.com
tokrwards.com
Updated Feb 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Most viewed YouTube videos of all time 2025 [Dataset]. https://www.statista.com/statistics/249396/top-youtube-videos-views/
Explore at:
Dataset updated
Feb 17, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Feb 2025
Area covered
Worldwide, YouTube
Description
On June 17, 2016, Korean education brand Pinkfong released their video "Baby Shark Dance", and the rest is history. In January 2021, Baby Shark Dance became the first YouTube video to surpass 10 billion views, after snatching the crown of most-viewed YouTube video of all time from the former record holder "Despacito" one year before. "Baby Shark Dance" currently has over 15 billion lifetime views on YouTube. Music videos on YouTube “Baby Shark Dance” might be the current record-holder in terms of total views, but Korean artist Psy’s “Gangnam Style” video remained on the top spot for longest (1,689 days or 4.6 years) before ceding its spot to its successor. With figures like these, it comes as little surprise that the majority of the most popular videos on YouTube are music videos. Since 2010, all but one the most-viewed videos on YouTube have been music videos, signifying the platform’s shift in focus from funny, viral videos to professionally produced content. As of 2022, about 40 percent of the U.S. digital music audience uses YouTube Music. Popular video content on YouTube Music fans are also highly engaged audiences and it is not uncommon for music videos to garner significant amounts of traffic within the first 24 hours of release. Other popular types of videos that generate lots of views after their first release are movie trailers, especially superhero movies related to the MCU (Marvel Cinematic Universe). The first official trailer for the upcoming film “Avengers: Endgame” generated 289 million views within the first 24 hours of release, while the movie trailer for Spider-Man: No Way Home generated over 355 views on the first day from release, making it the most viral movie trailer.

Facebook

Twitter

Click to copy link

Link copied

Cite

Rishav Sharma (2024). YouTube Trending Video Dataset (updated daily) [Dataset]. https://www.kaggle.com/rsrishav/youtube-trending-video-dataset

YouTube Trending Video Dataset (updated daily)

YouTube Trending Video data-set which gets updated daily.

Explore at:

zip(0 bytes)Available download formats

Dataset updated

Apr 15, 2024

Authors

Rishav Sharma

License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Area covered

YouTube

Description

This dataset is a daily record of the top trending YouTube videos and it will be updated daily.

Context

YouTube maintains a list of the top trending videos on the platform. According to Variety magazine, “To determine the year’s top-trending videos, YouTube uses a combination of factors including measuring users interactions (number of views, shares, comments and likes). Note that they’re not the most-viewed videos overall for the calendar year”.

Note that this dataset is a structurally improved version of this dataset.

Content

This dataset includes several months (and counting) of data on daily trending YouTube videos. Data is included for the IN, US, GB, DE, CA, FR, RU, BR, MX, KR, and JP regions (India, USA, Great Britain, Germany, Canada, France, Russia, Brazil, Mexico, South Korea, and, Japan respectively), with up to 200 listed trending videos per day.

Each region’s data is in a separate file. Data includes the video title, channel title, publish time, tags, views, likes and dislikes, description, and comment count.

The data also includes a category_id field, which varies between regions. To retrieve the categories for a specific video, find it in the associated JSON. One such file is included for each of the 11 regions in the dataset.

For more information on specific columns in the dataset refer to the column metadata.

Acknowledgements

This dataset was collected using the YouTube API. This dataset is the updated version of Trending YouTube Video Statistics.

Inspiration

Possible uses for this dataset could include: - Sentiment analysis in a variety of forms - Categorizing YouTube videos based on their comments and statistics. - Training ML algorithms like RNNs to generate their own YouTube comments. - Analyzing what factors affect how popular a YouTube video will be. - Statistical analysis over time.

For further inspiration, see the kernels on this dataset!

Clear search

Close search

Google apps

Main menu

YouTube Trending Video Dataset (updated daily)

This dataset is a daily record of the top trending YouTube videos and it will be updated daily.

Context

Content

Acknowledgements

Inspiration

YouTube Videos and Channels Metadata

YouTube Videos and Channels Metadata

Analyze the statistical relation between videos and form a topic tree

About this dataset

More Datasets

Featured Notebooks

How to use the dataset

How to Use This Dataset

Research Ideas

Acknowledgements

License

Columns

YouTube-Commons

YouTube Datasets

Hours of video uploaded to YouTube every minute 2007-2022

Youtube video statistics for 1 million videos

Motivation

Context

Content

Dataset structure

YouTube Dataset on Mobile Streaming for Internet Traffic Modeling, Network...

YouTube users worldwide 2020-2029

YouTube Channel Statistics Dataset

Dataset of Video Comments of a Vision Video Classified by Their Relevance,...

YouTube Dataset - Medical papers analising YouTube videos

YouTube users in India 2020-2029

YouTube Video and Channel Analytics

YouTube Video and Channel Analytics

YouTube Video and Channel Analytics: Statistics and Features

About this dataset

How to use the dataset

howto100m

Webis YouTube 8M Augmented 2018

Data from: Using Multistreaming Social Media Video as a Research Method for...

youtube_vis

Youtube users in the United States 2017-2025

youtube_subs_howto100M

Most viewed YouTube videos of all time 2025

YouTube Trending Video Dataset (updated daily)

YouTube Trending Video data-set which gets updated daily.

This dataset is a daily record of the top trending YouTube videos and it will be updated daily.

Context

Content

Acknowledgements

Inspiration