77 datasets found

YouTube Videos and Channels Metadata
kaggle.com
Updated Dec 14, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2022). YouTube Videos and Channels Metadata [Dataset]. https://www.kaggle.com/datasets/thedevastator/revealing-insights-from-youtube-video-and-channe
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 14, 2022
Dataset provided by
Kaggle
Authors
The Devastator
Area covered
YouTube
Description
YouTube Videos and Channels Metadata

Analyze the statistical relation between videos and form a topic tree

By VISHWANATH SESHAGIRI [source]

About this dataset

This dataset contains YouTube video and channel metadata to analyze the statistical relation between videos and form a topic tree. With 9 direct features, 13 more indirect features, it has all that you need to build a deep understanding of how videos are related – including information like total views per unit time, channel views, likes/subscribers ratio, comments/views ratio, dislikes/subscribers ratio etc. This data provides us with a unique opportunity to gain insights on topics such as subscriber count trends over time or calculating the impact of trends on subscriber engagement. We can develop powerful models that show us how different types of content drive viewership and identify the most popular styles or topics within YouTube's vast catalogue. Additionally this data offers an intriguing look into consumer behaviour as we can explore what drives people to watch specific videos at certain times or appreciate certain channels more than others - by analyzing things like likes per subscribers and dislikes per views ratios for example! Finally this dataset is completely open source with an easy-to-understand Github repo making it an invaluable resource for anyone looking to gain better insights into how their audience interacts with their content and how they might improve it in the future

More Datasets

For more datasets, click here.

Featured Notebooks

🚨 Your notebook can be here! 🚨!

How to use the dataset

How to Use This Dataset

In general, it is important to understand each parameter in the data set before proceeding with analysis. The parameters included are totalviews/channelelapsedtime, channelViewCount, likes/subscriber, views/subscribers, subscriberCounts, dislikes/views comments/subscriberchannelCommentCounts,, likes/dislikes comments/views dislikes/ subscribers totviewes /totsubsvews /elapsedtime.

To use this dataset for your own analysis:1) Review each parameter’s meaning and purpose in our dataset; 2) Get familiar with basic descriptive statistics such as mean median mode range; 3) Create visualizations or tables based on subsets of our data; 4) Understand correlations between different sets of variables or parameters; 5) Generate meaningful conclusions about specific channels or topics based on organized graph hierarchies or tables.; 6) Analyze trends over time for individual parameters as well as an aggregate reaction from all users when videos are released

Research Ideas

Predicting the Relative Popularity of Videos: This dataset can be used to build a statistical model that can predict the relative popularity of videos based on various factors such as total views, channel viewers, likes/dislikes ratio, and comments/views ratio. This model could then be used to make recommendations and predict which videos are likely to become popular or go viral.

Creating Topic Trees: The dataset can also be used to create topic trees or taxonomies by analyzing the content of videos and looking at what topics they cover. For example, one could analyze the most popular YouTube channels in a specific subject area, group together those that discuss similar topics, and then build an organized tree structure around those topics in order to better understand viewer interests in that area.

Viewer Engagement Analysis: This dataset could also be used for viewer engagement analysis purposes by analyzing factors such as subscriber count, average time spent watching a video per user (elapsed time), comments made per view etc., so as to gain insights into how engaged viewers are with specific content or channels on YouTube. From this information it would be possible to optimize content strategy accordingly in order improve overall engagement rates across various types of video content and channel types

Acknowledgements

If you use this dataset in your research, please credit the original authors.

Data Source

License

Unknown License - Please check the dataset description for more information.

Columns

File: YouTubeDataset_withChannelElapsed.csv | Column name | Description | |:----------------------------------|:-------------------------------------------------------| | totalviews/channelelapsedtime | Ratio of total views to channel elapsed time. (Ratio) | | channelViewCount | Total number of views for the channel. (Integer) | | likes/subscriber ...
b
YouTube Revenue and Usage Statistics (2025)
businessofapps.com
Updated May 22, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Business of Apps (2018). YouTube Revenue and Usage Statistics (2025) [Dataset]. https://www.businessofapps.com/data/youtube-statistics/
Explore at:
Dataset updated
May 22, 2018
Dataset authored and provided by
Business of Apps
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Area covered
YouTube
Description
YouTube was launched in 2005. It was founded by three PayPal employees: Chad Hurley, Steve Chen, and Jawed Karim, who ran the company from an office above a small restaurant in San Mateo. The first...
YouTube users worldwide 2020-2029
statista.com
Updated Jul 7, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). YouTube users worldwide 2020-2029 [Dataset]. https://www.statista.com/forecasts/1144088/youtube-users-in-the-world
Explore at:
Dataset updated
Jul 7, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
World
Description
The global number of Youtube users in was forecast to continuously increase between 2024 and 2029 by in total ***** million users (+***** percent). After the ninth consecutive increasing year, the Youtube user base is estimated to reach *** billion users and therefore a new peak in 2029. Notably, the number of Youtube users of was continuously increasing over the past years.User figures, shown here regarding the platform youtube, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to *** countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of Youtube users in countries like Africa and South America.
YouTube Video and Channel Analytics
kaggle.com
Updated Dec 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2023). YouTube Video and Channel Analytics [Dataset]. https://www.kaggle.com/datasets/thedevastator/youtube-video-and-channel-analytics/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 8, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
The Devastator
Area covered
YouTube
Description
YouTube Video and Channel Analytics

YouTube Video and Channel Analytics: Statistics and Features

By VISHWANATH SESHAGIRI [source]

About this dataset

The YouTube Video and Channel Metadata dataset is a comprehensive collection of data related to YouTube videos and channels. It consists of various features and statistics that provide insights into the performance and engagement of videos, as well as the overall popularity and success of channels.

The dataset includes both direct features, such as total views, channel elapsed time, channel ID, video category ID, channel view count, likes per subscriber, dislikes per subscriber, comments per subscriber, and more. Additionally, there are indirect features derived from YouTube's API that provide additional metrics for analysis.

One important aspect covered in this dataset is the ratio between certain metrics. For example: - The totalviews/channelelapsedtime ratio represents the average number of views a video has received relative to the elapsed time since the channel was created. - The likes/dislikes ratio indicates the proportion of likes on a video compared to dislikes. - The views/subscribers ratio showcases how engaged subscribers are by measuring the number of views relative to the number of subscribers.

Other metrics explored in this dataset include comments/views ratio (representing viewer engagement), dislikes/views ratio (measuring viewer sentiment), comments/subscriber ratio (indicating community participation), likes/subscriber ratio (reflecting audience loyalty), dislikes/subscriber ratio (highlighting dissatisfaction levels), total number of subscribers for a channel (subscriberCount), total views on a channel (channelViewCount), total number of comments on a channel (channelCommentCount), among others.

By analyzing these features and statistics within this dataset, researchers or data analysts can gain valuable insights into various aspects related to YouTube videos and channels. Furthermore, it may be possible to build statistical relationships between videos based on their performance characteristics or even develop topic trees based on similarities between different content categories. This dataset serves as an excellent resource for studying YouTube's ecosystem comprehensively.

For accessing additional resources related to this dataset or exploring code repositories associated with it, users can refer to the provided GitHub repository

How to use the dataset

Introduction:

Step 1: Understanding the Dataset Start by familiarizing yourself with the columns in the dataset. Here are some key features to pay attention to:

totalviews/channelelapsedtime: The ratio of total views of a video to the elapsed time of the channel.

channelViewCount: The total number of views on the channel.

likes/subscriber: The ratio of likes on a video to the number of subscribers of the channel.

views/subscribers: The ratio of views on a video to the number of subscribers of the channel.

subscriberCount: The total number of subscribers for a channel.

dislikes/views: The ratio of dislikes on a video to its total views.

comments/subscriber: The ratio comments on a video receive per subscriber count.

Step 2: Determining Data Analysis Objectives Define your objectives or research questions before diving into data analysis using this dataset. For example, you may want to explore relationships between viewership, engagement metrics, and various attributes such as category ID or elapsed time.

Step 3: Analyzing Relationships between Variables Use statistical techniques like correlation analysis or visualization tools like scatter plots, bar graphs, or heatmaps to understand relationships between variables in this dataset.

For example: - Plotting totalviews/channelelapsedtime against channelViewCount can help identify patterns between overall video popularity and channels' view count growth over time. - Comparing likes/dislikes with comments/views can give insights into viewer engagement levels across different videos.

Step 4: Building Machine Learning Models (Optional) If your objective includes predictive analysis or building machine learning models, select relevant features as predictors and the target variable (e.g., totalviews/channelelapsedtime) for training and evaluation.

You can use various algorithms such as linear regression, decision trees, or neural networks to predict video performance or channel growth based on available attributes.

Step 5: Evaluating Model Performance Assess the predictive model's performance using appropriate evaluation metrics like mean square...
Youtube users in the United Kingdom 2017-2025
statista.com
Updated Jul 10, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Youtube users in the United Kingdom 2017-2025 [Dataset]. https://www.statista.com/forecasts/1145489/youtube-users-in-the-united-kingdom
Explore at:
Dataset updated
Jul 10, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
2017 - 2019
Area covered
United Kingdom
Description
In 2021, YouTube's user base in the United Kingdom amounts to approximately ***** million users. The number of YouTube users in the United Kingdom is projected to reach ***** million users by 2025. User figures have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to *** countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).
2023 YouTube Most Viewed Top600
kaggle.com
Updated Dec 11, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kanchana1990 (2023). 2023 YouTube Most Viewed Top600 [Dataset]. http://doi.org/10.34740/kaggle/ds/4148346
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.34740/kaggle/ds/4148346
Dataset updated
Dec 11, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Kanchana1990
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Area covered
YouTube
Description
Dataset Name: 2023 YouTube Most Viewed Top600

Description: This dataset, titled "2023YouTubeMostViewed_Top600", comprises a curated selection of the top 600 YouTube videos based on view count, specifically from the year 2023. Each entry in the dataset represents a unique video, encompassing several key metrics:

Title: The title of the video, providing a glimpse into its content.

Published At: The upload date and time of the video, confirming its recency within the 2023 timeframe.

Duration: The length of the video, indicating its watch time.

View Count: The total number of views the video has accumulated, the primary metric for its inclusion in this dataset.

Like Count: The number of likes the video has received, offering insight into viewer engagement.

Comment Count: The total count of comments on the video, further reflecting audience interaction.

It's important to note that while these videos are among the most viewed as of the data retrieval date, the landscape of YouTube is dynamic. View counts are continually changing, and what constitutes the 'most viewed' can fluctuate. Thus, the dataset should be seen as a snapshot of popularity and viewer engagement during a specific period in 2023, rather than an absolute ranking. This dataset is invaluable for analysis of trending content, viewer preferences, and video engagement metrics on YouTube for the year 2023.

Note: Ethically mined data from YouTube
Youtube users in the United States 2017-2025
statista.com
Updated Jul 9, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Youtube users in the United States 2017-2025 [Dataset]. https://www.statista.com/forecasts/1147203/youtube-users-in-the-united-states
Explore at:
Dataset updated
Jul 9, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
2017 - 2019
Area covered
United States
Description
In 2021, YouTube's user base in the United States amounts to approximately ****** million users. The number of YouTube users in the United States is projected to reach ****** million users by 2025. User figures have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to *** countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).
Top 200 Youtubers Data (cleaned)
kaggle.com
Updated Jul 8, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Syed Jafer (2022). Top 200 Youtubers Data (cleaned) [Dataset]. https://www.kaggle.com/syedjaferk/top-200-youtubers-cleaned/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 8, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Syed Jafer
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
YouTube is an American online video sharing and social media platform headquartered in San Bruno, California. It was launched on February 14, 2005, by Steve Chen, Chad Hurley, and Jawed Karim. It is owned by Google, and is the second most visited website, after Google Search. YouTube has more than 2.5 billion monthly users who collectively watch more than one billion hours of videos each day. As of May 2019, videos were being uploaded at a rate of more than 500 hours of content per minute.

Youtube is very much used to influence, educate, free university (for me also) people (the users followers) in a particular way for a specific issue - which can impact the order in some ways.
Short Video Engagement Dataset
kaggle.com
Updated Feb 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Programmer3 (2025). Short Video Engagement Dataset [Dataset]. https://www.kaggle.com/datasets/programmer3/short-video-engagement-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 26, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Programmer3
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
This dataset is web-scraped from popular short video platforms like YouTube Shorts, TikTok, and Instagram Reels. It captures user interaction data, including views, likes, comments, shares, and watch duration, along with multimodal features from video content like text (titles, descriptions), image (visual characteristics), and audio (sound properties). The data has been processed and flattened into a structured CSV format with 17,654 Rows.
Countries with the most YouTube users 2025
statista.com
Updated Feb 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Countries with the most YouTube users 2025 [Dataset]. https://www.statista.com/statistics/280685/number-of-monthly-unique-youtube-users/
Explore at:
Dataset updated
Feb 17, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Feb 2025
Area covered
Worldwide, YouTube
Description
As of February 2025, India was the country with the largest YouTube audience by far, with approximately 491 million users engaging with the popular social video platform. The United States followed, with around 253 million YouTube viewers. Brazil came in third, with 144 million users watching content on YouTube. The United Kingdom saw around 54.8 million internet users engaging with the platform in the examined period. What country has the highest percentage of YouTube users? In July 2024, the United Arab Emirates was the country with the highest YouTube penetration worldwide, as around 94 percent of the country's digital population engaged with the service. In 2024, YouTube counted around 100 million paid subscribers for its YouTube Music and YouTube Premium services. YouTube mobile markets In 2024, YouTube was among the most popular social media platforms worldwide. In terms of revenues, the YouTube app generated approximately 28 million U.S. dollars in revenues in the United States in January 2024, as well as 19 million U.S. dollars in Japan.
d
Streaming Mobile Media Exposure | 1st Party | 3B+ events verified, US...
datarade.ai
.csv, .parquet
Updated Jun 7, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
MFour (2021). Streaming Mobile Media Exposure | 1st Party | 3B+ events verified, US consumers | Netflix, YouTube, Disney+ and Amazon Prime Video [Dataset]. https://datarade.ai/data-products/streaming-mobile-media-exposure-1st-party-3b-events-veri-mfour
Explore at:
.csv, .parquetAvailable download formats
Dataset updated
Jun 7, 2021
Dataset authored and provided by
MFour
Area covered
United States of America, YouTube
Description
This dataset encompasses mobile app based media consumption, collected from over 150,000 first-party US Daily Active Users on Android devices. Use it for measurement, journey understanding or to trigger surveys about sentiment. Platforms covered include Netflix, YouTube, Disney+ and Amazon Prime Video.

Fields include pre-roll ads played, viewing duration, channel, category and more. All data tied to demographics, all consumers can be surveyed about viewership (or other topics), and consumer journey understanding can be gleaned combining this dataset with other MFour OmniTraffic® products.
YouTube Crisis Actor Videos and Recommendations
kaggle.com
Updated Dec 18, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2023). YouTube Crisis Actor Videos and Recommendations [Dataset]. https://www.kaggle.com/datasets/thedevastator/youtube-crisis-actor-videos-and-recommendations/suggestions
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 18, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
The Devastator
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Area covered
YouTube
Description
YouTube Crisis Actor Videos and Recommendations

Dataset of Crisis Actor Videos on YouTube and their Recommendations

By Jonathan A. [source]

About this dataset

This dataset provides valuable insights into crisis actor videos and their corresponding recommendations on YouTube. It consists of a total of 8823 videos, accounting for an astounding 3,956,454,363 views. These videos were retrieved from YouTube's API and cover various categories and topics.

Specifically, this dataset focuses on crisis actor videos related to mass shootings, false flags, and other conspiracy theories that comprise around 20% of the collection. The remaining 80% explores conspiracies revolving around history, government institutions, and religions.

The dataset includes essential information such as the name and channel of the video uploader. Additionally, it provides details about viewer engagement through likes and dislikes counts. Furthermore, each video is assigned a category or topic to facilitate analysis.

It is important to note that approximately 100 music videos were excluded from the initial data set to maintain relevance to crisis actors.

Overall, this project aims to shed light on the prevalent issue of crisis actors on YouTube by providing researchers with a comprehensive dataset for further exploration and analysis. This highly informative dataset serves as a valuable resource for investigating trends within crisis actor content while contributing towards raising public awareness surrounding this topic

How to use the dataset

Understanding the Dataset:

The dataset comprises several columns that provide specific information about each video and its corresponding recommendations. Here's a brief overview of the key columns:

name: The title or name of the YouTube video.

channel: The name of the YouTube channel that uploaded the video.

category: The category or topic of the video.

views: The number of views the video has received.

likes: The number of likes received by each video.

dislikes: The number of dislikes received by each video.

Exploring Categories:

One way to analyze this dataset is by examining different categories mentioned in each video entry. This could involve identifying patterns within categories or comparing engagement metrics (views, likes, dislikes) across various topics.

For example, you might want to investigate how crisis actor videos are categorized compared to other conspiracy-related videos present in this dataset.

Analyzing Engagement Metrics:

To gain insights into users' response towards different videos related to crisis actors or conspiracy theories, it is recommended that you examine engagement metrics such as views, likes, and dislikes.

You can compare these metrics between individual videos within specific categories or observe trends across all entries.

Investigating Popularity:

Understanding which channels have maximum viewership within this particular subject area can offer valuable information for further analysis.

Examining which channels have consistently high views or engagement metrics (likes/dislikes) can help identify influential content creators related to crisis actors or conspiracy theories.

Identifying Recommendations:

The dataset also provides information about the recommendations associated with each video entry. By analyzing these recommendations, you can gain insights into the video content YouTube suggests to users who view crisis actor videos.

You could focus on specific keywords within recommendation titles or explore patterns in terms of topic relevance or common recommendations across multiple entries.

Cross-Referencing External Information:

As this dataset does not provide detailed descriptions or context for each video, it is advisable to cross-reference external sources to gather additional information if needed.

By using the provided video titles and channel names, you can search for more details about specific videos

Research Ideas

Analyzing the correlation between likes, dislikes, and views: This dataset can be used to analyze the relationship between the number of likes and dislikes a video receives and its overall views. By examining this relationship, one could gain insights into factors that contribute to increased engagement or disinterest in crisis actor videos.

Identifying popular YouTube channels in the crisis actor category: By analyzing the dataset, one can identify which YouTube channels have uploaded the most crisis actor videos and have gained high viewership. Th...
f
Microsoft Excel dataset file of YouTube videos.
plos.figshare.com
xlsx
Updated Nov 29, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dan Sun; Guochang Zhao (2023). Microsoft Excel dataset file of YouTube videos. [Dataset]. http://doi.org/10.1371/journal.pone.0294665.s002
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0294665.s002
Dataset updated
Nov 29, 2023
Dataset provided by
PLOS ONE
Authors
Dan Sun; Guochang Zhao
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
YouTube
Description
News dissemination plays a vital role in supporting people to incorporate beneficial actions during public health emergencies, thereby significantly reducing the adverse influences of events. Based on big data from YouTube, this research study takes the declaration of COVID-19 National Public Health Emergency (PHE) as the event impact and employs a DiD model to investigate the effect of PHE on the news dissemination strength of relevant videos. The study findings indicate that the views, comments, and likes on relevant videos significantly increased during the COVID-19 public health emergency. Moreover, the public’s response to PHE has been rapid, with the highest growth in comments and views on videos observed within the first week of the public health emergency, followed by a gradual decline and returning to normal levels within four weeks. In addition, during the COVID-19 public health emergency, in the context of different types of media, lifestyle bloggers, local media, and institutional media demonstrated higher growth in the news dissemination strength of relevant videos as compared to news & political bloggers, foreign media, and personal media, respectively. Further, the audience attracted by related news tends to display a certain level of stickiness, therefore this audience may subscribe to these channels during public health emergencies, which confirms the incentive mechanisms of social media platforms to foster relevant news dissemination during public health emergencies. The proposed findings provide essential insights into effective news dissemination in potential future public health events.
YouTube users in India 2020-2029
statista.com
Updated Jul 10, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). YouTube users in India 2020-2029 [Dataset]. https://www.statista.com/forecasts/1146150/youtube-users-in-india
Explore at:
Dataset updated
Jul 10, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
India
Description
The number of Youtube users in India was forecast to continuously increase between 2024 and 2029 by in total ***** million users (+***** percent). After the ninth consecutive increasing year, the Youtube user base is estimated to reach ****** million users and therefore a new peak in 2029. Notably, the number of Youtube users of was continuously increasing over the past years.User figures, shown here regarding the platform youtube, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to *** countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of Youtube users in countries like Sri Lanka and Nepal.
c
TV Analytics Market size was USD 3815.2 million in 2024!
cognitivemarketresearch.com
pdf,excel,csv,ppt
Updated Aug 20, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cognitive Market Research (2025). TV Analytics Market size was USD 3815.2 million in 2024! [Dataset]. https://www.cognitivemarketresearch.com/tv-analytics-market-report
Explore at:
pdf,excel,csv,pptAvailable download formats
Dataset updated
Aug 20, 2025
Dataset authored and provided by
Cognitive Market Research
License
https://www.cognitivemarketresearch.com/privacy-policyhttps://www.cognitivemarketresearch.com/privacy-policy
Time period covered
2021 - 2033
Area covered
Global
Description
According to Cognitive Market Research, the global TV analytics market size is USD 3815.2 million in 2024 and will expand at a compound annual growth rate (CAGR) of 18.20% from 2024 to 2031.

North America held the major market of more than 40% of the global revenue with a market size of USD 1526.08 million in 2024 and will grow at a compound annual growth rate (CAGR) of 16.4% from 2024 to 2031. Europe accounted for a share of over 30% of the global market size of USD 1144.56 million. Asia Pacific held the market of around 23% of the global revenue with a market size of USD 877.50 million in 2024 and will grow at a compound annual growth rate (CAGR) of 20.2% from 2024 to 2031. Latin America's market will have more than 5% of the global revenue with a market size of USD 190.76 million in 2024 and will grow at a compound annual growth rate (CAGR) of 17.6% from 2024 to 2031. Middle East and Africa held the major market of around 2% of the global revenue with a market size of USD 76.30 million in 2024 and will grow at a compound annual growth rate (CAGR) of 17.9% from 2024 to 2031. The on-premise segment is set to rise as on-premise solutions for OTT platforms are reasonably cost-effective regarding equipment composition and cabling infrastructure. Additionally, under this model, viewers are authorized to determine the type of content, which results in more control. The TV analytics market is driven by the growing consumer need for digital original series, and the growing trend of subscription-on-video demand (SVoD) platforms has further fuelled industry expansion. Significant demand for numerous genres and plays available on over-the-top (OTT) platforms such as Netflix and Amazon are contributing toward market development.

Integration of Advanced Technologies to Provide Viable Market Output

The TV analytics market is rapidly evolving with the integration of advanced technologies. Innovations such as AI-driven content recognition, real-time data processing, and machine learning algorithms transform how broadcasters and advertisers analyze audience behavior and content performance. These technologies enable precise targeting, personalized recommendations, and insightful audience insights, revolutionizing advertising strategies and content creation. As the industry embraces these advancements, it fosters more efficient decision-making processes and enhances the overall viewer experience, driving the evolution of television analytics.

For instance, in July 2022, MiQ launched its groundbreaking analytics and measurement capacity for cross-channel YouTube and TV campaigns in the UK. The creative solution bridges the intermission between the two channels. By connecting these often-disparate datasets, brands can reach almost 100% of their target viewers on YouTube and calculate reach deterministically across these channels.

(Source:https://www.wearemiq.com/press-releases/miqs-youtube-and-tv-analytics-capability-officially-lands-in-the-uk/ )

Increasing Digitalization and Shifting Viewer Preference to Propel Market Growth

The TV analytics market is experiencing significant growth due to increasing digitalization and shifting viewer preferences. As more viewers consume content across various digital platforms, there's a heightened need for data-driven insights into audience behavior and content performance. With the expansion of streaming assistance and on-demand viewing, traditional TV networks and advertisers are investing in analytics tools to understand viewer engagement, demographics, and content consumption patterns. This trend underscores the critical role of analytics in optimizing content strategies and advertising campaigns amidst evolving viewer dynamics.

For instance, in December 2022, TV analytics firm TVSquared launched its cross-platform measurement and attribution platform for all types of TV, ADvantage XP, in the UK and Germany. The scalable solution brings continuous and impression-based measurement of ad exposure and outcomes to TV campaigns across linear, streaming, and addressable TV.

(Source:https://www.marketingtechnews.net/news/2021/nov/02/tvsquared-launches-cross-platform-analytics-solution-in-uk-and-germany/ )

Complexity of Measuring Viewership across Multiple Platforms to Restrict Market Growth

The TV analytics market faces challenges in measuring viewership across multiple platforms due to the proliferation of streaming services, DVR, an...
Face Dataset Of People That Don't Exist
kaggle.com
Updated Sep 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
BwandoWando (2023). Face Dataset Of People That Don't Exist [Dataset]. http://doi.org/10.34740/kaggle/dsv/6433550
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.34740/kaggle/dsv/6433550
Dataset updated
Sep 8, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
BwandoWando
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Context

All the images of faces here are generated using https://thispersondoesnotexist.com/

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F1842206%2F4c3d3569f4f9c12fc898d76390f68dab%2FBeFunky-collage.jpg?generation=1662079836729388&alt=media" alt="">

Copyrighting of AI Generated images

Under US copyright law, these images are technically not subject to copyright protection. Only "original works of authorship" are considered. "To qualify as a work of 'authorship' a work must be created by a human being," according to a US Copyright Office's report [PDF].

https://www.theregister.com/2022/08/14/ai_digital_artwork_copyright/

Tagging

I manually tagged all images as best as I could and separated them between the two classes below

Female- 3860 images

Male- 3013 images

Some may pass either female or male, but I will leave it to you to do the reviewing. I included toddlers and babies under Male/ Female

How it works

Each of the faces are totally fake, created using an algorithm called Generative Adversarial Networks (GANs).

A generative adversarial network (GAN) is a class of machine learning frameworks designed by Ian Goodfellow and his colleagues in June 2014. Two neural networks contest with each other in a game (in the form of a zero-sum game, where one agent's gain is another agent's loss).

Given a training set, this technique learns to generate new data with the same statistics as the training set. For example, a GAN trained on photographs can generate new photographs that look at least superficially authentic to human observers, having many realistic characteristics. Though originally proposed as a form of generative model for unsupervised learning, GANs have also proved useful for semi-supervised learning, fully supervised learning,and reinforcement learning.

https://www.youtube.com/watch?v=u8qPvzk0AfY

https://www.youtube.com/watch?v=dCKbRCUyop8

https://www.youtube.com/watch?v=SWoravHhsUU

Github implementation of website

https://github.com/NVlabs/stylegan2

https://github.com/lucidrains/stylegan2-pytorch

https://github.com/lucidrains/lightweight-gan

How I gathered the images

Just a simple Jupyter notebook that looped and invoked the website https://thispersondoesnotexist.com/ , saving all images locally
Data from: The viewer doesn’t always seem to care - response to fake animal...
data.niaid.nih.gov
datadryad.org
zip
Updated Oct 14, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lauren Harrington; Angie Elwin; Suzi Patterson; Neil D'Cruze (2022). The viewer doesn’t always seem to care - response to fake animal rescues on YouTube and implications for social media self-policing policies [Dataset]. http://doi.org/10.5061/dryad.q573n5tn6
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.q573n5tn6
Dataset updated
Oct 14, 2022
Dataset provided by
World Animal Protectionhttp://worldanimalprotection.org/
University of Oxford
Authors
Lauren Harrington; Angie Elwin; Suzi Patterson; Neil D'Cruze
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Area covered
YouTube
Description
Animal-related content on social media is hugely popular but is not always appropriate in terms of how animals are portrayed or how they are treated. This has potential implications beyond the individual animals involved, for viewers, for wild animal populations, and for societies and their interactions with animals. Whilst social media platforms usually publish guidelines for permitted content, enforcement relies at least in part on viewers reporting inappropriate posts. Currently, there is no external regulation of social media platforms. Based on a set of 241 "fake animal rescue" videos that exhibited clear signs of animal cruelty and strong evidence of being deliberately staged (i.e. fake), we found little evidence that viewers disliked the videos and an overall mixed response in terms of awareness of the fake nature of the videos, and their attitudes towards the welfare of the animals involved. Our findings suggest, firstly, that despite the narrowly defined nature of the videos used in this case study, exposure rates can be extremely high (one of the videos had been viewed over 100 million times), and, secondly, that many YouTube viewers cannot identify (or are not concerned by) animal welfare or conservation issues within a social media context. In terms of the current policy approach of social media platforms, our findings raise questions regarding the value of their current reliance on consumers as watch dogs. Methods Data collection The dataset pertains to 241YouTube videos identified using the search function in YouTube and the search terms "primitive man saves" and "primitive boy saves" between May and July 2021; supplemented with additional similar videos held in a database collated by Animals for Asia (www.asiaforanimals.com). Video metrics were extracted automatically between 24.06.21 and 02.08.21 using the "tuber" package in R (Sood 2020, https://cran.r-project.org/web/packages/tuber/tuber.pdf ). Additional information (e.g. on animal taxa) was obtained manually by screening the videos. For five of the videos that received > 1,000 comments, comment text was also extracted using the tuber package. Only publicly available videos were accessed.
Data processing Users (video posters and commenters) have been de-identified. For each video for which comment text was analysed, the text was converted into a list of the most frequently used words and emojis. Please refer to the manuscript for further details on the methods and approach used to identify and define the most frequently used words/emojis, and to assign sentiment scores.
Data (i.e., evidence) about evidence based medicine
figshare.com
search.datacite.org
png
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jorge H Ramirez (2023). Data (i.e., evidence) about evidence based medicine [Dataset]. http://doi.org/10.6084/m9.figshare.1093997.v24
Explore at:
pngAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.1093997.v24
Dataset updated
May 30, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Jorge H Ramirez
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Update — December 7, 2014. – Evidence-based medicine (EBM) is not working for many reasons, for example: 1. Incorrect in their foundations (paradox): hierarchical levels of evidence are supported by opinions (i.e., lowest strength of evidence according to EBM) instead of real data collected from different types of study designs (i.e., evidence). http://dx.doi.org/10.6084/m9.figshare.1122534 2. The effect of criminal practices by pharmaceutical companies is only possible because of the complicity of others: healthcare systems, professional associations, governmental and academic institutions. Pharmaceutical companies also corrupt at the personal level, politicians and political parties are on their payroll, medical professionals seduced by different types of gifts in exchange of prescriptions (i.e., bribery) which very likely results in patients not receiving the proper treatment for their disease, many times there is no such thing: healthy persons not needing pharmacological treatments of any kind are constantly misdiagnosed and treated with unnecessary drugs. Some medical professionals are converted in K.O.L. which is only a puppet appearing on stage to spread lies to their peers, a person supposedly trained to improve the well-being of others, now deceits on behalf of pharmaceutical companies. Probably the saddest thing is that many honest doctors are being misled by these lies created by the rules of pharmaceutical marketing instead of scientific, medical, and ethical principles. Interpretation of EBM in this context was not anticipated by their creators. “The main reason we take so many drugs is that drug companies don’t sell drugs, they sell lies about drugs.” ―Peter C. Gøtzsche “doctors and their organisations should recognise that it is unethical to receive money that has been earned in part through crimes that have harmed those people whose interests doctors are expected to take care of. Many crimes would be impossible to carry out if doctors weren’t willing to participate in them.” —Peter C Gøtzsche, The BMJ, 2012, Big pharma often commits corporate crime, and this must be stopped. Pending (Colombia): Health Promoter Entities (In Spanish: EPS ―Empresas Promotoras de Salud).

Misinterpretations New technologies or concepts are difficult to understand in the beginning, it doesn’t matter their simplicity, we need to get used to new tools aimed to improve our professional practice. Probably the best explanation is here in these videos (credits to Antonio Villafaina for sharing these videos with me). English https://www.youtube.com/watch?v=pQHX-SjgQvQ&w=420&h=315 Spanish https://www.youtube.com/watch?v=DApozQBrlhU&w=420&h=315 ----------------------- Hypothesis: hierarchical levels of evidence based medicine are wrong Dear Editor, I have data to support the hypothesis described in the title of this letter. Before rejecting the null hypothesis I would like to ask the following open question:Could you support with data that hierarchical levels of evidence based medicine are correct? (1,2) Additional explanation to this question: – Only respond to this question attaching publicly available raw data.– Be aware that more than a question this is a challenge: I have data (i.e., evidence) which is contrary to classic (i.e., McMaster) or current (i.e., Oxford) hierarchical levels of evidence based medicine. An important part of this data (but not all) is publicly available. References

Ramirez, Jorge H (2014): The EBM challenge. figshare. http://dx.doi.org/10.6084/m9.figshare.1135873

The EBM Challenge Day 1: No Answers. Competing interests: I endorse the principles of open data in human biomedical research Read this letter on The BMJ – August 13, 2014.http://www.bmj.com/content/348/bmj.g3725/rr/762595Re: Greenhalgh T, et al. Evidence based medicine: a movement in crisis? BMJ 2014; 348: g3725. _ Fileset contents Raw data: Excel archive: Raw data, interactive figures, and PubMed search terms. Google Spreadsheet is also available (URL below the article description). Figure 1. Unadjusted (Fig 1A) and adjusted (Fig 1B) PubMed publication trends (01/01/1992 to 30/06/2014). Figure 2. Adjusted PubMed publication trends (07/01/2008 to 29/06/2014) Figure 3. Google search trends: Jan 2004 to Jun 2014 / 1-week periods. Figure 4. PubMed publication trends (1962-2013) systematic reviews and meta-analysis, clinical trials, and observational studies.
Figure 5. Ramirez, Jorge H (2014): Infographics: Unpublished US phase 3 clinical trials (2002-2014) completed before Jan 2011 = 50.8%. figshare.http://dx.doi.org/10.6084/m9.figshare.1121675 Raw data: "13377 studies found for: Completed | Interventional Studies | Phase 3 | received from 01/01/2002 to 01/01/2014 | Worldwide". This database complies with the terms and conditions of ClinicalTrials.gov: http://clinicaltrials.gov/ct2/about-site/terms-conditions Supplementary Figures (S1-S6). PubMed publication delay in the indexation processes does not explain the descending trends in the scientific output of evidence-based medicine. Acknowledgments I would like to acknowledge the following persons for providing valuable concepts in data visualization and infographics:

Maria Fernanda Ramírez. Professor of graphic design. Universidad del Valle. Cali, Colombia.

Lorena Franco. Graphic design student. Universidad del Valle. Cali, Colombia. Related articles by this author (Jorge H. Ramírez)

Ramirez JH. Lack of transparency in clinical trials: a call for action. Colomb Med (Cali) 2013;44(4):243-6. URL: http://www.ncbi.nlm.nih.gov/pubmed/24892242

Ramirez JH. Re: Evidence based medicine is broken (17 June 2014). http://www.bmj.com/node/759181

Ramirez JH. Re: Global rules for global health: why we need an independent, impartial WHO (19 June 2014). http://www.bmj.com/node/759151

Ramirez JH. PubMed publication trends (1992 to 2014): evidence based medicine and clinical practice guidelines (04 July 2014). http://www.bmj.com/content/348/bmj.g3725/rr/759895 Recommended articles

Greenhalgh Trisha, Howick Jeremy,Maskrey Neal. Evidence based medicine: a movement in crisis? BMJ 2014;348:g3725

Spence Des. Evidence based medicine is broken BMJ 2014; 348:g22

Schünemann Holger J, Oxman Andrew D,Brozek Jan, Glasziou Paul, JaeschkeRoman, Vist Gunn E et al. Grading quality of evidence and strength of recommendations for diagnostic tests and strategies BMJ 2008; 336:1106

Lau Joseph, Ioannidis John P A, TerrinNorma, Schmid Christopher H, OlkinIngram. The case of the misleading funnel plot BMJ 2006; 333:597

Moynihan R, Henry D, Moons KGM (2014) Using Evidence to Combat Overdiagnosis and Overtreatment: Evaluating Treatments, Tests, and Disease Definitions in the Time of Too Much. PLoS Med 11(7): e1001655. doi:10.1371/journal.pmed.1001655

Katz D. A-holistic view of evidence based medicinehttp://thehealthcareblog.com/blog/2014/05/02/a-holistic-view-of-evidence-based-medicine/ ---
youtube-kids-ads-www24
zenodo.org
zip
Updated Feb 20, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Aima Shahid; Aima Shahid; Emaan Bilal Khan; Emaan Bilal Khan; Nida Tanveer; Nida Tanveer (2024). youtube-kids-ads-www24 [Dataset]. http://doi.org/10.5281/zenodo.10685031
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.10685031
Dataset updated
Feb 20, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Aima Shahid; Aima Shahid; Emaan Bilal Khan; Emaan Bilal Khan; Nida Tanveer; Nida Tanveer
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Area covered
YouTube
Description

Our methodology for gathering "made for kids" videos on YouTube involves identifying the most popular children’s videos worldwide. We utilise data from Social Blade, a YouTube-certified user-analytics platform, which maintains a list of the most popular channels with the "made for kids" tag, ranked based on total lifetime views. We select the top 10 most popular videos from the 75 highest-viewed kids channels on this list, by querying the YouTube API. This forms our labelled video dataset, comprising 750 videos, capturing a broad spectrum of content and styles which are likely to attract a large number of young viewers worldwide.

While focusing on "made for kids" channels is a useful starting point for analysing ad patterns on kids' videos, it is also important to consider the wider landscape of child-oriented content on the platform, much of which remains unlabelled. To build a representative dataset of such videos, we use seed search words reflecting popular child interests, some of which include "toys", "kids cartoon", and "Barbie." The results are then parsed to find popular channels with unlabelled content, with a minimum threshold of 400,000 views.

Next, we scrape ad data across all videos for further analysis, covering all major ad formats on the platform including (i) skippable and (ii) unskippable video ads, (iii) sidebar ads, (iv) in-feed ads, and (v) banner ads. We use a Selenium Webdriver script launched in a new logged-out Chrome window, with no previous history, cookies, or user data. We then scrape each ad’s unique YouTube-assigned video ID, and any embedded external link as the video plays.

Next, we use YouTube Data API to obtain additional metadata like video title, duration, and "made for kids" label for each video ad, the result of which is recorded in the dataset. The videos are played from different VPN locations to explore the varied experiences based on geographical location.
The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS)
zenodo.org
data.niaid.nih.gov
zip
Updated Oct 19, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Steven R. Livingstone; Steven R. Livingstone; Frank A. Russo; Frank A. Russo (2024). The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) [Dataset]. http://doi.org/10.5281/zenodo.1188976
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.1188976
Dataset updated
Oct 19, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Steven R. Livingstone; Steven R. Livingstone; Frank A. Russo; Frank A. Russo
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
Description

The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) contains 7356 files (total size: 24.8 GB). The dataset contains 24 professional actors (12 female, 12 male), vocalizing two lexically-matched statements in a neutral North American accent. Speech includes calm, happy, sad, angry, fearful, surprise, and disgust expressions, and song contains calm, happy, sad, angry, and fearful emotions. Each expression is produced at two levels of emotional intensity (normal, strong), with an additional neutral expression. All conditions are available in three modality formats: Audio-only (16bit, 48kHz .wav), Audio-Video (720p H.264, AAC 48kHz, .mp4), and Video-only (no sound). Note, there are no song files for Actor_18.

The RAVDESS was developed by Dr Steven R. Livingstone, who now leads the Affective Data Science Lab, and Dr Frank A. Russo who leads the SMART Lab.

Citing the RAVDESS

The RAVDESS is released under a Creative Commons Attribution license, so please cite the RAVDESS if it is used in your work in any form. Published academic papers should use the academic paper citation for our PLoS1 paper. Personal works, such as machine learning projects/blog posts, should provide a URL to this Zenodo page, though a reference to our PLoS1 paper would also be appreciated.

Academic paper citation

Livingstone SR, Russo FA (2018) The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): A dynamic, multimodal set of facial and vocal expressions in North American English. PLoS ONE 13(5): e0196391. https://doi.org/10.1371/journal.pone.0196391.

Personal use citation

Include a link to this Zenodo page - https://zenodo.org/record/1188976

Commercial Licenses

Commercial licenses for the RAVDESS can be purchased. For more information, please visit our license page of fees, or contact us at ravdess@gmail.com.

Contact Information

If you would like further information about the RAVDESS, to purchase a commercial license, or if you experience any issues downloading files, please contact us at ravdess@gmail.com.

Example Videos

Watch a sample of the RAVDESS speech and song videos.

Emotion Classification Users

If you're interested in using machine learning to classify emotional expressions with the RAVDESS, please see our new RAVDESS Facial Landmark Tracking data set [Zenodo project page].

Construction and Validation

Full details on the construction and perceptual validation of the RAVDESS are described in our PLoS ONE paper - https://doi.org/10.1371/journal.pone.0196391.

The RAVDESS contains 7356 files. Each file was rated 10 times on emotional validity, intensity, and genuineness. Ratings were provided by 247 individuals who were characteristic of untrained adult research participants from North America. A further set of 72 participants provided test-retest data. High levels of emotional validity, interrater reliability, and test-retest intrarater reliability were reported. Validation data is open-access, and can be downloaded along with our paper from PLoS ONE.

Contents

Audio-only files

Audio-only files of all actors (01-24) are available as two separate zip files (~200 MB each):

Speech file (Audio_Speech_Actors_01-24.zip, 215 MB) contains 1440 files: 60 trials per actor x 24 actors = 1440.

Song file (Audio_Song_Actors_01-24.zip, 198 MB) contains 1012 files: 44 trials per actor x 23 actors = 1012.

Audio-Visual and Video-only files

Video files are provided as separate zip downloads for each actor (01-24, ~500 MB each), and are split into separate speech and song downloads:

Speech files (Video_Speech_Actor_01.zip to Video_Speech_Actor_24.zip) collectively contains 2880 files: 60 trials per actor x 2 modalities (AV, VO) x 24 actors = 2880.

Song files (Video_Song_Actor_01.zip to Video_Song_Actor_24.zip) collectively contains 2024 files: 44 trials per actor x 2 modalities (AV, VO) x 23 actors = 2024.

File Summary

In total, the RAVDESS collection includes 7356 files (2880+2024+1440+1012 files).

File naming convention

Each of the 7356 RAVDESS files has a unique filename. The filename consists of a 7-part numerical identifier (e.g., 02-01-06-01-02-01-12.mp4). These identifiers define the stimulus characteristics:

Filename identifiers

Modality (01 = full-AV, 02 = video-only, 03 = audio-only).

Vocal channel (01 = speech, 02 = song).

Emotion (01 = neutral, 02 = calm, 03 = happy, 04 = sad, 05 = angry, 06 = fearful, 07 = disgust, 08 = surprised).

Emotional intensity (01 = normal, 02 = strong). NOTE: There is no strong intensity for the 'neutral' emotion.

Statement (01 = "Kids are talking by the door", 02 = "Dogs are sitting by the door").

Repetition (01 = 1st repetition, 02 = 2nd repetition).

Actor (01 to 24. Odd numbered actors are male, even numbered actors are female).

Filename example: 02-01-06-01-02-01-12.mp4

Video-only (02)

Speech (01)

Fearful (06)

Normal intensity (01)

Statement "dogs" (02)

1st Repetition (01)

12th Actor (12)

Female, as the actor ID number is even.

License information

The RAVDESS is released under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, CC BY-NC-SA 4.0

Commercial licenses for the RAVDESS can also be purchased. For more information, please visit our license fee page, or contact us at ravdess@gmail.com.

Related Data sets

RAVDESS Facial Landmark Tracking data set [Zenodo project page].

Facebook

Twitter

Click to copy link

Link copied

Cite

The Devastator (2022). YouTube Videos and Channels Metadata [Dataset]. https://www.kaggle.com/datasets/thedevastator/revealing-insights-from-youtube-video-and-channe

YouTube Videos and Channels Metadata

Analyze the statistical relation between videos and form a topic tree

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Dec 14, 2022

Dataset provided by

Kaggle

Authors

The Devastator

Area covered

YouTube

Description

YouTube Videos and Channels Metadata

Analyze the statistical relation between videos and form a topic tree

By VISHWANATH SESHAGIRI [source]

About this dataset

This dataset contains YouTube video and channel metadata to analyze the statistical relation between videos and form a topic tree. With 9 direct features, 13 more indirect features, it has all that you need to build a deep understanding of how videos are related – including information like total views per unit time, channel views, likes/subscribers ratio, comments/views ratio, dislikes/subscribers ratio etc. This data provides us with a unique opportunity to gain insights on topics such as subscriber count trends over time or calculating the impact of trends on subscriber engagement. We can develop powerful models that show us how different types of content drive viewership and identify the most popular styles or topics within YouTube's vast catalogue. Additionally this data offers an intriguing look into consumer behaviour as we can explore what drives people to watch specific videos at certain times or appreciate certain channels more than others - by analyzing things like likes per subscribers and dislikes per views ratios for example! Finally this dataset is completely open source with an easy-to-understand Github repo making it an invaluable resource for anyone looking to gain better insights into how their audience interacts with their content and how they might improve it in the future

More Datasets

For more datasets, click here.

Featured Notebooks

🚨 Your notebook can be here! 🚨!

How to use the dataset

How to Use This Dataset

In general, it is important to understand each parameter in the data set before proceeding with analysis. The parameters included are totalviews/channelelapsedtime, channelViewCount, likes/subscriber, views/subscribers, subscriberCounts, dislikes/views comments/subscriberchannelCommentCounts,, likes/dislikes comments/views dislikes/ subscribers totviewes /totsubsvews /elapsedtime.

To use this dataset for your own analysis:1) Review each parameter’s meaning and purpose in our dataset; 2) Get familiar with basic descriptive statistics such as mean median mode range; 3) Create visualizations or tables based on subsets of our data; 4) Understand correlations between different sets of variables or parameters; 5) Generate meaningful conclusions about specific channels or topics based on organized graph hierarchies or tables.; 6) Analyze trends over time for individual parameters as well as an aggregate reaction from all users when videos are released

Research Ideas

Predicting the Relative Popularity of Videos: This dataset can be used to build a statistical model that can predict the relative popularity of videos based on various factors such as total views, channel viewers, likes/dislikes ratio, and comments/views ratio. This model could then be used to make recommendations and predict which videos are likely to become popular or go viral.

Creating Topic Trees: The dataset can also be used to create topic trees or taxonomies by analyzing the content of videos and looking at what topics they cover. For example, one could analyze the most popular YouTube channels in a specific subject area, group together those that discuss similar topics, and then build an organized tree structure around those topics in order to better understand viewer interests in that area.

Viewer Engagement Analysis: This dataset could also be used for viewer engagement analysis purposes by analyzing factors such as subscriber count, average time spent watching a video per user (elapsed time), comments made per view etc., so as to gain insights into how engaged viewers are with specific content or channels on YouTube. From this information it would be possible to optimize content strategy accordingly in order improve overall engagement rates across various types of video content and channel types

Acknowledgements

If you use this dataset in your research, please credit the original authors.

Data Source

License

Unknown License - Please check the dataset description for more information.

Columns

File: YouTubeDataset_withChannelElapsed.csv | Column name | Description | |:----------------------------------|:-------------------------------------------------------| | totalviews/channelelapsedtime | Ratio of total views to channel elapsed time. (Ratio) | | channelViewCount | Total number of views for the channel. (Integer) | | likes/subscriber ...

Clear search

Close search

Google apps

Main menu

YouTube Videos and Channels Metadata

YouTube Videos and Channels Metadata

Analyze the statistical relation between videos and form a topic tree

About this dataset

More Datasets

Featured Notebooks

How to use the dataset

How to Use This Dataset

Research Ideas

Acknowledgements

License

Columns

YouTube Revenue and Usage Statistics (2025)

YouTube users worldwide 2020-2029

YouTube Video and Channel Analytics

YouTube Video and Channel Analytics

YouTube Video and Channel Analytics: Statistics and Features

About this dataset

How to use the dataset

Youtube users in the United Kingdom 2017-2025

2023 YouTube Most Viewed Top600

Youtube users in the United States 2017-2025

Top 200 Youtubers Data (cleaned)

Short Video Engagement Dataset

Countries with the most YouTube users 2025

Streaming Mobile Media Exposure | 1st Party | 3B+ events verified, US...

YouTube Crisis Actor Videos and Recommendations

YouTube Crisis Actor Videos and Recommendations

Dataset of Crisis Actor Videos on YouTube and their Recommendations

About this dataset

How to use the dataset

Research Ideas

Microsoft Excel dataset file of YouTube videos.

YouTube users in India 2020-2029

TV Analytics Market size was USD 3815.2 million in 2024!

Face Dataset Of People That Don't Exist

Context

Copyrighting of AI Generated images

Tagging

How it works

Github implementation of website

How I gathered the images

Data from: The viewer doesn’t always seem to care - response to fake animal...

Data (i.e., evidence) about evidence based medicine

youtube-kids-ads-www24

The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS)

YouTube Videos and Channels Metadata

Analyze the statistical relation between videos and form a topic tree

YouTube Videos and Channels Metadata

Analyze the statistical relation between videos and form a topic tree

About this dataset

More Datasets

Featured Notebooks

How to use the dataset

How to Use This Dataset

Research Ideas

Acknowledgements

License

Columns