https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains data related to most watched YouTube videos till April 2024 . This contains different columns namely views,artist,channel,etc. The data is ranked on the basis of number of views.
On June 17, 2016, Korean education brand Pinkfong released their video "Baby Shark Dance", and the rest is history. In January 2021, Baby Shark Dance became the first YouTube video to surpass 10 billion views, after snatching the crown of most-viewed YouTube video of all time from the former record holder "Despacito" one year before. "Baby Shark Dance" currently has over 15 billion lifetime views on YouTube. Music videos on YouTube “Baby Shark Dance” might be the current record-holder in terms of total views, but Korean artist Psy’s “Gangnam Style” video remained on the top spot for longest (1,689 days or 4.6 years) before ceding its spot to its successor. With figures like these, it comes as little surprise that the majority of the most popular videos on YouTube are music videos. Since 2010, all but one the most-viewed videos on YouTube have been music videos, signifying the platform’s shift in focus from funny, viral videos to professionally produced content. As of 2022, about 40 percent of the U.S. digital music audience uses YouTube Music. Popular video content on YouTube Music fans are also highly engaged audiences and it is not uncommon for music videos to garner significant amounts of traffic within the first 24 hours of release. Other popular types of videos that generate lots of views after their first release are movie trailers, especially superhero movies related to the MCU (Marvel Cinematic Universe). The first official trailer for the upcoming film “Avengers: Endgame” generated 289 million views within the first 24 hours of release, while the movie trailer for Spider-Man: No Way Home generated over 355 views on the first day from release, making it the most viral movie trailer.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
YouTube maintains a list of the top trending videos on the platform. According to Variety magazine, “To determine the year’s top-trending videos, YouTube uses a combination of factors including measuring users interactions (number of views, shares, comments and likes). Note that they’re not the most-viewed videos overall for the calendar year”.
Note that this dataset is a structurally improved version of this dataset.
This dataset includes several months (and counting) of data on daily trending YouTube videos. Data is included for the IN, US, GB, DE, CA, FR, RU, BR, MX, KR, and JP regions (India, USA, Great Britain, Germany, Canada, France, Russia, Brazil, Mexico, South Korea, and, Japan respectively), with up to 200 listed trending videos per day.
Each region’s data is in a separate file. Data includes the video title, channel title, publish time, tags, views, likes and dislikes, description, and comment count.
The data also includes a category_id field, which varies between regions. To retrieve the categories for a specific video, find it in the associated JSON. One such file is included for each of the 11 regions in the dataset.
For more information on specific columns in the dataset refer to the column metadata.
This dataset was collected using the YouTube API. This dataset is the updated version of Trending YouTube Video Statistics.
Possible uses for this dataset could include: - Sentiment analysis in a variety of forms - Categorizing YouTube videos based on their comments and statistics. - Training ML algorithms like RNNs to generate their own YouTube comments. - Analyzing what factors affect how popular a YouTube video will be. - Statistical analysis over time.
For further inspiration, see the kernels on this dataset!
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Dataset Name: 2023 YouTube Most Viewed Top600
Description: This dataset, titled "2023YouTubeMostViewed_Top600", comprises a curated selection of the top 600 YouTube videos based on view count, specifically from the year 2023. Each entry in the dataset represents a unique video, encompassing several key metrics:
It's important to note that while these videos are among the most viewed as of the data retrieval date, the landscape of YouTube is dynamic. View counts are continually changing, and what constitutes the 'most viewed' can fluctuate. Thus, the dataset should be seen as a snapshot of popularity and viewer engagement during a specific period in 2023, rather than an absolute ranking. This dataset is invaluable for analysis of trending content, viewer preferences, and video engagement metrics on YouTube for the year 2023.
Note: Ethically mined data from YouTube
By VISHWANATH SESHAGIRI [source]
This dataset contains YouTube video and channel metadata to analyze the statistical relation between videos and form a topic tree. With 9 direct features, 13 more indirect features, it has all that you need to build a deep understanding of how videos are related – including information like total views per unit time, channel views, likes/subscribers ratio, comments/views ratio, dislikes/subscribers ratio etc. This data provides us with a unique opportunity to gain insights on topics such as subscriber count trends over time or calculating the impact of trends on subscriber engagement. We can develop powerful models that show us how different types of content drive viewership and identify the most popular styles or topics within YouTube's vast catalogue. Additionally this data offers an intriguing look into consumer behaviour as we can explore what drives people to watch specific videos at certain times or appreciate certain channels more than others - by analyzing things like likes per subscribers and dislikes per views ratios for example! Finally this dataset is completely open source with an easy-to-understand Github repo making it an invaluable resource for anyone looking to gain better insights into how their audience interacts with their content and how they might improve it in the future
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
How to Use This Dataset
In general, it is important to understand each parameter in the data set before proceeding with analysis. The parameters included are totalviews/channelelapsedtime, channelViewCount, likes/subscriber, views/subscribers, subscriberCounts, dislikes/views comments/subscriberchannelCommentCounts,, likes/dislikes comments/views dislikes/ subscribers totviewes /totsubsvews /elapsedtime.
To use this dataset for your own analysis:1) Review each parameter’s meaning and purpose in our dataset; 2) Get familiar with basic descriptive statistics such as mean median mode range; 3) Create visualizations or tables based on subsets of our data; 4) Understand correlations between different sets of variables or parameters; 5) Generate meaningful conclusions about specific channels or topics based on organized graph hierarchies or tables.; 6) Analyze trends over time for individual parameters as well as an aggregate reaction from all users when videos are released
Predicting the Relative Popularity of Videos: This dataset can be used to build a statistical model that can predict the relative popularity of videos based on various factors such as total views, channel viewers, likes/dislikes ratio, and comments/views ratio. This model could then be used to make recommendations and predict which videos are likely to become popular or go viral.
Creating Topic Trees: The dataset can also be used to create topic trees or taxonomies by analyzing the content of videos and looking at what topics they cover. For example, one could analyze the most popular YouTube channels in a specific subject area, group together those that discuss similar topics, and then build an organized tree structure around those topics in order to better understand viewer interests in that area.
Viewer Engagement Analysis: This dataset could also be used for viewer engagement analysis purposes by analyzing factors such as subscriber count, average time spent watching a video per user (elapsed time), comments made per view etc., so as to gain insights into how engaged viewers are with specific content or channels on YouTube. From this information it would be possible to optimize content strategy accordingly in order improve overall engagement rates across various types of video content and channel types
If you use this dataset in your research, please credit the original authors.
License
Unknown License - Please check the dataset description for more information.
File: YouTubeDataset_withChannelElapsed.csv | Column name | Description | |:----------------------------------|:-------------------------------------------------------| | totalviews/channelelapsedtime | Ratio of total views to channel elapsed time. (Ratio) | | channelViewCount | Total number of views for the channel. (Integer) | | likes/subscriber ...
As of January 2025, the ranking of the most popular YouTube channels based on monthly views is dominated by music and children's content. Wiz Khalifa's Music channel was ranked first with a whopping six billion channel views, while Wow Kidz ranked second with over five billion video views in the last examined month. Indian music record channel T-Series -which ranked first continuously in 2021 and 2022, placed third with 2.72 billion video views. Most subscribed YouTube channels When looking at the most-subscribed YouTube channels: Indian music label T-Series was ranked first with 229 million channel subscribers. The video game commentator Felix Kjellberg, aka PewDiePie, ranked seventh, with roughly 111 million subscribers, after being surpassed by Jimmy Donaldson, aka MrBeast in November 2022. As of June 2022, PewDiePie was still the most subscribed gaming-content channel on YouTube, followed by Salvadorian YouTuber Fernanfloo with 45 million global subscribers. Creators' earnings MrBeast was the highest-earning YouTuber in 2021, with 54 million U.S. dollars. Ryan Kaji from Ryan's World was among the youngest content creators making the rankings of the highest-earning YouTubers in 2021, with an estimated revenue of 27 million U.S. dollars. Ryan’s channel was set up by his parents in March 2015, when the young content creator was three years old. MrBeast was also the leading content creator on YouTube based on Influence Media Value, while PewDiePie ranked third with an evaluation of around 3.91 million U.S. dollars.
The graph shows data on most viewed ads on YouTube in 2017. The Rock's trailer for Apple's iPhone 7 featuring Siri generated 25.3 million views. Dior's ad for Miss Dior featuring Natalie Portman generated 43 million views as of the measured period.
The dataset contains the following files:
Filename | Data Format | Description |
01_dataset_scholarly_references_on_YouTube.json.gz | JSON Lines | An integrated dataset of scholarly references in YouTube video descriptions, covering videos posted up to the end of December 2023. This dataset combines the Altmetric dataset and the YA Domain Dataset and is the basis for identifying references to retracted articles. This dataset contains 743,529 scholarly references (386,628 unique DOIs) found in 322,521 YouTube videos uploaded by 77,974 channels. |
02_dataset_references_to_retracted_articles_on_YouTube.json.gz | JSON Lines |
A dataset of retracted articles referenced in YouTube videos, used as the primary source for analysis in this paper. The dataset was created by cross-referencing the integrated reference dataset with the Retraction Watch database. It includes metadata such as DOI, article title, retraction reason, and severity classification (Severe, Moderate, or Minor) based on Woo and Walsh (2024), along with video- and channel-level statistics (e.g., view counts and subscriber counts) retrieved via the YouTube Data API v3 as of April 22, 2025. This dataset contains 1,002 retracted articles (360 unique DOIs) found in 956 YouTube videos uploaded by 714 channels. |
03_full_list_table3_sorted_by_reference_count_retracted_articles_on_YouTube.json.gz | JSON Lines |
Complete list corresponding to Table 3, "Top 7 retracted articles ranked by the number of YouTube videos in which they are referenced." in the paper. |
04_full_list_table5_top10_most-viewed_video.json.gz | JSON Lines |
Complete list corresponding to Table 5, "Top 10 most-viewed YouTube videos that reference retracted articles, sorted by video view count." in the paper. |
05_detailed_manual_coding_40_sampled_retracted_articles.xlsx | XLSX |
This file provides detailed annotations for a manually coded sample of 40 YouTube videos referencing retracted scholarly articles. The sample includes 10 randomly selected videos from each of the four analytical groups categorized by publication timing (before/after retraction) and retraction severity (Moderate/Severe). The file includes reference stance for each video, visual/verbal mention of the article, and relevant timestamps when applicable. This dataset supplements the manual analysis results presented in Tables 6 and 7 in paper. |
Due to concerns over potential misuse (e.g., identification or harassment of individual content creators), this dataset is not made publicly available.
Researchers who wish to use this dataset for scholarly purposes may contact the authors to request access.
References
Fundings
JSPS KAKENHI Grant Numbers JP22K18147 and JP23K11761.
YouTube is an American online video-sharing platform headquartered in San Bruno, California. The service, created in February 2005 by three former PayPal employees—Chad Hurley, Steve Chen, and Jawed Karim—was bought by Google in November 2006 for US$1.65 billion and now operates as one of the company's subsidiaries. YouTube is the second most-visited website after Google Search, according to Alexa Internet rankings.
YouTube allows users to upload, view, rate, share, add to playlists, report, comment on videos, and subscribe to other users. Available content includes video clips, TV show clips, music videos, short and documentary films, audio recordings, movie trailers, live streams, video blogging, short original videos, and educational videos.
YouTube (the world-famous video sharing website) maintains a list of the top trending videos on the platform. According to Variety magazine, “To determine the year’s top-trending videos, YouTube uses a combination of factors including measuring users interactions (number of views, shares, comments, and likes). Note that they’re not the most-viewed videos overall for the calendar year”. Top performers on the YouTube trending list are music videos (such as the famously virile “Gangam Style”), celebrity and/or reality TV performances, and the random dude-with-a-camera viral videos that YouTube is well-known for.
This dataset is a daily record of the top trending YouTube videos.
Note that this dataset is a structurally improved version of this dataset.
This dataset was collected using the YouTube API. This Description is cited in Wikipedia.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
While focusing on "made for kids" channels is a useful starting point for analysing ad patterns on kids' videos, it is also important to consider the wider landscape of child-oriented content on the platform, much of which remains unlabelled. To build a representative dataset of such videos, we use seed search words reflecting popular child interests, some of which include "toys", "kids cartoon", and "Barbie." The results are then parsed to find popular channels with unlabelled content, with a minimum threshold of 400,000 views.
Next, we scrape ad data across all videos for further analysis, covering all major ad formats on the platform including (i) skippable and (ii) unskippable video ads, (iii) sidebar ads, (iv) in-feed ads, and (v) banner ads. We use a Selenium Webdriver script launched in a new logged-out Chrome window, with no previous history, cookies, or user data. We then scrape each ad’s unique YouTube-assigned video ID, and any embedded external link as the video plays.
Next, we use YouTube Data API to obtain additional metadata like video title, duration, and "made for kids" label for each video ad, the result of which is recorded in the dataset. The videos are played from different VPN locations to explore the varied experiences based on geographical location.
As of June 2022, more than *** hours of video were uploaded to YouTube every minute. This equates to approximately ****** hours of newly uploaded content per hour. The amount of content on YouTube has increased dramatically as consumer’s appetites for online video has grown. In fact, the number of video content hours uploaded every 60 seconds grew by around ** percent between 2014 and 2020. YouTube global users Online video is one of the most popular digital activities worldwide, with ** percent of internet users worldwide watching more than ** hours of online videos on a weekly basis in 2023. It was estimated that in 2023 YouTube would reach approximately *** million users worldwide. In 2022, the video platform was one of the leading media and entertainment brands worldwide, with a value of more than ** billion U.S. dollars. YouTube video content consumption The most viewed YouTube channels of all time have racked up billions of viewers, millions of subscribers and cover a wide variety of topics ranging from music to cosmetics. The YouTube channel owner with the most video views is Indian music label T-Series, which counted ****** billion lifetime views. Other popular YouTubers are gaming personalities such as PewDiePie, DanTDM and Markiplier.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The dataset is first introduced in the following paper: Siqi Wu, Marian-Andrei Rizoiu, and Lexing Xie. Beyond Views: Measuring and Predicting Engagement in Online Videos. In AAAI International Conference on Weblogs and Social Media (ICWSM), 2018. Tweeted videos dataset This dataset contains YouTube videos published between July 1st and August 31st, 2016. To be collected, the video needs (a) be mentioned on Twitter during aforementioned collection period; (b) have insight statistics available; (c) have at least 100 views within the first 30 days after upload. Quality videos datasets These datasets contain videos deemed of high quality by domain experts. Vevo videos: Videos of verified Vevo artists, as of August 31st, 2016. Billboard16 videos: Videos of 2016 Billboard Hot 100 chart. Top news videos: Videos of top 100 most viewed News channels. freebase_mid_type_name.csv It maps a freebase mid to a real-world entity. See more details in this data description.
http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.htmlhttp://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html
As YouTube becomes one of the most popular video-sharing platforms, YouTuber is developed as a new type of career in recent decades. YouTubers earn money through advertising revenue from YouTube videos, sponsorships from companies, merchandise sales, and donations from their fans. In order to maintain a stable income, the popularity of videos become the top priority for YouTubers. Meanwhile, some of our friends are YouTubers or channel owners in other video-sharing platforms. This raises our interest in predicting the performance of the video. If creators can have a preliminary prediction and understanding on their videos’ performance, they may adjust their video to gain the most attention from the public.
You have been provided details on videos along with some features as well. Can you accurately predict the number of likes for each video using the set of input variables?
Train Set
video_id -> Identifier for each video
title -> Name of the Video on Youtube
channel_title -> Name of the Channel on Youtube
category_id -> Category of the Video (anonymous)
publish_date -> The date video was published
tags -> Different tags for the video
views -> Number of views received by the Video
dislikes -> Number of dislikes on the Video
comment_count -> Number on comments on the Video
description -> Textual description of the Video
country_code -> Country from which the Video was published
likes -> Number of Likes on the video
Thank You Analytics Vidhya for providing this dataset.
As of February 2025, India was the country with the largest YouTube audience by far, with approximately 491 million users engaging with the popular social video platform. The United States followed, with around 253 million YouTube viewers. Brazil came in third, with 144 million users watching content on YouTube. The United Kingdom saw around 54.8 million internet users engaging with the platform in the examined period. What country has the highest percentage of YouTube users? In July 2024, the United Arab Emirates was the country with the highest YouTube penetration worldwide, as around 94 percent of the country's digital population engaged with the service. In 2024, YouTube counted around 100 million paid subscribers for its YouTube Music and YouTube Premium services. YouTube mobile markets In 2024, YouTube was among the most popular social media platforms worldwide. In terms of revenues, the YouTube app generated approximately 28 million U.S. dollars in revenues in the United States in January 2024, as well as 19 million U.S. dollars in Japan.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Video communication has played a key role in relaying important and complex information on the COVID-19 pandemic to the general public. The aim of the present study is to compare Norwegian health authorities’ and WHO’s use of video communication during the COVID-19 pandemic to the most viewed COVID-19 videos on YouTube, in order to identify how videos created by health authorities measure up to contemporary video content, both creatively and in reaching video consumers. Through structured search on YouTube we found that Norwegian health authorities have published 26 videos, and the WHO 29 videos on the platform. Press briefings, live videos, news reports, and videos recreated/translated into other languages than English or Norwegian, were not included. A content analysis comparing the 55 videos by the health authorities to the 27 most viewed videos on COVID-19 on YouTube demonstrates poor reach of health authorities’ videos in terms of views and it elucidates a clear creative gap. While the videos created by various YouTube creators communicate using a wide range of creative presentation means (such as professional presenters, contextual backgrounds, advanced graphic animations, and humour), videos created by the health authorities are significantly more homogenous in style often using field experts or public figures, plain backgrounds or PowerPoint style animations. We suggest that further studies into various creative presentation means and their influence on reach, recall, and on different groups of the population, are carried out in the future to evaluate specific factors of this creative gap.
As of January 2021, Dudesons was ranked first on the list of most viewed Finnish YouTube channels, reaching a number of views of nearly *** million. The Dudesons are a four-man group, known for their TV shows and live performances, which are a combination of stunts and comedy. Furthermore, they also had the most subscribed to channel, with *** million subscribers on YouTube. The second most viewed channel, Bass Boost, had almost *** million fewer views. A channel of a private person with a large amount of video views was Lakko.
Why do Finns use YouTube? According to the most recent data, the majority of Finnish YouTube users were accessing the platform for entertainment purposes. Other reasons to visit YouTube were following brands and companies, as well as searching for news sources. Like most social media platforms, YouTube was more popular among women than among men in Finland. However, it was relatively equally distributed, whereas platforms like WhatsApp, Facebook and Instagram were significantly more popular among women than among men.
YouTube advertising
YouTube has turned into a powerful advertising platform. The net advertising revenue of YouTube in the United States reached **** billion U.S. dollars and was expected to exceed *** billion U.s. dollars by 2022. What is more, the global YouTube advertising revenue hit over ** million U.S. dollars in 2019.
Abstract
This study aimed to assess the quality and reliability of the most-watched YouTube videos on Otago exercises. The keywords “Otago exercise” and “Otago exercise program” were searched between December 15-30, 2023. Sixty videos were selected for each keyword, sorted by number of views. Video metrics and upload sources were documented. The modified (m) DISCERN score and the Global Quality Score (GQS) were used to evaluate the reliability and quality of the videos, respectively. Out of the 34 videos reviewed, the majority (47.1%) were shared by physiotherapists. The median mDISCERN score was 2, indicating that a significant proportion (79.4%) of the videos exhibited low reliability (p<0.05). The median GQS score was 3, with 64.7% of videos classified as intermediate or high quality. However, no statistical differences in quality were observed (p>0.05). Although no statistical difference was noted, it was evident that physiotherapists uploaded a higher percentage of reliable and high-quality videos compared to other sources. Analysis of the video metrics among the quality groups revealed significant differences only in video duration (p<0.05). Positive correlations were found between certain video metrics and mDISCERN (video duration, number of comments) and GQS (video duration) scores. YouTube videos on OTAGO exercises demonstrate insufficient reliability and quality. Collaboration with professional organizations in geriatric rehabilitation is recommended for YouTube to produce high-quality and reliable videos aligned with their evolving health content policy.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The 10 most popular hashtags in our dataset.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This article provides a systematic portrait of the YouTube presence of U.S. Senate candidates during the 2008 election cycle. The evidence does not support the theory that democratized production, editing, and distribution of video content is markedly changing the formats and producers of political content. This is apparent from the predominance of 30-second ads among both the most popular videos and the broad range of campaign videos. Although other potential forms of accountability remain unrealized, YouTube is facilitating candidates being held accountable for their own advertising. The 2008 findings are compared to 2006 findings with the same methodology.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘5-Minute Crafts: Video Clickbait Titles?’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/shivamb/5minute-crafts-video-views-dataset on 12 November 2021.
--- Dataset description provided by original source is as follows ---
5-Minute Crafts is a DIY-style YouTube channel owned by TheSoul Publishing. As of October 2021, it is the 9th most-subscribed channel on the platform, It is also one of the most viewed channels. The channel has drawn criticism for unusual and potentially dangerous life hacks and its reliance on clickbait. Irrespective of the criticism, 5-Minute Crafts videos do get a lot of views.
In this dataset, a complete record of video titles from 5-Minute Craft youtube channels is collected along with many other meta-features of the titles. It also contains - total video views, duration, active since, sentiment, etc.
Use this dataset to perform the following different types. of analysis and modeling - 1. Relation of different words used in the titles and total views garnered 2. Which features of a video title are most important with respect. total views 3. Is there any correlation between title meta-features, total views, duration, and sentiment?
--- Original source retains full ownership of the source dataset ---
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains data related to most watched YouTube videos till April 2024 . This contains different columns namely views,artist,channel,etc. The data is ranked on the basis of number of views.