Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
YouTube was created in 2005, with the first video – Me at the Zoo - being uploaded on 23 April 2005. Since then, 1.3 billion people have set up YouTube accounts. In 2018, people watch nearly 5 billion videos each day. People upload 300 hours of video to the site every minute.
According to 2016 research undertaken by Pexeso, music only accounts for 4.3% of YouTube’s content. Yet it makes 11% of the views. Clearly, an awful lot of people watch a comparatively small number of music videos. It should be no surprise, therefore, that the most watched videos of all time on YouTube are predominantly music videos.
On August 13, BTS became the most-viewed artist in YouTube history, accumulating over 26.7 billion views across all their official channels. This count includes all music videos and dance practice videos.
Justin Bieber and Ed Sheeran now hold the records for second and third-highest views, with over 26 billion views each.
Currently, BTS’s most viewed videos are their music videos for “**Boy With Luv**,” “**Dynamite**,” and “**DNA**,” which all have over 1.4 billion views.
Headers of the Dataset Total = Total views (in millions) across all official channels Avg = Current daily average of all videos combined 100M = Number of videos with more than 100 million views
Facebook
TwitterODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
This dataset show trending and top one hundred song in dataset. This dataset all information about YouTube. YouTube is basically app and website how provide any information around the word such as movies, song, music, drama , tiktok and other data some other shape. YouTube is the word's largest app or website that provide people with the things they went.Many people earn money by making video at any topic is best source of income.
File Information:
This dataset contains information or upload a song on YouTube lets check song name ,subtitle, about
description of this song or music, and after view check ,different tags on song,duration of song in this
dataset and etc...
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Dataset Description
This dataset captures the YouTube Top 100 Songs of 2025, featuring comprehensive metadata for the year’s most popular tracks. It includes details such as:
🎵 Song Titles & Full Titles – official video titles as listed on YouTube
📝 Descriptions – official artist/label descriptions, promotions, and album links
👀 View Counts – ranging from thousands to over 2 billion views
⏱️ Duration – from short tracks (~2 minutes) to extended versions (~6 minutes)
🎶 Categories – primarily Music with a few People & Blogs entries
🏷️ Tags – artist names, record labels, and genre-related keywords
🖼️ Thumbnails – video preview images for each song
📅 Collected Date – 22 September 2025
This dataset is ideal for:
With 100 unique entries and no missing or mismatched values for key fields, this dataset offers a clean and reliable resource for researchers, data scientists, and music enthusiasts.
Facebook
TwitterComprehensive ranking dataset of the top 100 YouTube channels in the Music category. This dataset features 100 channels with detailed statistics including subscriber counts, total video views, video count, and global rankings. The leading channel has 199,000,000 subscribers and 212,476,018,350 total views. Each entry includes comprehensive metrics to analyze channel performance, growth trends, and competitive positioning. This dataset is regularly updated to reflect the latest YouTube channel statistics and ranking changes, providing valuable insights for content creators, marketers, and researchers analyzing YouTube ecosystem trends and channel performance benchmarks.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
These are the published date of music videos of every song in
https://www.kaggle.com/edumucelli/spotifys-worldwide-daily-song-ranking
Most of the time, music videos published dates are same as music themselves.
It would be valid to use the dates as release dates.
There are no other sources better than youtube to cover as much songs as possible.
The file contains no header
20 songs remained Nan (unavailable to find related videos)
This data was retrieved by Youtube API
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
**Trending on YouTube ** Trending helps viewers see what’s happening on YouTube and in the world. Trending aims to surface videos and shorts that a wide range of viewers would find interesting. Some trends are predictable, like a new song from a popular artist or a new movie trailer. Others are surprising, like a viral video.
Trending isn't personalized and displays the same list of trending videos to all viewers in the same country, which is why you may see videos in Trending that aren’t in the same language as your browser. However, in India, Trending displays a list of results for each of the 9 most common Indic languages.
SOURCE The data has been scrapped from "Mendeley.com". The source of this file ishttps://data.mendeley.com/datasets/7pkbvjtnxm/1/files/e7763107-45e9-4613-8c81-146e6a272266 Converted the data to csv file to use it in kaggle ../input/youtube-vdos/youtube trending videos dataset.csv
The data contains following columns .
* ) Position (int type) - An index column which gives the position of the channel in youtube channel
1) Channel Id ( Stirng ) - ID of the youtube channel
2) Channel Title ( String ) - Youtube channel title
3) Video Id (String) - ID of video in the youtube channel
4) Published At (String) - date of the video published at
5) Video Title (String ) - Title of the video
6) Video Description (String) - Description of the video(what the video is about)
6 Video Category Id ( int type) - Category of the video in youtube channel
7 Video Category Label (String) - type of category the video belongs
8 Duration (String ) - duration of the video
9 Duration Sec ( int type) - Duration of video in seconds
10 Dimension (String) - Dimension of the video (2D , Hd)
11 Definition (String) - Defining the video
12 Caption (bool ) - Boolean type caption (True or False)
13 Licensed Content (float Type)
14 View Count ( int type) - number of people viewed the video
15 Like Count (float) - Number of likes the channel got
16 Dislike Count (float) - Number of dislikes the channel got
17 Favorite Count ( int type) - Number of people marked as favourite
18 Comment Count (float) - Number of people commented on the video
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset Summary
This dataset consists of nearly 5 hours of video from over 40 Creative Commons-licensed videos on YouTube. The videos contain the voices of more than 100 different people. The audio files have been resampled to 16 kHz. The videos have been divided into chunks of up to 25 seconds. This dataset is intended for developing Turkish STT (Speech-to-Text) models.
Datasets Preparetion
The audio files and transcript data were scraped from YouTube. The scraped… See the full description on the dataset page: https://huggingface.co/datasets/Anilosan15/YouTube_Video_Transkriptleri_TR.
Facebook
TwitterComprehensive ranking dataset of the top 500 YouTube channels worldwide. This dataset features 500 channels with detailed statistics including subscriber counts, total video views, video count, and global rankings. The leading channel has 452,000,000 subscribers and 101,598,825,577 total views. Each entry includes comprehensive metrics to analyze channel performance, growth trends, and competitive positioning. This dataset is regularly updated to reflect the latest YouTube channel statistics and ranking changes, providing valuable insights for content creators, marketers, and researchers analyzing YouTube ecosystem trends and channel performance benchmarks.
Facebook
TwitterComprehensive ranking dataset of the top 100 YouTube channels from India. This dataset features 100 channels with detailed statistics including subscriber counts, total video views, video count, and global rankings. The leading channel has 307,000,000 subscribers and 320,721,721,076 total views. Each entry includes comprehensive metrics to analyze channel performance, growth trends, and competitive positioning. This dataset is regularly updated to reflect the latest YouTube channel statistics and ranking changes, providing valuable insights for content creators, marketers, and researchers analyzing YouTube ecosystem trends and channel performance benchmarks.
Facebook
TwitterThis dataset provides estimated YouTube RPM (Revenue Per Mille) ranges for different niches in 2025, based on ad revenue earned per 1,000 monetized views.
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
I never look at a group’s chart until after I’ve fallen for their music. But once that happens, my astrologer brain kicks in. Was there something in the stars that day? This project is my way of testing that idea, using data from 120 K-pop groups.
What’s in the dataset?
Astrological data: Sun signs, moon signs, rising signs (when available), planetary retrogrades, and moon phases at debut
Career metrics: PAKs, music show wins, physical album sales, YouTube views
Time reliability: "Reliable" (verified debut time) or "Unreliable" (date only)
For years, I’ve casually tracked K-pop debuts (read: my YouTube history is 60% comeback stages, 30% astrology videos). When I started learning data analysis, I realized I could finally ask properly: do certain planetary alignments show up more often in "successful" groups? No mysticism. Just dates, numbers, and a lot of spreadsheet tabs.
How the data was collected
Group info and career stats come from Kpopping and SoriData
Debut times were taken from YouTube when available (for newer groups)
For older groups, exact debut times are often unavailable because many didn’t debut with YouTube videos in the early years
All astrological calculations were done using Astro-Seek’s calculator with Seoul as the default location
Some interesting notes
Leo sun signs appear frequently among award-winning boy groups
Want to explore?
Compare different generations: Are 4th-gen groups more likely to have certain signs?
Check if Mercury retrograde at debut had any impact on a group’s early success
This isn’t about proving astrology works. It’s about exploring whether patterns exist between the stars and K-pop success. The data is here for you to analyze and draw your own conclusions.
P.S. If your bias’s Moon sign matches yours… welcome to the "wait, why do I feel so seen?" club.
Facebook
TwitterThis dataset contains thumbnails with their specific video statistics and channel statistics (as of March 2023).
The top 100 most subscribed channels from each categories has been chosen and 25 thumbnails from their channel has been taken. Channels with hidden view count or hidden like count had to be skipped of course.
The name of each thumbnail specifies their statistics in this format:
[category ID]_[seconds from channel published]_[subscribers]_[total videos]_[total views in this video]_[seconds since video published]_[views]_[likes]
Category ID specifies:
1 Film & Animation 2 Autos & Vehicles 10 Music 15 Pets & Animals 17 Sports 18 Short Movies 19 Travel & Events 20 Gaming 21 Videoblogging 22 People & Blogs 23 Comedy 24 Entertainment 25 News & Politics 26 How-to & Style 27 Education 28 Science & Technology 29 Nonprofits & Activism 30 Movies 31 Anime/Animation 32 Action/Adventure 33 Classics 34 Comedy 35 Documentary 36 Drama 37 Family 38 Foreign 39 Horror 40 Sci-Fi/Fantasy 41 Thriller 42 Shorts 43 Shows 44 Trailers
The dataset has been prepared using the Youtube Search v3 API.
Facebook
TwitterMost of the time, music videos published dates are same as music themselves.
It would be valid to use the dates as release dates.
There are no other sources better than youtube to cover as much songs as possible. Music streaming is ubiquitous. Currently, Spotify plays an important part on that. This dataset enable us to explore how artists and songs' popularity varies in time.
This dataset contains the daily ranking of the 200 most listened songs in 53 countries from 2017 and 2018 by Spotify users. It contains more than 2 million rows, which comprises 6629 artists, 18598 songs for a total count of one hundred five billion streams count.
The data spans from 1st January 2017 to 9th January 2018 and will be kept up-to-date on following versions. It has been collected from Spotify's regional chart data. Inspiration
Can you predict what is the rank position or the number of streams a song will have in the future?
How long does songs "resist" on the top 3, 5, 10, 20 ranking?
What are the signs of a song that gets into the top rank to stay?
Do continents share same top ranking artists or songs?
Are people listening to the very same top ranking songs on countries far away from each other?
How long time does a top ranking song takes to get into the ranking of neighbor countries?
To start out, you can take a look into a simple Kernel I have made in order to read the data, filter data from a song, plot is temporal tendency per country than make a simple forecast of the its streams count here.
Facebook
TwitterComprehensive ranking dataset of the top 100 YouTube channels from Sri Lanka. This dataset features 100 channels with detailed statistics including subscriber counts, total video views, video count, and global rankings. The leading channel has 10,600,000 subscribers and 4,717,732,423 total views. Each entry includes comprehensive metrics to analyze channel performance, growth trends, and competitive positioning. This dataset is regularly updated to reflect the latest YouTube channel statistics and ranking changes, providing valuable insights for content creators, marketers, and researchers analyzing YouTube ecosystem trends and channel performance benchmarks.
Facebook
TwitterInstagram’s most popular post
As of April 2024, the most popular post on Instagram was Lionel Messi and his teammates after winning the 2022 FIFA World Cup with Argentina, posted by the account @leomessi. Messi's post, which racked up over 61 million likes within a day, knocked off the reigning post, which was 'Photo of an Egg'. Originally posted in January 2021, 'Photo of an Egg' surpassed the world’s most popular Instagram post at that time, which was a photo by Kylie Jenner’s daughter totaling 18 million likes.
After several cryptic posts published by the account, World Record Egg revealed itself to be a part of a mental health campaign aimed at the pressures of social media use.
Instagram’s most popular accounts
As of April 2024, the official Instagram account @instagram had the most followers of any account on the platform, with 672 million followers. Portuguese footballer Cristiano Ronaldo (@cristiano) was the most followed individual with 628 million followers, while Selena Gomez (@selenagomez) was the most followed woman on the platform with 429 million. Additionally, Inter Miami CF striker Lionel Messi (@leomessi) had a total of 502 million. Celebrities such as The Rock, Kylie Jenner, and Ariana Grande all had over 380 million followers each.
Instagram influencers
In the United States, the leading content category of Instagram influencers was lifestyle, with 15.25 percent of influencers creating lifestyle content in 2021. Music ranked in second place with 10.96 percent, followed by family with 8.24 percent. Having a large audience can be very lucrative: Instagram influencers in the United States, Canada and the United Kingdom with over 90,000 followers made around 1,221 US dollars per post.
Instagram around the globe
Instagram’s worldwide popularity continues to grow, and India is the leading country in terms of number of users, with over 362.9 million users as of January 2024. The United States had 169.65 million Instagram users and Brazil had 134.6 million users. The social media platform was also very popular in Indonesia and Turkey, with 100.9 and 57.1, respectively. As of January 2024, Instagram was the fourth most popular social network in the world, behind Facebook, YouTube and WhatsApp.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description: The "Indian Languages Audio Dataset" is a collection of audio samples featuring a diverse set of 10 Indian languages. Each audio sample in this dataset is precisely 5 seconds in duration and is provided in MP3 format. It is important to note that this dataset is a subset of a larger collection known as the "Audio Dataset with 10 Indian Languages." The source of these audio samples is regional videos freely available on YouTube, and none of the audio samples or source videos are owned by the dataset creator.
Languages Included: 1. Bengali 2. Gujarati 3. Hindi 4. Kannada 5. Malayalam 6. Marathi 7. Punjabi 8. Tamil 9. Telugu 10. Urdu
This dataset offers a valuable resource for researchers, linguists, and machine learning enthusiasts who are interested in studying and analyzing the phonetics, accents, and linguistic characteristics of the Indian subcontinent. It is a representative sample of the linguistic diversity present in India, encompassing a wide array of languages and dialects. Researchers and developers are encouraged to explore this dataset to build applications or conduct research related to speech recognition, language identification, and other audio processing tasks.
Additionally, the dataset is not limited to these 10 languages and has the potential for expansion. Given the dynamic nature of language use in India, this dataset can serve as a foundation for future data collection efforts involving additional Indian languages and dialects.
Access to the "Indian Multilingual Audio Dataset - 10 Languages" is provided with the understanding that users will comply with applicable copyright and licensing restrictions. If users plan to extend this dataset or use it for commercial purposes, it is essential to seek proper permissions and adhere to relevant copyright and licensing regulations.
By utilizing this dataset responsibly and ethically, users can contribute to the advancement of language technology and research, ultimately benefiting language preservation, speech recognition, and cross-cultural communication.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
The people from Czech are publishing a dataset for the HTTPS traffic classification.
Since the data were captured mainly in the real backbone network, they omitted IP addresses and ports. The datasets consist of calculated from bidirectional flows exported with flow probe Ipifixprobe. This exporter can export a sequence of packet lengths and times and a sequence of packet bursts and time. For more information, please visit ipfixprobe repository (Ipifixprobe).
During research, they divided HTTPS into five categories: L -- Live Video Streaming, P -- Video Player, M -- Music Player, U -- File Upload, D -- File Download, W -- Website, and other traffic.
They have chosen the service representatives known for particular traffic types based on the Alexa Top 1M list and Moz's list of the most popular 500 websites for each category. They also used several popular websites that primarily focus on the audience in Czech. The identified traffic classes and their representatives are provided below:
Live Video Stream Twitch, Czech TV, YouTube Live Video Player DailyMotion, Stream.cz, Vimeo, YouTube Music Player AppleMusic, Spotify, SoundCloud File Upload/Download FileSender, OwnCloud, OneDrive, Google Drive Website and Other Traffic Websites from Alexa Top 1M list
Facebook
TwitterThat is a dataset that a Korean male voice actor(Lee Jaebeom) audio, text, and data file from youtube channel(https://www.youtube.com/channel/UC25eCV-Q1NZkexOZ0XoCajw/featured). Therefore, this channel has its license.
There are short audio files segmented by subtitle. and a refined text file that corresponds to an audio file. Also, data file(numpy) obtained by embedding audio and text files are there. The data can be used to train artificial intelligence models such as voice synthesis and voice recognition. And if you want to know how to use this data, see https://github.com/zldzmfoq12/voice-synthesizer. This github shows how to synthesize people voice.
This license is for a youtube channel(https://www.youtube.com/channel/UC25eCV-Q1NZkexOZ0XoCajw/featured). So, never use this data for commercial purposes.
How can I get neet texts(subtitles) corresponding to audio files?
Facebook
TwitterVoxCeleb is an audio-visual dataset consisting of short clips of human speech, extracted from interview videos uploaded to YouTube VoxCeleb contains speech from 7000 speakers spanning a wide range of different ethnicities, accents, professions and ages. All speaking face tracks are captured "in the wild", with background chatter, laughter, overlapping speech, pose variation and different lighting conditions. VoxCeleb consists of both audio and video. Each segment is at least 3 seconds long.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F4937078%2F2de9bee900e6599f080c396be3659cb4%2Fthe-face.jpg?generation=1590504436110094&alt=media%20=400x800" alt="People celebrating Students Day in Bulgarian chalga club The Face in Blagoevgrad." title="People celebrating Students Day in Bulgarian chalga club The Face in Blagoevgrad.">
Bulgarian pop-folk (hereinafter referred to as chalga) is a dance genre, stemming from ethno-pop, with strong hints of Oriental rhythms and instrumentals. Chalga is one of many branches of Balkan folk throughout the peninsula (turbofolk in Serbia, manele in Romania etc.) After the fall of communism in 1989 in Central and Eastern Europe, chalga rapidly found place in everyday life.
Chalga relies on provocativity, and tracks commonly contain sexually explicit lyrics. Because of this, it causes much controversy in society and there is sparse scientific work in the field. Nevertheless, chalga becomes an increasingly popular musical style. As such, we believe it must be subject to development. Finding its 'evolution' constitutes the main scientific motivation behind this study.
Payner LTD is a Bulgarian record label and production studio, founded in 1990. It is currently considered the largest record label in the country, producing mainly in both Bulgarian folk and chalga genres. The company has active presence in television, taking ownership of three channels: 'Planeta TV', 'Planeta Folk' and 'Planeta HD'.
Payner LTD also maintains activity in the Internet, particularly in YouTube. Their main channel in YouTube, 'PlanetaOfficial', publishes music content exclusively. 'PlanetaOfficial' can be also credited with holding the largest audience in Bulgaria - for the time being, it has got 2.1 million subscribers and 5.0 billion total video views, dominating on the national YouTube scene.
The top three YouTube accounts in Bulgaria, associated with chalga music, as of 4 Jan 2021, are:
- PlanetaOfficial (Payner LTD), 2.12m subscribers, 4962m total views,
- FEN TV, 0.76m subscribers, 699m total views,
- Diapason Records, 0.53m subscribers, 580m total views
In all of those circumstances, 'PlanetaOfficial' was recognised as a pivotal source of data in the study.
Data were acquired from MILKER, software specifically designed for this purpose.
The following data in payner.csv contains Spotify information of 679 resolved tracks, out of 638 detected in PlanetaOfficial, in the period 2014-2020. Every row is a track, and contains:
- the unique Spotify ID of the song;
- pre-processed names of the first three artists in a song (if such are present), according to their order of mention;
- name of the track;
- datetime of the video upload in PlanetaOfficial;
- various Spotify audio features.
There are no missing values in this data. However, [MILKER] is not flawless. It includes tracks, not associated with PlanetaOfficial - for example works by Bach and Beethoven. In this context, data purity is defined as the fraction of songs with corresponding video uploads by PlanetaOfficial.
After random sampling (n=100), it may be inferred that the purity of this dataset is (0.91, 0.99), 95% C.I.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
YouTube was created in 2005, with the first video – Me at the Zoo - being uploaded on 23 April 2005. Since then, 1.3 billion people have set up YouTube accounts. In 2018, people watch nearly 5 billion videos each day. People upload 300 hours of video to the site every minute.
According to 2016 research undertaken by Pexeso, music only accounts for 4.3% of YouTube’s content. Yet it makes 11% of the views. Clearly, an awful lot of people watch a comparatively small number of music videos. It should be no surprise, therefore, that the most watched videos of all time on YouTube are predominantly music videos.
On August 13, BTS became the most-viewed artist in YouTube history, accumulating over 26.7 billion views across all their official channels. This count includes all music videos and dance practice videos.
Justin Bieber and Ed Sheeran now hold the records for second and third-highest views, with over 26 billion views each.
Currently, BTS’s most viewed videos are their music videos for “**Boy With Luv**,” “**Dynamite**,” and “**DNA**,” which all have over 1.4 billion views.
Headers of the Dataset Total = Total views (in millions) across all official channels Avg = Current daily average of all videos combined 100M = Number of videos with more than 100 million views