100+ datasets found
  1. YouTube Trending Videos Dataset

    • kaggle.com
    zip
    Updated Dec 19, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). YouTube Trending Videos Dataset [Dataset]. https://www.kaggle.com/datasets/thedevastator/youtube-trending-videos-dataset
    Explore at:
    zip(29769637 bytes)Available download formats
    Dataset updated
    Dec 19, 2023
    Authors
    The Devastator
    Area covered
    YouTube
    Description

    YouTube Trending Videos Dataset

    Exploring YouTube Trending Videos

    By dskl [source]

    About this dataset

    Moreover it also reveals various engagement metrics such as the number of views the video has received, likes and dislikes it has garnered from viewership. Additionally information related to comment count on particular videos enables analysis regarding viewer interaction and response. Furthermore this dataset describes whether comments or ratings are disabled for a particular video allowing examination into how these factors impact engagement.

    By exploring this dataset in-depth marketers can gain valuable insights into identifying trends in content popularity across different countries while taking into account timing considerations based on published day of week. It also opens up avenues for analyzing public sentiment towards specific videos based on likes vs dislikes ratios and comment count which further aids in devising suitable marketing strategies.

    Overall,this informative dataset serves as an invaluable asset for researchers,data analysts,and marketers alike who strive to gain deeper understanding about trending video patterns,relevant metrics influencing content virality,factors dictating viewer sentiments,and exploring new possibilities within digital marketing space leveraging YouTube's wide reach

    How to use the dataset

    How to Use This Dataset: A Guide

    In this guide, we will walk you through the different columns in the dataset and provide insights on how you can explore the popularity and engagement of these trending videos. Let's dive in!

    Column Descriptions:

    • title: The title of the video.
    • channel_title: The title of the YouTube channel that published the video.
    • publish_date: The date when the video was published on YouTube.
    • time_frame: The duration of time (e.g., 1 day, 6 hours) that the video has been trending on YouTube.
    • published_day_of_week: The day of week (e.g., Monday) when the video was published.
    • publish_country: The country where the video was published.
    • tags: The tags or keywords associated with the video.
    • views: The number of views received by a particular video
    • likes: Number o likes received per each videos
    • dislike: Number dislikes receives per an individual vidoe 11.comment_count: number of comments

    Popular Video Insights:

    To gain insights into popular videos based on this dataset, you can focus your analysis using these columns:

    title, channel_title, publish_date, time_frame, and** publish_country**.

    By analyzing these attributes together with other engagement metrics such as views ,likes,**dislikes,**comments),comment_count you can identify trends in what type content is most popular both globally or within specific countries.

    For instance: - You could analyze which channels are consistently publishing trending videos - Explore whether certain types of titles or tags are more likely to attract views and engagement. - Determine if certain days of the week or time frames have a higher likelihood of trending videos being published.

    Engagement Insights:

    To explore user engagement with the trending videos, you can focus your analysis on these columns:

    likes, dislikes, comment_count

    By analyzing these attributes you can get insights into how users are interacting with the content. For example: - You could compare the like and dislike ratios to identify positively received videos versus those that are more controversial. - Analyze comment counts to understand how users are engaging with the content and whether comments being disabled affects overall

    Research Ideas

    • Analyzing the popularity and engagement of trending videos: By analyzing the number of views, likes, dislikes, and comments, we can understand which types of videos are popular among YouTube users. We can also examine factors such as comment count and ratings disabled to see how viewers engage with trending videos.
    • Understanding video trends across different countries: By examining the publish country column, we can compare the popularity of trending videos in different countries. This can help content creators or marketers understand regional preferences and tailor their content strategy accordingly.
    • Studying the impact of video attributes on engagement: By exploring the relationship between video attributes (such as title, tags, publish day) and engagement metrics (views, likes), we can identify patterns or trends that influence a video's success on YouTube. This information can be...
  2. YouTube Dataset of all Data Science Channels🎓🧾

    • kaggle.com
    zip
    Updated Jun 21, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abhishek0032 (2024). YouTube Dataset of all Data Science Channels🎓🧾 [Dataset]. https://www.kaggle.com/datasets/abhishek0032/youtube-dataset-all-data-scienceanalyst-channels
    Explore at:
    zip(732289 bytes)Available download formats
    Dataset updated
    Jun 21, 2024
    Authors
    Abhishek0032
    Area covered
    YouTube
    Description

    Description: This dataset contains detailed information about videos from various YouTube channels that specialize in data science and analytics. It includes metrics such as views, likes, comments, and publication dates. The dataset consists of 22862 rows, providing a robust sample for analyzing trends in content engagement, popularity of topics over time, and comparison of channels' performance.

    Column Descriptors:

    Channel_Name: The name of the YouTube channel. Title: The title of the video. Published_date: The date when the video was published. Views: The number of views the video has received. Like_count: The number of likes the video has received. Comment_Count: The number of comments on the video.

    This dataset contains information from the following YouTube channels:

    ['sentdex', 'freeCodeCamp.org' ,'CampusX', 'Darshil Parmar',' Keith Galli' ,'Alex The Analyst', 'Socratica' , Krish Naik', 'StatQuest with Josh Starmer', 'Nicholas Renotte', 'Leila Gharani', 'Rob Mulla' ,'Ryan Nolan Data', 'techTFQ', 'Dataquest' ,'WsCube Tech', 'Chandoo', 'Luke Barousse', 'Andrej Karpathy', 'Thu Vu data analytics', 'Guy in a Cube', 'Tableau Tim', 'codebasics', 'DeepLearningAI', 'Rishabh Mishra' 'ExcelIsFun', 'Kevin Stratvert' ' Ken Jee','Kaggle' , 'Tina Huang']

    This dataset can be used for various analyses, including but not limited to:

    Identifying the most popular videos and channels in the data science field.

    Understanding viewer engagement trends over time.

    Comparing the performance of different types of content across multiple channels.

    Performing a comparison between different channels to find the best-performing ones.

    Identifying the best videos to watch for specific topics in data science and analytics.

    Conducting a detailed analysis of your favorite YouTube channel to understand its content strategy and performance.

    Note: The data is current as of the date of extraction and may not reflect real-time changes on YouTube. For any analyses, ensure to consider the date when the data was last updated to maintain accuracy and relevance.

  3. Data from: YouTube Videos Datasets

    • brightdata.com
    .json, .csv, .xlsx
    Updated Dec 20, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bright Data (2024). YouTube Videos Datasets [Dataset]. https://brightdata.com/products/datasets/youtube/videos
    Explore at:
    .json, .csv, .xlsxAvailable download formats
    Dataset updated
    Dec 20, 2024
    Dataset authored and provided by
    Bright Datahttps://brightdata.com/
    License

    https://brightdata.com/licensehttps://brightdata.com/license

    Area covered
    YouTube, Worldwide
    Description

    Use our YouTube Videos dataset to extract detailed information from public videos and filter by video title, views, upload date, or likes. Data points include video URL, title, description, thumbnail, upload date, view count, like count, comment count, tags, and more. You can purchase the entire dataset or a customized subset, tailored to your needs. Popular use cases for this dataset include trend analysis, content performance tracking, brand monitoring, and influencer campaign optimization.

  4. Dot CSV's YouTube Channel Statistics

    • vidiq.com
    Updated Apr 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    vidIQ (2025). Dot CSV's YouTube Channel Statistics [Dataset]. https://vidiq.com/youtube-stats/channel/UCy5znSnfMsDwaLlROnZ7Qbg/
    Explore at:
    Dataset updated
    Apr 15, 2025
    Dataset authored and provided by
    vidIQ
    Time period covered
    Nov 1, 2025 - Nov 30, 2025
    Area covered
    ES, YouTube
    Variables measured
    subscribers, video count, video views, engagement rate, upload frequency, estimated earnings
    Description

    Comprehensive YouTube channel statistics for Dot CSV, featuring 906,000 subscribers and 54,938,209 total views. This dataset includes detailed performance metrics such as subscriber growth, video views, engagement rates, and estimated revenue. The channel operates in the Education category and is based in ES. Track 231 videos with daily and monthly performance data, including view counts, subscriber changes, and earnings estimates. Analyze growth trends, engagement patterns, and compare performance against similar channels in the same category.

  5. Ken Jee YouTube Data

    • kaggle.com
    zip
    Updated Jan 22, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ken Jee (2022). Ken Jee YouTube Data [Dataset]. https://www.kaggle.com/datasets/kenjee/ken-jee-youtube-data
    Explore at:
    zip(6556461 bytes)Available download formats
    Dataset updated
    Jan 22, 2022
    Authors
    Ken Jee
    Area covered
    YouTube
    Description

    Context

    I've been creating videos on YouTube since November of 2017 (https://www.youtube.com/c/KenJee1) with the mission of making data science accessible to more people. One of the best ways to do this is to tell stories and working on projects. This is my attempt at my first community project. I am making my YouTube data available for everyone to help better understand the growth of my YouTube community and think about ways that it could be improved! I would love for everyone in the community feel like they had some hand in contributing to the channel.

    Announcement Video: https://youtu.be/YPph59-rTxA

    I will be sharing my favorite projects in a few of my videos (with permission of course), and would also like to give away a few small prizes to the top featured notebooks. I hope you have fun with the analysis, I'm interested in seeing what you find in the data!

    For those looking for a place to start, some things I'm thinking about are: - What are the themes of the comment data? - What types of video titles and thumbnails drive the most traffic? - Who is my core audience and what are they interested in? - What types of videos have lead to the most growth? - What type of content are people engaging with the most or watching the longest?

    Some advanced projects could be: - Creating a chat bot to respond to common comments with videos where I have addressed a topic - Pulling sentiment from thumbnails and titles and comparing that with performance

    Data I would like to add over time - Video descriptions - Video subtitles - Actual video data

    Content

    There are four files in this repo. The relevant data included in most of them is from Nov 2017 - Jan 2022. I gathered some of this data via the YouTube API and the rest from my specific analytics.

    1) Aggregated Metrics By Video - This has all the topline metrics from my channel from its start (around 2015 to Jan 22 2022). I didn't post my first video until around 2) Aggregated Metrics By Video with Country and Subscriber Status - This has the same data as aggregated metrics by video, but it includes dimensions for which country people are viewing from and if the viewers are subscribed to the channel or not. 3) Video Performance Over Time - This has the daily data from each of my videos. 4) All Comments - This is all of my comment data gathered from the YouTube API. I have anonymized the users so don't worry about your name showing up!

    Acknowledgements

    This obviously wouldn't be possible without all of the wonderful people who watch and interact with my videos! I'm incredibly grateful for you all and I'm so happy I can share this project with you!

    License

    I collected this data from the YouTube API and through my own google analytics. Thus use of it must uphold the YouTube API's terms of service: https://developers.google.com/youtube/terms/api-services-terms-of-service

  6. Trending videos on Youtube

    • kaggle.com
    zip
    Updated Sep 20, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    anusha bellam (2022). Trending videos on Youtube [Dataset]. https://www.kaggle.com/datasets/anushabellam/trending-videos-on-youtube
    Explore at:
    zip(29720 bytes)Available download formats
    Dataset updated
    Sep 20, 2022
    Authors
    anusha bellam
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    YouTube
    Description

    **Trending on YouTube ** Trending helps viewers see what’s happening on YouTube and in the world. Trending aims to surface videos and shorts that a wide range of viewers would find interesting. Some trends are predictable, like a new song from a popular artist or a new movie trailer. Others are surprising, like a viral video.

    Trending isn't personalized and displays the same list of trending videos to all viewers in the same country, which is why you may see videos in Trending that aren’t in the same language as your browser. However, in India, Trending displays a list of results for each of the 9 most common Indic languages.

    SOURCE The data has been scrapped from "Mendeley.com". The source of this file ishttps://data.mendeley.com/datasets/7pkbvjtnxm/1/files/e7763107-45e9-4613-8c81-146e6a272266 Converted the data to csv file to use it in kaggle ../input/youtube-vdos/youtube trending videos dataset.csv

    The data contains following columns . * ) Position (int type) - An index column which gives the position of the channel in youtube channel 1) Channel Id ( Stirng ) - ID of the youtube channel 2) Channel Title ( String ) - Youtube channel title 3) Video Id (String) - ID of video in the youtube channel 4) Published At (String) - date of the video published at 5) Video Title (String ) - Title of the video 6) Video Description (String) - Description of the video(what the video is about) 6 Video Category Id ( int type) - Category of the video in youtube channel 7 Video Category Label (String) - type of category the video belongs
    8 Duration (String ) - duration of the video 9 Duration Sec ( int type) - Duration of video in seconds 10 Dimension (String) - Dimension of the video (2D , Hd) 11 Definition (String) - Defining the video 12 Caption (bool ) - Boolean type caption (True or False) 13 Licensed Content (float Type) 14 View Count ( int type) - number of people viewed the video
    15 Like Count (float) - Number of likes the channel got 16 Dislike Count (float) - Number of dislikes the channel got 17 Favorite Count ( int type) - Number of people marked as favourite 18 Comment Count (float) - Number of people commented on the video

  7. Youtube dataset1.csv

    • figshare.com
    txt
    Updated Apr 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Caleb M. Gibson (2024). Youtube dataset1.csv [Dataset]. http://doi.org/10.6084/m9.figshare.25546468.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Apr 6, 2024
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Caleb M. Gibson
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    YouTube
    Description

    This is a simple sample dataset for individuals to work around with in the process of learning data visualizations.

  8. Z

    Dataset of Video Comments of a Vision Video Classified by Their Relevance,...

    • data.niaid.nih.gov
    Updated Jul 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Karras, Oliver; Kristo, Eklekta (2024). Dataset of Video Comments of a Vision Video Classified by Their Relevance, Polarity, Intention, and Topic [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4533301
    Explore at:
    Dataset updated
    Jul 19, 2024
    Dataset provided by
    Leibniz University Hannover
    TIB - Leibniz Information Centre for Science and Technology
    Authors
    Karras, Oliver; Kristo, Eklekta
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains all comments (comments and replies) of the YouTube vision video "Tunnels" by "The Boring Company" fetched on 2020-10-13 using YouTube API. The comments are classified manually by three persons. We performed a single-class labeling of the video comments regarding their relevance for requirement engineering (RE) (ham/spam), their polarity (positive/neutral/negative). Furthermore, we performed a multi-class labeling of the comments regarding their intention (feature request and problem report) and their topic (efficiency and safety). While a comment can only be relevant or not relevant and have only one polarity, a comment can have one or more intentions and also one or more topics.

    For the replies, one person also classified them regarding their relevance for RE. However, the investigation of the replies is ongoing and future work.

    Remark: For 126 comments and 26 replies, we could not determine the date and time since they were no longer accessible on YouTube at the time this data set was created. In the case of a missing date and time, we inserted "NULL" in the corresponding cell.

    This data set includes the following files:

    Dataset.xlsx contains the raw and labeled video comments and replies:

    For each comment, the data set contains:

    ID: An identification number generated by YouTube for the comment

    Date: The date and time of the creation of the comment

    Author: The username of the author of the comment

    Likes: The number of likes of the comment

    Replies: The number of replies to the comment

    Comment: The written comment

    Relevance: Label indicating the relevance of the comment for RE (ham = relevant, spam = irrelevant)

    Polarity: Label indicating the polarity of the comment

    Feature request: Label indicating that the comment request a feature

    Problem report: Label indicating that the comment reports a problem

    Efficiency: Label indicating that the comment deals with the topic efficiency

    Safety: Label indicating that the comment deals with the topic safety

    For each reply, the data set contains:

    ID: The identification number of the comment to which the reply belongs

    Date: The date and time of the creation of the reply

    Author: The username of the author of the reply

    Likes: The number of likes of the reply

    Comment: The written reply

    Relevance: Label indicating the relevance of the reply for RE (ham = relevant, spam = irrelevant)

    Detailed analysis results.xlsx contains the detailed results of all ten times repeated 10-fold cross validation analyses for each of all considered combinations of machine learning algorithms and features

    Guide Sheet - Multi-class labeling.pdf describes the coding task, defines the categories, and lists examples to reduce inconsistencies and increase the quality of manual multi-class labeling

    Guide Sheet - Single-class labeling.pdf describes the coding task, defines the categories, and lists examples to reduce inconsistencies and increase the quality of manual single-class labeling

    Python scripts for analysis.zip contains the scripts (as jupyter notebooks) and prepared data (as csv-files) for the analyses

  9. YouTube Channel Performance Analytics

    • kaggle.com
    zip
    Updated Oct 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    L3WY (2024). YouTube Channel Performance Analytics [Dataset]. https://www.kaggle.com/datasets/positivealexey/youtube-channel-performance-analytics
    Explore at:
    zip(41446 bytes)Available download formats
    Dataset updated
    Oct 25, 2024
    Authors
    L3WY
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Area covered
    YouTube
    Description

    This dataset provides an in-depth look at YouTube video analytics, capturing key metrics related to video performance, audience engagement, revenue generation, and viewer behavior. Sourced from real video data, it highlights how variables like video duration, upload time, and ad impressions contribute to monetization and audience retention. This dataset is ideal for data analysts, content creators, and marketers aiming to uncover trends in viewer engagement, optimize content strategies, and maximize ad revenue. Inspired by the evolving landscape of digital content, it serves as a resource for understanding the impact of YouTube metrics on channel growth and content reach.

    Video Details: Columns like Video Duration, Video Publish Time, Days Since Publish, Day of Week.

    Revenue Metrics: Includes Revenue per 1000 Views (USD), Estimated Revenue (USD), Ad Impressions, and various ad revenue sources (e.g., AdSense, DoubleClick).

    Engagement Metrics: Metrics such as Views, Likes, Dislikes, Shares, Comments, Average View Duration, Average View Percentage (%), and Video Thumbnail CTR (%).

    Audience Data: Data on New Subscribers, Unsubscribes, Unique Viewers, Returning Viewers, and New Viewers.

    Monetization & Transaction Metrics: Details on Monetized Playbacks, Playback-Based CPM, YouTube Premium Revenue, and transactions like Orders and Total Sales Volume (USD).

  10. Most Viewed YouTube Music Videos

    • kaggle.com
    zip
    Updated Mar 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aryan Singh (2025). Most Viewed YouTube Music Videos [Dataset]. https://www.kaggle.com/datasets/asmonline/most-viewed-youtube-music-videos
    Explore at:
    zip(80258 bytes)Available download formats
    Dataset updated
    Mar 6, 2025
    Authors
    Aryan Singh
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Area covered
    YouTube
    Description

    This dataset provides a comprehensive list of the top 2500 most viewed music videos on YouTube, including video names and view counts. It serves as a valuable resource for analyzing music popularity trends, practicing data analysis techniques, or exploring the landscape of online music consumption.

    Dataset Structure

    File Format: CSV

    Columns: - Video: String, the name of the music video - Views: Integer, the number of views the video has received

    Rows: 2500

    The dataset is clean and ready to use, with no missing values or duplicates.

    Potential Uses

    • Analyze trends in music popularity
    • Identify top artists or genres by parsing video names
    • Practice data visualization and exploratory data analysis Use as a foundation for more detailed studies by enriching with additional data

    Note on Video Names

    Video names are in the format commonly used on YouTube, often including the artist and song title, e.g., "Artist - Song Title (Official Video)". Users may need to parse these names to extract specific information.

  11. E

    Data from: Italian YouTube Hate Speech Corpus

    • live.european-language-grid.eu
    binary format
    Updated Sep 30, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2021). Italian YouTube Hate Speech Corpus [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/20158
    Explore at:
    binary formatAvailable download formats
    Dataset updated
    Sep 30, 2021
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Area covered
    YouTube
    Description

    We present an Italian YouTube dataset manually annotated for hate speech types and targets. The comments to be annotated were sampled from the Italian YouTube comments on videos about the Covid-19 pandemic in the period from January 2020 to May 2020. Two sets were annotated: a training set with 59,870 comments (IMSyPP_IT_YouTube_comments_train.csv) and an evaluation set with 10,536 comments (IMSyPP_IT_YouTube_comments_evaluation.csv). The dataset was annotated by 8 annotators with each comment being annotated by two annotators. It was used to train a classification model for hate speech types detection that is publicly available at the following URL: https://huggingface.co/IMSyPP/hate_speech_it.

    The dataset consists of the following fields: ID_Commento - YouTube ID of the comment ID_Video - YouTube ID of the video under which the comment was posted Testo - text of the comment Tipo - type of hate speech Target - the target of hate speech

    Additionally, we have included the Italian YouTube data (SR_YT_comments.csv) which was collected in the same period as the training data and was annotated using the aforementioned model. The automatically labeled data was used to analyze the relationship between hate speech and misinformation on Italian YouTube. The results of this analysis are presented in the associated paper.

    The analyzed data are represented with the following fields: ID_Commento - YouTube ID of the comment Label - automatically assigned label by the model is_questionable - the type of channel where the comment was collected from; the channels could either be categorized as spreading reliable or questionable information.

  12. H

    Replication Data for: Cross-Partisan Discussions on YouTube: Conservatives...

    • dataverse.harvard.edu
    Updated Apr 27, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Siqi Wu; Paul Resnick (2021). Replication Data for: Cross-Partisan Discussions on YouTube: Conservatives Talk to Liberals but Liberals Don't Talk to Conservatives [Dataset]. http://doi.org/10.7910/DVN/KF5JC5
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 27, 2021
    Dataset provided by
    Harvard Dataverse
    Authors
    Siqi Wu; Paul Resnick
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    YouTube
    Description

    The dataset is first introduced in the following paper: Siqi Wu and Paul Resnick. Cross-Partisan Discussions on YouTube: Conservatives Talk to Liberals but Liberals Don't Talk to Conservatives. In AAAI International Conference on Weblogs and Social Media (ICWSM), 2021. us_partisan.csv Metadata for 1,267 US partisan media on YouTube. The first row is header. Fields include "title, url, channel_title, channel_id, leaning, type, source, channel_description" video_meta.csv Metadata for 274241 YouTube political videos from US partisan media. The first row is header. Fields include "video_id, channel_id, media_leaning, media_type, num_view, num_comment, num_cmt_from_liberal, num_cmt_from_conservative, num_cmt_from_unknown" user_comment_meta.csv.bz2 Metadata for 9,304,653 YouTube users who have commented on YouTube political videos. The first row is header. Fields include "hashed_user_id, predicted_user_leaning, num_comment, num_cmt_on_left, num_cmt_on_right" user_comment_trace.tsv.bz2 Comment trace for 9,304,653 YouTube users who have commented on YouTube political videos. The first row is header. Fields include "hashed_user_id predicted_user_leaning comment_trace" (split by \t) "comment_trace" consists of "channel_id1,num_comment_on_this_channel1;channel_id2,num_comment_on_this_channel2;..." (split by ;) trained_HAN_models.tar.bz2 Five trained HAN models for predicting user political leanings. Each model consists a ".h5" model file and ".tokenizer" tokenizer file. See this for how to use our pre-trained HAN models. See more details in this data description.

  13. Z

    x264 performances on the Youtube UGC Dataset

    • data.niaid.nih.gov
    • data-staging.niaid.nih.gov
    • +1more
    Updated Aug 29, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    X (2020). x264 performances on the Youtube UGC Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3928252
    Explore at:
    Dataset updated
    Aug 29, 2020
    Authors
    X
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    YouTube
    Description

    The res_ugc zip contains the results of a performance analysis of x264, for 201 configurations and 1397 videos of the Youtube User General Content Dataset (see https://media.withyoutube.com/).

    For each video (e.g. Animation_360P-3e40), there is a corresponding file (e.g. Animation_360P-3e40.csv) gathering the measurements for this video. The experimental protocol is detailed in the related paper (to link).

  14. 8 Nations' YouTube Data Videos Trends: 3200

    • kaggle.com
    zip
    Updated May 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    suat selvi (2025). 8 Nations' YouTube Data Videos Trends: 3200 [Dataset]. https://www.kaggle.com/datasets/suatselvi/8-nations-youtube-data-videos-trends-3200
    Explore at:
    zip(484381 bytes)Available download formats
    Dataset updated
    May 9, 2025
    Authors
    suat selvi
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Area covered
    YouTube
    Description

    youtube data analysis course videos 8 countries

    Overview

    This dataset includes ~3200 YouTube videos focused on data analysis from 8 countries (~400 videos per country). Featuring data from Turkey, USA, Russia, Italy, France, Germany, Japan, and Spain, each video provides 8 key features. Ideal for global data science trend analysis!

    Content

    • Countries: Turkey (TR), USA (US), Russia (RU), Italy (IT), France (FR), Germany (DE), Japan (JP), Spain (ES)
    • Features (8 columns, example):
      • title`: Video title
      • views_count: Total views
      • comment_count: Total comments
      • likes_count: Total likes -'dislike_count':Total dislikes
    • Additional Features (if added):
      • country_code: Country code (e.g., TR, US)
      • country_name: Full country name
      • like_view_ratio: Likes-to-views ratio
    • Size: ~3200 rows, 8+ columns
    • Format: CSV files
      • all_countries.csv: Combined dataset
      • Country-specific files (e.g., TR_videos.csv, US_videos.csv)

    Potential Use Cases

    • Compare engagement (views, likes) across Turkey, USA, Russia, and other nations.
    • Analyze trending data science topics using tags from different countries.
    • Study how publish_date impacts video popularity in each region.
    • Visualize country-specific trends with Seaborn or Matplotlib.

    Data Preparation

    • Cleaning: Missing values in likes and views filled with median/zero. NaN in tags set to "Unknown".
    • Standardization: publish_date formatted as YYYY-MM-DD.
    • Structure: Includes individual country files and a combined all_countries.csv.

    Notes

    • Data collected from YouTube API (or specify your method).
    • Some metrics may be incomplete due to API limitations.
    • Feedback and suggestions are welcome!

    Get Started

    Load the dataset with Pandas: ```python import pandas as pd df = pd.read_csv('all_countries.csv')

    Example: Top 5 videos by views

    print(df.sort_values('views', ascending=False).head())

  15. TED Talk | Youtube

    • kaggle.com
    zip
    Updated May 13, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ashish Jangra (2022). TED Talk | Youtube [Dataset]. https://www.kaggle.com/datasets/ashishjangra27/ted-talk-youtube
    Explore at:
    zip(1177042 bytes)Available download formats
    Dataset updated
    May 13, 2022
    Authors
    Ashish Jangra
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Area covered
    YouTube
    Description

    Context

    This dataset is created for beginner students of Data Analysis who can explore the field with real-life data. Using TED talk data will help them to analyze the talks and they can also watch the talks of their favorite author with the help of the dataset as well.

    Content

    Here we're having 2 datasets 1. Raw Dataset (TED.csv) 2. Preprocessed Dataset (TED_Preprocessed.csv)

    Now lets discuss about some imprtant columns in the datset

    TED.csv This dataset contains 9 different features of each talk available on TED's youtube channel which you can find below

    • video_link - The youtube link of the video
    • thumbnail_link - The youtube thumbnail link of the video
    • duration - Duration of the video
    • title - Title of the video
    • views - Number of views on that video
    • likes - Number of likes on that video-
    • comments - Number of comments on that video
    • date - Date of the video published on youtube
    • description - Video Description

    Acknowledgements

    The data has been scraped from the official TED Youtube channel and is available under the Creative Commons License.

    Inspiration As TED is one of the best learning platforms for the best people in their field so I always wanted to learn from the best that's why I've created this dataset so that learners can learn both Data Analytics and also from the speakers of TED talks.

    This dataset can help you to answer the following question

    Finding the most popular TED talks Finding the most popular TED talks Speaker (in terms of number of talks) Month-wise Analysis of TED talk frequency Year-wise Analysis of TED talk frequency Finding TED talks of your favorite Author Finding TED talks with the best view to like ration Finding TED talks based on tags(like climate) Finding the most popular TED talks Speaker (in terms of number of views)

    Enjoy Learning! Thumbnail Reference Image: https://tedxwinterpark.com/what-is-the-difference-between-ted-and-tedx/

  16. Data from: YouNICon: YouTube's CommuNIty of Conspiracy Videos

    • zenodo.org
    csv
    Updated Jan 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shao Yi Liaw; Shao Yi Liaw; Fan Huang; Fan Huang; Fabricio Benevenuto; Fabricio Benevenuto; Haewoon Kwak; Haewoon Kwak; Jisun An; Jisun An (2023). YouNICon: YouTube's CommuNIty of Conspiracy Videos [Dataset]. http://doi.org/10.5281/zenodo.7466263
    Explore at:
    csvAvailable download formats
    Dataset updated
    Jan 16, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Shao Yi Liaw; Shao Yi Liaw; Fan Huang; Fan Huang; Fabricio Benevenuto; Fabricio Benevenuto; Haewoon Kwak; Haewoon Kwak; Jisun An; Jisun An
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    YouTube
    Description

    this repository contains 6 files.

    1. all_videos.csv: contains all videos with the metadata
    2. comments_anon.csv: contains comment id, video id, anonymised author id of the comment, perspective api scores of the comment text.
    3. conspiracy_label.csv: contains video id and the label if the video contains consipiracy.
    4. train_final.csv: training set for the model
    5. val_final.csv: validation set for the model
    6. test_final.csv: test set for the model
  17. n

    Data from: MeadoWatch: a long-term community-science database of wildflower...

    • data.niaid.nih.gov
    • search.dataone.org
    • +1more
    zip
    Updated Mar 22, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Janneke HilleRisLambers; Ruben Manzanedo; Elli Theobald; Berry Brosi; Joshua Jenkins; Ava Kloss-Schmidt; Emilia Lia; Annie Schiffer; Jordana Sevigny; Anna Wilson; Yonit Yogev; Aji John (2022). MeadoWatch: a long-term community-science database of wildflower phenology in Mount Rainier National Park [Dataset]. http://doi.org/10.5061/dryad.g1jwstqs2
    Explore at:
    zipAvailable download formats
    Dataset updated
    Mar 22, 2022
    Dataset provided by
    ETH Zurich
    University of Washington
    Authors
    Janneke HilleRisLambers; Ruben Manzanedo; Elli Theobald; Berry Brosi; Joshua Jenkins; Ava Kloss-Schmidt; Emilia Lia; Annie Schiffer; Jordana Sevigny; Anna Wilson; Yonit Yogev; Aji John
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Area covered
    Mount Rainier
    Description

    We present a long-term and high-resolution phenological dataset from 17 wildflower species collected in Mt. Rainier National Park, as part of the MeadoWatch (MW) community science project. Since 2013, 500+ unique volunteers and scientists have gathered data on the timing of four key reproductive phenophases (budding, flowering, fruiting, and seeding) in 28 plots over two elevational gradients alongside popular park trails. Trained volunteers (87.2%) and UW scientists (12.8%) collected data 3-9 times/week during the growing season, using a standardized method. Taxonomic assessments were highly consistent between scientists and volunteers, with high accuracy and specificity across phenophases and species. Sensitivity, on the other hand, was lower than accuracy and specificity, suggesting that a few species might be challenging to reliably identify in community-science projects. Up to date, the MW database includes 42,000+ individual phenological observations from 17 species, between 2013 and 2019. However, MW is a living dataset that will be updated through continued contributions by volunteers, and made available for its use by the wider ecological community. Methods Open-Data-MeadoWatch Project MeadoWatch coordinator and contact: Prof. Dr. Janneke Hille Ris Lambers. janneke.hillerislambers@usys.ethz.ch University of Washington - ETHZürich Prof. Dr. Elli Theobald Dr. Meera Sethi PhD Candidate Aji John Dr. Rubén D. Manzanedo Prof. Dr. Berry Bross Dr. Joshua Jenkins Ms. Kloss-Schmidt Dr. Emmilia Lia Ms. Annie Schiffer Ms. Jordana Sevigny Ms. Anna Wilson Dr. Yonit Yogev Permanent website of MeadoWatch and more info: http://www.meadowatch.org/ https://www.youtube.com/channel/UCGBFTKxf8FIWswMDxBavpuQ/featured PUBLICATIONS LINKED TO THE MEADOWATCH DATABASE

    Manzanedo, R.D., John, A., Sethi, M.L., Theobald, E.J., Brosi, B., Jenkins, J., Kloss-Schmidt, A., Lia, E., Schiffer, A., Sevigny, J., Wilson, A., Yogev, Y., & Hille Ris Lambers, J. (Under review). MeadoWatch: a long-term community-science database of wildflower phenology in Mount Rainier National Park. Scientific Data

    Hille Ris Lambers, J., Cannistra, A. F., John, A., Lia, E., Manzanedo, R. D., Sethi, M., Sevigny, J., Theobald, E. J., Waugh, J. K. (2021) Climate change impacts on natural icons: do phenological shifts threaten the relationship between peak wildflowers and visitor satisfaction? Climate Change Ecology

    Breckheimer, I. K., Theobald, E. J., Cristea, N. C., Wilson, A. K., Lundquist, J. D., Rochefort, R. M., & HilleRisLambers, J. (2020). Crowd‐sourced data reveal social–ecological mismatches in phenology driven by climate. Frontiers in Ecology and the Environment, 18(2), 76-82.

    John, A., Ong, J., Theobald, E. J., Olden, J. D., Tan, A., & HilleRisLambers, J. (2020). Detecting Montane Flowering Phenology with CubeSat Imagery. Remote Sensing, 12(18), 2894.

    Wilson, A., Bacher, K., Breckheimer, I., Lundquist, J., Rochefort, R., Theobald, E., ... & HilleRisLambers, J. (2017). Monitoring wildflower phenology using traditional science, citizen science, and crowd sourcing. Park Sci, 33, 17-26.

    Repository content RAW DATA. Please check the 'Final data' folder:

    MW_PhenoDat_2013_2019_anonymized.csv -- Phenological records. Information on date, transect id, observer, and phenological state per flowering species.

    MW_SiteInfo_2013_2019_anonymized.csv -- Latitude and longitude location (WGS84), elevation, and forest type for each MW site

    MW_metadata.xlsx -- metadata information for all variables and files. As well as equivalency in species names (4 letter code to full botanical name)

    MW_SDDall.csv -- snow disappearance data and minimum soil temperature data

    MW_Phenocurves.csv -- parameters and data to reconstruct the curves in HilleRisLambers et al. 2021 and in the data exploration tool

    MW_Volunteer_info_2013_2019_anonymized.csv -- contains information for the MW volunteers, including observer, first participation, first year of training and training type

  18. e

    Reality TV on the web

    • data.europa.eu
    csv, plain text
    Updated Nov 4, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Institut national de l'audiovisuel (2025). Reality TV on the web [Dataset]. https://data.europa.eu/data/datasets/6005a8b69d3080fb8be30a2c
    Explore at:
    csv(7351), csv(749612), csv(264352), plain text(1947), csv(5642)Available download formats
    Dataset updated
    Nov 4, 2025
    Dataset authored and provided by
    Institut national de l'audiovisuel
    License

    https://www.etalab.gouv.fr/licence-ouverte-open-licencehttps://www.etalab.gouv.fr/licence-ouverte-open-licence

    Description

    The legal deposit of the audiovisual web is the part of the legal deposit which the legislator has entrusted to the National Audiovisual Institute (Law No 2006-961 of 1 August 2006 on copyright and related rights in the information society, known as the “DADVSI Law”). It started in February 2009 with the collection of websites (audiovisual media services and online public communication services) and since 2014 includes social networks (including Twitter accounts) linked to French audiovisual.

    As part of the creation of audiovisual thematic corpuses by Ina documentalists, a study on the online presence of reality TV shows was conducted between November 2020 and January 2021.

    For each identified reality show as well as for its participants and production companies, social media accounts, Youtube channels, Wikipedia pages, websites and hashtags have been identified.

    The dataset contains the completeness of these web objects and when available, documentary information and metrics (number of subscribers, number of subscriptions, etc.) as of January 2021. All the publications, videos associated with these accounts and the websites can only be consulted in the hands of the INA, the custodian of this legal deposit.

    The dataset is divided into 4 data sub-games: — “Depot_legal_du_web-_telealité-_emissions.csv” contains all web objects by reality show — “Depot_legal_du_web-_teleality-_participant.e.s.csv” contains all the web objects per participant — “Depot_legal_du_web-_teleality-_production.csv” contains all web objects per production company — “Depot_legal_du_web-_teleality-_dones_documentaires.csv” contains all documentary data and metrics by web objects — “Depot_legal_du_web-_teleality-_dones_documentary_dictionnaire.csv” contains a dictionary of the previous file.

    The re-use of personal data present in the data sets published by Ina constitutes the processing of personal data as defined by Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 known as the General Data Protection Regulation (the “GDPR”) and Law No 78-17 of 6 January 1978 on computing, files and freedoms as amended, together ‘the Data Protection Regulation’. The re-user is therefore subject to compliance with the legal framework resulting from the Data Protection Regulation in order to ensure that such re-use of personal data is lawful. In any event, Ina disclaims any liability for non-compliance by a re-user with the above-mentioned rules.

  19. Top 100 YouTube Channels

    • kaggle.com
    zip
    Updated Aug 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Omkar Mhamal (2025). Top 100 YouTube Channels [Dataset]. https://www.kaggle.com/datasets/omkarmhamal/top-100-youtube-channels
    Explore at:
    zip(2980 bytes)Available download formats
    Dataset updated
    Aug 15, 2025
    Authors
    Omkar Mhamal
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Area covered
    YouTube
    Description

    This dataset contains information about the 100 most-subscribed YouTube channels, including channel name, subscriber count (in millions), primary language, category, join date, and country/territory - scraped from Wikipedia.

    Source: "List of most-subscribed YouTube channels" on Wikipedia, retrieved August 2025 Wikipedia

    Licensing: Content is under CC BY-SA 4.0. This dataset is shared under the same license as required.

    Usage Note: Subscriber counts and rankings are as of Wikipedia's update in July 2025 and may have changed since. Users are encouraged to verify current numbers as needed.

  20. Z

    Data from: R code and dataset to "Monetizing Spillover Effects in the...

    • data-staging.niaid.nih.gov
    • producciocientifica.uv.es
    • +1more
    Updated Jul 30, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Montoro-Pons, Juan D.; Caballer-Tarazona, María; Cuadrado-García, Manuel (2021). R code and dataset to "Monetizing Spillover Effects in the Creative Industries: the Impact of Live Music Performances on Youtube Searches" [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_5091808
    Explore at:
    Dataset updated
    Jul 30, 2021
    Dataset provided by
    Universitat de València
    Authors
    Montoro-Pons, Juan D.; Caballer-Tarazona, María; Cuadrado-García, Manuel
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    YouTube
    Description

    Content:

    The script main_script.R includes code to run a regression discontinuity (RD) design and validation and falsification of estimated results

    The folder data contains two files:

    bands_2016_2019.csv: a dataset of performers with additional information for each one.

    festivals_2016_2019.csv: a dataset of video search activity (as retrieved from Google Trends) for performers in file bands_2016_2019.csv

    The folder source contains two additional R scripts:

    data_preparation.R: generates the long dataset used to estimate RD effects

    status_simulation.R: randomly assigns treattment status to performers and estimates RD effects. Note this may take a long time to run. Parallel code is used: the number of cores has been set to 4.

    The folder simulation_results contains simulated data after running the script status_simulation.R.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
The Devastator (2023). YouTube Trending Videos Dataset [Dataset]. https://www.kaggle.com/datasets/thedevastator/youtube-trending-videos-dataset
Organization logo

YouTube Trending Videos Dataset

Exploring YouTube Trending Videos

Explore at:
84 scholarly articles cite this dataset (View in Google Scholar)
zip(29769637 bytes)Available download formats
Dataset updated
Dec 19, 2023
Authors
The Devastator
Area covered
YouTube
Description

YouTube Trending Videos Dataset

Exploring YouTube Trending Videos

By dskl [source]

About this dataset

Moreover it also reveals various engagement metrics such as the number of views the video has received, likes and dislikes it has garnered from viewership. Additionally information related to comment count on particular videos enables analysis regarding viewer interaction and response. Furthermore this dataset describes whether comments or ratings are disabled for a particular video allowing examination into how these factors impact engagement.

By exploring this dataset in-depth marketers can gain valuable insights into identifying trends in content popularity across different countries while taking into account timing considerations based on published day of week. It also opens up avenues for analyzing public sentiment towards specific videos based on likes vs dislikes ratios and comment count which further aids in devising suitable marketing strategies.

Overall,this informative dataset serves as an invaluable asset for researchers,data analysts,and marketers alike who strive to gain deeper understanding about trending video patterns,relevant metrics influencing content virality,factors dictating viewer sentiments,and exploring new possibilities within digital marketing space leveraging YouTube's wide reach

How to use the dataset

How to Use This Dataset: A Guide

In this guide, we will walk you through the different columns in the dataset and provide insights on how you can explore the popularity and engagement of these trending videos. Let's dive in!

Column Descriptions:

  • title: The title of the video.
  • channel_title: The title of the YouTube channel that published the video.
  • publish_date: The date when the video was published on YouTube.
  • time_frame: The duration of time (e.g., 1 day, 6 hours) that the video has been trending on YouTube.
  • published_day_of_week: The day of week (e.g., Monday) when the video was published.
  • publish_country: The country where the video was published.
  • tags: The tags or keywords associated with the video.
  • views: The number of views received by a particular video
  • likes: Number o likes received per each videos
  • dislike: Number dislikes receives per an individual vidoe 11.comment_count: number of comments

Popular Video Insights:

To gain insights into popular videos based on this dataset, you can focus your analysis using these columns:

title, channel_title, publish_date, time_frame, and** publish_country**.

By analyzing these attributes together with other engagement metrics such as views ,likes,**dislikes,**comments),comment_count you can identify trends in what type content is most popular both globally or within specific countries.

For instance: - You could analyze which channels are consistently publishing trending videos - Explore whether certain types of titles or tags are more likely to attract views and engagement. - Determine if certain days of the week or time frames have a higher likelihood of trending videos being published.

Engagement Insights:

To explore user engagement with the trending videos, you can focus your analysis on these columns:

likes, dislikes, comment_count

By analyzing these attributes you can get insights into how users are interacting with the content. For example: - You could compare the like and dislike ratios to identify positively received videos versus those that are more controversial. - Analyze comment counts to understand how users are engaging with the content and whether comments being disabled affects overall

Research Ideas

  • Analyzing the popularity and engagement of trending videos: By analyzing the number of views, likes, dislikes, and comments, we can understand which types of videos are popular among YouTube users. We can also examine factors such as comment count and ratings disabled to see how viewers engage with trending videos.
  • Understanding video trends across different countries: By examining the publish country column, we can compare the popularity of trending videos in different countries. This can help content creators or marketers understand regional preferences and tailor their content strategy accordingly.
  • Studying the impact of video attributes on engagement: By exploring the relationship between video attributes (such as title, tags, publish day) and engagement metrics (views, likes), we can identify patterns or trends that influence a video's success on YouTube. This information can be...
Search
Clear search
Close search
Google apps
Main menu