87 datasets found
  1. Hours of video uploaded to YouTube every minute 2007-2022

    • statista.com
    Updated Apr 11, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2024). Hours of video uploaded to YouTube every minute 2007-2022 [Dataset]. https://www.statista.com/statistics/259477/hours-of-video-uploaded-to-youtube-every-minute/
    Explore at:
    Dataset updated
    Apr 11, 2024
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Jun 2007 - Jun 2022
    Area covered
    Worldwide, YouTube
    Description

    As of June 2022, more than 500 hours of video were uploaded to YouTube every minute. This equates to approximately 30,000 hours of newly uploaded content per hour. The amount of content on YouTube has increased dramatically as consumer’s appetites for online video has grown. In fact, the number of video content hours uploaded every 60 seconds grew by around 40 percent between 2014 and 2020.

    YouTube global users

    Online video is one of the most popular digital activities worldwide, with 27 percent of internet users worldwide watching more than 17 hours of online videos on a weekly basis in 2023. It was estimated that in 2023 YouTube would reach approximately 900 million users worldwide. In 2022, the video platform was one of the leading media and entertainment brands worldwide, with a value of more than 86 billion U.S. dollars.

    YouTube video content consumption

    The most viewed YouTube channels of all time have racked up billions of viewers, millions of subscribers and cover a wide variety of topics ranging from music to cosmetics. The YouTube channel owner with the most video views is Indian music label T-Series, which counted 217.25 billion lifetime views. Other popular YouTubers are gaming personalities such as PewDiePie, DanTDM and Markiplier.

  2. YouTube Trending Video Dataset (updated daily)

    • kaggle.com
    zip
    Updated Apr 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rishav Sharma (2024). YouTube Trending Video Dataset (updated daily) [Dataset]. https://www.kaggle.com/rsrishav/YouTube-trending-video-dataset
    Explore at:
    zip(0 bytes)Available download formats
    Dataset updated
    Apr 15, 2024
    Authors
    Rishav Sharma
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    YouTube
    Description

    This dataset is a daily record of the top trending YouTube videos and it will be updated daily.

    Context

    YouTube maintains a list of the top trending videos on the platform. According to Variety magazine, “To determine the year’s top-trending videos, YouTube uses a combination of factors including measuring users interactions (number of views, shares, comments and likes). Note that they’re not the most-viewed videos overall for the calendar year”.

    Note that this dataset is a structurally improved version of this dataset.

    Content

    This dataset includes several months (and counting) of data on daily trending YouTube videos. Data is included for the IN, US, GB, DE, CA, FR, RU, BR, MX, KR, and JP regions (India, USA, Great Britain, Germany, Canada, France, Russia, Brazil, Mexico, South Korea, and, Japan respectively), with up to 200 listed trending videos per day.

    Each region’s data is in a separate file. Data includes the video title, channel title, publish time, tags, views, likes and dislikes, description, and comment count.

    The data also includes a category_id field, which varies between regions. To retrieve the categories for a specific video, find it in the associated JSON. One such file is included for each of the 11 regions in the dataset.

    For more information on specific columns in the dataset refer to the column metadata.

    Acknowledgements

    This dataset was collected using the YouTube API. This dataset is the updated version of Trending YouTube Video Statistics.

    Inspiration

    Possible uses for this dataset could include: - Sentiment analysis in a variety of forms - Categorizing YouTube videos based on their comments and statistics. - Training ML algorithms like RNNs to generate their own YouTube comments. - Analyzing what factors affect how popular a YouTube video will be. - Statistical analysis over time .

    For further inspiration, see the kernels on this dataset!

  3. YouTube Dataset on Mobile Streaming for Internet Traffic Modeling, Network...

    • figshare.com
    txt
    Updated Apr 14, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Frank Loh; Florian Wamser; Fabian Poignée; Stefan Geißler; Tobias Hoßfeld (2022). YouTube Dataset on Mobile Streaming for Internet Traffic Modeling, Network Management, and Streaming Analysis [Dataset]. http://doi.org/10.6084/m9.figshare.19096823.v2
    Explore at:
    txtAvailable download formats
    Dataset updated
    Apr 14, 2022
    Dataset provided by
    figshare
    Authors
    Frank Loh; Florian Wamser; Fabian Poignée; Stefan Geißler; Tobias Hoßfeld
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    YouTube
    Description

    Streaming is by far the predominant type of traffic in communication networks. With thispublic dataset, we provide 1,081 hours of time-synchronous video measurements at network, transport, and application layer with the native YouTube streaming client on mobile devices. The dataset includes 80 network scenarios with 171 different individual bandwidth settings measured in 5,181 runs with limited bandwidth, 1,939 runs with emulated 3G/4G traces, and 4,022 runs with pre-defined bandwidth changes. This corresponds to 332GB video payload. We present the most relevant quality indicators for scientific use, i.e., initial playback delay, streaming video quality, adaptive video quality changes, video rebuffering events, and streaming phases.

  4. YouTube users worldwide 2020-2029

    • statista.com
    Updated Mar 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). YouTube users worldwide 2020-2029 [Dataset]. https://www.statista.com/forecasts/1144088/youtube-users-in-the-world
    Explore at:
    Dataset updated
    Mar 3, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Worldwide, YouTube
    Description

    The global number of Youtube users in was forecast to continuously increase between 2024 and 2029 by in total 232.5 million users (+24.91 percent). After the ninth consecutive increasing year, the Youtube user base is estimated to reach 1.2 billion users and therefore a new peak in 2029. Notably, the number of Youtube users of was continuously increasing over the past years.User figures, shown here regarding the platform youtube, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of Youtube users in countries like Africa and South America.

  5. Most Watched Youtube Videos

    • kaggle.com
    zip
    Updated Apr 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jatinthakur706 (2024). Most Watched Youtube Videos [Dataset]. https://www.kaggle.com/datasets/jatinthakur706/most-watched-youtube-videos
    Explore at:
    zip(0 bytes)Available download formats
    Dataset updated
    Apr 19, 2024
    Authors
    Jatinthakur706
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    YouTube
    Description

    This dataset contains data related to most watched YouTube videos till April 2024 . This contains different columns namely views,artist,channel,etc. The data is ranked on the basis of number of views.

  6. YouTube users in India 2020-2029

    • statista.com
    Updated Mar 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). YouTube users in India 2020-2029 [Dataset]. https://www.statista.com/forecasts/1146150/youtube-users-in-india
    Explore at:
    Dataset updated
    Mar 3, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    India
    Description

    The number of Youtube users in India was forecast to continuously increase between 2024 and 2029 by in total 222.2 million users (+34.88 percent). After the ninth consecutive increasing year, the Youtube user base is estimated to reach 859.26 million users and therefore a new peak in 2029. Notably, the number of Youtube users of was continuously increasing over the past years.User figures, shown here regarding the platform youtube, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of Youtube users in countries like Sri Lanka and Nepal.

  7. Video-EEG Encoding-Decoding Dataset KU Leuven

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Jun 11, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yuanyuan Yao; Yuanyuan Yao; Axel Stebner; Axel Stebner; Tinne Tuytelaars; Tinne Tuytelaars; Simon Geirnaert; Simon Geirnaert; Alexander Bertrand; Alexander Bertrand (2024). Video-EEG Encoding-Decoding Dataset KU Leuven [Dataset]. http://doi.org/10.5281/zenodo.10512414
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 11, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Yuanyuan Yao; Yuanyuan Yao; Axel Stebner; Axel Stebner; Tinne Tuytelaars; Tinne Tuytelaars; Simon Geirnaert; Simon Geirnaert; Alexander Bertrand; Alexander Bertrand
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Leuven
    Description

    If using this dataset, please cite the following paper and the current Zenodo repository.

    This dataset is described in detail in the following paper:

    Yao, Y., Stebner, A., Tuytelaars, T., Geirnaert, S., & Bertrand, A. (2024). Identifying temporal correlations between natural single-shot videos and EEG signals. Journal of Neural Engineering, 21(1), 016018. doi:10.1088/1741-2552/ad2333

    Introduction

    The research work leading to this dataset was conducted at the Department of Electrical Engineering (ESAT), KU Leuven.

    This dataset contains electroencephalogram (EEG) data collected from 19 young participants with normal or corrected-to-normal eyesight when they were watching a series of carefully selected YouTube videos. The videos were muted to avoid the confounds introduced by audio. For synchronization, a square box was encoded outside of the original frames and flashed every 30 seconds in the top right corner of the screen. A photosensor, detecting the light changes from this flashing box, was affixed to that region using black tape to ensure that the box did not distract participants. The EEG data was recorded using a BioSemi ActiveTwo system at a sample rate of 2048 Hz. Participants wore a 64-channel EEG cap, and 4 electrooculogram (EOG) sensors were positioned around the eyes to track eye movements.

    The dataset includes a total of (19 subjects x 63 min + 9 subjects x 24 min) of data. Further details can be found in the following section.

    Content

    • YouTube Videos: Due to copyright constraints, the dataset includes links to the original YouTube videos along with precise timestamps for the segments used in the experiments.
    • Raw EEG Data: Organized by subject ID, the dataset contains EEG segments corresponding to the presented videos. Both EEGLAB .set files (containing metadata) and .fdt files (containing raw data) are provided, which can also be read by popular EEG analysis Python packages such as MNE.
      • The naming convention links each EEG segment to its corresponding video. E.g., the EEG segment 01_eeg corresponds to video 01_Dance_1, 03_eeg corresponds to video 03_Acrob_1, Mr_eeg corresponds to video Mr_Bean, etc.
      • The raw data have 68 channels. The first 64 channels are EEG data, and the last 4 channels are EOG data. The position coordinates of the standard BioSemi headcaps can be downloaded here: https://www.biosemi.com/download/Cap_coords_all.xls.
      • Due to minor synchronization ambiguities, different clocks in the PC and EEG recorder, and missing or extra video frames during video playback (rarely occurred), the length of the EEG data may not perfectly match the corresponding video data. The difference, typically within a few milliseconds, can be resolved by truncating the modality with the excess samples.
    • Signal Quality Information: A supplementary .txt file detailing potential bad channels. Users can opt to create their own criteria for identifying and handling bad channels.

    The dataset is divided into two subsets: Single-shot and MrBean, based on the characteristics of the video stimuli.

    Single-shot Dataset

    The stimuli of this dataset consist of 13 single-shot videos (63 min in total), each depicting a single individual engaging in various activities such as dancing, mime, acrobatics, and magic shows. All the participants watched this video collection.

    Video IDLinkStart time (s)End time (s)
    01_Dance_1https://youtu.be/uOUVE5rGmhM8.54231.20
    03_Acrob_1https://youtu.be/DjihbYg6F2Y4.24231.91
    04_Magic_1https://youtu.be/CvzMqIQLiXE3.68348.17
    05_Dance_2https://youtu.be/f4DZp0OEkK45.05227.99
    06_Mime_2https://youtu.be/u9wJUTnBdrs5.79347.05
    07_Acrob_2https://youtu.be/kRqdxGPLajs183.61519.27
    08_Magic_2https://youtu.be/FUv-Q6EgEFI3.36270.62
    09_Dance_3https://youtu.be/LXO-jKksQkM5.61294.17
    12_Magic_3https://youtu.be/S84AoWdTq3E1.76426.36
    13_Dance_4https://youtu.be/0wc60tA1klw14.28217.18
    14_Mime_3https://youtu.be/0Ala3ypPM3M21.87386.84
    15_Dance_5https://youtu.be/mg6-SnUl0A015.14233.85
    16_Mime_6https://youtu.be/8V7rhAJF6Gc31.64388.61

    MrBean Dataset

    Additionally, 9 participants watched an extra 24-minute clip from the first episode of Mr. Bean, where multiple (moving) objects may exist and interact, and the camera viewpoint may change. The subject IDs and the signal quality files are inherited from the single-shot dataset.

    Video IDLinkStart time (s)End time (s)
    Mr_Beanhttps://www.youtube.com/watch?v=7Im2I6STbms39.771495.00

    Acknowledgement

    This research is funded by the Research Foundation - Flanders (FWO) project No G081722N, junior postdoctoral fellowship fundamental research of the FWO (for S. Geirnaert, No. 1242524N), the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation program (grant agreement No 802895), the Flemish Government (AI Research Program), and the PDM mandate from KU Leuven (for S. Geirnaert, No PDMT1/22/009).

    We also thank the participants for their time and effort in the experiments.

    Contact Information

    Executive researcher: Yuanyuan Yao, yuanyuan.yao@kuleuven.be

    Led by: Prof. Alexander Bertrand, alexander.bertrand@kuleuven.be

  8. Car crash dataset RUSSIA 2022-2023

    • kaggle.com
    Updated May 10, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sivoha (2023). Car crash dataset RUSSIA 2022-2023 [Dataset]. https://www.kaggle.com/datasets/sivoha/car-crash-dataset-russia-2022-2023
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 10, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Sivoha
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    Russia
    Description

    Car crash dataset RUSSIA 2022-2023 is a big driving video dataset that contains over 500 high-resolution videos of various driving scenarios. The dataset was created to aid the development and testing of autonomous driving systems and other related technologies. It includes videos from Russia, captured from a diverse set of locations, weather conditions, and lighting conditions, each video lasting about 10 seconds. The videos are annotated with bounding boxes around objects such as different types of cars, pedestrians, and cyclists, as well as traffic signs, and traffic lights. Additionally, the dataset includes metadata information for each video.Car crash dataset RUSSIA 2022-2023 is considered to be one of the few datasets from Russia on this topic. Created by 7 students from Moscow, MIEM HSE. First version published on 9th May, 2023.

  9. A Labelled Dataset for Sentiment Analysis of Videos on YouTube, TikTok, and...

    • zenodo.org
    • data.niaid.nih.gov
    • +2more
    csv
    Updated Jul 20, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nirmalya Thakur; Nirmalya Thakur; Vanessa Su; Mingchen Shao; Kesha A. Patel; Hongseok Jeong; Victoria Knieling; Andrew Bian; Vanessa Su; Mingchen Shao; Kesha A. Patel; Hongseok Jeong; Victoria Knieling; Andrew Bian (2024). A Labelled Dataset for Sentiment Analysis of Videos on YouTube, TikTok, and other sources about the 2024 outbreak of Measles [Dataset]. http://doi.org/10.5281/zenodo.11711230
    Explore at:
    csvAvailable download formats
    Dataset updated
    Jul 20, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Nirmalya Thakur; Nirmalya Thakur; Vanessa Su; Mingchen Shao; Kesha A. Patel; Hongseok Jeong; Victoria Knieling; Andrew Bian; Vanessa Su; Mingchen Shao; Kesha A. Patel; Hongseok Jeong; Victoria Knieling; Andrew Bian
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jun 15, 2024
    Area covered
    YouTube
    Description

    Please cite the following paper when using this dataset:

    N. Thakur, V. Su, M. Shao, K. Patel, H. Jeong, V. Knieling, and A. Bian “A labelled dataset for sentiment analysis of videos on YouTube, TikTok, and other sources about the 2024 outbreak of measles,” Proceedings of the 26th International Conference on Human-Computer Interaction (HCII 2024), Washington, USA, 29 June - 4 July 2024. (Accepted as a Late Breaking Paper, Preprint Available at: https://doi.org/10.48550/arXiv.2406.07693)

    Abstract

    This dataset contains the data of 4011 videos about the ongoing outbreak of measles published on 264 websites on the internet between January 1, 2024, and May 31, 2024. These websites primarily include YouTube and TikTok, which account for 48.6% and 15.2% of the videos, respectively. The remainder of the websites include Instagram and Facebook as well as the websites of various global and local news organizations. For each of these videos, the URL of the video, title of the post, description of the post, and the date of publication of the video are presented as separate attributes in the dataset. After developing this dataset, sentiment analysis (using VADER), subjectivity analysis (using TextBlob), and fine-grain sentiment analysis (using DistilRoBERTa-base) of the video titles and video descriptions were performed. This included classifying each video title and video description into (i) one of the sentiment classes i.e. positive, negative, or neutral, (ii) one of the subjectivity classes i.e. highly opinionated, neutral opinionated, or least opinionated, and (iii) one of the fine-grain sentiment classes i.e. fear, surprise, joy, sadness, anger, disgust, or neutral. These results are presented as separate attributes in the dataset for the training and testing of machine learning algorithms for performing sentiment analysis or subjectivity analysis in this field as well as for other applications. The paper associated with this dataset (please see the above-mentioned citation) also presents a list of open research questions that may be investigated using this dataset.

  10. ITTV - A Dataset of Italian Television for Automatic Genre Classification

    • zenodo.org
    • data.niaid.nih.gov
    csv
    Updated Jun 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Paolo Sani; Paolo Sani; Alessandro Ilic Mezza; Alessandro Ilic Mezza; Augusto Sarti; Augusto Sarti (2023). ITTV - A Dataset of Italian Television for Automatic Genre Classification [Dataset]. http://doi.org/10.5281/zenodo.8027327
    Explore at:
    csvAvailable download formats
    Dataset updated
    Jun 13, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Paolo Sani; Paolo Sani; Alessandro Ilic Mezza; Alessandro Ilic Mezza; Augusto Sarti; Augusto Sarti
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Italy
    Description

    ITTV is a publicly available dataset of Italian TV programs introduced in

    Alessandro Ilic Mezza, Paolo Sani, and Augusto Sarti, "Automatic TV Genre Classification Based on Visually-Conditioned Deep Audio Features," in 2023 31st European Signal Processing Conference (EUSIPCO), 2023.

    ITTV consists of 2625 manually annotated YouTube videos, totaling over 670 hours. Each clip is assigned one of seven classes:

    • Cartoons
    • Commercials
    • Football
    • Music
    • News
    • Talk Shows
    • Weather Forecast

    ITTV genre taxonomy is similar to that of the well-known RAI dataset described in

    Maurizio Montagnuolo and Alberto Messina, "Parallel neural networks for multimodal video genre classification,” Multimedia Tools and Applications, vol. 41, no. 1, pp. 125–159, 2009.

    The dataset contains genre annotations and metadata in CSV format. Please note that audio data is not provided.

    We provide the annotations for a balanced training (1575 clips) and validation (525 clips) split, as well as for a disjoint test set containing 525 installments from TV programs not included in the development set.

    As YouTube continuously updates, some videos may not be available in the future. Although we intend to keep ITTV updated as best as possible, please note that some content may not be available at any given time.

    Some YouTube videos (especially from the Football class and, to a lesser extent, the Cartoons class) may only be available in some countries due to regional restrictions imposed by the content creator. All videos are known to be accessible from Italy (last accessed on Nov. 25th, 2022.)

    Please contact Alessandro Ilic Mezza for further questions (e-mail: alessandroilic.mezza@polimi.it).

  11. Youtube cookery channels viewers comments in Hinglish

    • zenodo.org
    csv
    Updated Jan 24, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abhishek Kaushik; Abhishek Kaushik; Gagandeep Kaur; Gagandeep Kaur (2020). Youtube cookery channels viewers comments in Hinglish [Dataset]. http://doi.org/10.5281/zenodo.2841848
    Explore at:
    csvAvailable download formats
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Abhishek Kaushik; Abhishek Kaushik; Gagandeep Kaur; Gagandeep Kaur
    License

    Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
    License information was derived automatically

    Area covered
    YouTube
    Description

    The data was collected from the famous cookery Youtube channels in India. The major focus was to collect the viewers' comments in Hinglish languages. The datasets are taken from top 2 Indian cooking channel named Nisha Madhulika channel and Kabita’s Kitchen channel.

    Both the datasets comments are divided into seven categories:-

    Label 1- Gratitude

    Label 2- About the recipe

    Label 3- About the video

    Label 4- Praising

    Label 5- Hybrid

    Label 6- Undefined

    Label 7- Suggestions and queries

    All the labelling has been done manually.

    Nisha Madhulika dataset:

    Dataset characteristics: Multivariate

    Number of instances: 4900

    Area: Cooking

    Attribute characteristics: Real

    Number of attributes: 3

    Date donated: March, 2019

    Associate tasks: Classification

    Missing values: Null

    Kabita Kitchen dataset:

    Dataset characteristics: Multivariate

    Number of instances: 4900

    Area: Cooking

    Attribute characteristics: Real

    Number of attributes: 3

    Date donated: March, 2019

    Associate tasks: Classification

    Missing values: Null

    There are two separate datasets file of each channel named as preprocessing and main file .

    The files with preprocessing names are generated after doing the preprocessing and exploratory data analysis on both the datasets. This file includes:

    • Id
    • Comment text
    • Labels
    • Count of stop-words
    • Uppercase words
    • Hashtags
    • Word count
    • Char count
    • Average words
    • Numeric

    The main file includes:

    • Id
    • comment text
    • Labels

    Please cite the paper

    https://www.mdpi.com/2504-2289/3/3/37

    MDPI and ACS Style

    Kaur, G.; Kaushik, A.; Sharma, S. Cooking Is Creating Emotion: A Study on Hinglish Sentiments of Youtube Cookery Channels Using Semi-Supervised Approach. Big Data Cogn. Comput. 2019, 3, 37.

  12. d

    Replication Data for: Estimating the Ideology of YouTube Videos

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Dec 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lai, Angela; Brown, Megan A.; Bisbee, James; Tucker, Joshua A.; Nagler, Jonathan; Bonneau, Richard (2023). Replication Data for: Estimating the Ideology of YouTube Videos [Dataset]. http://doi.org/10.7910/DVN/WZZFTW
    Explore at:
    Dataset updated
    Dec 16, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Lai, Angela; Brown, Megan A.; Bisbee, James; Tucker, Joshua A.; Nagler, Jonathan; Bonneau, Richard
    Area covered
    YouTube
    Description

    Abstract: We present a method for estimating the ideology of political YouTube videos. The subfield of estimating ideology as a latent variable has often focused on traditional actors such as legislators while more recent work has used social media data to estimate the ideology of ordinary users, political elites, and media sources. We build on this work to estimate the ideology of a political YouTube video. First, we start with a matrix of political Reddit posts linking to YouTube videos and apply correspondence analysis to place those videos in an ideological space. Second, we train a language model with those estimated ideologies as training labels, enabling us to estimate the ideologies of videos not posted on Reddit. These predicted ideologies are then validated against human labels. We demonstrate the utility of this method by applying it to the watch histories of survey respondents to evaluate the prevalence of echo chambers on YouTube in addition to the association between video ideology and viewer engagement. Our approach gives video-level scores based only on supplied text metadata, is scalable, and can be easily adjusted to account for changes in the ideological landscape. Keywords: Ideology estimation, YouTube, latent variable This folder contains the replication materials for "Estimating the Ideology of Political YouTube Videos."

  13. e

    Video instructions for the data portal

    • americansamoa-data.nocache.eightyoptions.com.au
    • pacificdata.org
    • +14more
    zip
    Updated Apr 2, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Secretariat of the Pacific Regional Environment Programme (2025). Video instructions for the data portal [Dataset]. https://americansamoa-data.nocache.eightyoptions.com.au/dataset/video-instructions-data-portal
    Explore at:
    zip(40900538), zip, zip(35894926), zip(41752372)Available download formats
    Dataset updated
    Apr 2, 2025
    Dataset provided by
    Secretariat of the Pacific Regional Environment Programme
    License

    Public Domain Mark 1.0https://creativecommons.org/publicdomain/mark/1.0/
    License information was derived automatically

    Area covered
    Pacific Region
    Description

    These instructional videos walk users through the portal and its different features.

  14. f

    Data from: Youtube in brazilian academic libraries: who, how and for what is...

    • scielo.figshare.com
    jpeg
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Enrique Muriel-Torrado; Marcio Gonçalves (2023). Youtube in brazilian academic libraries: who, how and for what is used [Dataset]. http://doi.org/10.6084/m9.figshare.5931073.v1
    Explore at:
    jpegAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    SciELO journals
    Authors
    Enrique Muriel-Torrado; Marcio Gonçalves
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    YouTube
    Description

    ABSTRACT This research aims to analyze the use of Youtube as a useful platform for the activities of library and information science professionals in brazilian academic libraries. Related audiovisual practices of the university libraries to encourage activities and focus on the importance of the librarian as a content producer in the digital enviroment. The survey results serve as reference material for information scientists and managers of information units interested in sharing audiovisual information as a new way of relationship with their users. Finally, based on the results, it is recommended to plan the communication strategy on social media platforms as YouTube, and prepare relevant content to engage with their subscribers and users.

  15. h

    first-impressions-v2

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yeray, first-impressions-v2 [Dataset]. https://huggingface.co/datasets/yeray142/first-impressions-v2
    Explore at:
    Authors
    Yeray
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Dataset Card for First Impressions V2

    The first impressions data set, comprises 10000 clips (average duration 15s) extracted from more than 3,000 different YouTube high-definition (HD) videos of people facing and speaking in English to a camera. The videos are split into training, validation and test sets with a 3:1:1 ratio. People in videos show different gender, age, nationality, and ethnicity. Videos are labeled with personality traits variables. Amazon Mechanical Turk (AMT) was… See the full description on the dataset page: https://huggingface.co/datasets/yeray142/first-impressions-v2.

  16. d

    Data from: A systematic review of methods for studying consumer health...

    • datadryad.org
    • data.niaid.nih.gov
    • +1more
    zip
    Updated Sep 13, 2013
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Margaret Sampson; Jordi Cumber; Claudia Li; Catherine M. Pound; Ann Fuller; Denise M. Harrison; Denise Harrison (2013). A systematic review of methods for studying consumer health YouTube videos, with implications for systematic reviews [Dataset]. http://doi.org/10.5061/dryad.4jh42
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 13, 2013
    Dataset provided by
    Dryad
    Authors
    Margaret Sampson; Jordi Cumber; Claudia Li; Catherine M. Pound; Ann Fuller; Denise M. Harrison; Denise Harrison
    Time period covered
    2013
    Area covered
    YouTube
    Description

    characteristics of published youtube video reviewsData extracted from published manuscripts, Excel spreadsheet with two tabs, one for the orginal sample and another for newer manuscripts. The row titled PMID indicates the PubMed ID number,and identifies that article that the data is taken from.DE on review methods V2.xls

  17. Z

    Data from: Effectiveness of Online Off-the-Job Training in Attracting...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jul 11, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kobayashi, Fumitaka (2024). Data from: Effectiveness of Online Off-the-Job Training in Attracting Participants and Video-On-Demand Streaming in Improving Work-Life Balance: A Study Focusing on Medical Technologists [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8260491
    Explore at:
    Dataset updated
    Jul 11, 2024
    Dataset provided by
    Ohmae, Kazuto
    Kobayashi, Fumitaka
    Lee, Sang-Tae
    Chagi, Yoshinari
    Yamaguchi, Naoko
    Ikejima, Takuya
    Abe, Noriyuki
    Tatsumi, Shigenobu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The Nara Association of Medical Technologists has introduced online Off-Job Training (Off-JT) starting from FY2020 in response to the COVID-19 pandemic. This study aims to evaluate the online Off-JT, which differs from the traditional face-to-face format. Firstly, we compared the online format's ability to attract participants with the face-to-face format based on the number of training sessions and attendees. Despite having fewer training sessions (40.8% less), the online format had an average attendance of 105.4% higher (39.7 vs. 19.3) than the face-to-face format. To enhance participant convenience, we offered a limited number of live and video-on-demand (VOD) sessions on YouTube, evaluating their usefulness through an online survey focusing on work-life balance (WLB). The survey results showed that 81.9% (458/559) of respondents reported an improvement in WLB. The effect on WLB improvement varied depending on the viewing method, with VOD sessions showing 84.1% (376/447) and live sessions showing 73.2% (82/112). We believe that the increased ability to attract participants in the online Off-JT is mainly due to the elimination of travel burdens through internet-connected devices. The combination of live and VOD sessions on YouTube allowed participants to adjust their viewing time, leading to better allocation of free time and improved WLB. The online Off-JT and VOD delivery have shown to enhance convenience for participants by removing geographical and time constraints, resulting in positive effects.

  18. m

    Multi-language Video Subtitle Dataset

    • data.mendeley.com
    Updated Nov 29, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Olarik Surinta (2021). Multi-language Video Subtitle Dataset [Dataset]. http://doi.org/10.17632/gj8d88h2g3.2
    Explore at:
    Dataset updated
    Nov 29, 2021
    Authors
    Olarik Surinta
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The video subtitle images were collected from 24 videos shared on Facebook and Youtube. The subtitle text included Thai and English languages, including Thai characters, Roman characters, Thai numerals, Arabic numerals, and special characters with 157 characters in total.

    In the data-preprocessing step, we converted all 24 videos to images and obtained 2,700 images with subtitle text. The size of the subtitle text image was 1280x720 pixels and it was stored in JPG format. Further, we generated the ground truth from 4,224 subtitle images using the labelImg program. Also, the labels were then assigned to each subtitle image. Note that the number before the label is the order of the subtitle text image.

  19. Z

    RealVAD: A Real-world Dataset for Voice Activity Detection

    • data.niaid.nih.gov
    • explore.openaire.eu
    • +1more
    Updated Jul 3, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cigdem Beyan (2020). RealVAD: A Real-world Dataset for Voice Activity Detection [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3928150
    Explore at:
    Dataset updated
    Jul 3, 2020
    Dataset provided by
    Cigdem Beyan
    Vittorio Murino
    Muhammad Shahid
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    RealVAD: A Real-world Dataset for Voice Activity Detection

    The task of automatically detecting “Who is Speaking and When” is broadly named as Voice Activity Detection (VAD). Automatic VAD is a very important task and also the foundation of several domains, e.g., human-human, human-computer/ robot/ virtual-agent interaction analyses, and industrial applications.

    RealVAD dataset is constructed from a YouTube video composed of a panel discussion lasting approx. 83 minutes. The audio is available from a single channel. There is one static camera capturing all panelists, the moderator and audiences.

    Particular aspects of RealVAD dataset are:

    It is composed of panelists with different nationalities (British, Dutch, French, German, Italian, American, Mexican, Columbian, Thai). This aspect allows studying the effect of ethnic origin variety to the automatic VAD.

    There is a gender balance such that there are four female and five male panelists.

    The panelists are sitting in two rows and they can be gazing audience, other panelists, their laptop, the moderator or anywhere in the room while speaking or not-speaking. Therefore, they were captured not only from frontal-view but also from side-view varying based on their instant posture and head orientation.

    The panelists are moving freely and are doing various spontaneous actions (e.g., drinking water, checking their cell phone, using their laptop, etc.), resulting in different postures.

    The panelists’ body parts are sometimes partially occluded by their/other's body part or belongings (e.g., laptop).

    There are also natural changes of illumination and shadow rising on the wall behind the panelists in the back row.

    Especially, for the panelists sitting in the front row, there is sometimes background motion occurring when the person(s) behind them moves.

    The annotations includes:

    The upper body detection of nine panelists in bounding box form.

    Associated VAD ground-truth (speaking, not-speaking) for nine panelists.

    Acoustic features extracted from the video: MFCC and raw filterbank energies.

    All info regarding the annotations are given in the ReadMe.txt and Acoustic Features README.txt files.

    When using this dataset for your research, please cite the following paper in your publication:

    C. Beyan, M. Shahid and V. Murino, "RealVAD: A Real-world Dataset and A Method for Voice Activity Detection by Body Motion Analysis", in IEEE Transactions on Multimedia, 2020.

  20. P

    Extended YouTube Faces (E-YTF) Dataset

    • paperswithcode.com
    • opendatalab.com
    Updated Aug 1, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Claudio Ferrari; Stefano Berretti; Alberto del Bimbo (2018). Extended YouTube Faces (E-YTF) Dataset [Dataset]. https://paperswithcode.com/dataset/extended-youtube-faces-e-ytf
    Explore at:
    Dataset updated
    Aug 1, 2018
    Authors
    Claudio Ferrari; Stefano Berretti; Alberto del Bimbo
    Area covered
    YouTube
    Description

    The proposed Extended-YouTube Faces (E-YTF) is an extension of the famous YouTube Faces (YTF) dataset and is specifically designed to further push the challenges of face recognition by addressing the problem of open-set face identification from heterogeneous data i.e. still images vs video.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Statista (2024). Hours of video uploaded to YouTube every minute 2007-2022 [Dataset]. https://www.statista.com/statistics/259477/hours-of-video-uploaded-to-youtube-every-minute/
Organization logo

Hours of video uploaded to YouTube every minute 2007-2022

Explore at:
237 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Apr 11, 2024
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Jun 2007 - Jun 2022
Area covered
Worldwide, YouTube
Description

As of June 2022, more than 500 hours of video were uploaded to YouTube every minute. This equates to approximately 30,000 hours of newly uploaded content per hour. The amount of content on YouTube has increased dramatically as consumer’s appetites for online video has grown. In fact, the number of video content hours uploaded every 60 seconds grew by around 40 percent between 2014 and 2020.

YouTube global users

Online video is one of the most popular digital activities worldwide, with 27 percent of internet users worldwide watching more than 17 hours of online videos on a weekly basis in 2023. It was estimated that in 2023 YouTube would reach approximately 900 million users worldwide. In 2022, the video platform was one of the leading media and entertainment brands worldwide, with a value of more than 86 billion U.S. dollars.

YouTube video content consumption

The most viewed YouTube channels of all time have racked up billions of viewers, millions of subscribers and cover a wide variety of topics ranging from music to cosmetics. The YouTube channel owner with the most video views is Indian music label T-Series, which counted 217.25 billion lifetime views. Other popular YouTubers are gaming personalities such as PewDiePie, DanTDM and Markiplier.

Search
Clear search
Close search
Google apps
Main menu