87 datasets found

Hours of video uploaded to YouTube every minute 2007-2022
statista.com
Updated Apr 11, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2024). Hours of video uploaded to YouTube every minute 2007-2022 [Dataset]. https://www.statista.com/statistics/259477/hours-of-video-uploaded-to-youtube-every-minute/
Explore at:
Dataset updated
Apr 11, 2024
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Jun 2007 - Jun 2022
Area covered
Worldwide, YouTube
Description
As of June 2022, more than 500 hours of video were uploaded to YouTube every minute. This equates to approximately 30,000 hours of newly uploaded content per hour. The amount of content on YouTube has increased dramatically as consumer’s appetites for online video has grown. In fact, the number of video content hours uploaded every 60 seconds grew by around 40 percent between 2014 and 2020.

YouTube global users

Online video is one of the most popular digital activities worldwide, with 27 percent of internet users worldwide watching more than 17 hours of online videos on a weekly basis in 2023. It was estimated that in 2023 YouTube would reach approximately 900 million users worldwide. In 2022, the video platform was one of the leading media and entertainment brands worldwide, with a value of more than 86 billion U.S. dollars.

YouTube video content consumption

The most viewed YouTube channels of all time have racked up billions of viewers, millions of subscribers and cover a wide variety of topics ranging from music to cosmetics. The YouTube channel owner with the most video views is Indian music label T-Series, which counted 217.25 billion lifetime views. Other popular YouTubers are gaming personalities such as PewDiePie, DanTDM and Markiplier.
YouTube Trending Video Dataset (updated daily)
kaggle.com
zip
Updated Apr 15, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rishav Sharma (2024). YouTube Trending Video Dataset (updated daily) [Dataset]. https://www.kaggle.com/rsrishav/YouTube-trending-video-dataset
Explore at:
zip(0 bytes)Available download formats
Dataset updated
Apr 15, 2024
Authors
Rishav Sharma
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Area covered
YouTube
Description
This dataset is a daily record of the top trending YouTube videos and it will be updated daily.

Context

YouTube maintains a list of the top trending videos on the platform. According to Variety magazine, “To determine the year’s top-trending videos, YouTube uses a combination of factors including measuring users interactions (number of views, shares, comments and likes). Note that they’re not the most-viewed videos overall for the calendar year”.

Note that this dataset is a structurally improved version of this dataset.

Content

This dataset includes several months (and counting) of data on daily trending YouTube videos. Data is included for the IN, US, GB, DE, CA, FR, RU, BR, MX, KR, and JP regions (India, USA, Great Britain, Germany, Canada, France, Russia, Brazil, Mexico, South Korea, and, Japan respectively), with up to 200 listed trending videos per day.

Each region’s data is in a separate file. Data includes the video title, channel title, publish time, tags, views, likes and dislikes, description, and comment count.

The data also includes a category_id field, which varies between regions. To retrieve the categories for a specific video, find it in the associated JSON. One such file is included for each of the 11 regions in the dataset.

For more information on specific columns in the dataset refer to the column metadata.

Acknowledgements

This dataset was collected using the YouTube API. This dataset is the updated version of Trending YouTube Video Statistics.

Inspiration

Possible uses for this dataset could include: - Sentiment analysis in a variety of forms - Categorizing YouTube videos based on their comments and statistics. - Training ML algorithms like RNNs to generate their own YouTube comments. - Analyzing what factors affect how popular a YouTube video will be. - Statistical analysis over time.

For further inspiration, see the kernels on this dataset!
YouTube Dataset on Mobile Streaming for Internet Traffic Modeling, Network...
figshare.com
txt
Updated Apr 14, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Frank Loh; Florian Wamser; Fabian Poignée; Stefan Geißler; Tobias Hoßfeld (2022). YouTube Dataset on Mobile Streaming for Internet Traffic Modeling, Network Management, and Streaming Analysis [Dataset]. http://doi.org/10.6084/m9.figshare.19096823.v2
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.19096823.v2
Dataset updated
Apr 14, 2022
Dataset provided by
figshare
Authors
Frank Loh; Florian Wamser; Fabian Poignée; Stefan Geißler; Tobias Hoßfeld
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
YouTube
Description
Streaming is by far the predominant type of traffic in communication networks. With thispublic dataset, we provide 1,081 hours of time-synchronous video measurements at network, transport, and application layer with the native YouTube streaming client on mobile devices. The dataset includes 80 network scenarios with 171 different individual bandwidth settings measured in 5,181 runs with limited bandwidth, 1,939 runs with emulated 3G/4G traces, and 4,022 runs with pre-defined bandwidth changes. This corresponds to 332GB video payload. We present the most relevant quality indicators for scientific use, i.e., initial playback delay, streaming video quality, adaptive video quality changes, video rebuffering events, and streaming phases.
YouTube users worldwide 2020-2029
statista.com
Updated Mar 3, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). YouTube users worldwide 2020-2029 [Dataset]. https://www.statista.com/forecasts/1144088/youtube-users-in-the-world
Explore at:
Dataset updated
Mar 3, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
Worldwide, YouTube
Description
The global number of Youtube users in was forecast to continuously increase between 2024 and 2029 by in total 232.5 million users (+24.91 percent). After the ninth consecutive increasing year, the Youtube user base is estimated to reach 1.2 billion users and therefore a new peak in 2029. Notably, the number of Youtube users of was continuously increasing over the past years.User figures, shown here regarding the platform youtube, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of Youtube users in countries like Africa and South America.
Most Watched Youtube Videos
kaggle.com
zip
Updated Apr 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jatinthakur706 (2024). Most Watched Youtube Videos [Dataset]. https://www.kaggle.com/datasets/jatinthakur706/most-watched-youtube-videos
Explore at:
zip(0 bytes)Available download formats
Dataset updated
Apr 19, 2024
Authors
Jatinthakur706
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Area covered
YouTube
Description
This dataset contains data related to most watched YouTube videos till April 2024 . This contains different columns namely views,artist,channel,etc. The data is ranked on the basis of number of views.
YouTube users in India 2020-2029
statista.com
Updated Mar 3, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). YouTube users in India 2020-2029 [Dataset]. https://www.statista.com/forecasts/1146150/youtube-users-in-india
Explore at:
Dataset updated
Mar 3, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
India
Description
The number of Youtube users in India was forecast to continuously increase between 2024 and 2029 by in total 222.2 million users (+34.88 percent). After the ninth consecutive increasing year, the Youtube user base is estimated to reach 859.26 million users and therefore a new peak in 2029. Notably, the number of Youtube users of was continuously increasing over the past years.User figures, shown here regarding the platform youtube, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of Youtube users in countries like Sri Lanka and Nepal.

Video-EEG Encoding-Decoding Dataset KU Leuven

zenodo.org
data.niaid.nih.gov

zip

Updated Jun 11, 2024

Facebook

Twitter

Click to copy link

Link copied

Cite

Yuanyuan Yao; Yuanyuan Yao; Axel Stebner; Axel Stebner; Tinne Tuytelaars; Tinne Tuytelaars; Simon Geirnaert; Simon Geirnaert; Alexander Bertrand; Alexander Bertrand (2024). Video-EEG Encoding-Decoding Dataset KU Leuven [Dataset]. http://doi.org/10.5281/zenodo.10512414

Explore at:

zipAvailable download formats

Unique identifier

https://doi.org/10.5281/zenodo.10512414

Dataset updated

Jun 11, 2024

Dataset provided by

Zenodohttp://zenodo.org/

Authors

Yuanyuan Yao; Yuanyuan Yao; Axel Stebner; Axel Stebner; Tinne Tuytelaars; Tinne Tuytelaars; Simon Geirnaert; Simon Geirnaert; Alexander Bertrand; Alexander Bertrand

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Area covered

Leuven

Description

If using this dataset, please cite the following paper and the current Zenodo repository.

This dataset is described in detail in the following paper:

Yao, Y., Stebner, A., Tuytelaars, T., Geirnaert, S., & Bertrand, A. (2024). Identifying temporal correlations between natural single-shot videos and EEG signals. Journal of Neural Engineering, 21(1), 016018. doi:10.1088/1741-2552/ad2333

Introduction

The research work leading to this dataset was conducted at the Department of Electrical Engineering (ESAT), KU Leuven.

This dataset contains electroencephalogram (EEG) data collected from 19 young participants with normal or corrected-to-normal eyesight when they were watching a series of carefully selected YouTube videos. The videos were muted to avoid the confounds introduced by audio. For synchronization, a square box was encoded outside of the original frames and flashed every 30 seconds in the top right corner of the screen. A photosensor, detecting the light changes from this flashing box, was affixed to that region using black tape to ensure that the box did not distract participants. The EEG data was recorded using a BioSemi ActiveTwo system at a sample rate of 2048 Hz. Participants wore a 64-channel EEG cap, and 4 electrooculogram (EOG) sensors were positioned around the eyes to track eye movements.

The dataset includes a total of (19 subjects x 63 min + 9 subjects x 24 min) of data. Further details can be found in the following section.

Content

YouTube Videos: Due to copyright constraints, the dataset includes links to the original YouTube videos along with precise timestamps for the segments used in the experiments.
Raw EEG Data: Organized by subject ID, the dataset contains EEG segments corresponding to the presented videos. Both EEGLAB .set files (containing metadata) and .fdt files (containing raw data) are provided, which can also be read by popular EEG analysis Python packages such as MNE.
- The naming convention links each EEG segment to its corresponding video. E.g., the EEG segment 01_eeg corresponds to video 01_Dance_1, 03_eeg corresponds to video 03_Acrob_1, Mr_eeg corresponds to video Mr_Bean, etc.
- The raw data have 68 channels. The first 64 channels are EEG data, and the last 4 channels are EOG data. The position coordinates of the standard BioSemi headcaps can be downloaded here: https://www.biosemi.com/download/Cap_coords_all.xls.
- Due to minor synchronization ambiguities, different clocks in the PC and EEG recorder, and missing or extra video frames during video playback (rarely occurred), the length of the EEG data may not perfectly match the corresponding video data. The difference, typically within a few milliseconds, can be resolved by truncating the modality with the excess samples.
Signal Quality Information: A supplementary .txt file detailing potential bad channels. Users can opt to create their own criteria for identifying and handling bad channels.

The dataset is divided into two subsets: Single-shot and MrBean, based on the characteristics of the video stimuli.

Single-shot Dataset

The stimuli of this dataset consist of 13 single-shot videos (63 min in total), each depicting a single individual engaging in various activities such as dancing, mime, acrobatics, and magic shows. All the participants watched this video collection.

Video ID	Link	Start time (s)	End time (s)
01_Dance_1	https://youtu.be/uOUVE5rGmhM	8.54	231.20
03_Acrob_1	https://youtu.be/DjihbYg6F2Y	4.24	231.91
04_Magic_1	https://youtu.be/CvzMqIQLiXE	3.68	348.17
05_Dance_2	https://youtu.be/f4DZp0OEkK4	5.05	227.99
06_Mime_2	https://youtu.be/u9wJUTnBdrs	5.79	347.05
07_Acrob_2	https://youtu.be/kRqdxGPLajs	183.61	519.27
08_Magic_2	https://youtu.be/FUv-Q6EgEFI	3.36	270.62
09_Dance_3	https://youtu.be/LXO-jKksQkM	5.61	294.17
12_Magic_3	https://youtu.be/S84AoWdTq3E	1.76	426.36
13_Dance_4	https://youtu.be/0wc60tA1klw	14.28	217.18
14_Mime_3	https://youtu.be/0Ala3ypPM3M	21.87	386.84
15_Dance_5	https://youtu.be/mg6-SnUl0A0	15.14	233.85
16_Mime_6	https://youtu.be/8V7rhAJF6Gc	31.64	388.61

MrBean Dataset

Additionally, 9 participants watched an extra 24-minute clip from the first episode of Mr. Bean, where multiple (moving) objects may exist and interact, and the camera viewpoint may change. The subject IDs and the signal quality files are inherited from the single-shot dataset.

Video ID	Link	Start time (s)	End time (s)
Mr_Bean	https://www.youtube.com/watch?v=7Im2I6STbms	39.77	1495.00

Acknowledgement

This research is funded by the Research Foundation - Flanders (FWO) project No G081722N, junior postdoctoral fellowship fundamental research of the FWO (for S. Geirnaert, No. 1242524N), the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation program (grant agreement No 802895), the Flemish Government (AI Research Program), and the PDM mandate from KU Leuven (for S. Geirnaert, No PDMT1/22/009).

We also thank the participants for their time and effort in the experiments.

Contact Information

Executive researcher: Yuanyuan Yao, yuanyuan.yao@kuleuven.be

Led by: Prof. Alexander Bertrand, alexander.bertrand@kuleuven.be

Car crash dataset RUSSIA 2022-2023
kaggle.com
Updated May 10, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sivoha (2023). Car crash dataset RUSSIA 2022-2023 [Dataset]. https://www.kaggle.com/datasets/sivoha/car-crash-dataset-russia-2022-2023
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 10, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Sivoha
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Area covered
Russia
Description
Car crash dataset RUSSIA 2022-2023 is a big driving video dataset that contains over 500 high-resolution videos of various driving scenarios. The dataset was created to aid the development and testing of autonomous driving systems and other related technologies. It includes videos from Russia, captured from a diverse set of locations, weather conditions, and lighting conditions, each video lasting about 10 seconds. The videos are annotated with bounding boxes around objects such as different types of cars, pedestrians, and cyclists, as well as traffic signs, and traffic lights. Additionally, the dataset includes metadata information for each video.Car crash dataset RUSSIA 2022-2023 is considered to be one of the few datasets from Russia on this topic. Created by 7 students from Moscow, MIEM HSE. First version published on 9th May, 2023.
A Labelled Dataset for Sentiment Analysis of Videos on YouTube, TikTok, and...
zenodo.org
data.niaid.nih.gov
+2more
csv
Updated Jul 20, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nirmalya Thakur; Nirmalya Thakur; Vanessa Su; Mingchen Shao; Kesha A. Patel; Hongseok Jeong; Victoria Knieling; Andrew Bian; Vanessa Su; Mingchen Shao; Kesha A. Patel; Hongseok Jeong; Victoria Knieling; Andrew Bian (2024). A Labelled Dataset for Sentiment Analysis of Videos on YouTube, TikTok, and other sources about the 2024 outbreak of Measles [Dataset]. http://doi.org/10.5281/zenodo.11711230
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.11711230
Dataset updated
Jul 20, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Nirmalya Thakur; Nirmalya Thakur; Vanessa Su; Mingchen Shao; Kesha A. Patel; Hongseok Jeong; Victoria Knieling; Andrew Bian; Vanessa Su; Mingchen Shao; Kesha A. Patel; Hongseok Jeong; Victoria Knieling; Andrew Bian
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jun 15, 2024
Area covered
YouTube
Description
Please cite the following paper when using this dataset:

N. Thakur, V. Su, M. Shao, K. Patel, H. Jeong, V. Knieling, and A. Bian “A labelled dataset for sentiment analysis of videos on YouTube, TikTok, and other sources about the 2024 outbreak of measles,” Proceedings of the 26th International Conference on Human-Computer Interaction (HCII 2024), Washington, USA, 29 June - 4 July 2024. (Accepted as a Late Breaking Paper, Preprint Available at: https://doi.org/10.48550/arXiv.2406.07693)

Abstract

This dataset contains the data of 4011 videos about the ongoing outbreak of measles published on 264 websites on the internet between January 1, 2024, and May 31, 2024. These websites primarily include YouTube and TikTok, which account for 48.6% and 15.2% of the videos, respectively. The remainder of the websites include Instagram and Facebook as well as the websites of various global and local news organizations. For each of these videos, the URL of the video, title of the post, description of the post, and the date of publication of the video are presented as separate attributes in the dataset. After developing this dataset, sentiment analysis (using VADER), subjectivity analysis (using TextBlob), and fine-grain sentiment analysis (using DistilRoBERTa-base) of the video titles and video descriptions were performed. This included classifying each video title and video description into (i) one of the sentiment classes i.e. positive, negative, or neutral, (ii) one of the subjectivity classes i.e. highly opinionated, neutral opinionated, or least opinionated, and (iii) one of the fine-grain sentiment classes i.e. fear, surprise, joy, sadness, anger, disgust, or neutral. These results are presented as separate attributes in the dataset for the training and testing of machine learning algorithms for performing sentiment analysis or subjectivity analysis in this field as well as for other applications. The paper associated with this dataset (please see the above-mentioned citation) also presents a list of open research questions that may be investigated using this dataset.
ITTV - A Dataset of Italian Television for Automatic Genre Classification
zenodo.org
data.niaid.nih.gov
csv
Updated Jun 13, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Paolo Sani; Paolo Sani; Alessandro Ilic Mezza; Alessandro Ilic Mezza; Augusto Sarti; Augusto Sarti (2023). ITTV - A Dataset of Italian Television for Automatic Genre Classification [Dataset]. http://doi.org/10.5281/zenodo.8027327
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.8027327
Dataset updated
Jun 13, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Paolo Sani; Paolo Sani; Alessandro Ilic Mezza; Alessandro Ilic Mezza; Augusto Sarti; Augusto Sarti
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Italy
Description
ITTV is a publicly available dataset of Italian TV programs introduced in

Alessandro Ilic Mezza, Paolo Sani, and Augusto Sarti, "Automatic TV Genre Classification Based on Visually-Conditioned Deep Audio Features," in 2023 31st European Signal Processing Conference (EUSIPCO), 2023.

ITTV consists of 2625 manually annotated YouTube videos, totaling over 670 hours. Each clip is assigned one of seven classes:

Cartoons

Commercials

Football

Music

News

Talk Shows

Weather Forecast

ITTV genre taxonomy is similar to that of the well-known RAI dataset described in

Maurizio Montagnuolo and Alberto Messina, "Parallel neural networks for multimodal video genre classification,” Multimedia Tools and Applications, vol. 41, no. 1, pp. 125–159, 2009.

The dataset contains genre annotations and metadata in CSV format. Please note that audio data is not provided.

We provide the annotations for a balanced training (1575 clips) and validation (525 clips) split, as well as for a disjoint test set containing 525 installments from TV programs not included in the development set.

As YouTube continuously updates, some videos may not be available in the future. Although we intend to keep ITTV updated as best as possible, please note that some content may not be available at any given time.

Some YouTube videos (especially from the Football class and, to a lesser extent, the Cartoons class) may only be available in some countries due to regional restrictions imposed by the content creator. All videos are known to be accessible from Italy (last accessed on Nov. 25th, 2022.)

Please contact Alessandro Ilic Mezza for further questions (e-mail: alessandroilic.mezza@polimi.it).
Youtube cookery channels viewers comments in Hinglish
zenodo.org
csv
Updated Jan 24, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Abhishek Kaushik; Abhishek Kaushik; Gagandeep Kaur; Gagandeep Kaur (2020). Youtube cookery channels viewers comments in Hinglish [Dataset]. http://doi.org/10.5281/zenodo.2841848
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.2841848
Dataset updated
Jan 24, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Abhishek Kaushik; Abhishek Kaushik; Gagandeep Kaur; Gagandeep Kaur
License
Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
License information was derived automatically
Area covered
YouTube
Description
The data was collected from the famous cookery Youtube channels in India. The major focus was to collect the viewers' comments in Hinglish languages. The datasets are taken from top 2 Indian cooking channel named Nisha Madhulika channel and Kabita’s Kitchen channel.

Both the datasets comments are divided into seven categories:-

Label 1- Gratitude

Label 2- About the recipe

Label 3- About the video

Label 4- Praising

Label 5- Hybrid

Label 6- Undefined

Label 7- Suggestions and queries

All the labelling has been done manually.

Nisha Madhulika dataset:

Dataset characteristics: Multivariate

Number of instances: 4900

Area: Cooking

Attribute characteristics: Real

Number of attributes: 3

Date donated: March, 2019

Associate tasks: Classification

Missing values: Null

Kabita Kitchen dataset:

Dataset characteristics: Multivariate

Number of instances: 4900

Area: Cooking

Attribute characteristics: Real

Number of attributes: 3

Date donated: March, 2019

Associate tasks: Classification

Missing values: Null

There are two separate datasets file of each channel named as preprocessing and main file .

The files with preprocessing names are generated after doing the preprocessing and exploratory data analysis on both the datasets. This file includes:

Id

Comment text

Labels

Count of stop-words

Uppercase words

Hashtags

Word count

Char count

Average words

Numeric

The main file includes:

Id

comment text

Labels

Please cite the paper

https://www.mdpi.com/2504-2289/3/3/37

MDPI and ACS Style

Kaur, G.; Kaushik, A.; Sharma, S. Cooking Is Creating Emotion: A Study on Hinglish Sentiments of Youtube Cookery Channels Using Semi-Supervised Approach. Big Data Cogn. Comput. 2019, 3, 37.
d
Replication Data for: Estimating the Ideology of YouTube Videos
search.dataone.org
dataverse.harvard.edu
Updated Dec 16, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lai, Angela; Brown, Megan A.; Bisbee, James; Tucker, Joshua A.; Nagler, Jonathan; Bonneau, Richard (2023). Replication Data for: Estimating the Ideology of YouTube Videos [Dataset]. http://doi.org/10.7910/DVN/WZZFTW
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/WZZFTW
Dataset updated
Dec 16, 2023
Dataset provided by
Harvard Dataverse
Authors
Lai, Angela; Brown, Megan A.; Bisbee, James; Tucker, Joshua A.; Nagler, Jonathan; Bonneau, Richard
Area covered
YouTube
Description
Abstract: We present a method for estimating the ideology of political YouTube videos. The subfield of estimating ideology as a latent variable has often focused on traditional actors such as legislators while more recent work has used social media data to estimate the ideology of ordinary users, political elites, and media sources. We build on this work to estimate the ideology of a political YouTube video. First, we start with a matrix of political Reddit posts linking to YouTube videos and apply correspondence analysis to place those videos in an ideological space. Second, we train a language model with those estimated ideologies as training labels, enabling us to estimate the ideologies of videos not posted on Reddit. These predicted ideologies are then validated against human labels. We demonstrate the utility of this method by applying it to the watch histories of survey respondents to evaluate the prevalence of echo chambers on YouTube in addition to the association between video ideology and viewer engagement. Our approach gives video-level scores based only on supplied text metadata, is scalable, and can be easily adjusted to account for changes in the ideological landscape. Keywords: Ideology estimation, YouTube, latent variable This folder contains the replication materials for "Estimating the Ideology of Political YouTube Videos."
e
Video instructions for the data portal
americansamoa-data.nocache.eightyoptions.com.au
pacificdata.org
+14more
zip
Updated Apr 2, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Secretariat of the Pacific Regional Environment Programme (2025). Video instructions for the data portal [Dataset]. https://americansamoa-data.nocache.eightyoptions.com.au/dataset/video-instructions-data-portal
Explore at:
zip(40900538), zip, zip(35894926), zip(41752372)Available download formats
Dataset updated
Apr 2, 2025
Dataset provided by
Secretariat of the Pacific Regional Environment Programme
License
Public Domain Mark 1.0https://creativecommons.org/publicdomain/mark/1.0/
License information was derived automatically
Area covered
Pacific Region
Description
These instructional videos walk users through the portal and its different features.
f
Data from: Youtube in brazilian academic libraries: who, how and for what is...
scielo.figshare.com
jpeg
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Enrique Muriel-Torrado; Marcio Gonçalves (2023). Youtube in brazilian academic libraries: who, how and for what is used [Dataset]. http://doi.org/10.6084/m9.figshare.5931073.v1
Explore at:
jpegAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.5931073.v1
Dataset updated
May 30, 2023
Dataset provided by
SciELO journals
Authors
Enrique Muriel-Torrado; Marcio Gonçalves
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
YouTube
Description
ABSTRACT This research aims to analyze the use of Youtube as a useful platform for the activities of library and information science professionals in brazilian academic libraries. Related audiovisual practices of the university libraries to encourage activities and focus on the importance of the librarian as a content producer in the digital enviroment. The survey results serve as reference material for information scientists and managers of information units interested in sharing audiovisual information as a new way of relationship with their users. Finally, based on the results, it is recommended to plan the communication strategy on social media platforms as YouTube, and prepare relevant content to engage with their subscribers and users.
h
first-impressions-v2
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yeray, first-impressions-v2 [Dataset]. https://huggingface.co/datasets/yeray142/first-impressions-v2
Explore at:
Authors
Yeray
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Dataset Card for First Impressions V2

The first impressions data set, comprises 10000 clips (average duration 15s) extracted from more than 3,000 different YouTube high-definition (HD) videos of people facing and speaking in English to a camera. The videos are split into training, validation and test sets with a 3:1:1 ratio. People in videos show different gender, age, nationality, and ethnicity. Videos are labeled with personality traits variables. Amazon Mechanical Turk (AMT) was… See the full description on the dataset page: https://huggingface.co/datasets/yeray142/first-impressions-v2.
d
Data from: A systematic review of methods for studying consumer health...
datadryad.org
data.niaid.nih.gov
+1more
zip
Updated Sep 13, 2013
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Margaret Sampson; Jordi Cumber; Claudia Li; Catherine M. Pound; Ann Fuller; Denise M. Harrison; Denise Harrison (2013). A systematic review of methods for studying consumer health YouTube videos, with implications for systematic reviews [Dataset]. http://doi.org/10.5061/dryad.4jh42
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.4jh42
Dataset updated
Sep 13, 2013
Dataset provided by
Dryad
Authors
Margaret Sampson; Jordi Cumber; Claudia Li; Catherine M. Pound; Ann Fuller; Denise M. Harrison; Denise Harrison
Time period covered
2013
Area covered
YouTube
Description
characteristics of published youtube video reviewsData extracted from published manuscripts, Excel spreadsheet with two tabs, one for the orginal sample and another for newer manuscripts. The row titled PMID indicates the PubMed ID number,and identifies that article that the data is taken from.DE on review methods V2.xls
Z
Data from: Effectiveness of Online Off-the-Job Training in Attracting...
data.niaid.nih.gov
zenodo.org
Updated Jul 11, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kobayashi, Fumitaka (2024). Data from: Effectiveness of Online Off-the-Job Training in Attracting Participants and Video-On-Demand Streaming in Improving Work-Life Balance: A Study Focusing on Medical Technologists [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8260491
Explore at:
Dataset updated
Jul 11, 2024
Dataset provided by
Ohmae, Kazuto
Kobayashi, Fumitaka
Lee, Sang-Tae
Chagi, Yoshinari
Yamaguchi, Naoko
Ikejima, Takuya
Abe, Noriyuki
Tatsumi, Shigenobu
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The Nara Association of Medical Technologists has introduced online Off-Job Training (Off-JT) starting from FY2020 in response to the COVID-19 pandemic. This study aims to evaluate the online Off-JT, which differs from the traditional face-to-face format. Firstly, we compared the online format's ability to attract participants with the face-to-face format based on the number of training sessions and attendees. Despite having fewer training sessions (40.8% less), the online format had an average attendance of 105.4% higher (39.7 vs. 19.3) than the face-to-face format. To enhance participant convenience, we offered a limited number of live and video-on-demand (VOD) sessions on YouTube, evaluating their usefulness through an online survey focusing on work-life balance (WLB). The survey results showed that 81.9% (458/559) of respondents reported an improvement in WLB. The effect on WLB improvement varied depending on the viewing method, with VOD sessions showing 84.1% (376/447) and live sessions showing 73.2% (82/112). We believe that the increased ability to attract participants in the online Off-JT is mainly due to the elimination of travel burdens through internet-connected devices. The combination of live and VOD sessions on YouTube allowed participants to adjust their viewing time, leading to better allocation of free time and improved WLB. The online Off-JT and VOD delivery have shown to enhance convenience for participants by removing geographical and time constraints, resulting in positive effects.
m
Multi-language Video Subtitle Dataset
data.mendeley.com
Updated Nov 29, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Olarik Surinta (2021). Multi-language Video Subtitle Dataset [Dataset]. http://doi.org/10.17632/gj8d88h2g3.2
Explore at:
Unique identifier
https://doi.org/10.17632/gj8d88h2g3.2
Dataset updated
Nov 29, 2021
Authors
Olarik Surinta
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The video subtitle images were collected from 24 videos shared on Facebook and Youtube. The subtitle text included Thai and English languages, including Thai characters, Roman characters, Thai numerals, Arabic numerals, and special characters with 157 characters in total.

In the data-preprocessing step, we converted all 24 videos to images and obtained 2,700 images with subtitle text. The size of the subtitle text image was 1280x720 pixels and it was stored in JPG format. Further, we generated the ground truth from 4,224 subtitle images using the labelImg program. Also, the labels were then assigned to each subtitle image. Note that the number before the label is the order of the subtitle text image.
Z
RealVAD: A Real-world Dataset for Voice Activity Detection
data.niaid.nih.gov
explore.openaire.eu
+1more
Updated Jul 3, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cigdem Beyan (2020). RealVAD: A Real-world Dataset for Voice Activity Detection [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3928150
Explore at:
Dataset updated
Jul 3, 2020
Dataset provided by
Cigdem Beyan
Vittorio Murino
Muhammad Shahid
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
RealVAD: A Real-world Dataset for Voice Activity Detection

The task of automatically detecting “Who is Speaking and When” is broadly named as Voice Activity Detection (VAD). Automatic VAD is a very important task and also the foundation of several domains, e.g., human-human, human-computer/ robot/ virtual-agent interaction analyses, and industrial applications.

RealVAD dataset is constructed from a YouTube video composed of a panel discussion lasting approx. 83 minutes. The audio is available from a single channel. There is one static camera capturing all panelists, the moderator and audiences.

Particular aspects of RealVAD dataset are:

It is composed of panelists with different nationalities (British, Dutch, French, German, Italian, American, Mexican, Columbian, Thai). This aspect allows studying the effect of ethnic origin variety to the automatic VAD.

There is a gender balance such that there are four female and five male panelists.

The panelists are sitting in two rows and they can be gazing audience, other panelists, their laptop, the moderator or anywhere in the room while speaking or not-speaking. Therefore, they were captured not only from frontal-view but also from side-view varying based on their instant posture and head orientation.

The panelists are moving freely and are doing various spontaneous actions (e.g., drinking water, checking their cell phone, using their laptop, etc.), resulting in different postures.

The panelists’ body parts are sometimes partially occluded by their/other's body part or belongings (e.g., laptop).

There are also natural changes of illumination and shadow rising on the wall behind the panelists in the back row.

Especially, for the panelists sitting in the front row, there is sometimes background motion occurring when the person(s) behind them moves.

The annotations includes:

The upper body detection of nine panelists in bounding box form.

Associated VAD ground-truth (speaking, not-speaking) for nine panelists.

Acoustic features extracted from the video: MFCC and raw filterbank energies.

All info regarding the annotations are given in the ReadMe.txt and Acoustic Features README.txt files.

When using this dataset for your research, please cite the following paper in your publication:

C. Beyan, M. Shahid and V. Murino, "RealVAD: A Real-world Dataset and A Method for Voice Activity Detection by Body Motion Analysis", in IEEE Transactions on Multimedia, 2020.
P
Extended YouTube Faces (E-YTF) Dataset
paperswithcode.com
opendatalab.com
Updated Aug 1, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Claudio Ferrari; Stefano Berretti; Alberto del Bimbo (2018). Extended YouTube Faces (E-YTF) Dataset [Dataset]. https://paperswithcode.com/dataset/extended-youtube-faces-e-ytf
Explore at:
Dataset updated
Aug 1, 2018
Authors
Claudio Ferrari; Stefano Berretti; Alberto del Bimbo
Area covered
YouTube
Description
The proposed Extended-YouTube Faces (E-YTF) is an extension of the famous YouTube Faces (YTF) dataset and is specifically designed to further push the challenges of face recognition by addressing the problem of open-set face identification from heterogeneous data i.e. still images vs video.

Facebook

Twitter

Click to copy link

Link copied

Cite

Statista (2024). Hours of video uploaded to YouTube every minute 2007-2022 [Dataset]. https://www.statista.com/statistics/259477/hours-of-video-uploaded-to-youtube-every-minute/

Hours of video uploaded to YouTube every minute 2007-2022

Explore at:

237 scholarly articles cite this dataset (View in Google Scholar)

Dataset updated

Apr 11, 2024

Dataset authored and provided by

Statistahttp://statista.com/

Time period covered

Jun 2007 - Jun 2022

Area covered

Worldwide, YouTube

Description

As of June 2022, more than 500 hours of video were uploaded to YouTube every minute. This equates to approximately 30,000 hours of newly uploaded content per hour. The amount of content on YouTube has increased dramatically as consumer’s appetites for online video has grown. In fact, the number of video content hours uploaded every 60 seconds grew by around 40 percent between 2014 and 2020.

YouTube global users

Online video is one of the most popular digital activities worldwide, with 27 percent of internet users worldwide watching more than 17 hours of online videos on a weekly basis in 2023. It was estimated that in 2023 YouTube would reach approximately 900 million users worldwide. In 2022, the video platform was one of the leading media and entertainment brands worldwide, with a value of more than 86 billion U.S. dollars.

YouTube video content consumption

The most viewed YouTube channels of all time have racked up billions of viewers, millions of subscribers and cover a wide variety of topics ranging from music to cosmetics. The YouTube channel owner with the most video views is Indian music label T-Series, which counted 217.25 billion lifetime views. Other popular YouTubers are gaming personalities such as PewDiePie, DanTDM and Markiplier.

Clear search

Close search

Google apps

Main menu

Hours of video uploaded to YouTube every minute 2007-2022

YouTube Trending Video Dataset (updated daily)

This dataset is a daily record of the top trending YouTube videos and it will be updated daily.

Context

Content

Acknowledgements

Inspiration

YouTube Dataset on Mobile Streaming for Internet Traffic Modeling, Network...

YouTube users worldwide 2020-2029

Most Watched Youtube Videos

YouTube users in India 2020-2029

Video-EEG Encoding-Decoding Dataset KU Leuven

Introduction

Content

Single-shot Dataset

MrBean Dataset

Acknowledgement

Contact Information

Car crash dataset RUSSIA 2022-2023

A Labelled Dataset for Sentiment Analysis of Videos on YouTube, TikTok, and...

ITTV - A Dataset of Italian Television for Automatic Genre Classification

Youtube cookery channels viewers comments in Hinglish

Replication Data for: Estimating the Ideology of YouTube Videos

Video instructions for the data portal

Data from: Youtube in brazilian academic libraries: who, how and for what is...

first-impressions-v2

Data from: A systematic review of methods for studying consumer health...

Data from: Effectiveness of Online Off-the-Job Training in Attracting...

Multi-language Video Subtitle Dataset

RealVAD: A Real-world Dataset for Voice Activity Detection

Extended YouTube Faces (E-YTF) Dataset

Hours of video uploaded to YouTube every minute 2007-2022