74 datasets found

Trending YouTube Videos 2019 to 2020
kaggle.com
Updated Jul 24, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Akindu Himan (2024). Trending YouTube Videos 2019 to 2020 [Dataset]. https://www.kaggle.com/datasets/akinduhiman/trending-youtube-videos-2019-to-2020
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 24, 2024
Dataset provided by
Kaggle
Authors
Akindu Himan
Area covered
YouTube
Description
Description:

This dataset contains statistics for a selection of YouTube videos, capturing metrics such as views, comments, likes, dislikes, and the timestamp when the data was recorded. The dataset provides insights into the popularity and engagement levels of these videos as of April 15, 2019. This data can be useful for analyzing trends in video performance, user engagement, and the impact of content over time.

File Description: This CSV file contains detailed statistics for a set of YouTube videos, including unique video identifiers and various engagement metrics. Each row represents a different video, and the columns provide specific data points related to the video's performance.

Column Descriptions

videostatsid: Unique identifier for each video statistics entry. ytvideoid: Unique YouTube video identifier. views: The total number of views the video has received. comments: The total number of comments posted on the video. likes: The total number of likes the video has received. dislikes: The total number of dislikes the video has received. timestamp:The date and time when the statistics were recorded, in the format YYYY-MM-DD HH:MM
YouTube users worldwide 2020-2029
statista.com
tokrwards.com
Updated Jul 7, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). YouTube users worldwide 2020-2029 [Dataset]. https://www.statista.com/forecasts/1144088/youtube-users-in-the-world
Explore at:
Dataset updated
Jul 7, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
Worldwide, YouTube
Description
The global number of Youtube users in was forecast to continuously increase between 2024 and 2029 by in total ***** million users (+***** percent). After the ninth consecutive increasing year, the Youtube user base is estimated to reach *** billion users and therefore a new peak in 2029. Notably, the number of Youtube users of was continuously increasing over the past years.User figures, shown here regarding the platform youtube, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to *** countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of Youtube users in countries like Africa and South America.
YouTube users in India 2020-2029
statista.com
tokrwards.com
Updated Jul 10, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). YouTube users in India 2020-2029 [Dataset]. https://www.statista.com/forecasts/1146150/youtube-users-in-india
Explore at:
Dataset updated
Jul 10, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
India
Description
The number of Youtube users in India was forecast to continuously increase between 2024 and 2029 by in total ***** million users (+***** percent). After the ninth consecutive increasing year, the Youtube user base is estimated to reach ****** million users and therefore a new peak in 2029. Notably, the number of Youtube users of was continuously increasing over the past years.User figures, shown here regarding the platform youtube, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to *** countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of Youtube users in countries like Sri Lanka and Nepal.
Z
Dataset of Video Comments of a Vision Video Classified by Their Relevance,...
data.niaid.nih.gov
Updated Jul 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Karras, Oliver (2024). Dataset of Video Comments of a Vision Video Classified by Their Relevance, Polarity, Intention, and Topic [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4533301
Explore at:
Dataset updated
Jul 19, 2024
Dataset provided by
Kristo, Eklekta
Karras, Oliver
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains all comments (comments and replies) of the YouTube vision video "Tunnels" by "The Boring Company" fetched on 2020-10-13 using YouTube API. The comments are classified manually by three persons. We performed a single-class labeling of the video comments regarding their relevance for requirement engineering (RE) (ham/spam), their polarity (positive/neutral/negative). Furthermore, we performed a multi-class labeling of the comments regarding their intention (feature request and problem report) and their topic (efficiency and safety). While a comment can only be relevant or not relevant and have only one polarity, a comment can have one or more intentions and also one or more topics.

For the replies, one person also classified them regarding their relevance for RE. However, the investigation of the replies is ongoing and future work.

Remark: For 126 comments and 26 replies, we could not determine the date and time since they were no longer accessible on YouTube at the time this data set was created. In the case of a missing date and time, we inserted "NULL" in the corresponding cell.

This data set includes the following files:

Dataset.xlsx contains the raw and labeled video comments and replies:

For each comment, the data set contains:

ID: An identification number generated by YouTube for the comment

Date: The date and time of the creation of the comment

Author: The username of the author of the comment

Likes: The number of likes of the comment

Replies: The number of replies to the comment

Comment: The written comment

Relevance: Label indicating the relevance of the comment for RE (ham = relevant, spam = irrelevant)

Polarity: Label indicating the polarity of the comment

Feature request: Label indicating that the comment request a feature

Problem report: Label indicating that the comment reports a problem

Efficiency: Label indicating that the comment deals with the topic efficiency

Safety: Label indicating that the comment deals with the topic safety

For each reply, the data set contains:

ID: The identification number of the comment to which the reply belongs

Date: The date and time of the creation of the reply

Author: The username of the author of the reply

Likes: The number of likes of the reply

Comment: The written reply

Relevance: Label indicating the relevance of the reply for RE (ham = relevant, spam = irrelevant)

Detailed analysis results.xlsx contains the detailed results of all ten times repeated 10-fold cross validation analyses for each of all considered combinations of machine learning algorithms and features

Guide Sheet - Multi-class labeling.pdf describes the coding task, defines the categories, and lists examples to reduce inconsistencies and increase the quality of manual multi-class labeling

Guide Sheet - Single-class labeling.pdf describes the coding task, defines the categories, and lists examples to reduce inconsistencies and increase the quality of manual single-class labeling

Python scripts for analysis.zip contains the scripts (as jupyter notebooks) and prepared data (as csv-files) for the analyses
Late Night Talk Show YouTube Dataset
kaggle.com
Updated Jun 14, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pranav Hari (2020). Late Night Talk Show YouTube Dataset [Dataset]. https://www.kaggle.com/datasets/phiitm/late-night-talk-show-youtube-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 14, 2020
Dataset provided by
Kaggle
Authors
Pranav Hari
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Area covered
YouTube
Description

Context

Late Night Talk Shows are a staple of American television culture and with the shows establishing a digital presence in the form of YouTube channels, this culture has become more global. Some of the channels here have more than 20 Million subscribers which shows the amount of influence they hold in this platform.

Content

The data is organized on a per-show channel basis which has the most important information like video titles, and all the numeric counts of Likes, Dislikes, Comments and number of views (as of 13th June 2020)

Acknowledgements

All of this data is responsibly scraped from YouTube and I would like to acknowledge all the respective Talk Shows for making their content free for the public.

Inspiration

The main inspiration for this dataset is how a video title or a particular celebrity appearing on the talk show can affect the engagement rate of a video
e
English YouTube Hate Speech Corpus - Dataset - B2FIND
b2find.eudat.eu
Updated Apr 25, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). English YouTube Hate Speech Corpus - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/27a8267d-6a62-5006-8c26-440f70da2460
Explore at:
Dataset updated
Apr 25, 2023
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Area covered
YouTube
Description
We present an English YouTube dataset manually annotated for hate speech types and targets. The comments to be annotated were sampled from the English YouTube comments on videos about the Covid-19 pandemic in the period from January 2020 to May 2020. Two sets were annotated: a training set with 51,655 comments (IMSyPP_EN_YouTube_comments_train.csv) and two evaluation sets, one annotated in-context (IMSyPP_EN_YouTube_comments_evaluation_context.csv), another out-of-context (IMSyPP_EN_YouTube_comments_evaluation_no_context.csv), each based on the same 10,759 comments. The dataset was annotated by 10 annotators with most (99.9%) of the comments being annotated by two annotators. It was used to train a classification model for hate speech types detection that is publicly available at the following URL: https://huggingface.co/IMSyPP/hate_speech_en. The dataset consists of the following fields: Video_ID - YouTube ID of the video under which the comment was posted Comment_ID - YouTube ID of the comment Text - text of the comment Type - type of hate speech Target - the target of hate speech Annotator - code of the human annotator
YouTube Shorts and videos engagement 2024, by account size
statista.com
tokrwards.com
Updated Jan 28, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista Research Department (2025). YouTube Shorts and videos engagement 2024, by account size [Dataset]. https://www.statista.com/topics/2019/youtube/
Explore at:
Dataset updated
Jan 28, 2025
Dataset provided by
Statistahttp://statista.com/
Authors
Statista Research Department
Area covered
YouTube
Description
During the first quarter of 2024, Huge YouTube accounts, which had over 50,000 followers, reported an engagement rate of approximately 6.2 percent on their short-format content. In comparison, engagement was sensibly lower on long-format videos, which reported an engagement rate of 1.72 percent for Huge accounts. Medium YouTube accounts, which had a following between 2,001 and 10,000 users, reported engagement ratings of almost three percent on their Shorts, while long videos had an engagement of around 0.15 percent.
e
Protests Belarus 2020: YouTube Videos - Dataset - B2FIND
b2find.eudat.eu
Updated May 28, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2021). Protests Belarus 2020: YouTube Videos - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/a0eb4487-0786-5d66-b0b4-bbdce40d389e
Explore at:
Dataset updated
May 28, 2021
License
Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
License information was derived automatically
Area covered
Belarus, YouTube
Description
The collection “Protests Belarus 2020” contains 101 videos (mp4) on protests in the second half of 2020 (mainly Minsk area) triggered by the repeated election of Lukashenko and the treatment of oppositions during the presidential elections. We have downloaded all data in September 2020 and made screenshots (pdf) of websites so that the discussion and comments on the single video posts can be followed. All data is processed in an MS Excel database with metadata.
We collect all videos that are 1) event related AND show actions of this event, 2) downloadable, 3) we can find with our search words during a particular period. We strictly aim at a systematic and objective selection and organized storage of protest-related videos. We identify particular event-related search words after intense research on the event. According to the snowball principle, we then start the collection of videos with the help of these search words and try to download as much relevant content as possible. However, we cannot guarantee the completeness of protest videos on the particular event. We search the videos and include them into the collection until a particular degree of saturation has been reached. Due to copyright restrictions, we are only allowed to give access to the database of the collected video files including the hyperlinks with its metadata and not to the videos themselves.
The videos have been posted mainly by the participants of the events. Therefore, the material is only an extract and biased by the perspective of the single creator.
The collection is part of a larger and ongoing collection of videos on protest events in the post-Soviet region.
Z
RealVAD: A Real-world Dataset for Voice Activity Detection
data.niaid.nih.gov
zenodo.org
Updated Jul 3, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Vittorio Murino (2020). RealVAD: A Real-world Dataset for Voice Activity Detection [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3928150
Explore at:
Dataset updated
Jul 3, 2020
Dataset provided by
Muhammad Shahid
Vittorio Murino
Cigdem Beyan
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
RealVAD: A Real-world Dataset for Voice Activity Detection

The task of automatically detecting “Who is Speaking and When” is broadly named as Voice Activity Detection (VAD). Automatic VAD is a very important task and also the foundation of several domains, e.g., human-human, human-computer/ robot/ virtual-agent interaction analyses, and industrial applications.

RealVAD dataset is constructed from a YouTube video composed of a panel discussion lasting approx. 83 minutes. The audio is available from a single channel. There is one static camera capturing all panelists, the moderator and audiences.

Particular aspects of RealVAD dataset are:

It is composed of panelists with different nationalities (British, Dutch, French, German, Italian, American, Mexican, Columbian, Thai). This aspect allows studying the effect of ethnic origin variety to the automatic VAD.

There is a gender balance such that there are four female and five male panelists.

The panelists are sitting in two rows and they can be gazing audience, other panelists, their laptop, the moderator or anywhere in the room while speaking or not-speaking. Therefore, they were captured not only from frontal-view but also from side-view varying based on their instant posture and head orientation.

The panelists are moving freely and are doing various spontaneous actions (e.g., drinking water, checking their cell phone, using their laptop, etc.), resulting in different postures.

The panelists’ body parts are sometimes partially occluded by their/other's body part or belongings (e.g., laptop).

There are also natural changes of illumination and shadow rising on the wall behind the panelists in the back row.

Especially, for the panelists sitting in the front row, there is sometimes background motion occurring when the person(s) behind them moves.

The annotations includes:

The upper body detection of nine panelists in bounding box form.

Associated VAD ground-truth (speaking, not-speaking) for nine panelists.

Acoustic features extracted from the video: MFCC and raw filterbank energies.

All info regarding the annotations are given in the ReadMe.txt and Acoustic Features README.txt files.

When using this dataset for your research, please cite the following paper in your publication:

C. Beyan, M. Shahid and V. Murino, "RealVAD: A Real-world Dataset and A Method for Voice Activity Detection by Body Motion Analysis", in IEEE Transactions on Multimedia, 2020.
e
Italian YouTube Hate Speech Corpus - Dataset - B2FIND
b2find.eudat.eu
Updated Oct 11, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2021). Italian YouTube Hate Speech Corpus - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/ce492ac0-a509-5c6b-a7b7-efaf004637fd
Explore at:
Dataset updated
Oct 11, 2021
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Area covered
YouTube
Description
We present an Italian YouTube dataset manually annotated for hate speech types and targets. The comments to be annotated were sampled from the Italian YouTube comments on videos about the Covid-19 pandemic in the period from January 2020 to May 2020. Two sets were annotated: a training set with 59,870 comments (IMSyPP_IT_YouTube_comments_train.csv) and an evaluation set with 10,536 comments (IMSyPP_IT_YouTube_comments_evaluation.csv). The dataset was annotated by 8 annotators with each comment being annotated by two annotators. It was used to train a classification model for hate speech types detection that is publicly available at the following URL: https://huggingface.co/IMSyPP/hate_speech_it. The dataset consists of the following fields: ID_Commento - YouTube ID of the comment ID_Video - YouTube ID of the video under which the comment was posted Testo - text of the comment Tipo - type of hate speech Target - the target of hate speech Additionally, we have included the Italian YouTube data (SR_YT_comments.csv) which was collected in the same period as the training data and was annotated using the aforementioned model. The automatically labeled data was used to analyze the relationship between hate speech and misinformation on Italian YouTube. The results of this analysis are presented in the associated paper. The analyzed data are represented with the following fields: ID_Commento - YouTube ID of the comment Label - automatically assigned label by the model is_questionable - the type of channel where the comment was collected from; the channels could either be categorized as spreading reliable or questionable information.
Z
Video-EEG Encoding-Decoding Dataset KU Leuven
data.niaid.nih.gov
zenodo.org
Updated Feb 24, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Stebner, Axel (2025). Video-EEG Encoding-Decoding Dataset KU Leuven [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10512413
Explore at:
Dataset updated
Feb 24, 2025
Dataset provided by
Bertrand, Alexander
Tuytelaars, Tinne
Stebner, Axel
Yao, Yuanyuan
Geirnaert, Simon
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Leuven
Description
If using this dataset, please cite the following paper and the current Zenodo repository.

This dataset is described in detail in the following paper:

[1] Yao, Y., Stebner, A., Tuytelaars, T., Geirnaert, S., & Bertrand, A. (2024). Identifying temporal correlations between natural single-shot videos and EEG signals. Journal of Neural Engineering, 21(1), 016018. doi:10.1088/1741-2552/ad2333

The associated code is available at: https://github.com/YYao-42/Identifying-Temporal-Correlations-Between-Natural-Single-shot-Videos-and-EEG-Signals?tab=readme-ov-file

Introduction

The research work leading to this dataset was conducted at the Department of Electrical Engineering (ESAT), KU Leuven.

This dataset contains electroencephalogram (EEG) data collected from 19 young participants with normal or corrected-to-normal eyesight when they were watching a series of carefully selected YouTube videos. The videos were muted to avoid the confounds introduced by audio. For synchronization, a square box was encoded outside of the original frames and flashed every 30 seconds in the top right corner of the screen. A photosensor, detecting the light changes from this flashing box, was affixed to that region using black tape to ensure that the box did not distract participants. The EEG data was recorded using a BioSemi ActiveTwo system at a sample rate of 2048 Hz. Participants wore a 64-channel EEG cap, and 4 electrooculogram (EOG) sensors were positioned around the eyes to track eye movements.

The dataset includes a total of (19 subjects x 63 min + 9 subjects x 24 min) of data. Further details can be found in the following section.

Content

YouTube Videos: Due to copyright constraints, the dataset includes links to the original YouTube videos along with precise timestamps for the segments used in the experiments. The features proposed in 1 have been extracted and can be downloaded here: https://drive.google.com/file/d/1J1tYrxVizrl1xP-W1imvlA_v-DPzZ2Qh/view?usp=sharing.

Raw EEG Data: Organized by subject ID, the dataset contains EEG segments corresponding to the presented videos. Both EEGLAB .set files (containing metadata) and .fdt files (containing raw data) are provided, which can also be read by popular EEG analysis Python packages such as MNE.

The naming convention links each EEG segment to its corresponding video. E.g., the EEG segment 01_eeg corresponds to video 01_Dance_1, 03_eeg corresponds to video 03_Acrob_1, Mr_eeg corresponds to video Mr_Bean, etc.

The raw data have 68 channels. The first 64 channels are EEG data, and the last 4 channels are EOG data. The position coordinates of the standard BioSemi headcaps can be downloaded here: https://www.biosemi.com/download/Cap_coords_all.xls.

Due to minor synchronization ambiguities, different clocks in the PC and EEG recorder, and missing or extra video frames during video playback (rarely occurred), the length of the EEG data may not perfectly match the corresponding video data. The difference, typically within a few milliseconds, can be resolved by truncating the modality with the excess samples.

Signal Quality Information: A supplementary .txt file detailing potential bad channels. Users can opt to create their own criteria for identifying and handling bad channels.

The dataset is divided into two subsets: Single-shot and MrBean, based on the characteristics of the video stimuli.

Single-shot Dataset

The stimuli of this dataset consist of 13 single-shot videos (63 min in total), each depicting a single individual engaging in various activities such as dancing, mime, acrobatics, and magic shows. All the participants watched this video collection.

Video ID Link Start time (s) End time (s)

01_Dance_1 https://youtu.be/uOUVE5rGmhM 8.54 231.20

03_Acrob_1 https://youtu.be/DjihbYg6F2Y 4.24 231.91

04_Magic_1 https://youtu.be/CvzMqIQLiXE 3.68 348.17

05_Dance_2 https://youtu.be/f4DZp0OEkK4 5.05 227.99

06_Mime_2 https://youtu.be/u9wJUTnBdrs 5.79 347.05

07_Acrob_2 https://youtu.be/kRqdxGPLajs 183.61 519.27

08_Magic_2 https://youtu.be/FUv-Q6EgEFI 3.36 270.62

09_Dance_3 https://youtu.be/LXO-jKksQkM 5.61 294.17

12_Magic_3 https://youtu.be/S84AoWdTq3E 1.76 426.36

13_Dance_4 https://youtu.be/0wc60tA1klw 14.28 217.18

14_Mime_3 https://youtu.be/0Ala3ypPM3M 21.87 386.84

15_Dance_5 https://youtu.be/mg6-SnUl0A0 15.14 233.85

16_Mime_6 https://youtu.be/8V7rhAJF6Gc 31.64 388.61

MrBean Dataset

Additionally, 9 participants watched an extra 24-minute clip from the first episode of Mr. Bean, where multiple (moving) objects may exist and interact, and the camera viewpoint may change. The subject IDs and the signal quality files are inherited from the single-shot dataset.

Video ID Link Start time (s) End time (s)

Mr_Bean https://www.youtube.com/watch?v=7Im2I6STbms 39.77 1495.00

Acknowledgement

This research is funded by the Research Foundation - Flanders (FWO) project No G081722N, junior postdoctoral fellowship fundamental research of the FWO (for S. Geirnaert, No. 1242524N), the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation program (grant agreement No 802895), the Flemish Government (AI Research Program), and the PDM mandate from KU Leuven (for S. Geirnaert, No PDMT1/22/009).

We also thank the participants for their time and effort in the experiments.

Contact Information

Executive researcher: Yuanyuan Yao, yuanyuan.yao@kuleuven.be

Led by: Prof. Alexander Bertrand, alexander.bertrand@kuleuven.be
Z
Dataset del Trabajo Fin de Grado "¿Qué puedo aprender en YouTube sobre...
data.niaid.nih.gov
produccioncientifica.ugr.es
Updated Dec 2, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rams, S. (2020). Dataset del Trabajo Fin de Grado "¿Qué puedo aprender en YouTube sobre microscopía escolar?". [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4300667
Explore at:
Dataset updated
Dec 2, 2020
Dataset provided by
Rams, S.
Rivas Brousse, I.
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
YouTube
Description
Rivas Brousse, I. & Rams, S. (2020). Dataset del Trabajo Fin de Grado "¿Qué puedo aprender en YouTube sobre microscopía escolar?". [Dataset] Zenodo. DOI: 10.5281/zenodo.4300668
YouTube: number of interactions 2023-2024
statista.com
Updated Jan 28, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista Research Department (2025). YouTube: number of interactions 2023-2024 [Dataset]. https://www.statista.com/topics/2019/youtube/
Explore at:
Dataset updated
Jan 28, 2025
Dataset provided by
Statistahttp://statista.com/
Authors
Statista Research Department
Area covered
YouTube
Description
In 2024, users engaged more with the videos they watched on YouTube compared to the previous year. The number of average interactions on YouTube grew to 2.36 in the last measured year. This is an increase compared to 2023, when the number of comments, likes, and share on pieces of content hosted on YouTube was of approximately 2.1 interactions on average.
Z
Ben Shapiro YouTube Comments
data.niaid.nih.gov
zenodo.org
Updated Jan 23, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Vis, Sarah (2025). Ben Shapiro YouTube Comments [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10640908
Explore at:
Dataset updated
Jan 23, 2025
Dataset provided by
Picone, Ike
Jurg, Daniel
Vis, Sarah
Area covered
YouTube
Description
This dataset was complied as a resource for analyzing viewer engagement, sentiment, and discussion trends on the Ben Shapiro YouTube channel over the specified period. It comprises user-generated comments extracted from the Ben Shapiro YouTube channel. The collection process involved first cataloging a comprehensive list of all videos published on the channel. Subsequently, these videos were categorized into three distinct time frames. From each time frame, the ten videos that garnered the highest number of comments were identified for detailed comment extraction. The extraction of videos and their associated comments was conducted utilizing YouTube Data Tools (Rieder, 2015). The dataset was finalized on September 12, 2022, and encompasses 711,909 comments ranging from September 1, 2020, to September 12, 2022. This dataset was uploaded and analyzed in the 4CAT: Capture & Anlysis Toolkit (Peeters & Hagen, 2022).

References:

Peeters, S., & Hagen, S. (2022). The 4CAT Capture and Analysis Toolkit: A Modular Tool for Transparent and Traceable Social Media Research. Computational Communication Research, 4(2), 571–589. https://doi.org/10.5117/CCR2022.2.007.HAGE

Rieder, B. (2015). YouTube Data Tools (1.11) [Computer software].
e
Protests Georgia 2019: YouTube Videos - Dataset - B2FIND
b2find.eudat.eu
Updated May 28, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2021). Protests Georgia 2019: YouTube Videos - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/f72803e8-d41b-5363-a08d-deb752ed1dd6
Explore at:
Dataset updated
May 28, 2021
License
Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
License information was derived automatically
Area covered
YouTube
Description
The collection “Protests Georgia 2019” contains 50 videos (mp4) on protests in July 2019 (mainly Tbilisi area) triggered by the actions of the Russian politician Sergei Gavrilov who visited Georgia in June 2019. We have downloaded all data in August 2020 and made screenshots (pdf) of websites so that the discussion and comments on the single video posts can be followed. All data is processed in an MS Excel database with metadata.
We collect all videos that are 1) event related AND show actions of this event, 2) downloadable, 3) we can find with our search words during a particular period. We strictly aim at a systematic and objective selection and organized storage of protest-related videos. We identify particular event-related search words after intense research on the event. According to the snowball principle, we then start the collection of videos with the help of these search words and try to download as much relevant content as possible. However, we cannot guarantee the completeness of protest videos on the particular event. We search the videos and include them into the collection until a particular degree of saturation has been reached. Due to copyright restrictions, we are only allowed to give access to the database of the collected video files including the hyperlinks with its metadata and not to the videos themselves.
The videos have been posted mainly by the participants of the events. Therefore, the material is only an extract and biased by the perspective of the single creator.
The collection is part of a larger and ongoing collection of videos on protest events in the post-Soviet region.
Youtube-Dataset for Language Identification in Speech Signals
zenodo.org
txt, zip
Updated Aug 1, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alexandra Draghici; Jakob Abeßer; Hanna Lukashevich; Alexandra Draghici; Jakob Abeßer; Hanna Lukashevich (2020). Youtube-Dataset for Language Identification in Speech Signals [Dataset]. http://doi.org/10.5281/zenodo.3968292
Explore at:
zip, txtAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.3968292
Dataset updated
Aug 1, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Alexandra Draghici; Jakob Abeßer; Hanna Lukashevich; Alexandra Draghici; Jakob Abeßer; Hanna Lukashevich
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
YouTube
Description
Youtube-Dataset for Language Identification in Speech Signals

- for scientific use only, for questions contact: jakob.abesser@idmt.fraunhofer.de

Reference

In case you use this dataset for your research, please cite

Alexandra Draghici, Jakob Abeßer & Hanna Lukashevich: A Study on Spoken Language Identification
using Deep Neural Networks, Proceedings of the Audio Mostly Conference 2020

Dataset

The YouTube News Collection is a collection of videos from various
Youtube news channels. We gathered data from channels like BBC
news, France24, DW News, and Noticias Telemundo.

- 135664 npy files (numpy matrices exported from Python)
- each npy file includes a mel spectrogram (see below) of an audio file
- the subfolders "0" - "5" encode the language id:
0 - English
1 - French
2 - German
3 - Greek
4 - Italian
5 - Spanish

Audio Processing

- mono, sample rate 22.05 kHz
- mel spectrogram (librosa python package)
- windows size 512 samples
- hopsize 441 samples (20 ms)
- 129 mel bands
- file-level spectrogram are normalized to maximum of 1
YouTube-ASMR-300K
zenodo.org
data.niaid.nih.gov
zip
Updated Jun 12, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Karren Yang; Bryan Russell; Justin Salamon; Karren Yang; Bryan Russell; Justin Salamon (2020). YouTube-ASMR-300K [Dataset]. http://doi.org/10.5281/zenodo.3889168
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.3889168
Dataset updated
Jun 12, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Karren Yang; Bryan Russell; Justin Salamon; Karren Yang; Bryan Russell; Justin Salamon
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
YouTube
Description
The YouTube-ASMR dataset contains URLS for over 900 hours of ASMR video clips with stereo/binaural audio produced by various YouTube artists. The following paper contains a detailed description of the dataset and how it was compiled:

K. Yang, B. Russell and J. Salamon, "Telling Left from Right: Learning Spatial Correspondence of Sight and Sound", IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Virtual Conference, June 2020.
Evacuation Videos Database
figshare.com
mp4
Updated Feb 5, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Natalie van der Wal (2020). Evacuation Videos Database [Dataset]. http://doi.org/10.6084/m9.figshare.6974321.v13
Explore at:
mp4Available download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.6974321.v13
Dataset updated
Feb 5, 2020
Dataset provided by
Figsharehttp://figshare.com/
Authors
Natalie van der Wal
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Emergency evacuations. 126 publicly available videos in which people are or should be evacuating. Sources: YouTube & news sites. See detailed information about the videos and data collection method in the excel file: vanderWal2020-details-onlinerepository.xlsxA few videos could not be uploaded to figshare, please see excel file for the source to download yourself or request complete set in a zip file (also could not upload the zip file to figshare).
d
Replication Data for: \"Going viral on advertising YouTube video: Detecting...
dataone.org
dataverse.harvard.edu
Updated Nov 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Isabella, Giuliana; Melo, Andressa Freitas de; Gonzalez, Marcela Carvalho (2023). Replication Data for: \"Going viral on advertising YouTube video: Detecting the influences\" published by RAC-Revista de Administração Contemporânea [Dataset]. http://doi.org/10.7910/DVN/D3028Z
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/D3028Z
Dataset updated
Nov 8, 2023
Dataset provided by
Harvard Dataverse
Authors
Isabella, Giuliana; Melo, Andressa Freitas de; Gonzalez, Marcela Carvalho
Area covered
YouTube
Description
The file "2020.03.10 - Análise_Campanhas.xlsx" contains the names of all YouTube videos that were evaluated in the survey, along with their corresponding access links (for the paper - accessed occurred in March 2020), and the number of shares up to that date. The file "Conjoint Analysis.sav" contains the data collection used for the Conjoint Analysis - Study 2 of the article. If you want the same data in SAV I can provide it. The file "Appendix.pdf" contains the images used in the data collection
Z
RECOD.ai Events Dataset
data.niaid.nih.gov
zenodo.org
Updated Jul 17, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nascimento, José (2024). RECOD.ai Events Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_5547605
Explore at:
Dataset updated
Jul 17, 2024
Dataset provided by
Nascimento, José
Rocha, Anderson
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Overview

This data set consists of links to social network items for 34 different forensic events that took place between August 14th, 2018 and January 06th, 2021. The majority of the text and images are from Twitter (a minor part is from Flickr, Facebook and Google+), and every video is from YouTube.

Data Collection

We used Social Tracker (https://github.com/MKLab-ITI/mmdemo-dockerized), along with the social medias' APIs, to gather most of the collections. For a minor part, we used Twint (https://github.com/twintproject/twint). In both cases, we provided keywords related to the event to receive the data.

It is important to mention that, in procedures like this one, usually only a small fraction of the collected data is in fact related to the event and useful for a further forensic analysis.

Content

We have data from 34 events, and for each of them we provide the files:

items_full.csv: It contains links to any social media post that was collected.

images.csv: Enlists the images collected. In some files there is a field called "ItemUrl", that refers to the social network post (e.g., a tweet) that mentions that media.

video.csv: Urls of YouTube videos that were gathered about the event.

video_tweet.csv: This file contains IDs of tweets and IDs of YouTube videos. A tweet whose ID is in this file has a video in its content. In turn, the link of a Youtube video whose ID is in this file was mentioned by at least one collected tweet. Only two collections have this file.

description.txt: Contains some standard information about the event, and possibly some comments about any specific issue related to it.

In fact, most of the collections do not have all the files above. Such an issue is due to changes in our collection procedure throughout the time of this work.

Events

We divided the events into six groups. They are,

Fire

Devastating fire is the main issue of the event, therefore most of the informative pictures show flames or burned constructions

14 Events

Collapse

Most of the relevant images depict collapsed buildings, bridges, etc. (not caused by fire).

5 Events

Shooting

Likely images of guns and police officers. Few or no destruction of the environment.

5 Events

Demonstration

Plethora of people on the streets. Possibly some problem took place on that, but in most cases the demonstration is the actual event.

7 Events

Collision

Traffic collision. Pictures of damaged vehicles on an urban landscape. Possibly there are images with victims on the street.

1 Event

Flood

Events that range from fierce rain to a tsunami. Many pictures depict water.

2 Events

We enlist the events in the file recod-ai-events-dataset-list.pdf

Media Content

Due to the terms of use from the social networks, we do not make publicly available the texts, images and videos that were collected. However, we can provide some extra piece of media content related to one (or more) events by contacting the authors.

Funding

DéjàVu thematic project, São Paulo Research Foundation (grants 2017/12646-3, 2018/18264-8 and 2020/02241-9)

Facebook

Twitter

Click to copy link

Link copied

Cite

Akindu Himan (2024). Trending YouTube Videos 2019 to 2020 [Dataset]. https://www.kaggle.com/datasets/akinduhiman/trending-youtube-videos-2019-to-2020

Comprehensive Statistics on Trending YouTube Videos from 2019 to 2020

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Jul 24, 2024

Dataset provided by

Kaggle

Authors

Akindu Himan

Area covered

YouTube

Description

Description:

This dataset contains statistics for a selection of YouTube videos, capturing metrics such as views, comments, likes, dislikes, and the timestamp when the data was recorded. The dataset provides insights into the popularity and engagement levels of these videos as of April 15, 2019. This data can be useful for analyzing trends in video performance, user engagement, and the impact of content over time.

File Description: This CSV file contains detailed statistics for a set of YouTube videos, including unique video identifiers and various engagement metrics. Each row represents a different video, and the columns provide specific data points related to the video's performance.

Column Descriptions

videostatsid: Unique identifier for each video statistics entry. ytvideoid: Unique YouTube video identifier. views: The total number of views the video has received. comments: The total number of comments posted on the video. likes: The total number of likes the video has received. dislikes: The total number of dislikes the video has received. timestamp:The date and time when the statistics were recorded, in the format YYYY-MM-DD HH:MM

Clear search

Close search

Google apps

Main menu

Trending YouTube Videos 2019 to 2020

Description:

Column Descriptions

YouTube users worldwide 2020-2029

YouTube users in India 2020-2029

Dataset of Video Comments of a Vision Video Classified by Their Relevance,...

Late Night Talk Show YouTube Dataset

Context

Content

Acknowledgements

Inspiration

English YouTube Hate Speech Corpus - Dataset - B2FIND

YouTube Shorts and videos engagement 2024, by account size

Protests Belarus 2020: YouTube Videos - Dataset - B2FIND

RealVAD: A Real-world Dataset for Voice Activity Detection

Italian YouTube Hate Speech Corpus - Dataset - B2FIND

Video-EEG Encoding-Decoding Dataset KU Leuven

Dataset del Trabajo Fin de Grado "¿Qué puedo aprender en YouTube sobre...

YouTube: number of interactions 2023-2024

Ben Shapiro YouTube Comments

Protests Georgia 2019: YouTube Videos - Dataset - B2FIND

Youtube-Dataset for Language Identification in Speech Signals

YouTube-ASMR-300K

Evacuation Videos Database

Replication Data for: \"Going viral on advertising YouTube video: Detecting...

RECOD.ai Events Dataset

Trending YouTube Videos 2019 to 2020

Comprehensive Statistics on Trending YouTube Videos from 2019 to 2020

Description:

Column Descriptions