The global number of Youtube users in was forecast to continuously increase between 2024 and 2029 by in total ***** million users (+***** percent). After the ninth consecutive increasing year, the Youtube user base is estimated to reach *** billion users and therefore a new peak in 2029. Notably, the number of Youtube users of was continuously increasing over the past years.User figures, shown here regarding the platform youtube, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to *** countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of Youtube users in countries like Africa and South America.
https://brightdata.com/licensehttps://brightdata.com/license
Use our YouTube profiles dataset to extract both business and non-business information from public channels and filter by channel name, views, creation date, or subscribers. Datapoints include URL, handle, banner image, profile image, name, subscribers, description, video count, create date, views, details, and more. You may purchase the entire dataset or a customized subset, depending on your needs. Popular use cases for this dataset include sentiment analysis, brand monitoring, influencer marketing, and more.
Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
License information was derived automatically
Study how YouTube videos become viral or, more in general, how they evolve in terms of views, likes and subscriptions is a topic of interest in many disciplines. With this dataset you can study such phenomena, with statistics about 1 million YouTube videos. The information was collected in 2013 when YouTube was exposing the data publicly: they removed this functionality in the years and now it's possible to have such statistics only to the owner of the video. This makes this dataset unique.
This Dataset has been generated with YOUStatAnalyzer, a tool developed by myself (Mattia Zeni) when I was working for CREATE-NET (www.create-net.org) within the framework of the CONGAS FP7 project (http://www.congas-project.eu). For the project we needed to collect and analyse the dynamics of YouTube videos popularity. The dataset contains statistics of more than 1 million Youtube videos, chosen accordingly to random keywords extracted from the WordNet library (http://wordnet.princeton.edu).
The motivation that led us to the development of the YOUStatAnalyser data collection tool and the creation of this dataset is that there's an active research community working on the interplay among user individual preferences, social dynamics, advertising mechanisms and a common problem is the lack of open large-scale datasets. At the same time, no tool was present at that time. Today, YouTube removed the possibility to visualize these data on each video's page, making this dataset unique.
When using our dataset for research purposes, please cite it as:
@INPROCEEDINGS{YOUStatAnalyzer,
author={Mattia Zeni and Daniele Miorandi and Francesco {De Pellegrini}},
title = {{YOUStatAnalyzer}: a Tool for Analysing the Dynamics of {YouTube} Content Popularity},
booktitle = {Proc. 7th International Conference on Performance Evaluation Methodologies and Tools
(Valuetools, Torino, Italy, December 2013)},
address = {Torino, Italy},
year = {2013}
}
The dataset contains statistics and metadata of 1 million YouTube videos, collected in 2013. The videos have been chosen accordingly to random keywords extracted from the WordNet library (http://wordnet.princeton.edu).
The structure of a dataset is the following:
{
u'_id': u'9eToPjUnwmU',
u'title': u'Traitor Compilation # 1 (Trouble ...',
u'description': u'A traitor compilation by one are ...',
u'category': u'Games',
u'commentsNumber': u'6',
u'publishedDate': u'2012-10-09T23:42:12.000Z',
u'author': u'ServilityGaming',
u'duration': u'208',
u'type': u'video/3gpp',
u'relatedVideos': [u'acjHy7oPmls', u'EhW2LbCjm7c', u'UUKigFAQLMA', ...],
u'accessControl': {
u'comment': {u'permission': u'allowed'},
u'list': {u'permission': u'allowed'},
u'videoRespond': {u'permission': u'moderated'},
u'rate': {u'permission': u'allowed'},
u'syndicate': {u'permission': u'allowed'},
u'embed': {u'permission': u'allowed'},
u'commentVote': {u'permission': u'allowed'},
u'autoPlay': {u'permission': u'allowed'}
},
u'views': {
u'cumulative': {
u'data': [15.0, 25.0, 26.0, 26.0, ...]
},
u'daily': {
u'data': [15.0, 10.0, 1.0, 0.0, ..]
}
},
u'shares': {
u'cumulative': {
u'data': [0.0, 0.0, 0.0, 0.0, ...]
},
u'daily': {
u'data': [0.0, 0.0, 0.0, 0.0, ...]
}
},
u'watchtime': {
u'cumulative': {
u'data': [22.5666666667, 36.5166666667, 36.7, 36.7, ...]
},
u'daily': {
u'data': [22.5666666667, 13.95, 0.166666666667, 0.0, ...]
}
},
u'subscribers': {
u'cumulative': {
u'data': [0.0, 0.0, 0.0, 0.0, ...]
},
u'daily': {
u'data': [-1.0, 0.0, 0.0, 0.0, ...]
}
},
u'day': {
u'data': [1349740800000.0, 1349827200000.0, 1349913600000.0, 1350000000000.0, ...]
}
}
From the structure above is possible to see which fields an entry in the dataset has. It is possible to divide them into 2 sections:
1) Video Information.
_id -> Corresponding to the video ID and to the unique identifier of an entry in the database.
title -> Te video's title.
description -> The video's description.
category -> The YouTube category the video is inserted in.
commentsNumber -> The number of comments posted by users.
publishedDate -> The date the video has been published.
author -> The author of the video.
duration -> The video duration in seconds.
type -> The encoding type of the video.
relatedVideos -> A list of related videos.
accessControl -> A list of access policies for different aspects related to the video.
2) Video Statistics.
Each video can have 4 different statistics variables: views, shares, subscribers and watchtime. Recent videos have all of them while older video can have only the 'views' variable. Each variable has 2 dimensions, daily and cumulative.
`views -> number of views collected by the vi...
The number of Youtube users in India was forecast to continuously increase between 2024 and 2029 by in total ***** million users (+***** percent). After the ninth consecutive increasing year, the Youtube user base is estimated to reach ****** million users and therefore a new peak in 2029. Notably, the number of Youtube users of was continuously increasing over the past years.User figures, shown here regarding the platform youtube, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to *** countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of Youtube users in countries like Sri Lanka and Nepal.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Youtube social network and ground-truth communities Dataset information Youtube is a video-sharing web site that includes a social network. In the Youtube social network, users form friendship each other and users can create groups which other users can join. We consider such user-defined groups as ground-truth communities. This data is provided by Alan Mislove et al.
We regard each connected component in a group as a separate ground-truth community. We remove the ground-truth communities which have less than 3 nodes. We also provide the top 5,000 communities with highest quality which are described in our paper. As for the network, we provide the largest connected component.
more info : https://snap.stanford.edu/data/com-Youtube.html
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Over the past few years YouTube has became a popular site for video broadcasting and earning money by publishing various different skills in the form of videos. For some people it has become a main source to earn money. Getting the videos trending among the viewers is one of the major tasks which each and every content creator wants. Popularity of any video and its reach to the audience is completely based on YouTube's Recommendation algorithm. This document is a dataset descriptor for the dataset collected over the time span of about 45 days during the Israel-Hamas War
By VISHWANATH SESHAGIRI [source]
The YouTube Video and Channel Metadata dataset is a comprehensive collection of data related to YouTube videos and channels. It consists of various features and statistics that provide insights into the performance and engagement of videos, as well as the overall popularity and success of channels.
The dataset includes both direct features, such as total views, channel elapsed time, channel ID, video category ID, channel view count, likes per subscriber, dislikes per subscriber, comments per subscriber, and more. Additionally, there are indirect features derived from YouTube's API that provide additional metrics for analysis.
One important aspect covered in this dataset is the ratio between certain metrics. For example: - The totalviews/channelelapsedtime ratio represents the average number of views a video has received relative to the elapsed time since the channel was created. - The likes/dislikes ratio indicates the proportion of likes on a video compared to dislikes. - The views/subscribers ratio showcases how engaged subscribers are by measuring the number of views relative to the number of subscribers.
Other metrics explored in this dataset include comments/views ratio (representing viewer engagement), dislikes/views ratio (measuring viewer sentiment), comments/subscriber ratio (indicating community participation), likes/subscriber ratio (reflecting audience loyalty), dislikes/subscriber ratio (highlighting dissatisfaction levels), total number of subscribers for a channel (subscriberCount), total views on a channel (channelViewCount), total number of comments on a channel (channelCommentCount), among others.
By analyzing these features and statistics within this dataset, researchers or data analysts can gain valuable insights into various aspects related to YouTube videos and channels. Furthermore, it may be possible to build statistical relationships between videos based on their performance characteristics or even develop topic trees based on similarities between different content categories. This dataset serves as an excellent resource for studying YouTube's ecosystem comprehensively.
For accessing additional resources related to this dataset or exploring code repositories associated with it, users can refer to the provided GitHub repository
Introduction:
Step 1: Understanding the Dataset Start by familiarizing yourself with the columns in the dataset. Here are some key features to pay attention to:
- totalviews/channelelapsedtime: The ratio of total views of a video to the elapsed time of the channel.
- channelViewCount: The total number of views on the channel.
- likes/subscriber: The ratio of likes on a video to the number of subscribers of the channel.
- views/subscribers: The ratio of views on a video to the number of subscribers of the channel.
- subscriberCount: The total number of subscribers for a channel.
- dislikes/views: The ratio of dislikes on a video to its total views.
- comments/subscriber: The ratio comments on a video receive per subscriber count.
Step 2: Determining Data Analysis Objectives Define your objectives or research questions before diving into data analysis using this dataset. For example, you may want to explore relationships between viewership, engagement metrics, and various attributes such as category ID or elapsed time.
Step 3: Analyzing Relationships between Variables Use statistical techniques like correlation analysis or visualization tools like scatter plots, bar graphs, or heatmaps to understand relationships between variables in this dataset.
For example: - Plotting totalviews/channelelapsedtime against channelViewCount can help identify patterns between overall video popularity and channels' view count growth over time. - Comparing likes/dislikes with comments/views can give insights into viewer engagement levels across different videos.
Step 4: Building Machine Learning Models (Optional) If your objective includes predictive analysis or building machine learning models, select relevant features as predictors and the target variable (e.g., totalviews/channelelapsedtime) for training and evaluation.
You can use various algorithms such as linear regression, decision trees, or neural networks to predict video performance or channel growth based on available attributes.
Step 5: Evaluating Model Performance Assess the predictive model's performance using appropriate evaluation metrics like mean square...
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
YouTube is the world's largest video-sharing platform, launched in 2005. It allows users to upload, view, and share videos, and has grown to be a central hub for content creators across various fields, including entertainment, education, music, and more. With over 2 billion logged-in users monthly, YouTube has become an essential platform for digital content and marketing.
The Top 1000 YouTube Channels Dataset captures detailed information about the top-performing YouTube channels globally. This dataset includes the following columns:
This dataset is invaluable for analyzing trends, understanding content strategies, and benchmarking channel performances within the YouTube ecosystem.
Creative Commons YouTube
Description
YouTube is large-scale video-sharing platform where users have the option of uploading content under a CC BY license. To collect high-quality speech-based textual content and combat the rampant license laundering on YouTube, we manually curated a set of over 2,000 YouTube channels that consistently release original openly licensed content containing speech. The resulting collection spans a wide range of genres, including lectures… See the full description on the dataset page: https://huggingface.co/datasets/common-pile/youtube.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The COVYT dataset contains speech samples from individuals who self-reported their COVID-19 infection on public social media platforms (YouTube, Xiaohongshu). These videos, as well as accompanying videos of the same people prior to infection, were mined in an attempt to gather publicly-available data for COVID-19 research. This release includes the links to the original videos along with the accompanying manual segmentation and diarisation that identifies the utterances of the target individuals. We are additionally releasing features derived from the segmented utterances. Finally, the dataset includes partitioning information according to 4 different cross-validation schemes. See the arxiv pre-print for more details: https://arxiv.org/abs/2206.11045
By VISHWANATH SESHAGIRI [source]
This dataset contains YouTube video and channel metadata to analyze the statistical relation between videos and form a topic tree. With 9 direct features, 13 more indirect features, it has all that you need to build a deep understanding of how videos are related – including information like total views per unit time, channel views, likes/subscribers ratio, comments/views ratio, dislikes/subscribers ratio etc. This data provides us with a unique opportunity to gain insights on topics such as subscriber count trends over time or calculating the impact of trends on subscriber engagement. We can develop powerful models that show us how different types of content drive viewership and identify the most popular styles or topics within YouTube's vast catalogue. Additionally this data offers an intriguing look into consumer behaviour as we can explore what drives people to watch specific videos at certain times or appreciate certain channels more than others - by analyzing things like likes per subscribers and dislikes per views ratios for example! Finally this dataset is completely open source with an easy-to-understand Github repo making it an invaluable resource for anyone looking to gain better insights into how their audience interacts with their content and how they might improve it in the future
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
How to Use This Dataset
In general, it is important to understand each parameter in the data set before proceeding with analysis. The parameters included are totalviews/channelelapsedtime, channelViewCount, likes/subscriber, views/subscribers, subscriberCounts, dislikes/views comments/subscriberchannelCommentCounts,, likes/dislikes comments/views dislikes/ subscribers totviewes /totsubsvews /elapsedtime.
To use this dataset for your own analysis:1) Review each parameter’s meaning and purpose in our dataset; 2) Get familiar with basic descriptive statistics such as mean median mode range; 3) Create visualizations or tables based on subsets of our data; 4) Understand correlations between different sets of variables or parameters; 5) Generate meaningful conclusions about specific channels or topics based on organized graph hierarchies or tables.; 6) Analyze trends over time for individual parameters as well as an aggregate reaction from all users when videos are released
Predicting the Relative Popularity of Videos: This dataset can be used to build a statistical model that can predict the relative popularity of videos based on various factors such as total views, channel viewers, likes/dislikes ratio, and comments/views ratio. This model could then be used to make recommendations and predict which videos are likely to become popular or go viral.
Creating Topic Trees: The dataset can also be used to create topic trees or taxonomies by analyzing the content of videos and looking at what topics they cover. For example, one could analyze the most popular YouTube channels in a specific subject area, group together those that discuss similar topics, and then build an organized tree structure around those topics in order to better understand viewer interests in that area.
Viewer Engagement Analysis: This dataset could also be used for viewer engagement analysis purposes by analyzing factors such as subscriber count, average time spent watching a video per user (elapsed time), comments made per view etc., so as to gain insights into how engaged viewers are with specific content or channels on YouTube. From this information it would be possible to optimize content strategy accordingly in order improve overall engagement rates across various types of video content and channel types
If you use this dataset in your research, please credit the original authors.
License
Unknown License - Please check the dataset description for more information.
File: YouTubeDataset_withChannelElapsed.csv | Column name | Description | |:----------------------------------|:-------------------------------------------------------| | totalviews/channelelapsedtime | Ratio of total views to channel elapsed time. (Ratio) | | channelViewCount | Total number of views for the channel. (Integer) | | likes/subscriber ...
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Allegedly, the online suicide game "Momo Challenge" dares players to perform self-harming tasks and, ultimately, commit suicide. Journalists have criticized the large amount of YouTube videos in which YouTubers promote the challenge by passing on the phone numbers to their viewers. However, empirical knowledge about this and similar cyber threats is lacking. This data set was created to give insight into the reach of the Momo Challenge on YouTube, how users form communities around this video material, and to what extent it puts them at risk. It contains the results of a data crawl with NodeXL. Using the keywords ‘Momo Challenge English’, we ran the crawl for titles, descriptions, and tags during the turn of 2018/2019. We identified 487 videos, which we manually cleansed of videos unrelated to the challenge. The remaining data set consists of 209 videos.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
News dissemination plays a vital role in supporting people to incorporate beneficial actions during public health emergencies, thereby significantly reducing the adverse influences of events. Based on big data from YouTube, this research study takes the declaration of COVID-19 National Public Health Emergency (PHE) as the event impact and employs a DiD model to investigate the effect of PHE on the news dissemination strength of relevant videos. The study findings indicate that the views, comments, and likes on relevant videos significantly increased during the COVID-19 public health emergency. Moreover, the public’s response to PHE has been rapid, with the highest growth in comments and views on videos observed within the first week of the public health emergency, followed by a gradual decline and returning to normal levels within four weeks. In addition, during the COVID-19 public health emergency, in the context of different types of media, lifestyle bloggers, local media, and institutional media demonstrated higher growth in the news dissemination strength of relevant videos as compared to news & political bloggers, foreign media, and personal media, respectively. Further, the audience attracted by related news tends to display a certain level of stickiness, therefore this audience may subscribe to these channels during public health emergencies, which confirms the incentive mechanisms of social media platforms to foster relevant news dissemination during public health emergencies. The proposed findings provide essential insights into effective news dissemination in potential future public health events.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
The dataset provides a comprehensive overview of leading YouTube channels, capturing key metrics such as subscriber counts, video views, and estimated annual earnings. It includes information on the channel's category, number of uploads, and geographical data like country and urban population. Additionally, socio-economic indicators such as gross tertiary education enrollment, unemployment rate, and development status of the channel's country are included. For instance, T-Series, the top-ranked channel, has 245 million subscribers and 228 billion video views, generating significant annual earnings. This dataset is invaluable for analyzing the dynamics of content creation on YouTube and understanding how geographical and economic factors influence channel success.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The dataset is first introduced in the following paper: Siqi Wu and Paul Resnick. Cross-Partisan Discussions on YouTube: Conservatives Talk to Liberals but Liberals Don't Talk to Conservatives. In AAAI International Conference on Weblogs and Social Media (ICWSM), 2021. us_partisan.csv Metadata for 1,267 US partisan media on YouTube. The first row is header. Fields include "title, url, channel_title, channel_id, leaning, type, source, channel_description" video_meta.csv Metadata for 274241 YouTube political videos from US partisan media. The first row is header. Fields include "video_id, channel_id, media_leaning, media_type, num_view, num_comment, num_cmt_from_liberal, num_cmt_from_conservative, num_cmt_from_unknown" user_comment_meta.csv.bz2 Metadata for 9,304,653 YouTube users who have commented on YouTube political videos. The first row is header. Fields include "hashed_user_id, predicted_user_leaning, num_comment, num_cmt_on_left, num_cmt_on_right" user_comment_trace.tsv.bz2 Comment trace for 9,304,653 YouTube users who have commented on YouTube political videos. The first row is header. Fields include "hashed_user_id predicted_user_leaning comment_trace" (split by \t) "comment_trace" consists of "channel_id1,num_comment_on_this_channel1;channel_id2,num_comment_on_this_channel2;..." (split by ;) trained_HAN_models.tar.bz2 Five trained HAN models for predicting user political leanings. Each model consists a ".h5" model file and ".tokenizer" tokenizer file. See this for how to use our pre-trained HAN models. See more details in this data description.
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Have you ever wanted to create your own maps, or integrate and visualize spatial datasets to examine changes in trends between locations and over time? Follow along with these training tutorials on QGIS, an open source geographic information system (GIS) and learn key concepts, procedures and skills for performing common GIS tasks – such as creating maps, as well as joining, overlaying and visualizing spatial datasets. These tutorials are geared towards new GIS users. We’ll start with foundational concepts, and build towards more advanced topics throughout – demonstrating how with a few relatively easy steps you can get quite a lot out of GIS. You can then extend these skills to datasets of thematic relevance to you in addressing tasks faced in your day-to-day work.
In 2021, YouTube's user base in Vietnam amounts to approximately ***** million users. The number of YouTube users in Vietnam is projected to reach ***** million users by 2025. User figures have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to *** countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).
The increasing popularity and use of digital platforms and social media such as WhatsApp, Facebook, YouTube and Instagram are opening up new opportunities for children, young people and adults to pursue cultural interests or to stage themselves aesthetically. If we focus on young people between the ages of 12 and 19, a number of studies on media use show that YouTube in particular has become the leading medium for this age group. Given the growth in importance of this web video platform, questions arise about the receptive and productive content of experience and the significance of cultural content and practices. Furthermore, there are hardly any findings on the extent to which YouTube stimulates young people to engage in cultural activities and self-organized learning processes. The sample is composed of n=818 adolescents aged 12-19 years. The selection of the study units was based on a quota procedure. The adolescent target subjects were recruited via the IFAK interviewer staff according to predefined quotas for age, gender, region, place size class, type of school attended (for students), and occupation (for non-students). The characteristics "age and gender" and "region and place size" were crossed or combined with each other to produce as accurate a representation of the population as possible. The characteristic "migration background" was not used as a quota characteristic. The specifications for this are based on the latest data from the Federal Statistical Office and ma Radio 2018 II. The structural composition of the sample corresponds to the data for the population according to the characteristics mentioned. The study was conducted as a face-to-face oral survey. The answers of the young people were recorded by an interviewer on a laptop via a corresponding survey program. 111 face-to-face interviewers from the in-house interviewing staff, who have experience in interviewing children and adolescents, were used. The predefined questionnaire was binding for all interviewers with regard to the wording and sequence of questions. The maximum number of interviews per interviewer was n=10. Each interviewer received a detailed written briefing on the project at the beginning of the study. Die zunehmende Verbreitung und Nutzung digitaler Plattformen und sozialer Medien wie z. B. WhatsApp, Facebook, YouTube oder Instagram eröffnen Kindern, Jugendlichen und Erwachsenen neue Möglichkeiten, kulturellen Interessen nachzugehen oder sich ästhetisch zu inszenieren. Richtet man seinen Blick auf Jugendliche im Alter von 12 bis 19 Jahren, so zeigt eine Reihe von Studien zur Mediennutzung, dass sich insbesondere YouTube zum Leitmedium dieser Altersgruppe entwickelt hat. Angesichts des Bedeutungszuwachses dieser Webvideo-Plattform stellen sich Fragen nach den rezeptiven und produktiven Erfahrungsgehalten sowie der Bedeutung kultureller Inhalte und Praktiken. Weiterhin existieren kaum Erkenntnisse darüber, inwiefern YouTube die Jugendlichen zu kulturellen Aktivitäten und selbstorganisierten Lernprozessen anregt. Die Stichprobe setzt sich aus n=818 Jugendlichen im Alter von 12-19 Jahren zusammen. Die Auswahl der Untersuchungseinheiten erfolgte auf der Grundlage eines Quotenverfahrens. Die Rekrutierung der jugendlichen Zielpersonen erfolgte über den IFAK-Interviewerstab nach vorgegeben Quoten für Alter, Geschlecht, Region, Ortsgrößenklasse, besuchter Schultyp (bei Schülern) und Berufstätigkeit (bei Nicht-Schülern). Dabei wurden die Merkmale „Alter und Geschlecht“ sowie „Region und Ortsgröße“ gekreuzt bzw. miteinander kombiniert, um ein möglichst genaues Abbild der Grundgesamtheit herzustellen.Das Merkmal „Migrationshintergrund“ wurde nicht als Quotierungsmerkmal herangezogen. Die Vorgaben hierfür basieren auf den aktuellsten Angaben des Statistischen Bundesamtes und der ma Radio 2018 II. Die strukturelle Zusammensetzung der Stichprobe entspricht nach den genannten Merkmalen den Daten für die Grundgesamtheit. Die Studie wurde als persönlich-mündliche Befragung durchgeführt. Die Antworten der Jugendlichen wurden dabei über ein entsprechendes Befragungsprogramm von einem Interviewer auf einem Laptop erfasst. Zum Einsatz kamen 111 face-to-face Interviewer aus dem hauseigenen Interviewerstab, die Erfahrungen mit der Befragung von Kindern und Jugendlichen haben. Der vorgegebene Fragebogen war im Hinblick auf Wortlaut und Reihenfolge der Fragen für alle Interviewer verbindlich. Die maximale Anzahl an Interviews pro Interviewer lag bei n=10. Jeder Interviewer erhielt zu Beginn der Studie eine detaillierte schriftliche Einweisung in das Projekt.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
NCSRD-DS-5GDDos v2.0 Dataset
===
NCSRD-DS-5GDDos is a comprehensive dataset recorded in a real-world 5G testbed that aligns with the 3GPP specifications. The dataset captures Distributed Denial of Service (DDoS) attacks initiated by malicious connected users (UEs).
The setup comprises of 3 cells with a total of 9 UEs connected to the same core network. The 5G network is implemented by the Amarisoft Callbox Mini solution (cell 2), and we further employ a second cell using the Amarisoft Classic (cell 1 & 3), that also hosts the 5G core.
The setup utilizes a broad set of UE devices comprising a set of smart phones (Huawei P40), microcomputers (Raspberry Pi 4 - Waveshare 5G Hat M2), industrial 5G routers (Industrial Waveshare 5G Router), a WiFi-6 mobile hotspot (DWR-2101 5G Wi-Fi 6 Mobile Hotspot) and a CPE box (Waveshare 5G CPE Box). All UEs are being operated by subsidiary hosts which are responsible for the traffic generation, occurring from scheduled communications times.
All identifiers are artificially generated and do not represent or based on personal data. We identify each UE through its ‘imeisv’ ID, that corresponds to the device in use, due to vendor implementation, that uses the same IMSI for all UEs.
This dataset captures attack data from a total of 5 malicious User Equipment (UE) devices that initiated various flooding attacks on a 5G network. Each record includes key identifiers such as the IMEISV (International Mobile Equipment Identity Software Version number) and IP address of the attacking UE, along with the device type. The file "summary_report.csv" summarizes this information. The traffic types used in the attacks include syn flooding, UDP flooding, ICMP flooding, DNS flooding, and GTP-U flooding. The benign users stream YouTube and Skype traffic.
The dataset is recorded through the use of a data collector that interfaces with the 5G network and gathers data regarding UEs, gNBs and the Core Network. The data are recorded in an InfluxdB and pre-processed into three separate tabular .csv files for more efficient processing: “amari_ue_data.csv”, “enb_counters.csv” and “mme_counters.csv”. In this version, we use an Amarisoft Classic (cells 1 & 3, Core Network) and an Amarisoft Mini (cell 2) (more information on the products can be found in https://www.amarisoft.com/).
The ”amari_ue_data.csv” provides information on the UEs regarding identification (“imeisv”, “5g_tmsi”, “rnti”), IP addressing, bearer information, cell information (“tac”, “ran_plmn”), and cell information (“ul_bitrate”, “dl_bitrate”, “cell_id”, retransmissions per user per cell “ul_retx” as well as aggregated bit rates for each cell).
The ”enb_counters.csv” focuses on cell-level information, providing downlink and uplink bitrates, usage ratio per user, cpu load of the gNB.
We provide separate files of ”amari_ue_data.csv” and ”enb_counters.csv” generated from each gNB (Amarisoft Classic and Mini).
The “mme_counters.csv” provides information on the Non-Access Stratum (NAS) of the 5G Network and focuses on session status reports (e.g., number of PDU session establishments, paging, context setup. This part gives an overview of the connection management throughout the recording session, and provides information on features suggested by 3GPP for abnormal user behavior.
We also provide a separate pre-processed dataset, that merges the two "amari_ue_data_*.csv" file, including labeling of the malicious/benign samples, and may be more flexible for interested data scientists.
Please refer to README.txt for the features included in each file.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Sports engages billions of followers worldwide and impacts the
economy. Sports controversies often ignite passionate discus-
sions among fans, analysts, and players. With the rise of social
media, platforms like YouTube have become central to these discus-
sions. This study aims to analyze the stances or perform opinion
mining namely for, against, and neutral on comments from fa-
mous social media platforms like YouTube for famous public sports
controversies.
To our knowledge, it is the first-ever study and dataset (hand curated) of civic
engagement in controversial sports events spanning around 40 years.
LLMs (Llama and Deepseek reasoning family) were used for initial
annotations (stance) of comments and later fine-tuned for comparative performance analysis ( 30% boost in accuracy).
This dataset presents a collection of YouTube comments (around 43k) on famous
and controversial Public Sports Events.
We explore public sentiment analysis (stance detection) on a total of 6 famous controversial
sports incidents by extracting and processing YouTube comments.
Stance detection is performed on those events through fine-tuning
of models like Llama-3.1-8b and Deepseek reasoning models (Llama-
8b distilled) on comments from events like The Underarm Incident,
Jonny Bairstow’s Run-Out Incident, Ashwin’s Mankading Event,
Luis Suarez Handball Event etc.
The global number of Youtube users in was forecast to continuously increase between 2024 and 2029 by in total ***** million users (+***** percent). After the ninth consecutive increasing year, the Youtube user base is estimated to reach *** billion users and therefore a new peak in 2029. Notably, the number of Youtube users of was continuously increasing over the past years.User figures, shown here regarding the platform youtube, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to *** countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of Youtube users in countries like Africa and South America.