The proposed Extended-YouTube Faces (E-YTF) is an extension of the famous YouTube Faces (YTF) dataset and is specifically designed to further push the challenges of face recognition by addressing the problem of open-set face identification from heterogeneous data i.e. still images vs video.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
There is surprisingly little empirical evidence supporting theoretical and anecdotal claims regarding the spontaneous production of prototypic facial expressions used in numerous emotion recognition studies. Proponents of innate prototypic expressions believe that this lack of evidence may be due to ethical restrictions against presenting powerful elicitors in the lab. The current popularity of internet platforms designed for public sharing of videos allows investigators to shed light on this debate by examining naturally-occurring facial expressions outside the laboratory. An Internet prank (“Scary Maze”) has provided a unique opportunity to observe children reacting to a consistent fear- and surprise-inducing stimulus: The unexpected presentation of a “scary face” during an online maze game. The purpose of this study was to examine children’s facial expressions in this naturalistic setting. Emotion ratings of non-facial behaviour (provided by untrained undergraduates) and anatomically-based facial codes were obtained from 60 videos of children (ages 4–7) found on YouTube. Emotion ratings were highest for fear and surprise. Correspondingly, children displayed more facial expressions of fear and surprise than for other emotions (e.g. anger, joy). These findings provide partial support for the ecological validity of fear and surprise expressions. Still prototypic expressions were produced by fewer than half the children.
FaceForensics is a video dataset consisting of more than 500,000 frames containing faces from 1004 videos that can be used to study image or video forgeries. All videos are downloaded from Youtube and are cut down to short continuous clips that contain mostly frontal faces. This dataset has two versions:
Source-to-Target: where the authors reenact over 1000 videos with new facial expressions extracted from other videos, which e.g. can be used to train a classifier to detect fake images or videos.
Selfreenactment: where the authors use Face2Face to reenact the facial expressions of videos with their own facial expressions as input to get pairs of videos, which e.g. can be used to train supervised generative refinement models.
pavitemple/youtube-videos dataset hosted on Hugging Face and contributed by the HF Datasets community
markhneedham/youtube-comments dataset hosted on Hugging Face and contributed by the HF Datasets community
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
=====================================================================
=====================================================================
Authors: Trung-Nghia Le (1), Khanh-Duy Nguyen (2), Huy H. Nguyen (1), Junichi Yamagishi (1), Isao Echizen (1)
Affiliations: (1)National Institute of Informatics, Japan (2)University of Information Technology-VNUHCM, Vietnam
National Institute of Informatics Copyright (c) 2021
Emails: {ltnghia, nhhuy, jyamagis, iechizen}@nii.ac.jp, {khanhd}@uit.edu.vn
Arxiv: https://arxiv.org/abs/2111.12888 NII Face Mask Dataset v1.0: https://zenodo.org/record/5761725
=============================== INTRODUCTION ===============================
The NII Face Mask Dataset is the first large-scale dataset targeting mask-wearing ratio estimation in street cameras. This dataset contains 581,108 face annotations extracted from 18,088 video frames (1920x1080 pixels) in 17 street-view videos obtained from the Rambalac's YouTube channel.
The videos were taken in multiple places, at various times, before and during the COVID-19 pandemic. The total length of the videos is approximately 56 hours.
=============================== REFERENCES ===============================
If your publish using any of the data in this dataset please cite the following papers:
@article{Nguyen202112888, title={Effectiveness of Detection-based and Regression-based Approaches for Estimating Mask-Wearing Ratio}, author={Nguyen, Khanh-Duy and Nguyen, Huy H and Le, Trung-Nghia and Yamagishi, Junichi and Echizen, Isao}, archivePrefix={arXiv}, arxivId={2111.12888}, url={https://arxiv.org/abs/2111.12888}, year={2021} }
@INPROCEEDINGS{Nguyen2021EstMaskWearing, author={Nguyen, Khanh-Duv and Nguyen, Huv H. and Le, Trung-Nghia and Yamagishi, Junichi and Echizen, Isao}, booktitle={2021 16th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2021)}, title={Effectiveness of Detection-based and Regression-based Approaches for Estimating Mask-Wearing Ratio}, year={2021}, pages={1-8}, url={https://ieeexplore.ieee.org/document/9667046}, doi={10.1109/FG52635.2021.9667046}}
======================== DATA STRUCTURE ==================================
./NFM ├── dataset │ ├── train.csv: annotations for the train set. │ ├── test.csv: annotations for the test set. └── README_v1.0.md
We use the same structure for two CSV files (train.csv and test.csv). Both CSV files have the same columns: <1st column>: video_id (a source video can be found by following the link: https://www.youtube.com/watch?v=) <2nd column>: frame_id (the index of a frame extracted from the source video) <3rd column>: timestamp in milisecond (the timestamp of a frame extracted from the source video) <4th column>: label (for each annotated face, one of three labels was attached with a bounding box: 'Mask'/'No-Mask'/'Unknown') <5th column>: left <6th column>: top <7th column>: right <8th column>: bottom Four coordinates (left, top, right, bottom) were used to denote a face's bounding box.
============================== COPYING ================================
This repository is made available under Creative Commons Attribution License (CC-BY).
Regarding Creative Commons License: Attribution 4.0 International (CC BY 4.0), please see https://creativecommons.org/licenses/by/4.0/
THIS DATABASE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS DATABASE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE
====================== ACKNOWLEDGEMENTS ================================
This research was partly supported by JSPS KAKENHI Grants (JP16H06302, JP18H04120, JP21H04907, JP20K23355, JP21K18023), and JST CREST Grants (JPMJCR20D3, JPMJCR18A6), Japan.
This dataset is based on the Rambalac's YouTube channel: https://www.youtube.com/c/Rambalac
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
📺 YouTube-Commons 📺
YouTube-Commons is a collection of audio transcripts of 2,063,066 videos shared on YouTube under a CC-By license.
Content
The collection comprises 22,709,724 original and automatically translated transcripts from 3,156,703 videos (721,136 individual channels). In total, this represents nearly 45 billion words (44,811,518,375). All the videos where shared on YouTube with a CC-BY license: the dataset provide all the necessary provenance information… See the full description on the dataset page: https://huggingface.co/datasets/PleIAs/YouTube-Commons.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Most facial expression recognition (FER) systems rely on machine learning approaches that require large databases (DBs) for effective training. As these are not easily available, a good solution is to augment the DBs with appropriate techniques, which are typically based on either geometric transformation or deep learning based technologies (e.g., Generative Adversarial Networks (GANs)). Whereas the first category of techniques has been fairly adopted in the past, studies that use GAN-based techniques are limited for FER systems. To advance in this respect, we evaluate the impact of the GAN techniques by creating a new DB containing the generated synthetic images.
The face images contained in the KDEF DB serve as the basis for creating novel synthetic images by combining the facial features of two images (i.e., Candie Kung and Cristina Saralegui) selected from the YouTube-Faces DB. The novel images differ from each other, in particular concerning the eyes, the nose, and the mouth, whose characteristics are taken from the Candie and Cristina images.
The total number of novel synthetic images generated with the GAN is 980 (70 individuals from KDEF DB x 7 emotions x 2 subjects from YouTube-Faces DB).
The zip file "GAN_KDEF_Candie" contains the 490 images generated by combining the KDEF images with the Candie Kung image. The zip file "GAN_KDEF_Cristina" contains the 490 images generated by combining the KDEF images with the Cristina Saralegui image. The used image IDs are the same used for the KDEF DB. The synthetic generated images have a resolution of 562x762 pixels.
If you make use of this dataset, please consider citing the following publication:
Porcu, S., Floris, A., & Atzori, L. (2020). Evaluation of Data Augmentation Techniques for Facial Expression Recognition Systems. Electronics, 9, 1892, doi: 10.3390/electronics9111892, url: https://www.mdpi.com/2079-9292/9/11/1892.
BibTex format:
@article{porcu2020evaluation, title={Evaluation of Data Augmentation Techniques for Facial Expression Recognition Systems}, author={Porcu, Simone and Floris, Alessandro and Atzori, Luigi}, journal={Electronics}, volume={9}, pages={108781}, year={2020}, number = {11}, article-number = {1892}, publisher={MDPI}, doi={10.3390/electronics9111892} }
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Although the Platform Economy has been the subject of significant research efforts, much work remains to be done regarding content creation-based online platforms and the negative impact they can have on creators that make a living off them. This explorative study therefore shines a light on the challenges that creators face on YouTube and how they overcome them. Through semi-structured interviews of eight different YouTube channels generated data that was then subjected to a two-stage coding method, initially employing open coding which was followed up with selective coding. The obtained results show that the most significant hurdles are created by the platform itself via strict guidelines and policies as well as non-transparent processes, followed by monetization issues and a lack of competition. The creators use technical workarounds on the platform as well as self-organization through self-censorship and personal operationalization to deal with said issues while not relying on YouTube alone for their income. Instead, they use a multi-platform approach, harnessing revenue streams on Twitch or Patreon as well as sponsorships in addition to advertisement revenue from YouTube. Due to the lack of regulation, these platforms regulate themselves through automated algorithms, creating an environment in which professional creators are in constant conflict with the platform. By calling Content Creators entrepreneurs, implications for the platforms can be made. They are no longer simply digital places where people interact but have become marketplaces that emulate entire industries or became something new altogether. As a result, future efforts should focus on further refining the definition for individual platforms to promote clarity and support regulatory efforts. In addition, this study should be replicated for other platforms such as Twitch or Patreon, as Content Creators may face similar issues.
The MLB-YouTube dataset is a new, large-scale dataset consisting of 20 baseball games from the 2017 MLB post-season available on YouTube with over 42 hours of video footage. The dataset consists of two components: segmented videos for activity recognition and continuous videos for activity classification. It is quite challenging as it is created from TV broadcast baseball games where multiple different activities share the camera angle. Further, the motion/appearance difference between the various activities is quite small.
BitiBytes123/youtube-dataset dataset hosted on Hugging Face and contributed by the HF Datasets community
As of June 2021, Zozo Kempf's was the most popular Hungarian sports channel on YouTube with 273 thousand followers. Face Team Vlogs ranked second with 160 thousand YouTube subscribers.
High-definition Talking Face Dataset (HDTF). The HDTF dataset is collected from youtube website published in recent two years and consists of about 16 hours 720P∼1080P videos. There are over 300 subjects and 10k different sentences in HDTF dataset. Our HDTF dataset has higher video resolution than previous in-thewild datasets and more subjects/sentences than in-the-lab datasets.
According to a survey conducted in 2022, approximately ** percent of YouTube creators in the Middle East and North Africa (MENA) region indicated that quality editing is one of challenges they face in building their YouTube channels. This was followed by gathering a film crew and growing subscribers and views.
https://www.marketresearchforecast.com/privacy-policyhttps://www.marketresearchforecast.com/privacy-policy
The Multi-Channel Network (MCN) market is experiencing robust growth, driven by the increasing popularity of online video content and the need for creators to effectively monetize their channels. The market's expansion is fueled by several key factors. Firstly, the rising demand for professional production and editing tools, along with funding opportunities and comprehensive digital rights management solutions, empowers content creators to produce higher-quality videos and scale their operations. Secondly, cross-promotion strategies within MCNs offer creators significant reach and exposure, accelerating channel growth. Monetization assistance, a core service of MCNs, provides creators with crucial expertise in navigating advertising revenue, sponsorships, and merchandise sales, enhancing their profitability. Finally, the diversification of MCN services across various application sectors like BFSI (Banking, Financial Services, and Insurance), telecommunications, media & entertainment, and technology widens the market's potential and attracts diverse clientele. However, the market also faces certain challenges. Competition among MCNs is intense, requiring providers to constantly innovate and adapt to evolving creator needs and technological advancements. Furthermore, the regulatory landscape surrounding online content, including copyright and data privacy, poses ongoing hurdles for MCNs to navigate. Despite these challenges, the projected Compound Annual Growth Rate (CAGR) indicates a strong upward trajectory for the MCN market over the next decade. This sustained growth is likely to be driven by the continued expansion of online video consumption across various demographics and regions, along with the increasing professionalization of online content creation. The consolidation of smaller MCNs into larger entities is also anticipated, leading to a more concentrated market structure with a few dominant players.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
ClarityClips/youtube-video-summarization dataset hosted on Hugging Face and contributed by the HF Datasets community
The increasing popularity and use of digital platforms and social media such as WhatsApp, Facebook, YouTube and Instagram are opening up new opportunities for children, young people and adults to pursue cultural interests or to stage themselves aesthetically. If we focus on young people between the ages of 12 and 19, a number of studies on media use show that YouTube in particular has become the leading medium for this age group. Given the growth in importance of this web video platform, questions arise about the receptive and productive content of experience and the significance of cultural content and practices. Furthermore, there are hardly any findings on the extent to which YouTube stimulates young people to engage in cultural activities and self-organized learning processes.
The sample is composed of n=818 adolescents aged 12-19 years. The selection of the study units was based on a quota procedure. The adolescent target subjects were recruited via the IFAK interviewer staff according to predefined quotas for age, gender, region, place size class, type of school attended (for students), and occupation (for non-students). The characteristics "age and gender" and "region and place size" were crossed or combined with each other to produce as accurate a representation of the population as possible. The characteristic "migration background" was not used as a quota characteristic. The specifications for this are based on the latest data from the Federal Statistical Office and ma Radio 2018 II. The structural composition of the sample corresponds to the data for the population according to the characteristics mentioned.
The study was conducted as a face-to-face oral survey. The answers of the young people were recorded by an interviewer on a laptop via a corresponding survey program. 111 face-to-face interviewers from the in-house interviewing staff, who have experience in interviewing children and adolescents, were used. The predefined questionnaire was binding for all interviewers with regard to the wording and sequence of questions. The maximum number of interviews per interviewer was n=10. Each interviewer received a detailed written briefing on the project at the beginning of the study.
FaceForensics++ is a forensics dataset consisting of 1000 original video sequences that have been manipulated with four automated face manipulation methods: Deepfakes, Face2Face, FaceSwap and NeuralTextures. The data has been sourced from 977 youtube videos and all videos contain a trackable mostly frontal face without occlusions which enables automated tampering methods to generate realistic forgeries.
https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
The global network copyright market, valued at $48.8 billion in 2025, is projected to experience robust growth, driven by the expanding digital entertainment landscape and increasing demand for high-quality, legally licensed content. A Compound Annual Growth Rate (CAGR) of 14.5% from 2025 to 2033 indicates a significant market expansion, reaching an estimated value exceeding $180 billion by 2033. This growth is fueled by several key factors. The rise of streaming platforms like Netflix, YouTube, and Tencent, coupled with increased internet penetration and smartphone adoption globally, are major contributors. Furthermore, enhanced copyright protection measures and stricter enforcement against piracy are bolstering the market. The increasing investment in original content production by streaming services and the growing popularity of online gaming, which relies heavily on licensed music and visual assets, further contribute to market expansion. However, the market faces certain challenges. Fluctuations in licensing fees and negotiations between content creators and distributors can impact profitability. The evolving technological landscape, particularly the rise of artificial intelligence and its potential impact on copyright infringement, poses a significant risk. Furthermore, regional variations in copyright laws and enforcement create complexities for global players. Despite these challenges, the long-term outlook for the network copyright market remains positive, driven by the ongoing shift towards digital content consumption and the increasing importance of protecting intellectual property rights in the digital age. The competitive landscape, dominated by major players like Netflix, YouTube, and Tencent, is characterized by intense competition and continuous innovation in content delivery and protection technologies. Segmentation within the market likely reflects differences in content type (music, video, software), licensing models, and geographic regions.
https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
The video production and marketing services market is experiencing robust growth, driven by the increasing adoption of video content across various industries. The market's expansion is fueled by several key factors. Firstly, the escalating demand for engaging and informative video content across platforms like YouTube, social media, and websites is significantly boosting the need for professional video production and marketing services. Businesses are recognizing video's effectiveness in enhancing brand awareness, driving customer engagement, and ultimately boosting sales conversions. Secondly, technological advancements in video production and editing software, coupled with the rising accessibility of high-quality video equipment, have lowered the barriers to entry for smaller businesses and individual creators, further fueling market growth. This is complemented by the continuous evolution of marketing techniques, including the rise of short-form video and targeted advertising strategies on social media, which are directly impacting the demand for specialized video marketing expertise. We estimate the 2025 market size to be around $15 billion, considering the significant investments in digital marketing and the prevalence of video in business strategies. Assuming a conservative CAGR of 15% (a reasonable estimate given the current market dynamics), this translates to substantial growth over the forecast period (2025-2033). However, the market also faces certain challenges. Competition is intensifying as more companies enter the field. Maintaining a competitive edge necessitates continuous innovation in production techniques, marketing strategies, and post-production technologies. Furthermore, the fluctuating costs of production, including equipment, software, and talent acquisition, can pose a challenge to profitability. Despite these restraints, the long-term outlook remains positive, driven by the enduring importance of video content in communication and marketing. Market segmentation, particularly by industry (manufacturing, education, finance) and video type (animated, live-action), offers opportunities for specialized service providers to focus their marketing efforts and gain a stronger foothold. The geographical distribution reveals significant opportunities in regions like Asia-Pacific and North America, which currently hold a significant portion of the market share and are expected to continue their robust growth trajectory. The significant number of companies listed indicates a highly competitive landscape, but also signifies the market's attractiveness and immense potential for those who can differentiate their services.
The proposed Extended-YouTube Faces (E-YTF) is an extension of the famous YouTube Faces (YTF) dataset and is specifically designed to further push the challenges of face recognition by addressing the problem of open-set face identification from heterogeneous data i.e. still images vs video.