88 datasets found
  1. O

    100DOH (The 100 Days Of Hands)

    • opendatalab.com
    zip
    Updated Sep 28, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    University of Michigan (2023). 100DOH (The 100 Days Of Hands) [Dataset]. https://opendatalab.com/OpenDataLab/100DOH_The_100_Days_Of_Hands
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 28, 2023
    Dataset provided by
    University of Michigan
    Johns Hopkins University
    Description

    The 100 Days Of Hands Dataset (100DOH) is a large-scale video dataset containing hands and hand-object interactions. It consists of 27.3K Youtube videos from 11 categories (see details in the table below) with nearly 131 days of footage of everyday interaction. The focus of the dataset is hand contact, and it includes both first-person and third-person perspectives. The videos in 100DOH are unconstrained and content-rich, ranging from records of daily life to specific instructional videos. To enforce diversity, we only keep no more than 20 videos from each uploader. All video links are provided.

    We searched implicitly for hands engaged in interactions rather than explicitly. We search for generic tags (e.g., DIY cookies home 2014, kitchen 2018 assembly cabinet) instead of specific actions.

  2. LatAm: digital video ads view-through rate 2018, by country

    • statista.com
    Updated Jul 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). LatAm: digital video ads view-through rate 2018, by country [Dataset]. https://www.statista.com/statistics/994134/digital-video-ads-view-through-rate-latin-america/
    Explore at:
    Dataset updated
    Jul 7, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    LAC
    Description

    The graph shows the view-through rates (VTR) of digital video advertising in selected countries in Latin America in the second half of 2018. In Mexico, out of a 100 people who loaded a digital video ad, ***** percent watched the video to the end.

  3. Top Youtube Artist

    • kaggle.com
    Updated Jan 12, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mrityunjay Pathak (2023). Top Youtube Artist [Dataset]. https://www.kaggle.com/datasets/themrityunjaypathak/top-youtube-artist
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 12, 2023
    Dataset provided by
    Kaggle
    Authors
    Mrityunjay Pathak
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    YouTube
    Description

    YouTube was created in 2005, with the first video – Me at the Zoo - being uploaded on 23 April 2005. Since then, 1.3 billion people have set up YouTube accounts. In 2018, people watch nearly 5 billion videos each day. People upload 300 hours of video to the site every minute.

    According to 2016 research undertaken by Pexeso, music only accounts for 4.3% of YouTube’s content. Yet it makes 11% of the views. Clearly, an awful lot of people watch a comparatively small number of music videos. It should be no surprise, therefore, that the most watched videos of all time on YouTube are predominantly music videos.

    On August 13, BTS became the most-viewed artist in YouTube history, accumulating over 26.7 billion views across all their official channels. This count includes all music videos and dance practice videos.

    Justin Bieber and Ed Sheeran now hold the records for second and third-highest views, with over 26 billion views each.

    Currently, BTS’s most viewed videos are their music videos for “**Boy With Luv**,” “**Dynamite**,” and “**DNA**,” which all have over 1.4 billion views.

    Headers of the Dataset Total = Total views (in millions) across all official channels Avg = Current daily average of all videos combined 100M = Number of videos with more than 100 million views

  4. ChokePoint Dataset

    • zenodo.org
    • data.niaid.nih.gov
    txt, xz
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yongkang Wong; Shaokang Chen; Sandra Mau; Conrad Sanderson; Brian Lovell; Yongkang Wong; Shaokang Chen; Sandra Mau; Conrad Sanderson; Brian Lovell (2020). ChokePoint Dataset [Dataset]. http://doi.org/10.5281/zenodo.815657
    Explore at:
    xz, txtAvailable download formats
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Yongkang Wong; Shaokang Chen; Sandra Mau; Conrad Sanderson; Brian Lovell; Yongkang Wong; Shaokang Chen; Sandra Mau; Conrad Sanderson; Brian Lovell
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    The ChokePoint dataset is designed for experiments in person identification/verification under real-world surveillance conditions using existing technologies. An array of three cameras was placed above several portals (natural choke points in terms of pedestrian traffic) to capture subjects walking through each portal in a natural way. While a person is walking through a portal, a sequence of face images (ie. a face set) can be captured. Faces in such sets will have variations in terms of illumination conditions, pose, sharpness, as well as misalignment due to automatic face localisation/detection. Due to the three camera configuration, one of the cameras is likely to capture a face set where a subset of the faces is near-frontal.

    The dataset consists of 25 subjects (19 male and 6 female) in portal 1 and 29 subjects (23 male and 6 female) in portal 2. The recording of portal 1 and portal 2 are one month apart. The dataset has frame rate of 30 fps and the image resolution is 800X600 pixels. In total, the dataset consists of 48 video sequences and 64,204 face images. In all sequences, only one subject is presented in the image at a time. The first 100 frames of each sequence are for background modelling where no foreground objects were presented.

    Each sequence was named according to the recording conditions (eg. P2E_S1_C3) where P, S, and C stand for portal, sequence and camera, respectively. E and L indicate subjects either entering or leaving the portal. The numbers indicate the respective portal, sequence and camera label. For example, P2L_S1_C3 indicates that the recording was done in Portal 2, with people leaving the portal, and captured by camera 3 in the first recorded sequence.

    To pose a more challenging real-world surveillance problems, two seqeunces (P2E_S5 and P2L_S5) were recorded with crowded scenario. In additional to the aforementioned variations, the sequences were presented with continuous occlusion. This phenomenon presents challenges in identidy tracking and face verification.

    This dataset can be applied, but not limited, to the following research areas:

    • person re-identification
    • image set matching
    • face quality measurement
    • face clustering
    • 3D face reconstruction
    • pedestrian/face tracking
    • background estimation and subtraction

    Please cite the following paper if you use the ChokePoint dataset in your work (papers, articles, reports, books, software, etc):

    • Y. Wong, S. Chen, S. Mau, C. Sanderson, B.C. Lovell
      Patch-based Probabilistic Image Quality Assessment for Face Selection and Improved Video-based Face Recognition
      IEEE Biometrics Workshop, Computer Vision and Pattern Recognition (CVPR) Workshops, pages 81-88, 2011.
      http://doi.org/10.1109/CVPRW.2011.5981881

  5. YouTube usage penetration in the United States 2020, by age group

    • statista.com
    Updated May 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). YouTube usage penetration in the United States 2020, by age group [Dataset]. https://www.statista.com/statistics/296227/us-youtube-reach-age-gender/
    Explore at:
    Dataset updated
    May 20, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    United States
    Description

    In the third quarter of 2020, it was found that 77 percent of U.S. internet users aged 15 to 25 years accessed YouTube. YouTube in the United States With over 126 million unique monthly viewers, YouTube is by far the most popular online video property in the United States. The platform’s mobile presence is also significant, as YouTube consistently ranks as the most popular mobile app in the United States based on audience reach. The most popular YouTube partner channels consistently attract dozens of millions of viewers and the top YouTube partner channel in the United States as of March 2019 was music label Universal Music Group (UMG), with over 50 million unique viewers. Music on YouTube Music is one of the most popular types of content on YouTube and as of 2019, half of the U.S. population used YouTube to listen to music on a weekly basis. Music videos frequently go viral and attract a large amount of attention upon release: it is not uncommon for popular releases to rack up 100 million video views within a week. Korean pop phenomenon BTS currently holds the record for the fastest viral video to reach 100 million YouTube streams. Their August 2020 release “Dynamite” only needed one day to accomplish the feat.

  6. Confused student EEG brainwave data

    • kaggle.com
    zip
    Updated Aug 27, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Haohan Wang (2016). Confused student EEG brainwave data [Dataset]. https://www.kaggle.com/wanghaohan/confused-eeg
    Explore at:
    zip(227480454 bytes)Available download formats
    Dataset updated
    Aug 27, 2016
    Authors
    Haohan Wang
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Description

    We collected EEG signal data from 10 college students while they watched MOOC video clips. We extracted online education videos that are assumed not to be confusing for college students, such as videos of the introduction of basic algebra or geometry. We also prepare videos that are expected to confuse a typical college student if a student is not familiar with the video topics like Quantum Mechanics, and Stem Cell Research. We prepared 20 videos, 10 in each category. Each video was about 2 minutes long. We chopped the two-minute clip in the middle of a topic to make the videos more confusing. The students wore a single-channel wireless MindSet that measured activity over the frontal lobe. The MindSet measures the voltage between an electrode resting on the forehead and two electrodes (one ground and one reference) each in contact with an ear. After each session, the student rated his/her confusion level on a scale of 1-7, where one corresponded to the least confusing and seven corresponded to the most confusing. These labels if further normalized into labels of whether the students are confused or not. This label is offered as self-labelled confusion in addition to our predefined label of confusion.

    Data information:

    -----data.csv

    1. Column 1: Subject ID
    2. Column 2: Video ID
    3. Column 3: Attention (Proprietary measure of mental focus)
    4. Column 4: Mediation (Proprietary measure of calmness)
    5. Column 5: Raw (Raw EEG signal)
    6. Column 6: Delta (1-3 Hz of power spectrum)
    7. Column 7: Theta (4-7 Hz of power spectrum)
    8. Column 8: Alpha 1 (Lower 8-11 Hz of power spectrum)
    9. Column 9: Alpha 2 (Higher 8-11 Hz of power spectrum)
    10. Column 10: Beta 1 (Lower 12-29 Hz of power spectrum)
    11. Column 11: Beta 2 (Higher 12-29 Hz of power spectrum)
    12. Column 12: Gamma 1 (Lower 30-100 Hz of power spectrum)
    13. Column 13: Gamma 2 (Higher 30-100 Hz of power spectrum)
    14. Column 14: predefined label (whether the subject is expected to be confused)
    15. Column 15: user-defined label (whether the subject is actually confused)

    -----subject demographic

    1. Column 1: Subject ID
    2. Column 2: Age
    3. Column 3: Ethnicity (Categorized according to https://en.wikipedia.org/wiki/List_of_contemporary_ethnic_groups)
    4. Column 4: Gender

    -----video data

    Each video lasts roughly two-minute long, we remove the first 30 seconds and last 30 seconds, only collect the EEG data during the middle 1 minute.

    Format

    These data are collected from ten students, each watching ten videos.

    Therefore, it can be seen as only 100 data points for these 12000+ rows. If you look at this way, then each data point consists of 120+ rows, which is sampled every 0.5 seconds (so each data point is a one minute video). Signals with higher frequency are reported as the mean value during each 0.5 second.

    Reference:

    Wang, H., Li, Y., Hu, X., Yang, Y., Meng, Z., & Chang, K. M. (2013, June). Using EEG to Improve Massive Open Online Courses Feedback Interaction. In AIED Workshops.

    Inspiration

    • This dataset is an extremely challenging data set to perform binary classification. 65% of prediction accuracy is quite decent according to our experience.

    • It is an interesting data set to carry out the variable selection (causal inference) task that may help further research. Past research has indicated that Theta signal is correlated with confusion level.

    • It is also an interesting data set for confounding factors correction model because we offer two labels (subject id and video id) that could profoundly confound the results.

    Contact

    haohanw@cs.cmu.edu

  7. Most viewed YouTube videos of all time 2025

    • statista.com
    • ai-chatbox.pro
    Updated Feb 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Most viewed YouTube videos of all time 2025 [Dataset]. https://www.statista.com/statistics/249396/top-youtube-videos-views/
    Explore at:
    Dataset updated
    Feb 17, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Feb 2025
    Area covered
    Worldwide, YouTube
    Description

    On June 17, 2016, Korean education brand Pinkfong released their video "Baby Shark Dance", and the rest is history. In January 2021, Baby Shark Dance became the first YouTube video to surpass 10 billion views, after snatching the crown of most-viewed YouTube video of all time from the former record holder "Despacito" one year before. "Baby Shark Dance" currently has over 15 billion lifetime views on YouTube. Music videos on YouTube “Baby Shark Dance” might be the current record-holder in terms of total views, but Korean artist Psy’s “Gangnam Style” video remained on the top spot for longest (1,689 days or 4.6 years) before ceding its spot to its successor. With figures like these, it comes as little surprise that the majority of the most popular videos on YouTube are music videos. Since 2010, all but one the most-viewed videos on YouTube have been music videos, signifying the platform’s shift in focus from funny, viral videos to professionally produced content. As of 2022, about 40 percent of the U.S. digital music audience uses YouTube Music. Popular video content on YouTube Music fans are also highly engaged audiences and it is not uncommon for music videos to garner significant amounts of traffic within the first 24 hours of release. Other popular types of videos that generate lots of views after their first release are movie trailers, especially superhero movies related to the MCU (Marvel Cinematic Universe). The first official trailer for the upcoming film “Avengers: Endgame” generated 289 million views within the first 24 hours of release, while the movie trailer for Spider-Man: No Way Home generated over 355 views on the first day from release, making it the most viral movie trailer.

  8. B

    Brazil No of Employed Person: 100 to 249 Persons: IC: Film, Video...

    • ceicdata.com
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com, Brazil No of Employed Person: 100 to 249 Persons: IC: Film, Video Production, TV Programs, Sound Recording & Music Publishing [Dataset]. https://www.ceicdata.com/en/brazil/enterprise-industry-no-of-employed-person-by-industry-100-to-249-persons/no-of-employed-person-100-to-249-persons-ic-film-video-production-tv-programs-sound-recording--music-publishing
    Explore at:
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Dec 1, 2006 - Dec 1, 2017
    Area covered
    Brazil
    Variables measured
    Employment
    Description

    Brazil Number of Employed Person: 100 to 249 Persons: IC: Film, Video Production, TV Programs, Sound Recording & Music Publishing data was reported at 3,143.000 Person in 2017. This records a decrease from the previous number of 3,582.000 Person for 2016. Brazil Number of Employed Person: 100 to 249 Persons: IC: Film, Video Production, TV Programs, Sound Recording & Music Publishing data is updated yearly, averaging 2,838.000 Person from Dec 2006 (Median) to 2017, with 12 observations. The data reached an all-time high of 4,058.000 Person in 2014 and a record low of 2,032.000 Person in 2011. Brazil Number of Employed Person: 100 to 249 Persons: IC: Film, Video Production, TV Programs, Sound Recording & Music Publishing data remains active status in CEIC and is reported by Brazilian Institute of Geography and Statistics. The data is categorized under Brazil Premium Database’s Business and Economic Survey – Table BR.SH019: Enterprise Industry: No of Employed Person: by Industry: 100 to 249 Persons.

  9. E

    List Of Vital YouTube Statistics Marketers Should Not Ignore In 2023

    • enterpriseappstoday.com
    Updated Oct 10, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    EnterpriseAppsToday (2023). List Of Vital YouTube Statistics Marketers Should Not Ignore In 2023 [Dataset]. https://www.enterpriseappstoday.com/stats/youtube-statistics.html
    Explore at:
    Dataset updated
    Oct 10, 2023
    Dataset authored and provided by
    EnterpriseAppsToday
    License

    https://www.enterpriseappstoday.com/privacy-policyhttps://www.enterpriseappstoday.com/privacy-policy

    Time period covered
    2022 - 2032
    Area covered
    Global, YouTube
    Description

    Key YouTube Statistics (Editor’s Choice) YouTube recorded 70 billion monthly active users in March 2023, which includes 55.10% of worldwide active social media users. There have been more than 14 million daily active users currently on YouTube, in the United States of America this platform is accessed by 62% of users. YouTube is touted as the second largest search engine and the second most visited website after Google. Revenue earned by YouTube in the first two quarters of 2023 is around $14.358 billion. In 2023, YouTube Premium and YouTube Music have recorded 80 million subscribers collectively worldwide. YouTube consumers view more than a billion hours of video per day. YouTube has more than 38 million active channels. In the fourth quarter of 2021, YouTube ad revenue has been $8.6 billion. Around 3 million paid subscribers to access YouTube TV. YouTube Premium has around 1 billion paid users. In 2023, YouTube was banned in countries such as China excluding Macau and Hong Kong, Eritrea, Iran, North Korea, Turkmenistan, and South Sudan. With 166 million downloads, the YouTube app has become the second most downloaded entertainment application across the world after Netflix. With 91 million downloads, YouTube Kids has become the sixth most downloaded entertainment app in the world. Nearly 90% of digital consumers access YouTube in the US, making it the most popular social network for watching video content. Over 70% of YouTube viewership takes place on its mobile application. More than 70% of YouTube video content watched by people is suggested by its algorithm. The average duration of a video on YouTube is 12 minutes. An average YouTube user spends 20 minutes and 23 seconds on the platform daily. Around 28% of YouTube videos that are published by popular channels are in the English language. 77% of YouTube users watch comedy content on the platform. With 247 million subscribers, T-Series has become the most subscribed channel on YouTube. Around 50 million users log on to YouTube every day. YouTube's biggest concurrent views record has been at 2.3 billion from when SpaceX has gone live on the platform to unveil Falcon Heavy Rocket. The majority of YouTube users are in the age group of 15 to 35 years in the US. The male-female ratio of YouTube users is 11:9. Apple INC. has been touted as the biggest advertiser on YouTube in 2020 spending $237.15 million. YouTube produced total revenue of $19.7 billion in 2020. As of 2021, the majority of YouTube users (467 million) are from India. It is the most popular platform in the United States with 74 percent of adult users. YouTube contributes to nearly 25% of mobile traffic worldwide. Daily live streaming on YouTube has increased by 45% in total in 2020. In India, around 225 million people are active on the platform each hour as per the 2021 statistics. YouTube Usage and Viewership Statistics #1. YouTube accounts for more than 2 billion monthly active users Around 2.7 billion users log on to YouTube each month. The number of monthly active users of YouTube is expected to grow even further. #2. Around 14.3 billion people visit the platform every month The number of YouTube visitors is far higher compared to Facebook, Amazon, and Instagram. #3. YouTube is accessible across 100 countries in 80 languages. The platform is widely available across different communities and nations. #4. 53.9% of YouTube users are men and 46.1% of women use the platform As of 2023 statistics, 53.9% of men use the platform and 46.1% of women over 18 years are on YouTube. The share in the number of males and females is 1.38 billion and 1.18 billion respectively. Age Group Male Female 18 to 24 8.5% 6% 25 to 34 11.6% 8.6% 35 to 44 9% 7.5% 45 to 54 6.2% 5.7% 55 to 64 4.4% 4.5% Above 65 4.3% 5.4% #5. 99% of YouTube users are active on other social media networks as well. Fewer than 1% of YouTube users are solely dependent on the platform. #6. Users spend around 20 minutes and 23 seconds per day on YouTube on average It is quite a generous amount of time spent on any social network platform. #7. YouTube is the second most visited site worldwide With more than 14 billion visits per month, YouTube has become the second most visited site in the world. However, its parent company Google is the most visited site across the globe. As per the statistics, YouTube is the third most popular searched word on Google. #8. 694000 hours of video content are streamed on YouTube per minute YouTube has outweighed Netflix as well in terms of streaming video content. #9. Over 81% of total internet users have accessed YouTube #10. Nearly 450 million hours of video content are uploaded on YouTube each hour More than 5 billion videos are watched on YouTube per day. #11. India has the maximum numb

  10. GPJATK DATASET – Calibrated and synchronized multi-view video and motion...

    • zenodo.org
    bin, pdf
    Updated Apr 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bogdan Kwolek; Bogdan Kwolek; Agnieszka Michalczuk; Agnieszka Michalczuk; Tomasz Krzeszowski; Tomasz Krzeszowski; Adam Świtoński; Adam Świtoński; Henryk Josiński; Henryk Josiński; Konrad Wojciechowski; Konrad Wojciechowski (2025). GPJATK DATASET – Calibrated and synchronized multi-view video and motion capture dataset for evaluation of gait recognition [Dataset]. http://doi.org/10.1007/s11042-019-07945-y
    Explore at:
    pdf, binAvailable download formats
    Dataset updated
    Apr 10, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Bogdan Kwolek; Bogdan Kwolek; Agnieszka Michalczuk; Agnieszka Michalczuk; Tomasz Krzeszowski; Tomasz Krzeszowski; Adam Świtoński; Adam Świtoński; Henryk Josiński; Henryk Josiński; Konrad Wojciechowski; Konrad Wojciechowski
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description

    =======================
    Summary
    =======================
    GPJATK DATASET – MULTI-VIEW VIDEO AND MOTION CAPTURE DATASET
    The GPJATK dataset has been designed for research on vision-based 3D gait recognition. It can also be used for evaluation of the multi-view (where gallery gaits from multiple views are combined to recognize probe gait on a single view) and the cross-view (where probe gait and gallery gait are recorded from two different views) gait recognition algorithms. In addition to problems related to gait recognition, the dataset can also be used for research on algorithms for human motion tracking and articulated pose estimation. The GPJATK dataset is available only for scientific use.
    All documents and papers that use the dataset must acknowledge the use of the dataset by including a citation of the following paper:
    B. Kwolek, A. Michalczuk, T. Krzeszowski, A. Switonski, H. Josinski, and K. Wojciechowski, „Calibrated and synchronized multi-view video and motion capture dataset for evaluation of gait recognition,” Multimedia Tools and Applications, vol. 78, iss. 22, p. 32437–32465, 2019, doi:10.1007/s11042-019-07945-y

    =======================
    Data description
    =======================
    The GPJATK dataset contains data captured by 10 mocap cameras and four calibrated and synchronized video cameras. The 3D gait dataset consists of 166 data sequences, that present the gait of 32 people (10 women and 22 men). In 128 data sequences, each of the individuals was dressed in his/her own clothes, in 24 data sequences, 6 of the performers (person #26-#31) changed clothes, and in 14 data sequences, 7 of the performers attending in the recordings had a backpack on his/her back. Each sequence consists of four videos with RGB images with a resolution of 960×540, which were recorded by synchronized and calibrated cameras with 25 frames per second, together with the corresponding MoCap data. The mocap data were registered at 100 Hz by a Vicon system consisting of 10 MX-T40 cameras.
    During the recording session, the actor has been requested to walk on the scene of size 6.5 m × 4.2 m along a line joining the cameras C2 and C4 as well as along the diagonal of the scene. In a single recording session, every performer walked from right to left, then from left to right, and afterward on the diagonal from upper-right to bottom-left and from bottom-left to upper-right corner of the scene. Some performers were also asked to attend additional recording sessions, i.e. after changing into another garment, and putting on a backpack.

    =======================
    Dataset structure
    =======================
    * Gait_Data - data for gait recognition containing 32 subjects. The data was obtained using both marker-less and marker-based motion capture systems.
    * Markerless motion tracking algorithm - dataset obtained using a markerless motion tracking algorithm
    * MoCap - dataset obtained using the Vicon motion capture system
    Each dataset contains:
    * Arff - motion data after smoothing, normalization, and MPCA in Weka ARFF format
    * AsfAmc - motion data saved in Acclaim ASF/AMC format
    * Csv - motion data saved in CSV format. Each row contains data for one frame and each column represents a different attribute. Unit for angles attributes are degrees and unit for distances are millimeters.
    * Mat - Matlab .mat files
    * Sequences - 166 video sequences with 32 subjects. Each sequence consists of 4 video streams and MoCap data. Video is recorded with a frequency of 25 Hz, and MoCap data is recorded at 100 Hz. Both systems are synchronized.
    Each sequence contains:
    * Background - sequences with a background in AVI format
    * Calibration - camera calibration data (Tsai model)
    * Edges - images with detected edges
    * Videos - sequences in AVI format
    * MoCap - data from motion capture system in formats: c3d and Acclaim ASF/AMC
    * Silhouettes - images with person silhouettes
    * Matlab_scripts - Matlab scripts for generating .arff files
    It requires scripts:
    * Tensor Toolbox
    * Matlab Toolbox for Multilinear Principal Component Analysis (MPCA) by Haiping LU (https://www.mathworks.com/matlabcentral/fileexchange/26168-multilinear-principal-component-analysis--mpca-?s_tid=prof_contriblnk)
    * ListOfSequences.txt - file with information about video sequences (start frames, frames numbers, offsets)
    * ActorsData.txt - file with information about recorded persons (age, gender, height, width)
    * GPJATK_Release_Agreement.pdf - GPJATK dataset release agreement which must be accepted to use the database

    =======================
    Project participants
    =======================
    Konrad Wojciechowski (Polish-Japanese Academy of Information Technology)
    Bogdan Kwolek

    =======================
    Acknowledgements
    =======================
    The recordings were made in the years 2012-2014 in the Human Motion Lab (Research and Development Center of the Polish-Japanese Academy of Information Technology) in Bytom as part of the projects: 1) „System with a library of modules for advanced analysis and an interactive synthesis of human motion” co-financed by the European Regional Development Fund under the Innovative Economy Operational Programme – Priority Axis 1; 2) OR00002111 financed by the National Centre for Research and Development (NCBiR).

    =======================
    Privacy statement
    =======================
    Data of human subjects is provided in coded form (without personal identifying information and with blurred faces to prevent identification).

    =======================
    Further information
    =======================
    For any questions, comments or other issues please contact Tomasz Krzeszowski

  11. Mall - Crowd Estimation

    • kaggle.com
    Updated Jan 17, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Feras (2021). Mall - Crowd Estimation [Dataset]. https://www.kaggle.com/datasets/ferasoughali/mall-crowd-estimation/versions/3
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 17, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Feras
    Description

    This data set is added for convenience.

    Data is provided in h5 files that contains images and labels.

    Images corresponds to original frames.

    Labels are density maps based on objects positions in each frame.

    Density maps were scaled to 100 * number of objects to make the network learn better.

    Density maps were produced by applying a convolution with a Gaussian kernel.

    Order of frames is maintained and dataset is split into (1500) frames for training and (500) frames for validation.

    Script used to generate this dataset can be found in this github repo: https://github.com/NeuroSYS-pl/objects_counting_dmap

    Official link: https://personal.ie.cuhk.edu.hk/~ccloy/downloads_mall_dataset.html

    Details The mall dataset was collected from a publicly accessible webcam for crowd counting and profiling research.

    Ground truth: Over 60,000 pedestrians were labelled in 2000 video frames. Data is annotated exhaustively by labeling the head position of every pedestrian in all frames.

    Video length: 2000 frames Frame size: 640x480 Frame rate: < 2 Hz

    The dataset is intended for research purposes only and as such cannot be used commercially. Please cite the following publication(s) when this dataset is used in any academic and research reports.

    References From Semi-Supervised to Transfer Counting of Crowds C. C. Loy, S. Gong, and T. Xiang in Proceedings of IEEE International Conference on Computer Vision, pp. 2256-2263, 2013 (ICCV)

    Cumulative Attribute Space for Age and Crowd Density Estimation K. Chen, S. Gong, T. Xiang, and C. C. Loy in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 2467-2474, 2013 (CVPR, Oral)

    Crowd Counting and Profiling: Methodology and Evaluation C. C. Loy, K. Chen, S. Gong, T. Xiang in S. Ali, K. Nishino, D. Manocha, and M. Shah (Eds.), Modeling, Simulation and Visual Analysis of Crowds, Springer, vol. 11, pp. 347-382, 2013

    Feature Mining for Localised Crowd Counting K. Chen, C. C. Loy, S. Gong, and T. Xiang British Machine Vision Conference, 2012 (BMVC)

  12. GAViD: Group Affect from ViDeos

    • zenodo.org
    csv, zip
    Updated Jun 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Deepak Kumar; Deepak Kumar; Puneet Kumar; Puneet Kumar; Xiaobai Li; Xiaobai Li; Balasubramanian Raman; Balasubramanian Raman (2025). GAViD: Group Affect from ViDeos [Dataset]. http://doi.org/10.5281/zenodo.15448846
    Explore at:
    csv, zipAvailable download formats
    Dataset updated
    Jun 5, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Deepak Kumar; Deepak Kumar; Puneet Kumar; Puneet Kumar; Xiaobai Li; Xiaobai Li; Balasubramanian Raman; Balasubramanian Raman
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jun 1, 2025
    Description

    Overview

    We introduce the Group Affect from ViDeos (GAViD) dataset, which comprises 5091 video clips with multimodal data (video, audio, and context), annotated with ternary valence and discrete emotion labels, and enriched with VideoGPT-generated contextual metadata and human-annotated action cues. We also present CAGNet, a baseline model for multimodal context aware group affect recognition. CAGNet achieves 61.20% test accuracy on GAViD, comparable to state-of-the art performance in the field.

    NOTE: For now we are providing only Train video clips. The corresponding paper is under Review in ACM Multimedia 2025 Dataset Track. After its publication, the validation and Test set access will be granted upon request and approval, in accordance with the Responsible Use Policy.

    Dataset Description

    GAViD is a large-scale, in-the-wild multimodal dataset of 5091 samples, each annotated with the elements listed below. The following sections describe its key details and compilation procedure.

    1. Raw video clips of an average duration of five seconds,
    2. Audio aligned with the video clips,
    3. Contextual metadata (scene descriptions, event labels) generated by a multimodal LLM and human-verified,
    4. Group affect labels: ternary valence (positive, neutral, negative) and five discrete emotions (happy, sad, fear, anger, neutral),
    5. Emotion intensity ratings (high, medium, low),
    6. Interaction type labels (cooperative, hostile, neutral),
    7. Action cues (e.g. smiling, clapping, shouting, dancing, singing).

    Dataset details

    • Number of clips (samples) in GAViD-> 5130
    • Number of samples with some problem-> 39
    • Number of samples after filtering-> 5,091
    • Duration per clip-> 5 sec
    • Clip count per video-> 1–35
    • Dataset split-> Train: 3503; Val: 542; Test:1046
    • Affect labels (classwise distribution)-> Positive: 2600; Negative: 1189; Neutral: 1302
    • Emotion label distribution-> Neutral: 1522; Happy: 2428; Anger: 884; Sad: 201; Fear: 56

    Keywords used to rearch the raw videos from YouTube

    PositivePositiveNegativeNegativeNeutralNeutral
    Team CelebrationHappyProtestAngry SportGroup MeetingPanel Discussion
    Group MeetingVideo ConferenceHeated ArgumentViolent ProtestParliament speechPeople on street
    Get TogetherMeetingEmotional breakdown in PublicAggressive ArgumentPeople walking on streetTeam brainstorming Session
    CelebrationPress ConferenceSpritual GatheringAggressive GroupTeam Building ActivitiesGroup Discussion
    Religious gathering Talk Show Street RaceCondolenceGroup work sessionTeam Planning session
    FarewellGroup Performance Group FightWrestlingStudents in DiscussionWedding Group Dance
    People Dancing on StreetStreet Comedy MMA FightVIolenceRoundtable Discus-
    sion
    Oath
    Wedding PerformanceDhol masti BoxingSilent ProtestMental health ad-
    dress
    General Talk
    Couple group danceComedy showPeople in the fightGroup FightWedding CelebrationFestival Celebration

    Emotion Recognition Results using CAGNet

    ModelVal Acc.Val F1Test Acc.Test F1
    CAGNet62.55%0.45460.33%0.448

    Components of the Dataset

    The dataset comprises two main components:

    • GAViD_train.csv file: Contains bin number used by labelbox in the annotation process, video_id, group_emotion (Positive, Negative, Neutral), specific_emotion (happy, sad, fear, anger, neutral), emotion_intensity, interaction_type, action_cuse, Video Description genertaed using Video-ChatGPT model.
    • GAViD_Train_VideoClips.zip folder: Contains the video clips of train set [For Now we are providing only Train video clips. Validation and Test set video clips will be provided as per the request].

    Data Format and Fields of the CSV File

    The dataset is structured in GAViD.csv file along with corresponding Videos in related folders. This CSV file includes the following fields:

    • Video_ID: Unique Identifier of a video
    • Group_Affect: Positive, Negative, Neutral
    • Descrete_Emotion: Happy, Sad, Fear, Anger, Neutral
    • Emotion_Intensity: High, Medium, Low
    • Interaction_Type: Cooperative, Hostile, Neutral
    • Action_Cues: e.g. Smiling, Clapping, Shouting, Dancing, Singing etc.
    • Context: Each video clip's summary generated from the Video-ChatGPT model.

    Ethical considerations, data privacy and misuse prevention

    • Data Collection and Consent: The data collection and annotation strictly followed established ethical protocols in line with YouTube's Terms, which state “Public videos with a Creative Commons license may be reused". We downloaded only public-domain videos licensed under Creative Commons (CC BY 4.0), which “allows others to share, copy and redistribute the material in any medium or format, and to adapt, remix, transform, and build upon it for any purpose, even commercially".
    • Privacy: All content was reviewed to ensure no private or sensitive information is present. Faces are included only from public domain videos as needed for group affect research; only group-level content is released, with no attempt or risk of individual identification. Other personally identifiable information, such as
      names and addresses and contacts, was removed.

    Code and Citation

    • Code Repository: https: //github.com/deepakkumar-iitr/GAViD/tree/main
    • Citing the Dataset: Users of the dataset should cite the corresponding paper described at the above GitHub Repository.

    License & Access

    • This dataset is released for academic research only and is free to researchers from educational or

  13. m

    YouTube Statistics By Content Ecosystem

    • market.biz
    Updated Jul 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Market.biz (2025). YouTube Statistics By Content Ecosystem [Dataset]. https://market.biz/youtube-statistics/
    Explore at:
    Dataset updated
    Jul 8, 2025
    Dataset provided by
    Market.biz
    License

    https://market.biz/privacy-policyhttps://market.biz/privacy-policy

    Time period covered
    2022 - 2032
    Area covered
    South America, Australia, Africa, Europe, North America, ASIA, YouTube
    Description

    Introduction

    YouTube Statistics: YouTube dominates the digital landscape with 2.70 billion monthly active users among the world population in mid-2025, making it the second-largest search engine after Google and the second social platform, following Facebook, across the world.

    People watch more than 1 billion hours of video on YouTube, that’s a million years of attention span. With over 20 million new videos uploaded to the platform every day, the YouTube content ecosystem is practically endless. Short-form video lovers have not been ignored.

    With an astonishing 70 billion views a day on YouTube shorts, these viewers are generating a new level of interactions and engagement across the platform. Of course, mobile dominates; 63% of watch time happens on mobile devices. With over 100 million subscribers to YouTube Premium and YouTube Music, in addition to free, YouTube is indeed a premium entertainment platform.

  14. Gender Detection & Classification - Face Dataset

    • kaggle.com
    Updated Oct 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Training Data (2023). Gender Detection & Classification - Face Dataset [Dataset]. https://www.kaggle.com/datasets/trainingdatapro/gender-detection-and-classification-image-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 31, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Training Data
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description

    Gender Detection & Classification - face recognition dataset

    The dataset is created on the basis of Face Mask Detection dataset

    Dataset Description:

    The dataset comprises a collection of photos of people, organized into folders labeled "women" and "men." Each folder contains a significant number of images to facilitate training and testing of gender detection algorithms or models.

    The dataset contains a variety of images capturing female and male individuals from diverse backgrounds, age groups, and ethnicities.

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12421376%2F1c4708f0b856f7889e3c0eea434fe8e2%2FFrame%2045%20(1).png?generation=1698764294000412&alt=media" alt="">

    This labeled dataset can be utilized as training data for machine learning models, computer vision applications, and gender detection algorithms.

    💴 For Commercial Usage: Full version of the dataset includes 376 000+ photos of people, leave a request on TrainingData to buy the dataset

    Metadata for the full dataset:

    • assignment_id - unique identifier of the media file
    • worker_id - unique identifier of the person
    • age - age of the person
    • true_gender - gender of the person
    • country - country of the person
    • ethnicity - ethnicity of the person
    • photo_1_extension, photo_2_extension, photo_3_extension, photo_4_extension - photo extensions in the dataset
    • photo_1_resolution, photo_2_resolution, photo_3_extension, photo_4_resolution - photo resolution in the dataset

    OTHER BIOMETRIC DATASETS:

    💴 Buy the Dataset: This is just an example of the data. Leave a request on https://trainingdata.pro/datasets to learn about the price and buy the dataset

    Content

    The dataset is split into train and test folders, each folder includes: - folders women and men - folders with images of people with the corresponding gender, - .csv file - contains information about the images and people in the dataset

    File with the extension .csv

    • file: link to access the file,
    • gender: gender of a person in the photo (woman/man),
    • split: classification on train and test

    TrainingData provides high-quality data annotation tailored to your needs

    keywords: biometric system, biometric system attacks, biometric dataset, face recognition database, face recognition dataset, face detection dataset, facial analysis, gender detection, supervised learning dataset, gender classification dataset, gender recognition dataset

  15. Countries with the most YouTube users 2025

    • statista.com
    • ai-chatbox.pro
    Updated Feb 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Countries with the most YouTube users 2025 [Dataset]. https://www.statista.com/statistics/280685/number-of-monthly-unique-youtube-users/
    Explore at:
    Dataset updated
    Feb 17, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Feb 2025
    Area covered
    Worldwide, YouTube
    Description

    As of February 2025, India was the country with the largest YouTube audience by far, with approximately 491 million users engaging with the popular social video platform. The United States followed, with around 253 million YouTube viewers. Brazil came in third, with 144 million users watching content on YouTube. The United Kingdom saw around 54.8 million internet users engaging with the platform in the examined period. What country has the highest percentage of YouTube users? In July 2024, the United Arab Emirates was the country with the highest YouTube penetration worldwide, as around 94 percent of the country's digital population engaged with the service. In 2024, YouTube counted around 100 million paid subscribers for its YouTube Music and YouTube Premium services. YouTube mobile markets In 2024, YouTube was among the most popular social media platforms worldwide. In terms of revenues, the YouTube app generated approximately 28 million U.S. dollars in revenues in the United States in January 2024, as well as 19 million U.S. dollars in Japan.

  16. Video Analytics Market Report by Component (Software, Services), Deployment...

    • imarcgroup.com
    pdf,excel,csv,ppt
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    IMARC Group, Video Analytics Market Report by Component (Software, Services), Deployment Type (On-Premises, Cloud), Application (Incident Detection, Intrusion Management, People/Crowd Counting, Traffic Monitoring, Automatic Number Plate Recognition, Facial Recognition, and Others), Architecture Type (Edge-Based, Server-Based), Organization Size (Small and Medium Enterprise, Large Enterprise), End User (BFSI, Retail, Critical Infrastructure, Traffic Management, Transportation and Logistics, Hospitality and Entertainment, Defense and Security, and Others), and Region 2025-2033 [Dataset]. https://www.imarcgroup.com/video-analytics-market
    Explore at:
    pdf,excel,csv,pptAvailable download formats
    Dataset provided by
    Imarc Group
    Authors
    IMARC Group
    License

    https://www.imarcgroup.com/privacy-policyhttps://www.imarcgroup.com/privacy-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    The global video analytics market size reached USD 8.5 Billion in 2024. Looking forward, IMARC Group expects the market to reach USD 31.0 Billion by 2033, exhibiting a growth rate (CAGR) of 15.4% during 2025-2033. The market is experiencing steady growth driven by the growing need for enhanced surveillance across various industries to prevent unauthorized access, advancements in artificial intelligence (AI) and machine learning (ML) technologies, and increasing focus on cost-effective solutions.

    Report Attribute
    Key Statistics
    Base Year
    2024
    Forecast Years
    2025-2033
    Historical Years
    2019-2024
    Market Size in 2024
    USD 8.5 Billion
    Market Forecast in 2033
    USD 31.0 Billion
    Market Growth Rate (2025-2033)15.4%

    IMARC Group provides an analysis of the key trends in each segment of the market, along with forecasts at the global, regional, and country levels for 2025-2033. Our report has categorized the market based on component, deployment type, application, architecture type, organization size, and end user.

  17. Data from: Quantifying the Size and Geographic Extent of CCTV's Impact on...

    • icpsr.umich.edu
    • datasets.ai
    • +1more
    Updated Aug 25, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ratcliffe, Jerry; Groff, Elizabeth (2017). Quantifying the Size and Geographic Extent of CCTV's Impact on Reducing Crime in Philadelphia, Pennsylvania, 2003-2013 [Dataset]. http://doi.org/10.3886/ICPSR35514.v1
    Explore at:
    Dataset updated
    Aug 25, 2017
    Dataset provided by
    Inter-university Consortium for Political and Social Researchhttps://www.icpsr.umich.edu/web/pages/
    Authors
    Ratcliffe, Jerry; Groff, Elizabeth
    License

    https://www.icpsr.umich.edu/web/ICPSR/studies/35514/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/35514/terms

    Time period covered
    Jan 2003 - Dec 2013
    Area covered
    Philadelphia, Pennsylvania
    Description

    These data are part of NACJD's Fast Track Release and are distributed as they were received from the data depositor. The files have been zipped by NACJD for release, but not checked or processed except for the removal of direct identifiers. Users should refer to the accompanying readme file for a brief description of the files available with this collection and consult the investigator(s) if further information is needed. This study was designed to investigate whether the presence of CCTV cameras can reduce crime by studying the cameras and crime statistics of a controlled area. The viewsheds of over 100 CCTV cameras within the city of Philadelphia, Pennsylvania were defined and grouped into 13 clusters, and camera locations were digitally mapped. Crime data from 2003-2013 was collected from areas that were visible to the selected cameras, as well as data from control and displacement areas using an incident reporting database that records the location of crime events. Demographic information was also collected from the mapped areas, such as population density, household information, and data on the specific camera(s) in the area. This study also investigated the perception of CCTV cameras, and interviewed members of the public regarding topics such as what they thought the camera could see, who was watching the camera feed, and if they were concerned about being filmed.

  18. replicAnt - Plum2023 - Detection & Tracking Datasets and Trained Networks

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Apr 21, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fabian Plum; Fabian Plum; René Bulla; Hendrik Beck; Hendrik Beck; Natalie Imirzian; Natalie Imirzian; David Labonte; David Labonte; René Bulla (2023). replicAnt - Plum2023 - Detection & Tracking Datasets and Trained Networks [Dataset]. http://doi.org/10.5281/zenodo.7849417
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 21, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Fabian Plum; Fabian Plum; René Bulla; Hendrik Beck; Hendrik Beck; Natalie Imirzian; Natalie Imirzian; David Labonte; David Labonte; René Bulla
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains all recorded and hand-annotated as well as all synthetically generated data as well as representative trained networks used for detection and tracking experiments in the replicAnt - generating annotated images of animals in complex environments using Unreal Engine manuscript. Unless stated otherwise, all 3D animal models used in the synthetically generated data have been generated with the open-source photgrammetry platform scAnt peerj.com/articles/11155/. All synthetic data has been generated with the associated replicAnt project available from https://github.com/evo-biomech/replicAnt.

    Abstract:

    Deep learning-based computer vision methods are transforming animal behavioural research. Transfer learning has enabled work in non-model species, but still requires hand-annotation of example footage, and is only performant in well-defined conditions. To overcome these limitations, we created replicAnt, a configurable pipeline implemented in Unreal Engine 5 and Python, designed to generate large and variable training datasets on consumer-grade hardware instead. replicAnt places 3D animal models into complex, procedurally generated environments, from which automatically annotated images can be exported. We demonstrate that synthetic data generated with replicAnt can significantly reduce the hand-annotation required to achieve benchmark performance in common applications such as animal detection, tracking, pose-estimation, and semantic segmentation; and that it increases the subject-specificity and domain-invariance of the trained networks, so conferring robustness. In some applications, replicAnt may even remove the need for hand-annotation altogether. It thus represents a significant step towards porting deep learning-based computer vision tools to the field.

    Benchmark data

    Two video datasets were curated to quantify detection performance; one in laboratory and one in field conditions. The laboratory dataset consists of top-down recordings of foraging trails of Atta vollenweideri (Forel 1893) leaf-cutter ants. The colony was collected in Uruguay in 2014, and housed in a climate chamber at 25°C and 60% humidity. A recording box was built from clear acrylic, and placed between the colony nest and a box external to the climate chamber, which functioned as feeding site. Bramble leaves were placed in the feeding area prior to each recording session, and ants had access to the recording area at will. The recorded area was 104 mm wide and 200 mm long. An OAK-D camera (OpenCV AI Kit: OAK-D, Luxonis Holding Corporation) was positioned centrally 195 mm above the ground. While keeping the camera position constant, lighting, exposure, and background conditions were varied to create recordings with variable appearance: The “base” case is an evenly lit and well exposed scene with scattered leaf fragments on an otherwise plain white backdrop. A “bright” and “dark” case are characterised by systematic over- or underexposure, respectively, which introduces motion blur, colour-clipped appendages, and extensive flickering and compression artefacts. In a separate well exposed recording, the clear acrylic backdrop was substituted with a printout of a highly textured forest ground to create a “noisy” case. Last, we decreased the camera distance to 100 mm at constant focal distance, effectively doubling the magnification, and yielding a “close” case, distinguished by out-of-focus workers. All recordings were captured at 25 frames per second (fps).

    The field datasets consists of video recordings of Gnathamitermes sp. desert termites, filmed close to the nest entrance in the desert of Maricopa County, Arizona, using a Nikon D850 and a Nikkor 18-105 mm lens on a tripod at camera distances between 20 cm to 40 cm. All video recordings were well exposed, and captured at 23.976 fps.

    Each video was trimmed to the first 1000 frames, and contains between 36 and 103 individuals. In total, 5000 and 1000 frames were hand-annotated for the laboratory- and field-dataset, respectively: each visible individual was assigned a constant size bounding box, with a centre coinciding approximately with the geometric centre of the thorax in top-down view. The size of the bounding boxes was chosen such that they were large enough to completely enclose the largest individuals, and was automatically adjusted near the image borders. A custom-written Blender Add-on aided hand-annotation: the Add-on is a semi-automated multi animal tracker, which leverages blender’s internal contrast-based motion tracker, but also include track refinement options, and CSV export functionality. Comprehensive documentation of this tool and Jupyter notebooks for track visualisation and benchmarking is provided on the replicAnt and BlenderMotionExport GitHub repositories.

    Synthetic data generation

    Two synthetic datasets, each with a population size of 100, were generated from 3D models of \textit{Atta vollenweideri} leaf-cutter ants. All 3D models were created with the scAnt photogrammetry workflow. A “group” population was based on three distinct 3D models of an ant minor (1.1 mg), a media (9.8 mg), and a major (50.1 mg) (see 10.5281/zenodo.7849059)). To approximately simulate the size distribution of A. vollenweideri colonies, these models make up 20%, 60%, and 20% of the simulated population, respectively. A 33% within-class scale variation, with default hue, contrast, and brightness subject material variation, was used. A “single” population was generated using the major model only, with 90% scale variation, but equal material variation settings.

    A Gnathamitermes sp. synthetic dataset was generated from two hand-sculpted models; a worker and a soldier made up 80% and 20% of the simulated population of 100 individuals, respectively with default hue, contrast, and brightness subject material variation. Both 3D models were created in Blender v3.1, using reference photographs.

    Each of the three synthetic datasets contains 10,000 images, rendered at a resolution of 1024 by 1024 px, using the default generator settings as documented in the Generator_example level file (see documentation on GitHub). To assess how the training dataset size affects performance, we trained networks on 100 (“small”), 1,000 (“medium”), and 10,000 (“large”) subsets of the “group” dataset. Generating 10,000 samples at the specified resolution took approximately 10 hours per dataset on a consumer-grade laptop (6 Core 4 GHz CPU, 16 GB RAM, RTX 2070 Super).


    Additionally, five datasets which contain both real and synthetic images were curated. These “mixed” datasets combine image samples from the synthetic “group” dataset with image samples from the real “base” case. The ratio between real and synthetic images across the five datasets varied between 10/1 to 1/100.

    Funding

    This study received funding from Imperial College’s President’s PhD Scholarship (to Fabian Plum), and is part of a project that has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation program (Grant agreement No. 851705, to David Labonte). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

  19. R

    EgoHands Object Detection Dataset - specific

    • public.roboflow.com
    zip
    Updated Apr 22, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    IU Computer Vision Lab (2022). EgoHands Object Detection Dataset - specific [Dataset]. https://public.roboflow.com/object-detection/hands/1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 22, 2022
    Dataset authored and provided by
    IU Computer Vision Lab
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Bounding Boxes of hands
    Description

    https://i.imgur.com/eEWi4PT.png" alt="EgoHands Dataset">

    About this dataset

    The EgoHands dataset is a collection of 4800 annotated images of human hands from a first-person view originally collected and labeled by Sven Bambach, Stefan Lee, David Crandall, and Chen Yu of Indiana University.

    The dataset was captured via frames extracted from video recorded through head-mounted cameras on a Google Glass headset while peforming four activities: building a puzzle, playing chess, playing Jenga, and playing cards. There are 100 labeled frames for each of 48 video clips.

    Our modifications

    The original EgoHands dataset was labeled with polygons for segmentation and released in a Matlab binary format. We converted it to an object detection dataset using a modified version of this script from @molyswu and have archived it in many popular formats for use with your computer vision models.

    After converting to bounding boxes for object detection, we noticed that there were several dozen unlabeled hands. We added these by hand and improved several hundred of the other labels that did not fully encompass the hands (usually to include omitted fingertips, knuckles, or thumbs). In total, 344 images' annotations were edited manually.

    We chose a new random train/test split of 80% training, 10% validation, and 10% testing. Notably, this is not the same split as in the original EgoHands paper.

    There are two versions of the converted dataset available: * specific is labeled with four classes: myleft, myright, yourleft, yourright representing which hand of which person (the viewer or the opponent across the table) is contained in the bounding box. * generic contains the same boxes but with a single hand class.

    Using this dataset

    The authors have graciously allowed Roboflow to re-host this derivative dataset. It is released under a Creative Commons by Attribution 4.0 license. You may use it for academic or commercial purposes but must cite the original paper.

    Please use the following Bibtext: @inproceedings{egohands2015iccv, title = {Lending A Hand: Detecting Hands and Recognizing Activities in Complex Egocentric Interactions}, author = {Sven Bambach and Stefan Lee and David Crandall and Chen Yu}, booktitle = {IEEE International Conference on Computer Vision (ICCV)}, year = {2015} }

  20. m

    PkSLMNM: Pakistan Sign Language Manual and Non-Manual Gestures Dataset

    • data.mendeley.com
    Updated May 26, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sameena Javaid (2022). PkSLMNM: Pakistan Sign Language Manual and Non-Manual Gestures Dataset [Dataset]. http://doi.org/10.17632/m3m9924p3v.1
    Explore at:
    Dataset updated
    May 26, 2022
    Authors
    Sameena Javaid
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Pakistan
    Description

    Sign language is a non-verbal form of communication used by people with impaired hearing and speech. They also use facial actions to provide sign language prosody, similar to intonation in spoken languages. Sign Language Recognition (SLR) using hand signs is a typical way, however, face expression and body language play an important role in communication, which has not been analyzed to its fullest potential. In this paper, we present a dataset that comprises manual (hand signs) and non-manual (facial expressions and body movements) gestures of Pakistan Sign Language (PSL). It contains videos of 7 basic affective expressions performed by 100 healthy individuals, presented in an easily accessible format of .MP4 that can be used to train and test systems to make robust models for real-time applications using videos. Current data can also help with facial feature detection, classification of subjects by gender and age, or provide insights into any individual’s interest and emotional state.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
University of Michigan (2023). 100DOH (The 100 Days Of Hands) [Dataset]. https://opendatalab.com/OpenDataLab/100DOH_The_100_Days_Of_Hands

100DOH (The 100 Days Of Hands)

OpenDataLab/100DOH_The_100_Days_Of_Hands

Explore at:
zipAvailable download formats
Dataset updated
Sep 28, 2023
Dataset provided by
University of Michigan
Johns Hopkins University
Description

The 100 Days Of Hands Dataset (100DOH) is a large-scale video dataset containing hands and hand-object interactions. It consists of 27.3K Youtube videos from 11 categories (see details in the table below) with nearly 131 days of footage of everyday interaction. The focus of the dataset is hand contact, and it includes both first-person and third-person perspectives. The videos in 100DOH are unconstrained and content-rich, ranging from records of daily life to specific instructional videos. To enforce diversity, we only keep no more than 20 videos from each uploader. All video links are provided.

We searched implicitly for hands engaged in interactions rather than explicitly. We search for generic tags (e.g., DIY cookies home 2014, kitchen 2018 assembly cabinet) instead of specific actions.

Search
Clear search
Close search
Google apps
Main menu