100+ datasets found
  1. R

    Person Identification 2 Dataset

    • universe.roboflow.com
    zip
    Updated Jun 15, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Custom YOLO Dataset (2023). Person Identification 2 Dataset [Dataset]. https://universe.roboflow.com/custom-yolo-dataset/person-identification-dataset-2
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 15, 2023
    Dataset authored and provided by
    Custom YOLO Dataset
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Person Bounding Boxes
    Description

    Person Identification Dataset 2

    ## Overview
    
    Person Identification Dataset 2 is a dataset for object detection tasks - it contains Person annotations for 335 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  2. Dataset for non-targeted urinary biomarkers

    • catalog.data.gov
    Updated Nov 12, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2020). Dataset for non-targeted urinary biomarkers [Dataset]. https://catalog.data.gov/dataset/dataset-for-non-targeted-urinary-biomarkers
    Explore at:
    Dataset updated
    Nov 12, 2020
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    This dataset contains a summary of compounds found in human urine samples. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: The original dataset contains identification information for the sample subjects and all of their descriptors including age, gender, race, and medical screening information. The analyzed data cannot be made publicly available. Format: This dataset contains a summary of compounds found in human urine samples. This dataset is associated with the following publication: O’Lenick, C., J. Pleil, M. Stiegel, J. Sobus, and A. Wallace. Detection and analysis of endogenous polar volatile organic compounds (PVOCs) in urine for human exposome research. BIOMARKERS. Taylor & Francis, Inc., Philadelphia, PA, USA, 24(3): 240-248, (2019).

  3. R

    Person Find Dataset

    • universe.roboflow.com
    zip
    Updated Sep 6, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    kaosit (2023). Person Find Dataset [Dataset]. https://universe.roboflow.com/kaosit/person-find/dataset/2
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 6, 2023
    Dataset authored and provided by
    kaosit
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Person Bounding Boxes
    Description

    Person Find

    ## Overview
    
    Person Find is a dataset for object detection tasks - it contains Person annotations for 200 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  4. Omaha Lead Study data

    • catalog.data.gov
    • datasets.ai
    • +1more
    Updated Apr 21, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2022). Omaha Lead Study data [Dataset]. https://catalog.data.gov/dataset/omaha-lead-study-data
    Explore at:
    Dataset updated
    Apr 21, 2022
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Area covered
    Omaha
    Description

    We linked information on SLL at residential properties with children’s BLLs, grouping children based on whether they had pre- and/or post-remediation BLLs. Our data includes PII and we have a data use agreement that was negotiated between the Douglas County Health Department and the U.S. Environmental Protection Agency. This agreement states that, “Upon completion of this work described herein, all Restricted Data records shall be destroyed or returned … within 30 days of the completion of the work. In addition, the Institutional Review Board (IRB) protocol (UNC-IRB No. 15-1629) further outlines how the confidentiality of the data will be protected during analysis. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: Please contact Ellen Kirrane at kirrane.ellen@epa.gov. Format: Data is in tabular format. This dataset is associated with the following publication: Ye, D., J. Brown, D. Umbach, J. Adams, W. Thayer, M. Follansbee, and E. Kirrane. Estimating the effects of soil remediation on children’s blood lead near a former lead smelter in Omaha Nebraska, U.S.. ENVIRONMENTAL HEALTH PERSPECTIVES. National Institute of Environmental Health Sciences (NIEHS), Research Triangle Park, NC, USA, 130(3): 037008 1-17, (2022).

  5. Z

    CT-FAN: A Multilingual dataset for Fake News Detection

    • data.niaid.nih.gov
    Updated Oct 23, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gautam Kishore Shahi (2022). CT-FAN: A Multilingual dataset for Fake News Detection [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4714516
    Explore at:
    Dataset updated
    Oct 23, 2022
    Dataset provided by
    Thomas Mandl
    Gautam Kishore Shahi
    Julia Maria Struß
    Juliane Köhler
    Michael Wiegand
    Melanie Siegel
    Description

    By downloading the data, you agree with the terms & conditions mentioned below:

    Data Access: The data in the research collection may only be used for research purposes. Portions of the data are copyrighted and have commercial value as data, so you must be careful to use them only for research purposes.

    Summaries, analyses and interpretations of the linguistic properties of the information may be derived and published, provided it is impossible to reconstruct the information from these summaries. You may not try identifying the individuals whose texts are included in this dataset. You may not try to identify the original entry on the fact-checking site. You are not permitted to publish any portion of the dataset besides summary statistics or share it with anyone else.

    We grant you the right to access the collection's content as described in this agreement. You may not otherwise make unauthorised commercial use of, reproduce, prepare derivative works, distribute copies, perform, or publicly display the collection or parts of it. You are responsible for keeping and storing the data in a way that others cannot access. The data is provided free of charge.

    Citation

    Please cite our work as

    @InProceedings{clef-checkthat:2022:task3, author = {K{"o}hler, Juliane and Shahi, Gautam Kishore and Stru{\ss}, Julia Maria and Wiegand, Michael and Siegel, Melanie and Mandl, Thomas}, title = "Overview of the {CLEF}-2022 {CheckThat}! Lab Task 3 on Fake News Detection", year = {2022}, booktitle = "Working Notes of CLEF 2022---Conference and Labs of the Evaluation Forum", series = {CLEF~'2022}, address = {Bologna, Italy},}

    @article{shahi2021overview, title={Overview of the CLEF-2021 CheckThat! lab task 3 on fake news detection}, author={Shahi, Gautam Kishore and Stru{\ss}, Julia Maria and Mandl, Thomas}, journal={Working Notes of CLEF}, year={2021} }

    Problem Definition: Given the text of a news article, determine whether the main claim made in the article is true, partially true, false, or other (e.g., claims in dispute) and detect the topical domain of the article. This task will run in English and German.

    Task 3: Multi-class fake news detection of news articles (English) Sub-task A would detect fake news designed as a four-class classification problem. Given the text of a news article, determine whether the main claim made in the article is true, partially true, false, or other. The training data will be released in batches and roughly about 1264 articles with the respective label in English language. Our definitions for the categories are as follows:

    False - The main claim made in an article is untrue.

    Partially False - The main claim of an article is a mixture of true and false information. The article contains partially true and partially false information but cannot be considered 100% true. It includes all articles in categories like partially false, partially true, mostly true, miscaptioned, misleading etc., as defined by different fact-checking services.

    True - This rating indicates that the primary elements of the main claim are demonstrably true.

    Other- An article that cannot be categorised as true, false, or partially false due to a lack of evidence about its claims. This category includes articles in dispute and unproven articles.

    Cross-Lingual Task (German)

    Along with the multi-class task for the English language, we have introduced a task for low-resourced language. We will provide the data for the test in the German language. The idea of the task is to use the English data and the concept of transfer to build a classification model for the German language.

    Input Data

    The data will be provided in the format of Id, title, text, rating, the domain; the description of the columns is as follows:

    ID- Unique identifier of the news article

    Title- Title of the news article

    text- Text mentioned inside the news article

    our rating - class of the news article as false, partially false, true, other

    Output data format

    public_id- Unique identifier of the news article

    predicted_rating- predicted class

    Sample File

    public_id, predicted_rating 1, false 2, true

    IMPORTANT!

    We have used the data from 2010 to 2022, and the content of fake news is mixed up with several topics like elections, COVID-19 etc.

    Baseline: For this task, we have created a baseline system. The baseline system can be found at https://zenodo.org/record/6362498

    Related Work

    Shahi GK. AMUSED: An Annotation Framework of Multi-modal Social Media Data. arXiv preprint arXiv:2010.00502. 2020 Oct 1.https://arxiv.org/pdf/2010.00502.pdf

    G. K. Shahi and D. Nandini, “FakeCovid – a multilingual cross-domain fact check news dataset for covid-19,” in workshop Proceedings of the 14th International AAAI Conference on Web and Social Media, 2020. http://workshop-proceedings.icwsm.org/abstract?id=2020_14

    Shahi, G. K., Dirkson, A., & Majchrzak, T. A. (2021). An exploratory study of covid-19 misinformation on twitter. Online Social Networks and Media, 22, 100104. doi: 10.1016/j.osnem.2020.100104

    Shahi, G. K., Struß, J. M., & Mandl, T. (2021). Overview of the CLEF-2021 CheckThat! lab task 3 on fake news detection. Working Notes of CLEF.

    Nakov, P., Da San Martino, G., Elsayed, T., Barrón-Cedeno, A., Míguez, R., Shaar, S., ... & Mandl, T. (2021, March). The CLEF-2021 CheckThat! lab on detecting check-worthy claims, previously fact-checked claims, and fake news. In European Conference on Information Retrieval (pp. 639-649). Springer, Cham.

    Nakov, P., Da San Martino, G., Elsayed, T., Barrón-Cedeño, A., Míguez, R., Shaar, S., ... & Kartal, Y. S. (2021, September). Overview of the CLEF–2021 CheckThat! Lab on Detecting Check-Worthy Claims, Previously Fact-Checked Claims, and Fake News. In International Conference of the Cross-Language Evaluation Forum for European Languages (pp. 264-291). Springer, Cham.

  6. Q

    Data for: The Bystander Affect Detection (BAD) Dataset for Failure Detection...

    • data.qdr.syr.edu
    pdf, tsv, txt, zip
    Updated Sep 25, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexandra Bremers; Alexandra Bremers; Xuanyu Fang; Xuanyu Fang; Natalie Friedman; Natalie Friedman; Wendy Ju; Wendy Ju (2023). Data for: The Bystander Affect Detection (BAD) Dataset for Failure Detection in HRI [Dataset]. http://doi.org/10.5064/F6TAWBGS
    Explore at:
    zip(66872585), zip(67359564), zip(49981372), zip(45063165), zip(35942055), tsv(5431), zip(63732190), zip(32108293), zip(33064251), zip(49848937), zip(38858151), zip(137880775), zip(90804192), zip(36477139), zip(38068214), zip(36039067), zip(37592931), zip(34234760), zip(63445623), zip(38092264), zip(45582594), zip(50915158), zip(111033502), zip(32955394), zip(30549219), zip(39991378), zip(166237686), zip(50351519), zip(62744513), zip(46810648), zip(34379478), zip(35492684), zip(22036189), pdf(197935), zip(66187509), zip(40085473), zip(40798037), pdf(113804), zip(12931695), zip(31593404), zip(26677367), zip(35547615), tsv(244631), zip(35954889), txt(7329), zip(74593629), zip(52574377), zip(55483165), zip(31323914), zip(43519637), zip(42743107), zip(55790691), zip(50499507), zip(76761027), zip(38063092), zip(55654900), zip(30504764), zip(48203736), zip(40422817)Available download formats
    Dataset updated
    Sep 25, 2023
    Dataset provided by
    Qualitative Data Repository
    Authors
    Alexandra Bremers; Alexandra Bremers; Xuanyu Fang; Xuanyu Fang; Natalie Friedman; Natalie Friedman; Wendy Ju; Wendy Ju
    License

    https://qdr.syr.edu/policies/qdr-restricted-access-conditionshttps://qdr.syr.edu/policies/qdr-restricted-access-conditions

    Description

    Project Overview For a robot to repair its own error, it must first know it has made a mistake. One way that people detect errors is from the implicit reactions from bystanders – their confusion, smirks, or giggles clue us in that something unexpected occurred. To enable robots to detect and act on bystander responses to task failures, we developed a novel method to elicit bystander responses to human and robot errors. Data Overview This project introduces the Bystander Affect Detection (BAD) dataset – a dataset of videos of bystander reactions to videos of failures. This dataset includes 2,452 human reactions to failure, collected in contexts that approximate “in-the-wild” data collection – including natural variances in webcam quality, lighting, and background. The BAD dataset may be requested for use in related research projects. As the dataset contains facial video data of participants, access can be requested along with the presentation of a research protocol and data use agreement that protects participants. Data Collection Overview and Access Conditions Using 46 different stimulus videos featuring a variety of human and machine task failures, we collected a total of 2,452 webcam videos of human reactions from 54 participants. Recruitment happened through the online behavioral research platform Prolific (https://www.prolific.co/about), where the options were selected to recruit a gender-balanced sample across all countries available. Participants had to use a laptop or desktop. Compensation was set at the Prolific rate of $12/hr, which came down to about $8 per participant for about 40 minutes of participation. Participants agreed that their data can be shared for future research projects and the data were approved to be shared publicly by IRB review. However, considering the fact that this is a machine-learning dataset containing identifiable crowdsourced human subjects data, the research team has decided that potential secondary users of the data must meet the following criteria for the access request to be granted: 1. Agreement to three usage terms: - I will not redistribute the contents of the BAD Dataset - I will not use videos for purposes outside of human interaction research (broadly defined as any project that aims to study or develop improvements to human interactions with technology to result in a better user experience) - I will not use the videos to identify, defame, or otherwise negatively impact the health, welfare, employment or reputation of human participants 2. A description of what you want to use the BAD dataset for, indicating any applicable human subjects protection measures that are in place. (For instance, "Me and my fellow researchers at University of X, lab of Y, will use the BAD dataset to train a model to detect when our Nao robot interrupts people at awkward times. The PI is Professor Z. Our protocol was approved under IRB #.") 3. A copy of the IRB record or ethics approval document, confirming the research protocol and institutional approval. Data Analysis To test the viability of the collected data, we used the Bystander Reaction Dataset as input to a deep-learning model, BADNet, to predict failure occurrence. We tested different data labeling methods and learned how they affect model performance, achieving precisions above 90%. Shared Data Organization This data project consists of 54 zipped folders of recorded video data organized by participant, totaling 2,452 videos. The accompanying documentation includes a file containing the text of the consent form used for the research project, an inventory of the stimulus videos used, aggregate survey data, this data narrative, and an administrative readme file. Special Notes The data were approved to be shared publicly by IRB review. However, considering the fact that this is a machine-learning dataset containing identifiable crowdsourced human subjects data, the research team has decided that potential secondary users of the data must meet specific criteria before they qualify for access. Please consult the Terms tab below for more details and follow the instructions there if interested in requesting access.

  7. g

    Data associated with Wallis et al. 2024

    • gimi9.com
    • catalog.data.gov
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data associated with Wallis et al. 2024 [Dataset]. https://gimi9.com/dataset/data-gov_data-associated-with-wallis-et-al-2024
    Explore at:
    Description

    🇺🇸 미국 English Metadata supporting Wallis et al. 2024 in Environment International. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: Data from the National Children's Study must be accessed through the National Institutes of Health, National Institute of Child Health and Human Development's Data and Specimen Hub (DASH) at https://dash.nichd.nih.gov/. Format: Participant demographic, lifestyle, residence, occupational, and other types of data from questionnaire and observational survey instruments are in .csv and .xlsx files. PFAS measurements in serum and house dust in .csv files. This dataset is associated with the following publication: Wallis, D., K. Miller, N. Deluca, K. Thomas, C. Fuller, J. McCord, E. Cohen-Hubal, and J. Minucci. Understanding prenatal household exposures to per- and polyfluorylalkyl substances using paired Biological and dust measurements with sociodemographic and housing variables. ENVIRONMENT INTERNATIONAL. Elsevier B.V., Amsterdam, NETHERLANDS, 194(December): 109157, (2024).

  8. R

    People Id Dataset

    • universe.roboflow.com
    zip
    Updated Jul 24, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    persons (2023). People Id Dataset [Dataset]. https://universe.roboflow.com/persons/people-id/model/3
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 24, 2023
    Dataset authored and provided by
    persons
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    People Bounding Boxes
    Description

    Here are a few use cases for this project:

    1. Retail Shop Analysis: The "People ID" computer vision model can be used to analyze customer behavior in a retail store. For example, it can track the movement of customers, identify their browsing and purchasing patterns, and help analyze peak hours and areas of the store that receive heavy footfall.

    2. Crowd Management: In large events like concerts, festivals, or sports gatherings, the model can be used to identify and count the number of attendees, also monitor crowd movement patterns for safety and security purposes.

    3. Security Surveillance: Use the model to enhance CCTV surveillance in public spaces or high-security areas. The technology can help in identifying unauthorized persons or unusual behaviors, contributing to safety and security.

    4. Health Compliance Monitoring: The model can be used in hospitals, offices, or public spaces to ensure compliance with health regulations like mask compliance during a pandemic, or tracking people's movements to trace potential infection transmission.

    5. Smart Home Applications: In smart home applications, the model can be used to identify specific family members for personalized experiences or detect unfamiliar faces for security reasons.

  9. System identification dataset

    • figshare.com
    zip
    Updated Oct 15, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mohsen Kharazihai Isfahani; Maryam Zekri; Hamid Reza Marateb; Miguel Angel Mañanas (2019). System identification dataset [Dataset]. http://doi.org/10.6084/m9.figshare.7891103.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Oct 15, 2019
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Mohsen Kharazihai Isfahani; Maryam Zekri; Hamid Reza Marateb; Miguel Angel Mañanas
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The dataset contains two sub-folders:1. Load sharing 2. function approximation1. Load sharing ("data.zip"):Each folder corresponds with a subject.the monopolar, single differential and double differential data issaved in the corresponding sub-folders 'mono', 'sd' and 'dd' respectively.In each subfolder, the data is saved as '30.mat','50mat',or '70.mat' corresponding with 30%,50% or 70% MVC isometric flexion-extension.The recording protocol can be found the word file 'report.doc' in this folder inThe subsection: experimental recording.Structure of the '.mat' files :They all have the same structure:Raw_Torque : The measured Torque in ADC numbersstructure 'TAB_ARV' , the EMG envelopes for 'BB', 'BR', 'TM', 'TL' (Read report for the methods and acronyms).2. function approximation ("fun_approx.zip")Multiple benchmark examples including a piecewise single variable function, five nonlinear dynamic plants with various nonlinear structures, the chaotic Mackey Glass time series (with different signal to noise ratio (SNR) and various chaotic degree) and the real-world Box-Jenkins gas furnace system are considered to verify the effectiveness of the proposed FJWNN model. The description ("info.pdf") and the entire simulated data as well as the results of our method on the training and test sets (in excel files) were provided.

  10. g

    Focus group data for factors in homeowners’ willingness to adopt IA systems...

    • gimi9.com
    Updated Oct 10, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). Focus group data for factors in homeowners’ willingness to adopt IA systems | gimi9.com [Dataset]. https://gimi9.com/dataset/data-gov_focus-group-data-for-factors-in-homeowners-willingness-to-adopt-ia-systems
    Explore at:
    Dataset updated
    Oct 10, 2023
    Description

    Dataset is a Microsoft Word file that contains the transcripts of five focus groups of adopters and prospective adopters of innovative/alternative septic systems. Focus group participants were asked a series of questions to better understand why they would or would not choose to adopt the alternative and innovative septic systems. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: Contact Kate Mulvaney, mulvaney.kate@epa.gov. Format: This is EPA-owned data, but SDMP was generated post clearance due to pre-RAPID/SciHub integration. Dataset is a Microsoft Word file that contains the transcripts of five focus groups of adopters and prospective adopters of innovative/alternative septic systems. Focus group participants were asked a series of questions to better understand why they would or would not choose to adopt the alternative and innovative septic systems. This dataset is associated with the following publication: Rudman, A., K. Mulvaney, N. Merrill, and K. Canfield. Factors in homeowners’ willingness to adopt nitrogen-reducing innovative/alternative septic systems. Frontiers in Marine Science. Frontiers, Lausanne, SWITZERLAND, 10: 1069599, (2023).

  11. Data from: Drinking Water Sources, Quality, and Associated Health Outcomes...

    • s.cnmilf.com
    • catalog.data.gov
    Updated Apr 20, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2025). Drinking Water Sources, Quality, and Associated Health Outcomes in Appalachian Virginia: A Risk Characterization Study in Two Counties [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/drinking-water-sources-quality-and-associated-health-outcomes-in-appalachian-virginia-a-ri-1eaa5
    Explore at:
    Dataset updated
    Apr 20, 2025
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Area covered
    Appalachia
    Description

    Data set contains sensitive PII and cannot be released publicly. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: contact Tim Wade (wade.tim@epa.gov). Format: Data include personally identifiable information collected from living individuals. This dataset is associated with the following publication: Wade, T., A. Cohen, M. Raseduzzaman, B. O'Connell, T. Brown, T. Mami, L. Krometis, A. Hubbard, P. Scheuerman, M. Edwards, A. Darling, B. Pennala, S. Price, B. Lytton, E. Whettstone, S. Pholwat, S. Griffin, J. Kobylanski, and A. Egorov. Drinking Water Sources, Quality, and Associated Health Outcomes in Appalachian Virginia: A Risk Characterization Study in Two Counties. ENVIRONMENTAL HEALTH. BioMed Central Ltd, London, UK, 260: 114390, (2024).

  12. W

    Dataset of firefighters absorption of PAHs and benzene during training...

    • cloud.csiss.gmu.edu
    • catalog.data.gov
    • +1more
    Updated Mar 6, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    United States (2021). Dataset of firefighters absorption of PAHs and benzene during training exercises [Dataset]. http://doi.org/10.23719/1503338
    Explore at:
    Dataset updated
    Mar 6, 2021
    Dataset provided by
    United States
    License

    https://pasteur.epa.gov/license/sciencehub-license.htmlhttps://pasteur.epa.gov/license/sciencehub-license.html

    Description

    The dataset contains concentrations of toxicants in breath and urine collected from study participants. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: By contacting CDC/NIOSH. Format: The dataset contains concentrations of toxicants in breath and urine collected from study participants.

    This dataset is associated with the following publication: Fent, K., C. Toennis, D. Sammons, S. Robertson, S. Bertke, A. Calafat, J. Pleil, A. Wallace, S. Kerber, D. Smith, and G. Horn. Firefighters' and instructors’ absorption of PAHs and benzene during training exercises. INTERNATIONAL JOURNAL OF HYGIENE AND ENVIRONMENTAL HEALTH. Elsevier B.V., Amsterdam, NETHERLANDS, 222(7): 991-1000, (2019).

  13. N

    Tell City, IN Population Breakdown by Gender and Age

    • neilsberg.com
    csv, json
    Updated Sep 14, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2023). Tell City, IN Population Breakdown by Gender and Age [Dataset]. https://www.neilsberg.com/research/datasets/67b29206-3d85-11ee-9abe-0aa64bf2eeb2/
    Explore at:
    json, csvAvailable download formats
    Dataset updated
    Sep 14, 2023
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Tell City
    Variables measured
    Male and Female Population Under 5 Years, Male and Female Population over 85 years, Male and Female Population Between 5 and 9 years, Male and Female Population Between 10 and 14 years, Male and Female Population Between 15 and 19 years, Male and Female Population Between 20 and 24 years, Male and Female Population Between 25 and 29 years, Male and Female Population Between 30 and 34 years, Male and Female Population Between 35 and 39 years, Male and Female Population Between 40 and 44 years, and 8 more
    Measurement technique
    The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates. To measure the three variables, namely (a) Population (Male), (b) Population (Female), and (c) Gender Ratio (Males per 100 Females), we initially analyzed and categorized the data for each of the gender classifications (biological sex) reported by the US Census Bureau across 18 age groups, ranging from under 5 years to 85 years and above. These age groups are described above in the variables section. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the population of Tell City by gender across 18 age groups. It lists the male and female population in each age group along with the gender ratio for Tell City. The dataset can be utilized to understand the population distribution of Tell City by gender and age. For example, using this dataset, we can identify the largest age group for both Men and Women in Tell City. Additionally, it can be used to see how the gender ratio changes from birth to senior most age group and male to female ratio across each age group for Tell City.

    Key observations

    Largest age group (population): Male # 25-29 years (373) | Female # 55-59 years (402). Source: U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.

    Age groups:

    • Under 5 years
    • 5 to 9 years
    • 10 to 14 years
    • 15 to 19 years
    • 20 to 24 years
    • 25 to 29 years
    • 30 to 34 years
    • 35 to 39 years
    • 40 to 44 years
    • 45 to 49 years
    • 50 to 54 years
    • 55 to 59 years
    • 60 to 64 years
    • 65 to 69 years
    • 70 to 74 years
    • 75 to 79 years
    • 80 to 84 years
    • 85 years and over

    Scope of gender :

    Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis.

    Variables / Data Columns

    • Age Group: This column displays the age group for the Tell City population analysis. Total expected values are 18 and are define above in the age groups section.
    • Population (Male): The male population in the Tell City is shown in the following column.
    • Population (Female): The female population in the Tell City is shown in the following column.
    • Gender Ratio: Also known as the sex ratio, this column displays the number of males per 100 females in Tell City for each age group.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Tell City Population by Gender. You can refer the same here

  14. p

    Count Yourself In Workforce Survey - Dataset - CKAN

    • ckan0.cf.opendata.inter.prod-toronto.ca
    Updated Sep 18, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2020). Count Yourself In Workforce Survey - Dataset - CKAN [Dataset]. https://ckan0.cf.opendata.inter.prod-toronto.ca/dataset/count-yourself-in-workforce-survey
    Explore at:
    Dataset updated
    Sep 18, 2020
    Description

    The CYI Survey invites employees to voluntarily disclose how they self-identify based on questions related to Indigenous identity, Black identity, gender, race/ethnicity, sexual orientation and if they identify as a person with a disability. The data displays the diversity within the workforce at the City of Toronto. The goal of the survey is to track progress towards realizing the City's Motto "Diversity Our Strength", and to continuously monitor and socialize diversity data across the City, in order to help inform decision-making and address gaps in representation across all levels at the City. About the Datasets The following datasets were collected through the City's CYI Workforce survey between 2013 and 2024. The data has been reported in aggregate formats that do not allow for the identification of individual employees. First Nations, Inuit, and Metis Data The City is working with an external working group of First Nations, Inuit, and Métis (FNIM) advisors to develop a framework for the collection and use of FNIM data. While this framework is in development, Indigenous data from CYI surveys conducted in 2022, 2023, and 2024 will not be made available until Ownership, Control, Access, and Possession (OCAP) and United Nations Declaration on the Rights of Indigenous Peoples (UNDRIP) principles have been applied. However, Indigenous data from 2018, 2019, 2020 and 2021 is still available. For questions related to the implications or considerations of the framework’s development, please contact dataequity@toronto.ca

  15. CMAQ_DATA

    • catalog.data.gov
    • s.cnmilf.com
    Updated Nov 12, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2020). CMAQ_DATA [Dataset]. https://catalog.data.gov/dataset/cmaq-data
    Explore at:
    Dataset updated
    Nov 12, 2020
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    Data is CMAQ (Community Multiscale Air Quality) air quality modeling data contained in 12km grids (covering the eastern US) and 36 km grids (covering the entire US) for the years 2004, 2005, and 2006. Each CMAQ grid contains a concentration value for an air pollutant (fine particulate matter - PM2.5), and this concentration value can be used to determine the impact of air pollution concentrations on hospital emergency department admissions for asthma, and hospital inpatient admissions for asthma, myocardial infraction (MI), and heart failure (HF) in Baltimore Maryland. CMAQ data is in both .IOAPI (Input/Output Applications Programming Interface) format and .csv (comma-separated value) format. Health data was used in this analysis, but the health data cannot be released because it contains personally identifiable information (PII) on living individuals, and is protected by the Privacy Act of 1974 (as amended), the Health Insurance Portability and Accountability Act (HIPPA) of 1996 (as amended), and is exempt from Freedom of Information Act (FOIA) requests. The health dataset contains information about human research subjects, and access to it was limited by the Institutional Review Board (IRB) decision of 19 February 2014 (Protocol #13-76), and updated on 8 December 2016. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: The following folder has been set aside to access this (CMAQ_Data-only) data: ftp://newftp.epa.gov/EPADataCommons/ORD/NERL_SED/EHCAB/Hall_ORD-028187/. Format: Data is CMAQ (Community Multiscale Air Quality) air quality modeling data contained in 12km grids (covering the eastern US) and 36 km grids (covering the entire US) for the years 2004, 2005, and 2006. Each CMAQ grid contains a concentration value for an air pollutant (fine particulate matter - PM2.5), and this concentration value can be used to determine the impact of air pollution concentrations on hospital emergency department admissions for asthma, and hospital inpatient admissions for asthma, myocardial infraction (MI), and heart failure (HF) in Baltimore Maryland. CMAQ data is in both .IOAPI (Input/Output Applications Programming Interface) format and .csv (comma-separated value) format. Health data was used in this analysis, but the health data cannot be released because it contains personally identifiable information (PII) on living individuals, and is protected by the Privacy Act of 1974 (as amended), the Health Insurance Portability and Accountability Act (HIPPA) of 1996 (as amended), and is exempt from Freedom of Information Act (FOIA) requests. The health dataset contains information about human research subjects, and access to it was limited by the Institutional Review Board (IRB) decision of 19 February 2014 (Protocol #13-76), and updated on 8 December 2016. This dataset is associated with the following publication: Braggio, J., E. Hall, S. Weber, and A. Huff. Contribution of Satellite-Derived Aerosol Optical Depth PM2.5 Bayesian Concentration Surfaces to Respiratory-Cardiovascular Chronic Disease Hospitalizations in Baltimore, Maryland. ATMOSPHERE. MDPI AG, Basel, SWITZERLAND, 11(2): 209, (2020).

  16. Education Industry Data | Global Education Sector Professionals | Verified...

    • datarade.ai
    Updated Oct 27, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Success.ai (2021). Education Industry Data | Global Education Sector Professionals | Verified LinkedIn Profiles from 700M+ Dataset | Best Price Guarantee [Dataset]. https://datarade.ai/data-products/education-industry-data-global-education-sector-professiona-success-ai
    Explore at:
    .bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
    Dataset updated
    Oct 27, 2021
    Dataset provided by
    Area covered
    Brazil, Taiwan, Ascension and Tristan da Cunha, Gabon, Kiribati, Samoa, Jersey, Palestine, Mongolia, Wallis and Futuna
    Description

    Success.ai’s Education Industry Data provides access to comprehensive profiles of global professionals in the education sector. Sourced from over 700 million verified LinkedIn profiles, this dataset includes actionable insights and verified contact details for teachers, school administrators, university leaders, and other decision-makers. Whether your goal is to collaborate with educational institutions, market innovative solutions, or recruit top talent, Success.ai ensures your efforts are supported by accurate, enriched, and continuously updated data.

    Why Choose Success.ai’s Education Industry Data? 1. Comprehensive Professional Profiles Access verified LinkedIn profiles of teachers, school principals, university administrators, curriculum developers, and education consultants. AI-validated profiles ensure 99% accuracy, reducing bounce rates and enabling effective communication. 2. Global Coverage Across Education Sectors Includes professionals from public schools, private institutions, higher education, and educational NGOs. Covers markets across North America, Europe, APAC, South America, and Africa for a truly global reach. 3. Continuously Updated Dataset Real-time updates reflect changes in roles, organizations, and industry trends, ensuring your outreach remains relevant and effective. 4. Tailored for Educational Insights Enriched profiles include work histories, academic expertise, subject specializations, and leadership roles for a deeper understanding of the education sector.

    Data Highlights: 700M+ Verified LinkedIn Profiles: Access a global network of education professionals. 100M+ Work Emails: Direct communication with teachers, administrators, and decision-makers. Enriched Professional Histories: Gain insights into career trajectories, institutional affiliations, and areas of expertise. Industry-Specific Segmentation: Target professionals in K-12 education, higher education, vocational training, and educational technology.

    Key Features of the Dataset: 1. Education Sector Profiles Identify and connect with teachers, professors, academic deans, school counselors, and education technologists. Engage with individuals shaping curricula, institutional policies, and student success initiatives. 2. Detailed Institutional Insights Leverage data on school sizes, student demographics, geographic locations, and areas of focus. Tailor outreach to align with institutional goals and challenges. 3. Advanced Filters for Precision Targeting Refine searches by region, subject specialty, institution type, or leadership role. Customize campaigns to address specific needs, such as professional development or technology adoption. 4. AI-Driven Enrichment Enhanced datasets include actionable details for personalized messaging and targeted engagement. Highlight educational milestones, professional certifications, and key achievements.

    Strategic Use Cases: 1. Product Marketing and Outreach Promote educational technology, learning platforms, or training resources to teachers and administrators. Engage with decision-makers driving procurement and curriculum development. 2. Collaboration and Partnerships Identify institutions for collaborations on research, workshops, or pilot programs. Build relationships with educators and administrators passionate about innovative teaching methods. 3. Talent Acquisition and Recruitment Target HR professionals and academic leaders seeking faculty, administrative staff, or educational consultants. Support hiring efforts for institutions looking to attract top talent in the education sector. 4. Market Research and Strategy Analyze trends in education systems, curriculum development, and technology integration to inform business decisions. Use insights to adapt products and services to evolving educational needs.

    Why Choose Success.ai? 1. Best Price Guarantee Access industry-leading Education Industry Data at unmatched pricing for cost-effective campaigns and strategies. 2. Seamless Integration Easily integrate verified data into CRMs, recruitment platforms, or marketing systems using downloadable formats or APIs. 3. AI-Validated Accuracy Depend on 99% accurate data to reduce wasted outreach and maximize engagement rates. 4. Customizable Solutions Tailor datasets to specific educational fields, geographic regions, or institutional types to meet your objectives.

    Strategic APIs for Enhanced Campaigns: 1. Data Enrichment API Enrich existing records with verified education professional profiles to enhance engagement and targeting. 2. Lead Generation API Automate lead generation for a consistent pipeline of qualified professionals in the education sector. Success.ai’s Education Industry Data enables you to connect with educators, administrators, and decision-makers transforming global...

  17. S

    Data from: DroneRFb-DIR: An RF Signal Dataset for Non-cooperative Drone...

    • scidb.cn
    Updated Mar 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    dian zi yu xin xi xue bao (2025). DroneRFb-DIR: An RF Signal Dataset for Non-cooperative Drone Individual Identification [Dataset]. http://doi.org/10.57760/sciencedb.20454
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 18, 2025
    Dataset provided by
    Science Data Bank
    Authors
    dian zi yu xin xi xue bao
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    RF drone detection is one of the critical methods for managing non-cooperative drones, and drone individual recognition (DIR) based on RF signals is a key procedure for drone detection. Given the lack of DIR dataset availability in the current stage, we provide an open-source dataset named DroneRFb-DIR for RF-based drone individual recognition. In particular, we use the Software-Defined Radio (SDR) device to capture the RF signals exchanged between flying drones and their remote controllers, including six types of drones with three different individuals for each, as well as the background signals in urban scenarios for reference. The captured signals are stored in the format of original I/Q data. Each type of data contains over 40 segments, with each segment comprising over 4 million sample points. The RF sampling band is set between 2.4GHz and 2.48GHz, covering flight control signals, video transmission signals, and interference signals from surrounding interference devices. The dataset has been annotated with detailed entity identifiers and Line-of-Sight or None-Line-of-Sight scene labels.

  18. n

    Build better LibGuides: A dataset of Political Science, Public Affairs, and...

    • data.niaid.nih.gov
    • search.dataone.org
    • +1more
    zip
    Updated May 30, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Annelise Sklar (2024). Build better LibGuides: A dataset of Political Science, Public Affairs, and International Studies LibGuides [Dataset]. http://doi.org/10.5061/dryad.prr4xgxvk
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 30, 2024
    Dataset provided by
    University of California, San Diego
    Authors
    Annelise Sklar
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    The dataset that accompanies the "Build Better LibGuides" chapter of Teaching Information Literacy in Political Science, Public Affairs, and International Studies. This dataset was created to compare current practices in Political Science, Public Affairs, and International Studies (PSPAIS) LibGuides with recommended best practices using a sample that represents a variety of academic institutions. Members of the ACRL Politics, Policy, and International Relations Section (PPIRS) were identified as the librarians most likely to be actively engaged with these specific subjects, so the dataset was scoped by identifying the institutions associated with the most active PPIRS members and then locating the LibGuides in these and related disciplines. The resulting dataset includes 101 guides at 46 institutions, for a total of 887 LibGuide tabs. Methods This dataset was created to compare current practices in Political Science, Public Affairs, and International Studies (PSPAIS) LibGuides with recommended best practices using a sample that represents a variety of academic institutions. Members of the ACRL Politics, Policy, and International Relations Section (PPIRS) were identified as the librarians most likely to be actively engaged with these specific subjects, so the dataset was scoped by identifying the institutions associated with the most active PPIRS members and then locating the LibGuides in these and related disciplines. Specifically, a student assistant collected the names and institutional affiliations of each member serving on a PPIRS committee as of July 1, 2021, 2022, and 2023. The student then removed the individual librarian names from the list and located the links to the Political Science or Government; Public Policy, Public Affairs, or Public Administration; and International Studies or International Relations LibGuides at each institution. The chapter author then confirmed and, in a few cases, added to the student's work and copied and pasted the tab names from each guide (which conveniently were also hyperlinked) into a Google Sheet. The resulting dataset included 101 guides at 46 institutions, for a total of 887 LibGuide tabs. A Google Apps script was used to extract the hyperlinks from the collected tab names and then a Python script was used to scrape the names of links included on each of the tabs. LibGuides from two institutions returned errors during the link name scraping process and were excluded in this part of the analysis.

  19. h

    OMOP dataset: Hospital COVID patients: severity, acuity, therapies, outcomes...

    • healthdatagateway.org
    unknown
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    This publication uses data from PIONEER, an ethically approved database and analytical environment (East Midlands Derby Research Ethics 20/EM/0158), OMOP dataset: Hospital COVID patients: severity, acuity, therapies, outcomes [Dataset]. https://healthdatagateway.org/dataset/139
    Explore at:
    unknownAvailable download formats
    Dataset authored and provided by
    This publication uses data from PIONEER, an ethically approved database and analytical environment (East Midlands Derby Research Ethics 20/EM/0158)
    License

    https://www.pioneerdatahub.co.uk/data/data-request-process/https://www.pioneerdatahub.co.uk/data/data-request-process/

    Description

    OMOP dataset: Hospital COVID patients: severity, acuity, therapies, outcomes Dataset number 2.0

    Coronavirus disease 2019 (COVID-19) was identified in January 2020. Currently, there have been more than 6 million cases & more than 1.5 million deaths worldwide. Some individuals experience severe manifestations of infection, including viral pneumonia, adult respiratory distress syndrome (ARDS) & death. There is a pressing need for tools to stratify patients, to identify those at greatest risk. Acuity scores are composite scores which help identify patients who are more unwell to support & prioritise clinical care. There are no validated acuity scores for COVID-19 & it is unclear whether standard tools are accurate enough to provide this support. This secondary care COVID OMOP dataset contains granular demographic, morbidity, serial acuity and outcome data to inform risk prediction tools in COVID-19.

    PIONEER geography The West Midlands (WM) has a population of 5.9 million & includes a diverse ethnic & socio-economic mix. There is a higher than average percentage of minority ethnic groups. WM has a large number of elderly residents but is the youngest population in the UK. Each day >100,000 people are treated in hospital, see their GP or are cared for by the NHS. The West Midlands was one of the hardest hit regions for COVID admissions in both wave 1 & 2.

    EHR. University Hospitals Birmingham NHS Foundation Trust (UHB) is one of the largest NHS Trusts in England, providing direct acute services & specialist care across four hospital sites, with 2.2 million patient episodes per year, 2750 beds & 100 ITU beds. UHB runs a fully electronic healthcare record (EHR) (PICS; Birmingham Systems), a shared primary & secondary care record (Your Care Connected) & a patient portal “My Health”. UHB has cared for >5000 COVID admissions to date. This is a subset of data in OMOP format.

    Scope: All COVID swab confirmed hospitalised patients to UHB from January – August 2020. The dataset includes highly granular patient demographics & co-morbidities taken from ICD-10 & SNOMED-CT codes. Serial, structured data pertaining to care process (timings, staff grades, specialty review, wards), presenting complaint, acuity, all physiology readings (pulse, blood pressure, respiratory rate, oxygen saturations), all blood results, microbiology, all prescribed & administered treatments (fluids, antibiotics, inotropes, vasopressors, organ support), all outcomes.

    Available supplementary data: Health data preceding & following admission event. Matched “non-COVID” controls; ambulance, 111, 999 data, synthetic data. Further OMOP data available as an additional service.

    Available supplementary support: Analytics, Model build, validation & refinement; A.I.; Data partner support for ETL (extract, transform & load) process, Clinical expertise, Patient & end-user access, Purchaser access, Regulatory requirements, Data-driven trials, “fast screen” services.

  20. F

    African Facial Images Dataset | Selfie & ID Card Images

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). African Facial Images Dataset | Selfie & ID Card Images [Dataset]. https://www.futurebeeai.com/dataset/image-dataset/facial-images-selfie-id-african
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    Welcome to the African Human Facial Images Dataset, curated to advance facial recognition technology and support the development of secure biometric identity systems, KYC verification processes, and AI-driven computer vision applications. This dataset is designed to serve as a robust foundation for real-world face matching and recognition use cases.

    Facial Image Data

    The dataset contains over 2,000 facial image sets of African individuals. Each set includes:

    Selfie Images: 5 high-quality selfie images taken under different conditions
    ID Card Images: 2 clear facial images extracted from different government-issued ID cards

    Diversity & Representation

    Geographic Diversity: Participants represent African countries including Kenya, Malawi, Nigeria, Ethiopia, Benin, Somalia, Uganda, and more
    Demographics: Individuals aged 18 to 70 years with a 60:40 male-to-female ratio
    File Formats: Images are provided in JPEG and HEIC formats for compatibility and quality retention

    Image Quality & Capture Conditions

    All images were captured with real-world variability to enhance dataset robustness:

    Lighting: Captured under diverse lighting setups to simulate real environments
    Backgrounds: A wide variety of indoor and outdoor backgrounds
    Device Quality: Captured using modern smartphones to ensure high resolution and clarity

    Metadata

    Each participant’s data is accompanied by rich metadata to support AI model training, including:

    Unique participant ID
    Image file names
    Age at the time of capture
    Gender
    Country of origin
    Demographic details
    File format information

    This metadata enables targeted filtering and training across diverse scenarios.

    Use Cases & Applications

    This dataset is ideal for a wide range of AI and biometric applications:

    Facial Recognition: Train accurate and generalizable face matching models
    KYC & Identity Verification: Enhance onboarding and compliance systems in fintech and government services
    Biometric Identification: Build secure facial recognition systems for access control and identity authentication
    Age Prediction: Train models to estimate age from facial features
    Generative AI: Provide reference data for synthetic face generation or augmentation tasks

    Secure & Ethical Collection

    Data Security: All images were securely stored and processed on FutureBeeAI’s proprietary platform
    Ethical Compliance: Data collection was conducted in full alignment with privacy laws and ethical standards
    Informed Consent: Every participant provided written consent, with full awareness of the intended uses of the data

    Dataset Updates & Customization

    To meet evolving AI demands, this dataset is regularly updated and can be customized. Available options include:

    <div style="margin-top:10px; margin-bottom: 10px; padding-left: 30px; display: flex; gap: 16px; align-items:

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Custom YOLO Dataset (2023). Person Identification 2 Dataset [Dataset]. https://universe.roboflow.com/custom-yolo-dataset/person-identification-dataset-2

Person Identification 2 Dataset

person-identification-2-dataset

person-identification-dataset-2

Explore at:
252 scholarly articles cite this dataset (View in Google Scholar)
zipAvailable download formats
Dataset updated
Jun 15, 2023
Dataset authored and provided by
Custom YOLO Dataset
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Variables measured
Person Bounding Boxes
Description

Person Identification Dataset 2

## Overview

Person Identification Dataset 2 is a dataset for object detection tasks - it contains Person annotations for 335 images.

## Getting Started

You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.

  ## License

  This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Search
Clear search
Close search
Google apps
Main menu