100+ datasets found
  1. The dataset contains PII

    • catalog.data.gov
    • gimi9.com
    Updated Nov 12, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2020). The dataset contains PII [Dataset]. https://catalog.data.gov/dataset/the-dataset-contains-pii
    Explore at:
    Dataset updated
    Nov 12, 2020
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    These data are interview transcripts with individuals who are users of the Smoke Sense app. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: This data is available on request to approved individuals. Format: This data contains PII. These are interview transcripts. This dataset is associated with the following publication: Hano, M., L. Wei, B. Hubbell, and A. Rappold. Scaling Up: Citizen Science Engagement and Impacts Beyond the Individual. Citizen Science: Theory and Practice. Ubiquity Press, London, UK, 5(1): 1-13, (2020).

  2. Wikipedia notable people

    • kaggle.com
    zip
    Updated Jun 15, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Konrad Banachewicz (2023). Wikipedia notable people [Dataset]. https://www.kaggle.com/datasets/konradb/wikipedia-notable-people
    Explore at:
    zip(268529204 bytes)Available download formats
    Dataset updated
    Jun 15, 2023
    Authors
    Konrad Banachewicz
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    From the original paper:

    A new strand of literature aims at building the most comprehensive and accurate database of notable individuals. We collect a massive amount of data from various editions of Wikipedia and Wikidata. Using deduplication techniques over these partially overlapping sources, we cross-verify each retrieved information. For some variables, Wikipedia adds 15% more information when missing in Wikidata. We find very few errors in the part of the database that contains the most documented individuals but nontrivial error rates in the bottom of the notability distribution, due to sparse information and classification errors or ambiguity. Our strategy results in a cross-verified database of 2.29 million individuals (an elite of 1/43,000 of human being having ever lived), including a third who are not present in the English edition of Wikipedia.

  3. h

    company-dataset

    • huggingface.co
    Updated Jul 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andrea Altomani (2025). company-dataset [Dataset]. https://huggingface.co/datasets/andreaaltomani/company-dataset
    Explore at:
    Dataset updated
    Jul 28, 2025
    Authors
    Andrea Altomani
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Free dataset containing about 24M companies. Originally compiled by People Data Labs, released under a free license. Data Schema and more information on te dataset at: https://docs.peopledatalabs.com/docs/free-company-dataset This version has been downloaded on 2025-07-28

  4. h

    black-people-liveness-detection-video-dataset

    • huggingface.co
    Updated Apr 11, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Unique Data (2024). black-people-liveness-detection-video-dataset [Dataset]. https://huggingface.co/datasets/UniqueData/black-people-liveness-detection-video-dataset
    Explore at:
    Dataset updated
    Apr 11, 2024
    Authors
    Unique Data
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description

    Biometric Attack Dataset, Black People

      The similar dataset that includes all ethnicities - Anti Spoofing Real Dataset
    

    The dataset for face anti spoofing and face recognition includes images and videos of black people. The dataset helps in enchancing the performance of the model by providing wider range of data for a specific ethnic group. The videos were gathered by capturing faces of genuine individuals presenting spoofs, using facial presentations. Our dataset proposes… See the full description on the dataset page: https://huggingface.co/datasets/UniqueData/black-people-liveness-detection-video-dataset.

  5. YOLO HighVis and Person Detection Dataset

    • kaggle.com
    Updated Jun 5, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tudor Hirtopanu (2024). YOLO HighVis and Person Detection Dataset [Dataset]. https://www.kaggle.com/datasets/tudorhirtopanu/yolo-highvis-and-person-detection-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 5, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Tudor Hirtopanu
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description
    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F18802242%2F82f9418bc911b2ac58ef3abaa97e23a6%2Ffoto_no_exif.jpg?generation=1722318331347433&alt=media">https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F18802242%2Fa45e9981a7e6e88d107ab928bec3d2e8%2Ffoto_no_exif-2.jpg?generation=1722318405836434&alt=media">

    This dataset contains over 24,000 labels in almost 8,000 images and corresponding labels to train a YOLOv8 model to detect people and hi vis jackets: The dataset contains 2 directories, images and labels. Each image has a txt label file where each line has the class ID of the object detected followed by the normalised coordinates of the bounding box.

    This dataset was used to train a YOLO model for tracking people exclusively wearing a high-vis jacket. You may find the project here - https://github.com/tudorhirtopanu/YOLO-HiVis

  6. Dataset for Targeted GC-MS Analysis of Firefighters' Exhaled Breath

    • catalog.data.gov
    • datasets.ai
    • +1more
    Updated Nov 12, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2020). Dataset for Targeted GC-MS Analysis of Firefighters' Exhaled Breath [Dataset]. https://catalog.data.gov/dataset/dataset-for-targeted-gc-ms-analysis-of-firefighters-exhaled-breath
    Explore at:
    Dataset updated
    Nov 12, 2020
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    This dataset includes a table of the VOC concentrations detected in firefighter breath samples. QQ-plots for benzene, toluene, and ethylbenzene levels in breath samples as well as box-and-whisker plots of pre-, post-, and 1 h post-exposure breath levels of VOCs for firefighters participating in attack, search, and outside ventilation positions are provided. Graphs detailing the responses of individuals to pre-, post-, and 1 h post-exposure concentrations of benzene, toluene, and ethylbenzene are shown. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: The original dataset contains identification information for the firefighters who participated in the controlled structure burns. The analyzed tables and graphs can be made publicly available. Format: The original dataset contains identification information for the firefighters who participated in the controlled structure burns. The analyzed tables and graphs can be made publicly available. This dataset is associated with the following publication: Wallace, A., J. Pleil, K. Oliver, D. Whitaker, S. Mentese, K. Fent, and G. Horn. Targeted GC-MS analysis of firefighters’ exhaled breath: Exploring biomarker response at the individual level. JOURNAL OF OCCUPATIONAL AND ENVIRONMENTAL HYGIENE. Taylor & Francis, Inc., Philadelphia, PA, USA, 16(5): 355-366, (2019).

  7. Individuals and Households Program - Valid Registrations

    • catalog.data.gov
    Updated Jun 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FEMA/Response and Recovery/Recovery Directorate (2025). Individuals and Households Program - Valid Registrations [Dataset]. https://catalog.data.gov/dataset/individuals-and-households-program-valid-registrations-nemis
    Explore at:
    Dataset updated
    Jun 7, 2025
    Dataset provided by
    Federal Emergency Management Agencyhttp://www.fema.gov/
    Description

    This dataset contains FEMA applicant-level data for the Individuals and Households Program (IHP). All PII information has been removed. The location is represented by county, city, and zip code. This dataset contains Individual Assistance (IA) applications from DR1439 (declared in 2002) to those declared over 30 days ago. The full data set is refreshed on an annual basis and refreshed weekly to update disasters declared in the last 18 months. This dataset includes all major disasters and includes only valid registrants (applied in a declared county, within the registration period, having damage due to the incident and damage within the incident period). Information about individual data elements and descriptions are listed in the metadata information within the dataset.rnValid registrants may be eligible for IA assistance, which is intended to meet basic needs and supplement disaster recovery efforts. IA assistance is not intended to return disaster-damaged property to its pre-disaster condition. Disaster damage to secondary or vacation homes does not qualify for IHP assistance.rnData comes from FEMA's National Emergency Management Information System (NEMIS) with raw, unedited, self-reported content and subject to a small percentage of human error.rnAny financial information is derived from NEMIS and not FEMA's official financial systems. Due to differences in reporting periods, status of obligations and application of business rules, this financial information may differ slightly from official publication on public websites such as usaspending.gov. This dataset is not intended to be used for any official federal reporting. rnCitation: The Agency’s preferred citation for datasets (API usage or file downloads) can be found on the OpenFEMA Terms and Conditions page, Citing Data section: https://www.fema.gov/about/openfema/terms-conditions.rnDue to the size of this file, tools other than a spreadsheet may be required to analyze, visualize, and manipulate the data. MS Excel will not be able to process files this large without data loss. It is recommended that a database (e.g., MS Access, MySQL, PostgreSQL, etc.) be used to store and manipulate data. Other programming tools such as R, Apache Spark, and Python can also be used to analyze and visualize data. Further, basic Linux/Unix tools can be used to manipulate, search, and modify large files.rnIf you have media inquiries about this dataset, please email the FEMA News Desk at FEMA-News-Desk@fema.dhs.gov or call (202) 646-3272. For inquiries about FEMA's data and Open Government program, please email the OpenFEMA team at OpenFEMA@fema.dhs.gov.rnThis dataset is scheduled to be superceded by Valid Registrations Version 2 by early CY 2024.

  8. u

    Data from: MobileWell400+: A Large-Scale Multivariate Longitudinal Mobile...

    • produccioncientifica.ucm.es
    Updated 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Banos, Oresti; Damas, Miguel; Goicoechea, Carmen; Perakakis, Pandelis; Pomares, Hector; Rodriguez-Leon, Ciro; Sanabria, Daniel; Villalonga, Claudia; Banos, Oresti; Damas, Miguel; Goicoechea, Carmen; Perakakis, Pandelis; Pomares, Hector; Rodriguez-Leon, Ciro; Sanabria, Daniel; Villalonga, Claudia (2024). MobileWell400+: A Large-Scale Multivariate Longitudinal Mobile Dataset for Investigating Individual and Collective Well-Being [Dataset]. https://produccioncientifica.ucm.es/documentos/668fc499b9e7c03b01be2372
    Explore at:
    Dataset updated
    2024
    Authors
    Banos, Oresti; Damas, Miguel; Goicoechea, Carmen; Perakakis, Pandelis; Pomares, Hector; Rodriguez-Leon, Ciro; Sanabria, Daniel; Villalonga, Claudia; Banos, Oresti; Damas, Miguel; Goicoechea, Carmen; Perakakis, Pandelis; Pomares, Hector; Rodriguez-Leon, Ciro; Sanabria, Daniel; Villalonga, Claudia
    Description

    This study engaged 409 participants over a period spanning from July 10 to August 8, 2023, ensuring representation across various demographic factors: 221 females, 186 males, 2 non-binary, year of birth between 1951 and 2005, with varied annual incomes and from 15 Spanish regions. The MobileWell400+ dataset, openly accessible, encompasses a wide array of data collected via the participants' mobile phone, including demographic, emotional, social, behavioral, and well-being data. Methodologically, the project presents a promising avenue for uncovering new social, behavioral, and emotional indicators, supplementing existing literature. Notably, artificial intelligence is considered to be instrumental in analysing these data, discerning patterns, and forecasting trends, thereby advancing our comprehension of individual and population well-being. Ethical standards were upheld, with participants providing informed consent.

    The following is a non-exhaustive list of collected data:

    Data continuously collected through the participants' smartphone sensors: physical activity (resting, walking, driving, cycling, etc.), name of detected WiFi networks, connectivity type (WiFi, mobile, none), ambient light, ambient noise, and status of the device screen (on, off, locked, unlocked).

    Data corresponding to an initial survey prompted via the smartphone, with information related to demographic data, effects and COVID vaccination, average hours of physical activity, and answers to a series of questions to measure mental health, many of them taken from internationally recognised psychological and well-being scales (PANAS, PHQ, GAD, BRS and AAQ), social isolation (TILS) and economic inequality perception.

    Data corresponding to daily surveys prompted via the smartphone, where variables related to mood (valence, activation, energy and emotional events) and social interaction (quantity and quality) are measured.

    Data corresponding to weekly surveys prompted via the smartphone, where information on overall health, hours of physical activity per week, lonileness, and questions related to well-being are asked.

    Data corresponding to an final survey prompted via the smartphone, consisting of similar questions to the ones asked in the initial survey, namely psychological and well-being items (PANAS, PHQ, GAD, BRS and AAQ), social isolation (TILS) and economic inequality perception questions.

    For a more detailed description of the study please refer to MobileWell400+StudyDescription.pdf.

    For a more detailed description of the collected data, variables and data files please refer to MobileWell400+FilesDescription.pdf.

  9. w

    Dataset of currency and individuals using the Internet of countries

    • workwithdata.com
    Updated May 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2025). Dataset of currency and individuals using the Internet of countries [Dataset]. https://www.workwithdata.com/datasets/countries?col=country%2Ccurrency%2Cinternet_pct
    Explore at:
    Dataset updated
    May 8, 2025
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is about countries. It has 194 rows. It features 3 columns: currency, and individuals using the Internet. It is 100% filled with non-null values.

  10. w

    Dataset of individuals using the Internet of countries per year in Chile...

    • workwithdata.com
    Updated Apr 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2025). Dataset of individuals using the Internet of countries per year in Chile (Historical) [Dataset]. https://www.workwithdata.com/datasets/countries-yearly?col=country%2Cdate%2Cinternet_pct&f=1&fcol0=country&fop0=%3D&fval0=Chile
    Explore at:
    Dataset updated
    Apr 9, 2025
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Chile
    Description

    This dataset is about countries per year in Chile. It has 64 rows. It features 3 columns: country, and individuals using the Internet.

  11. National Survey of College Graduates

    • catalog.data.gov
    Updated Mar 5, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Center for Science and Engineering Statistics (2022). National Survey of College Graduates [Dataset]. https://catalog.data.gov/dataset/national-survey-of-college-graduates
    Explore at:
    Dataset updated
    Mar 5, 2022
    Dataset provided by
    National Center for Science and Engineering Statisticshttp://ncses.nsf.gov/
    Description

    The National Survey of College Graduates is a repeated cross-sectional biennial survey that provides data on the nation's college graduates, with a focus on those in the science and engineering workforce. This survey is a unique source for examining the relationship of degree field and occupation in addition to other characteristics of college-educated individuals, including work activities, salary, and demographic information.

  12. w

    Dataset of individuals using the Internet of countries per year in Ireland...

    • workwithdata.com
    Updated Apr 9, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2025). Dataset of individuals using the Internet of countries per year in Ireland (Historical) [Dataset]. https://www.workwithdata.com/datasets/countries-yearly?col=country%2Cdate%2Cinternet_pct&f=1&fcol0=country&fop0=%3D&fval0=Ireland
    Explore at:
    Dataset updated
    Apr 9, 2025
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Ireland, Ireland
    Description

    This dataset is about countries per year in Ireland. It has 64 rows. It features 3 columns: country, and individuals using the Internet.

  13. Z

    Data from: YJMob100K: City-Scale and Longitudinal Dataset of Anonymized...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Apr 21, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yabe, Takahiro; Tsubouchi, Kota; Shimizu, Toru; Sekimoto, Yoshihide; Sezaki, Kaoru; Moro, Esteban; Pentland, Alex (2024). YJMob100K: City-Scale and Longitudinal Dataset of Anonymized Human Mobility Trajectories [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8111992
    Explore at:
    Dataset updated
    Apr 21, 2024
    Dataset provided by
    MIT
    Yahoo Japan Corporation
    University of Tokyo
    Authors
    Yabe, Takahiro; Tsubouchi, Kota; Shimizu, Toru; Sekimoto, Yoshihide; Sezaki, Kaoru; Moro, Esteban; Pentland, Alex
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The YJMob100K human mobility datasets (YJMob100K_dataset1.csv.gz and YJMob100K_dataset1.csv.gz) contain the movement of a total of 100,000 individuals across a 75 day period, discretized into 30-minute intervals and 500 meter grid cells. The first dataset contains the movement of 80,000 individuals across a 75-day business-as-usual period, while the second dataset contains the movement of 20,000 individuals across a 75-day period (including the last 15 days during an emergency) with unusual behavior.

    While the name or location of the city is not disclosed, the participants are provided with points-of-interest (POIs; e.g., restaurants, parks) data for each grid cell (~85 dimensional vector) as supplementary information (cell_POIcat.csv.gz). The list of 85 POI categories can be found in POI_datacategories.csv.

    For details of the dataset, see Data Descriptor:

    Yabe, T., Tsubouchi, K., Shimizu, T., Sekimoto, Y., Sezaki, K., Moro, E., & Pentland, A. (2024). YJMob100K: City-scale and longitudinal dataset of anonymized human mobility trajectories. Scientific Data, 11(1), 397. https://www.nature.com/articles/s41597-024-03237-9

    --- Details about the Human Mobility Prediction Challenge 2023 (ended November 13, 2023) ---

    The challenge takes place in a mid-sized and highly populated metropolitan area, somewhere in Japan. The area is divided into 500 meters x 500 meters grid cells, resulting in a 200 x 200 grid cell space.

    The human mobility datasets (task1_dataset.csv.gz and task2_dataset.csv.gz) contain the movement of a total of 100,000 individuals across a 90 day period, discretized into 30-minute intervals and 500 meter grid cells. The first dataset contains the movement of a 75 day business-as-usual period, while the second dataset contains the movement of a 75 day period during an emergency with unusual behavior.

    There are 2 tasks in the Human Mobility Prediction Challenge.

    In task 1, participants are provided with the full time series data (75 days) for 80,000 individuals, and partial (only 60 days) time series movement data for the remaining 20,000 individuals (task1_dataset.csv.gz). Given the provided data, Task 1 of the challenge is to predict the movement patterns of the individuals in the 20,000 individuals during days 60-74. Task 2 is similar task but uses a smaller dataset of 25,000 individuals in total, 2,500 of which have the locations during days 60-74 masked and need to be predicted (task2_dataset.csv.gz).

    While the name or location of the city is not disclosed, the participants are provided with points-of-interest (POIs; e.g., restaurants, parks) data for each grid cell (~85 dimensional vector) as supplementary information (which is optional for use in the challenge) (cell_POIcat.csv.gz).

    For more details, see https://connection.mit.edu/humob-challenge-2023

  14. w

    Dataset of books called Individuals : an essay in descriptive metaphysics

    • workwithdata.com
    Updated Apr 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2025). Dataset of books called Individuals : an essay in descriptive metaphysics [Dataset]. https://www.workwithdata.com/datasets/books?f=1&fcol0=book&fop0=%3D&fval0=Individuals+%3A+an+essay+in+descriptive+metaphysics
    Explore at:
    Dataset updated
    Apr 17, 2025
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is about books. It has 2 rows and is filtered where the book is Individuals : an essay in descriptive metaphysics. It features 7 columns including author, publication date, language, and book publisher.

  15. w

    Dataset of individuals using the Internet of countries per year in Angola...

    • workwithdata.com
    Updated Apr 9, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2025). Dataset of individuals using the Internet of countries per year in Angola (Historical) [Dataset]. https://www.workwithdata.com/datasets/countries-yearly?col=country%2Cdate%2Cinternet_pct&f=1&fcol0=country&fop0=%3D&fval0=Angola
    Explore at:
    Dataset updated
    Apr 9, 2025
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Angola
    Description

    This dataset is about countries per year in Angola. It has 64 rows. It features 3 columns: country, and individuals using the Internet.

  16. Mental Health Dataset

    • kaggle.com
    zip
    Updated Oct 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bhadra Mohit (2024). Mental Health Dataset [Dataset]. https://www.kaggle.com/datasets/bhadramohit/mental-health-dataset
    Explore at:
    zip(13276 bytes)Available download formats
    Dataset updated
    Oct 22, 2024
    Authors
    Bhadra Mohit
    License

    https://cdla.io/sharing-1-0/https://cdla.io/sharing-1-0/

    Description

    Comprehensive Mental Health Insights: A Diverse Dataset of 1000 Individuals Across Professions, Countries, and Lifestyles

    This dataset provides a rich collection of anonymized mental health data for 1000 individuals, representing a wide range of ages, genders, occupations, and countries. It aims to shed light on the various factors affecting mental health, offering valuable insights into stress levels, sleep patterns, work-life balance, and physical activity.

    Key Features: Demographics: The dataset includes individuals from various countries such as the USA, India, the UK, Canada, and Australia. Each entry captures key demographic information such as age, gender, and occupation (e.g., IT, Healthcare, Education, Engineering).

    Mental Health Conditions: The dataset contains data on whether the individuals have reported any mental health issues (Yes/No), along with the severity of these conditions categorized into Low, Medium, or High.

    Consultation History: For individuals with mental health conditions, the dataset notes whether they have consulted a mental health professional.

    Stress Levels: Each individual’s stress level is classified as Low, Medium, or High, providing insights into how different factors such as work hours or sleep may correlate with mental well-being.

    Lifestyle Factors: The dataset includes information on sleep duration, work hours per week, and weekly physical activity hours, offering a detailed picture of how lifestyle factors contribute to mental health.

    This dataset can be used for research, analysis, or machine learning models to predict mental health trends, uncover correlations between work-life balance and mental well-being, and explore the impact of stress and physical activity on mental health.

  17. F

    Native American Multi-Year Facial Image Dataset

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). Native American Multi-Year Facial Image Dataset [Dataset]. https://www.futurebeeai.com/dataset/image-dataset/facial-images-historical-native-american
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    Welcome to the Native American Multi-Year Facial Image Dataset, thoughtfully curated to support the development of advanced facial recognition systems, biometric identification models, KYC verification tools, and other computer vision applications. This dataset is ideal for training AI models to recognize individuals over time, track facial changes, and enhance age progression capabilities.

    Facial Image Data

    This dataset includes over 5,000+ high-quality facial images, organized into individual participant sets, each containing:

    •
    Historical Images: 22 facial images per participant captured across a span of 10 years
    •
    Enrollment Image: One recent high-resolution facial image for reference or ground truth

    Diversity & Representation

    •
    Geographic Coverage: Participants from USA, Canada, Mexico and more and other Native American regions
    •
    Demographics: Individuals aged 18 to 70 years, with a gender distribution of 60% male and 40% female
    •
    File Formats: All images are available in JPEG and HEIC formats

    Image Quality & Capture Conditions

    To ensure model generalization and practical usability, images in this dataset reflect real-world diversity:

    •
    Lighting Conditions: Images captured under various natural and artificial lighting setups
    •
    Backgrounds: A wide range of indoor and outdoor backgrounds
    •
    Device Quality: Captured using modern, high-resolution mobile devices for consistency and clarity

    Metadata

    Each participant’s dataset is accompanied by rich metadata to support advanced model training and analysis, including:

    •Unique participant ID
    •File name
    •Age at the time of image capture
    •Gender
    •Country of origin
    •Demographic profile
    •File format

    Use Cases & Applications

    This dataset is highly valuable for a wide range of AI and computer vision applications:

    •
    Facial Recognition Systems: Train models for high-accuracy face matching across time
    •
    KYC & Identity Verification: Improve time-spanning verification for banks, insurance, and government services
    •
    Biometric Security Solutions: Build reliable identity authentication models
    •
    Age Progression & Estimation Models: Train AI to predict aging patterns or estimate age from facial features
    •
    Generative AI: Support creation and validation of synthetic age progression or longitudinal face generation

    Secure & Ethical Collection

    •
    Platform: All data was securely collected and processed through FutureBeeAI’s proprietary systems
    •
    Ethical Compliance: Full participant consent obtained with transparent communication of use cases
    •
    Privacy-Protected: No personally identifiable information is included; all data is anonymized and handled with care

    Dataset Updates & Customization

    To keep pace with evolving AI needs, this dataset is regularly updated and customizable. Custom data collection options include:

    <div style="margin-top:10px; margin-bottom: 10px; padding-left: 30px; display: flex; gap:

  18. w

    Dataset of GDP and individuals using the Internet of countries per year in...

    • workwithdata.com
    Updated Apr 9, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2025). Dataset of GDP and individuals using the Internet of countries per year in Spain (Historical) [Dataset]. https://www.workwithdata.com/datasets/countries-yearly?col=country%2Cdate%2Cgdp%2Cinternet_pct&f=1&fcol0=country&fop0=%3D&fval0=Spain
    Explore at:
    Dataset updated
    Apr 9, 2025
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Spain
    Description

    This dataset is about countries per year in Spain. It has 64 rows. It features 4 columns: country, GDP, and individuals using the Internet.

  19. People Data | Authoritative Database

    • lseg.com
    Updated Oct 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    LSEG (2025). People Data | Authoritative Database [Dataset]. https://www.lseg.com/en/data-analytics/financial-data/company-data/company-profile-information/people-data
    Explore at:
    csv,python,user interface,xmlAvailable download formats
    Dataset updated
    Oct 14, 2025
    Dataset provided by
    London Stock Exchange Grouphttp://www.londonstockexchangegroup.com/
    Authors
    LSEG
    License

    https://www.lseg.com/en/policies/website-disclaimerhttps://www.lseg.com/en/policies/website-disclaimer

    Description

    People data provides complete people information and gives the ability to link individual information to organizations and roles.

  20. LAMARCK DNAmAge and Ozone Dataset

    • catalog.data.gov
    • datasets.ai
    • +1more
    Updated Jul 29, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2024). LAMARCK DNAmAge and Ozone Dataset [Dataset]. https://catalog.data.gov/dataset/lamarck-dnamage-and-ozone-dataset
    Explore at:
    Dataset updated
    Jul 29, 2024
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    This dataset contains data from the LAMARCK controlled exposure study including DNA methylation assessments done before and 24 hours after each exposure, subclinical health outcomes measures, exposure details, and demographic information on the individual participants in the study. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: The dataset can be accessed by contacting Dr. Cavin Ward-Caviness (ward-caviness.cavin@epa.gov). Format: The data is tabular data containing information on DNA methylation assessment, lung function, inflammation, controlled exposure conditions, and demographics of the LAMARCK participants. DNA methylation age has also been calculated based on the DNA methylation assessment data. This dataset is associated with the following publication: Weston, W., M. Bind, W. Cascio, R. Devlin, D. Diaz Sanchez, and C. Ward-Caviness. Accelerated aging and altered sub-clinical response to ozone exposure in young, healthy adults. Environmental Epigenetics. Oxford University Press, Cary, NC, USA, 10(1): dvae007, (2024).

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
U.S. EPA Office of Research and Development (ORD) (2020). The dataset contains PII [Dataset]. https://catalog.data.gov/dataset/the-dataset-contains-pii
Organization logo

The dataset contains PII

Explore at:
3 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Nov 12, 2020
Dataset provided by
United States Environmental Protection Agencyhttp://www.epa.gov/
Description

These data are interview transcripts with individuals who are users of the Smoke Sense app. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: This data is available on request to approved individuals. Format: This data contains PII. These are interview transcripts. This dataset is associated with the following publication: Hano, M., L. Wei, B. Hubbell, and A. Rappold. Scaling Up: Citizen Science Engagement and Impacts Beyond the Individual. Citizen Science: Theory and Practice. Ubiquity Press, London, UK, 5(1): 1-13, (2020).

Search
Clear search
Close search
Google apps
Main menu