2 datasets found
  1. H

    BOSQUE Test set

    • dataverse.harvard.edu
    • search.dataone.org
    Updated Jul 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alejandra Jaramillo Arboleda; Maria Juliana Sanchez Zapata; LILI JOHANA RUEDA JAIME; Andrés Morales-Forero; Samuel Bassetto (2025). BOSQUE Test set [Dataset]. http://doi.org/10.7910/DVN/AQEPIN
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 10, 2025
    Dataset provided by
    Harvard Dataverse
    Authors
    Alejandra Jaramillo Arboleda; Maria Juliana Sanchez Zapata; LILI JOHANA RUEDA JAIME; Andrés Morales-Forero; Samuel Bassetto
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    BOSQUE Test Set: A Dermoscopic Image Dataset from Colombian Patients with Diverse Skin Phototypes Description: The BOSQUE Test Set is a curated dataset of 151 dermoscopic images of pigmented skin lesions, collected from dermatology consultations and outreach campaigns in Bogotá, Colombia. Each image is accompanied by expert-verified metadata including histological diagnosis, patient demographic details, anatomical site, and skin phototype. The dataset is intended to support machine learning research in dermatology with a particular focus on skin tone diversity and fairness in diagnostic algorithms. The dataset was developed under the guidance of Universidad El Bosque, whose name inspired the acronym BOSQUE. It responds to the global underrepresentation of darker skin phototypes in existing dermoscopic image collections such as HAM10000, and aims to improve diagnostic equity through inclusive data curation. Key Features 151 dermoscopic images acquired in real-world clinical settings Captured using polarized light dermatoscopes (DermLite 4 + iPhone) Inclusive population: Sex: 97 Female, 54 Male Age groups: from 0–29 to 90+, categorized into clinically relevant bins Fitzpatrick skin phototypes: ranging from II to VI Type II (fair, burns easily): 11 patients Type III (light brown, mild burns): 94 patients Type IV (moderate brown, rarely burns): 34 patients Type V (dark brown, very rarely burns): 7 patients Type VI (deeply pigmented, never burns): 5 patients Lesion characteristics: Nature: benign or malignant (histopathologically confirmed) Size: categorized as ≤5mm, 6–10mm, 11–20mm, >20mm Evolution time: grouped into <1y, 1y, 2y, 3–4y, 5–9y, and 10y+ categories Anatomical site: head/neck, trunk, limbs, or acral areas Histopathological diagnosis: 7-class ISIC-style labels (akiec, bcc, bkl, df, mel, nv, vasc) Clinical label: melanocytic vs. non-melanocytic (from clinical diagnosis) Clinical context: includes personal history of NMSC and use of photosensitizing drugs Image naming: pseudonymized file names encode diagnosis label and image ID Ethics: all data anonymized and collected under IRB-approved protocol in Colombia Included Files BOSQUE_test_set.zip: Folder containing 151 dermoscopic image files (JPG) BOSQUE_metadata.csv: Metadata for each image, including: Patient sex, age group, skin phototype Anatomical site of the lesion Lesion nature (benign/malignant) Lesion size and evolution time (binned) Histological diagnosis (7-class) Clinical label (melanocytic / non-melanocytic) Use Cases This dataset is intended for: Benchmarking AI models for dermoscopic image classification Fairness analysis across skin tones, sex, and age groups Medical education and clinical training on diverse skin phototypes Comparison against HAM10000 or ISIC datasets in research Ethical Statement All patients provided informed consent for the capture and use of clinical and dermoscopic images, the collection of relevant clinical metadata, and the performance of skin biopsies for diagnostic confirmation. The study protocol was reviewed and approved by the Institutional Ethics Committee at Subred Integrada de Servicios de Salud Norte E.S.E and Universidad El Bosque (Bogotá, Colombia). All data were anonymized in compliance with Colombian health data privacy regulations and international ethical standards (e.g., Declaration of Helsinki). No personally identifiable information is included in the metadata or image files. Access to data was restricted to authorized investigators, and patients were informed about the research and educational use of their anonymized data. Suggested Citation [Author(s)]. (2025). BOSQUE Test Set: A Dermoscopic Image Dataset from Colombian Patients with Diverse Skin Phototypes [Data set]. Harvard Dataverse. https://doi.org/xxxxx

  2. Players in the NFL in 2023, by ethnicity

    • statista.com
    Updated Nov 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Players in the NFL in 2023, by ethnicity [Dataset]. https://www.statista.com/statistics/1167935/racial-diversity-nfl-players/
    Explore at:
    Dataset updated
    Nov 26, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    2023
    Area covered
    United States
    Description

    In 2023, the greatest share of players by ethnic group in the National Football League (NFL) were black or African American athletes, constituting just over ** percent of players within the NFL. Despite the large population of Hispanic or Latino people within the United States, there is a substantial underrepresentation within the NFL, with only *** percent of players identifying as such. National Football League The National Football League (NFL) is a professional American football league that was established in 1920 and now consists of 32 clubs divided into two conferences, the National Football Conference (NFC) and the American Football Conference (AFC). The league culminates in the Super Bowl, the NFL's annual championship game. As the league’s championship game, the Super Bowl has grown into one of the world's largest single-day sporting events, attracting high television ratings and generating billions of dollars in consumer spending. NFL revenues The NFL is one of the most profitable sports leagues in the world, generating a staggering **** billion U.S. dollars in 2022. This total revenue of all ** NFL teams has constantly increased over the past 15 years and, although this figure dropped significantly in 2020, this was largely as a result of the impact of coronavirus (COVID-19) containment measures. This significant drop in revenue demonstrates one of the primary impacts of COVID-19 on professional sports leagues. NFL franchises As a result of this profitability in non-pandemic times, the franchises of the NFL are attributed extremely high market values. The Dallas Cowboys were by far the most valuable franchise in the NFL, with a market value of **** billion US dollars in 2023. The high value of NFL franchises can be seen clearly when compared to those of the NBA, MLB, and NHL. Franchises within the NFL had an average market value of approximately *** billion U.S. dollars in 2023.

  3. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Alejandra Jaramillo Arboleda; Maria Juliana Sanchez Zapata; LILI JOHANA RUEDA JAIME; Andrés Morales-Forero; Samuel Bassetto (2025). BOSQUE Test set [Dataset]. http://doi.org/10.7910/DVN/AQEPIN

BOSQUE Test set

Explore at:
2 scholarly articles cite this dataset (View in Google Scholar)
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 10, 2025
Dataset provided by
Harvard Dataverse
Authors
Alejandra Jaramillo Arboleda; Maria Juliana Sanchez Zapata; LILI JOHANA RUEDA JAIME; Andrés Morales-Forero; Samuel Bassetto
License

CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically

Description

BOSQUE Test Set: A Dermoscopic Image Dataset from Colombian Patients with Diverse Skin Phototypes Description: The BOSQUE Test Set is a curated dataset of 151 dermoscopic images of pigmented skin lesions, collected from dermatology consultations and outreach campaigns in Bogotá, Colombia. Each image is accompanied by expert-verified metadata including histological diagnosis, patient demographic details, anatomical site, and skin phototype. The dataset is intended to support machine learning research in dermatology with a particular focus on skin tone diversity and fairness in diagnostic algorithms. The dataset was developed under the guidance of Universidad El Bosque, whose name inspired the acronym BOSQUE. It responds to the global underrepresentation of darker skin phototypes in existing dermoscopic image collections such as HAM10000, and aims to improve diagnostic equity through inclusive data curation. Key Features 151 dermoscopic images acquired in real-world clinical settings Captured using polarized light dermatoscopes (DermLite 4 + iPhone) Inclusive population: Sex: 97 Female, 54 Male Age groups: from 0–29 to 90+, categorized into clinically relevant bins Fitzpatrick skin phototypes: ranging from II to VI Type II (fair, burns easily): 11 patients Type III (light brown, mild burns): 94 patients Type IV (moderate brown, rarely burns): 34 patients Type V (dark brown, very rarely burns): 7 patients Type VI (deeply pigmented, never burns): 5 patients Lesion characteristics: Nature: benign or malignant (histopathologically confirmed) Size: categorized as ≤5mm, 6–10mm, 11–20mm, >20mm Evolution time: grouped into <1y, 1y, 2y, 3–4y, 5–9y, and 10y+ categories Anatomical site: head/neck, trunk, limbs, or acral areas Histopathological diagnosis: 7-class ISIC-style labels (akiec, bcc, bkl, df, mel, nv, vasc) Clinical label: melanocytic vs. non-melanocytic (from clinical diagnosis) Clinical context: includes personal history of NMSC and use of photosensitizing drugs Image naming: pseudonymized file names encode diagnosis label and image ID Ethics: all data anonymized and collected under IRB-approved protocol in Colombia Included Files BOSQUE_test_set.zip: Folder containing 151 dermoscopic image files (JPG) BOSQUE_metadata.csv: Metadata for each image, including: Patient sex, age group, skin phototype Anatomical site of the lesion Lesion nature (benign/malignant) Lesion size and evolution time (binned) Histological diagnosis (7-class) Clinical label (melanocytic / non-melanocytic) Use Cases This dataset is intended for: Benchmarking AI models for dermoscopic image classification Fairness analysis across skin tones, sex, and age groups Medical education and clinical training on diverse skin phototypes Comparison against HAM10000 or ISIC datasets in research Ethical Statement All patients provided informed consent for the capture and use of clinical and dermoscopic images, the collection of relevant clinical metadata, and the performance of skin biopsies for diagnostic confirmation. The study protocol was reviewed and approved by the Institutional Ethics Committee at Subred Integrada de Servicios de Salud Norte E.S.E and Universidad El Bosque (Bogotá, Colombia). All data were anonymized in compliance with Colombian health data privacy regulations and international ethical standards (e.g., Declaration of Helsinki). No personally identifiable information is included in the metadata or image files. Access to data was restricted to authorized investigators, and patients were informed about the research and educational use of their anonymized data. Suggested Citation [Author(s)]. (2025). BOSQUE Test Set: A Dermoscopic Image Dataset from Colombian Patients with Diverse Skin Phototypes [Data set]. Harvard Dataverse. https://doi.org/xxxxx

Search
Clear search
Close search
Google apps
Main menu