14 datasets found
  1. o

    International Skin Imaging Collaboration (ISIC) Archive

    • registry.opendata.aws
    Updated Aug 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    International Skin Imaging Collaboration (ISIC) (2025). International Skin Imaging Collaboration (ISIC) Archive [Dataset]. https://registry.opendata.aws/isic-archive/
    Explore at:
    Dataset updated
    Aug 12, 2025
    Dataset provided by
    International Skin Imaging Collaboration (ISIC)
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    A public-access archive of skin lesion images, supporting teaching, research, and the development and evaluation of diagnostic algorithms.

  2. i

    DERM12345

    • api.isic-archive.com
    Updated 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Imperial College London (2025). DERM12345 [Dataset]. http://doi.org/10.34970/705541
    Explore at:
    Dataset updated
    2025
    Dataset provided by
    ISIC Archive
    datacite
    Authors
    Imperial College London
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Skin lesion datasets provide essential information for understanding various skin conditions and developing effective diagnostic tools. They aid the artificial intelligence-based early detection of skin cancer, facilitate treatment planning, and contribute to medical education and research. Published large datasets have partially coverage the subclassifications of the skin lesions. This limitation highlights the need for more expansive and varied datasets to reduce false predictions and help improve the failure analysis for skin lesions. This study presents a diverse dataset comprising 12,345 dermatoscopic images with 40 subclasses of skin lesions, collected in Turkiye, which comprises different skin types in the transition zone between Europe and Asia. Each subgroup contains high-resolution images and expert annotations, providing a strong and reliable basis for future research. The detailed analysis of each subgroup provided in this study facilitates targeted research endeavors and enhances the depth of understanding regarding the skin lesions. This dataset distinguishes itself through a diverse structure with its 5 super classes, 15 main classes, 40 subclasses and 12,345 high-resolution dermatoscopic images.

    Yilmaz, A., Yasar, S.P., Gencoglan, G. et al. DERM12345: A Large, Multisource Dermatoscopic Skin Lesion Dataset with 40 Subclasses. Sci Data 11, 1302 (2024). https://doi.org/10.1038/s41597-024-04104-3

  3. i

    MILK10k

    • api.isic-archive.com
    Updated 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    MILK study team (2025). MILK10k [Dataset]. http://doi.org/10.34970/648456
    Explore at:
    Dataset updated
    2025
    Dataset provided by
    ISIC Archive
    datacite
    Authors
    MILK study team
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    MILK10k consists of 10480 images, each representing a paired clinical close-up and dermatoscopic image for 5240 lesions. The dataset’s metadata include age (in 5-year intervals), sex, anatomic site, skin tone, diagnosis, method of ground truth establishment (histopathology or other means), and, if a dermatoscopic image of the same lesion was previously included in ISIC, its corresponding ISIC identifier. Skin tone is categorized into six levels, ranging from very dark (0) to very light (5), intentionally distinct from the Fitzpatrick skin types to avoid confusion. Most patients had skin tones in the middle ranges. Of the 5240 lesions, 95.7% were biopsied or excised, with histopathology serving as the gold standard for diagnosis. Diagnoses were mapped to both the ISIC-Dx diagnostic scheme and a simplified classification based on the ISIC2018/2019 challenge and HAM10000 diagnostic categories. The dataset includes 11 broad diagnostic categories:

    1. Basal cell carcinoma (bcc)
    2. Melanocytic nevus (nv)
    3. Benign keratinocytic lesion (bkl)
    4. Squamous cell carcinoma/keratoacanthoma (sccka)
    5. Melanoma (mel)
    6. Actinic keratosis/intraepidermal carcinoma (akiec)
    7. Dermatofibroma (df)
    8. Inflammatory and infectious conditions (inf)
    9. Vascular lesions and hemorrhage (vasc)
    10. Other benign proliferations including collision tumors (ben_oth)
    11. Other malignant proliferations including collision tumors (mal_oth)

    Additionally, we provide the most specific ISIC-Dx diagnosis and its parent branch in the ISIC-Dx diagnostic tree. In cases where a dermatoscopic image of the same lesion was already included in the ISIC archive, its ISIC identifier is reported in the metadata. Furthermore, all images have been annotated using the MONET framework, with probabilities for the following concept term groups included in the metadata:

    1. Ulceration, crust
    2. Hair
    3. Vasculature, vessels
    4. Erythema
    5. Pigmentation
    6. Gel, water drop, fluid, dermoscopy liquid
    7. Skin markings, pen ink, purple pen

    In addition to MILK10k, we have curated a smaller benchmark dataset, called MILK10k Benchmark derived from the same sources and covering the same diagnostic categories. This dataset is available as part of a live challenge within the ISIC framework and can be accessed on ISIC.

    Images were provided by the following institutions:

    • Department of Dermatology, Medical University of Vienna, Vienna, Austria
    • Medicine Faculty Department of Dermatology, Ankara University, Ankara, Turkey
    • Mayne Academy of General Practice, Medical School, The University of Queensland, Australia
    • Dermatology Service, Memorial Sloan Kettering Cancer Center, New York, USA
    • Independent Researcher, 1000 Skopje, North Macedonia
  4. D

    Skin Cancer: HAM10000 Dataset

    • datasetninja.com
    Updated Jan 21, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tschandl, Philipp; Cliff Rosendahl; Harald Kittler (2024). Skin Cancer: HAM10000 Dataset [Dataset]. https://datasetninja.com/skin-cancer-ham10000
    Explore at:
    Dataset updated
    Jan 21, 2024
    Dataset provided by
    Dataset Ninja
    Authors
    Tschandl, Philipp; Cliff Rosendahl; Harald Kittler
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    To address the challenges of training neural networks for automated diagnosis of pigmented skin lesions, the authors introduced the HAM10000 ("Human Against Machine with 10000 training images") dataset. This dataset aimed to overcome the limitations of small-sized and homogeneous dermatoscopic image datasets by providing a diverse and extensive collection. To achieve this, they collected dermatoscopic images from various populations using different modalities, which necessitated employing distinct acquisition and cleaning methods. The authors also designed semi-automatic workflows that incorporated specialized neural networks to enhance the dataset's quality. The resulting HAM10000 dataset comprised 10,015 dermatoscopic images, which were made available for academic machine learning applications through the ISIC archive. This dataset served as a benchmark for machine learning experiments and comparisons with human experts.

  5. Skin Disease Detection Dataset (HAM10000 + ISIC)

    • kaggle.com
    Updated May 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nour12347653 (2025). Skin Disease Detection Dataset (HAM10000 + ISIC) [Dataset]. https://www.kaggle.com/datasets/nour12347653/skin-disease-detection-dataset-ham10000-isic
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 20, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    nour12347653
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    This dataset is a cleaned and preprocessed combination of the HAM10000 and ISIC Archive dermoscopic image datasets, intended for training and evaluating deep learning models for skin lesion classification.

    It is structured to support multi-class image classification, and has been carefully processed to maintain high quality, class balance.

    Classes Included :

    "melanocytic nevi": "Melanocytic Nevus", "nv": "Melanocytic Nevus", "melanoma": "Melanoma", "mel": "Melanoma", "benign keratosis": "Benign Keratosis", "bkl": "Benign Keratosis", "basal cell carcinoma": "Basal Cell Carcinoma", "bcc": "Basal Cell Carcinoma", "actinic keratosis": "Actinic Keratosis", "akiec": "Actinic Keratosis", "dermatofibroma": "Dermatofibroma", "df": "Dermatofibroma", "vascular lesions": "Vascular Lesion", "vasc": "Vascular Lesion", "warts/molluscum": "Warts/Molluscum"

    Preprocessing Notes

    • All images resized to 224x224 for CNN compatibility
    • Labels unified and cleaned across both datasets
    • Invalid or corrupted entries removed
  6. Skin Lesions

    • kaggle.com
    Updated Nov 11, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anwar Hawash (2023). Skin Lesions [Dataset]. https://www.kaggle.com/datasets/anwarhawash/skin-lesions
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 11, 2023
    Dataset provided by
    Kaggle
    Authors
    Anwar Hawash
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Diagnostic Categories:

    Melanoma Melanocytic nevus Basal cell carcinoma Actinic keratosis Benign keratosis (solar lentigo / seborrheic keratosis / lichen planus-like keratosis) Dermatofibroma Vascular lesion Squamous cell carcinoma

    Original Data Source

    • Original Challenge: https://challenge.isic-archive.com/data/#2019

      [1] Tschandl P., Rosendahl C. & Kittler H. The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions. Sci. Data 5, 180161 doi.10.1038/sdata.2018.161 (2018)

      [2] Noel C. F. Codella, David Gutman, M. Emre Celebi, Brian Helba, Michael A. Marchetti, Stephen W. Dusza, Aadi Kalloo, Konstantinos Liopyris, Nabin Mishra, Harald Kittler, Allan Halpern: "Skin Lesion Analysis Toward Melanoma Detection: A Challenge at the 2017 International Symposium on Biomedical Imaging (ISBI), Hosted by the International Skin Imaging Collaboration (ISIC)", 2017; arXiv:1710.05006.

      [3] Marc Combalia, Noel C. F. Codella, Veronica Rotemberg, Brian Helba, Veronica Vilaplana, Ofer Reiter, Allan C. Halpern, Susana Puig, Josep Malvehy: "BCN20000: Dermoscopic Lesions in the Wild", 2019; arXiv:1908.02288.

    Copyright and Attribution

    If you use this dataset in your research, please credit the authors

    what-are-the-different-types-of-skin-cancer?

    https://www.everydayhealth.com/skin-cancer/what-are-the-different-types-of-skin-cancer/

  7. f

    iToBoS 2024 - Skin Lesion Detection with 3D-TBP

    • figshare.com
    • api.isic-archive.com
    png
    Updated May 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anup Saha (2025). iToBoS 2024 - Skin Lesion Detection with 3D-TBP [Dataset]. http://doi.org/10.6084/m9.figshare.28452545.v6
    Explore at:
    pngAvailable download formats
    Dataset updated
    May 12, 2025
    Dataset provided by
    figshare
    Authors
    Anup Saha
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The iToBoS dataset: skin region images extracted from 3D total body photographs for lesion detectionThe early detection of skin cancer is critical for improving patient outcomes. Traditionally, dermatologists rely on dermoscopy to examine pigmented skin lesions. While this non-invasive technique enhances diagnostic accuracy, its effectiveness is highly dependent on the clinician’s expertise. Additionally, capturing dermoscopic images for every suspicious lesion is a labor-intensive process. Given these challenges, there is an increasing need for computer-aided diagnosis (CAD) systems that utilize conventional cameras. Such systems can support general physicians and other non-specialist practitioners in identifying potential malignant lesion, improving early detection and intervention. Moreover, they facilitate longitudinal tracking of lesions, aiding researchers in studying disease progression and treatment efficacy.This dataset provides high-resolution skin patch images extracted from 3D total body photographs to support the development of advanced machine learning models for lesion detection. It serves as a valuable resource for researchers working on automated skin lesion analysis, particularly in the context of total body photography (TBP).Dataset Description:The iToBoS dataset consists of 16,954 high-resolution images of skin regions obtained from anonymized 3D avatars of patients. These avatars were generated using the Canfield VECTRA WB360 system, a cutting-edge imaging technology that captures comprehensive, full-body skin images using 92 fixed cameras arranged in 46 stereo pairs with xenon flash lighting. The images were collected from patients at two clinical sites: the Clinical Hospital of Barcelona (Spain) and the University of Queensland (Australia).The dataset provides diverse anatomical locations, including the torso, arms, and legs, with each image having an average resolution of 1012x827 pixels and a 45-pixel overlap between adjacent images. The images are extracted from 3D avatars while ensuring compliance with GDPR regulations by automatically removing patient facial features. Each image is accompanied by metadata, including patient age range, body location, and sun damage score, allowing for in-depth analysis and stratification.Significance of the Dataset:Facilitates Automated Skin Lesion Detection: The dataset supports the development of AI-based lesion detection models that can improve early diagnosis of skin cancer, particularly in regions with limited access to dermatological expertise.Supports Total Body Photography Research: Leveraging 3D TBP for lesion detection is an emerging field, and this dataset provides a benchmark for further exploration.Enhances Machine Learning Applications: The dataset serves as a benchmark for developing state-of-the-art computer vision and deep learning models for detection of skin lesions.

  8. ISIC 2019 TFRecords 256x256

    • kaggle.com
    zip
    Updated Jul 10, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chris Deotte (2020). ISIC 2019 TFRecords 256x256 [Dataset]. https://www.kaggle.com/cdeotte/ISIC2019-256x256
    Explore at:
    zip(440785630 bytes)Available download formats
    Dataset updated
    Jul 10, 2020
    Authors
    Chris Deotte
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    These TFRecords contain both the image data and tabular data (meta data) for 2019 ISIC Melanoma Classification Competition described here. The images are 256x256x3 jpegs. The original jpegs have been center square cropped and then resized using cv2.resize with interpolation = cv2.INTER_AREA.

    The odd numbered TFRecords (1,3,5,7,...) contain images that had an original image size of 1024x1024 before crop resize. And the even numbered TFRecords (0,2,4,6,...) did not have size 1024x1024 before crop resize. They are split like this in case you do not want to include the odd numbered TFRecords which some say have images that look different than 2020 competition data.

    TFRecords with Image and Tabular Data

    The train TFRecords have the following fields

     feature = {
       'image': _bytes_feature,
       'image_name': _bytes_feature,
       'patient_id': _int64_feature,
       'sex': _int64_feature,
       'age_approx': _int64_feature,
       'anatom_site_general_challenge': _int64_feature,
       'diagnosis': _int64_feature,
       'target': _int64_feature,
       'width': _int64_feature,
       'height': _int64_feature
     }
    

    The feature width and height are the original image width before center square crop resize

    The feature target=1 if diagnosis=MEL which is melanoma and target=0 otherwise. The image_name is a string. The patient_id is set to -1 because we don't know it. The sex has been labeled encoded to int with

    0:'male`
    1:'female` 
    

    The age_approx originally had 437 NaNs but these have been imputed to mean. The anatom_site_general_challenge has been label encoded to

    -1: NaN
    0: 'head/neck' 
    1: 'upper extremity'
    2: 'lower extremity'
    3: 'torso',
    4: 'palms/soles'
    5: 'oral/genital'
    

    The diagnosis has been label encoded to

    9: 'MEL'
    10: 'NV'
    11: 'BCC'
    12: 'AK'
    13: 'BKL'
    14: 'DF'
    15: 'VASC'
    16: 'SCC'
    17: 'UNK'
    
  9. H

    Data from: The HAM10000 dataset, a large collection of multi-source...

    • dataverse.harvard.edu
    tsv, zip
    Updated Jan 29, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Harvard Dataverse (2021). The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions [Dataset]. http://doi.org/10.7910/DVN/DBW86T
    Explore at:
    tsv(830369), zip(10808743)Available download formats
    Dataset updated
    Jan 29, 2021
    Dataset provided by
    Harvard Dataverse
    Description

    Training of neural networks for automated diagnosis of pigmented skin lesions is hampered by the small size and lack of diversity of available dataset of dermatoscopic images. We tackle this problem by releasing the HAM10000 ("Human Against Machine with 10000 training images") dataset. We collected dermatoscopic images from different populations, acquired and stored by different modalities. The final dataset consists of 10015 dermatoscopic images which can serve as a training set for academic machine learning purposes. Cases include a representative collection of all important diagnostic categories in the realm of pigmented lesions: Actinic keratoses and intraepithelial carcinoma / Bowen's disease (akiec), basal cell carcinoma (bcc), benign keratosis-like lesions (solar lentigines / seborrheic keratoses and lichen-planus like keratoses, bkl), dermatofibroma (df), melanoma (mel), melanocytic nevi (nv) and vascular lesions (angiomas, angiokeratomas, pyogenic granulomas and hemorrhage, vasc). More than 50% of lesions are confirmed through histopathology (histo), the ground truth for the rest of the cases is either follow-up examination (follow_up), expert consensus (consensus), or confirmation by in-vivo confocal microscopy (confocal). The dataset includes lesions with multiple images, which can be tracked by the lesion_id-column within the HAM10000_metadata file. Due to upload size limitations, images are stored in two files: HAM10000_images_part1.zip (5000 JPEG files) HAM10000_images_part2.zip (5015 JPEG files) Additional data for evaluation purposes The HAM10000 dataset served as the training set for the ISIC 2018 challenge (Task 3). The test-set images are available herein as ISIC2018_Task3_Test_Images.zip (1511 images), the official validation-set is available through the challenge website https://challenge2018.isic-archive.com/. The ISIC-Archive also provides a "Live challenge" submission site for continuous evaluation of automated classifiers on the official validation- and test-set. Comparison to physicians Test-set evaluations of the ISIC 2018 challenge were compared to physicians on an international scale, where the majority of challenge participants outperformed expert readers: Tschandl P. et al., Lancet Oncol 2019 Human-computer collaboration The test-set images were also used in a study comparing different methods and scenarios of human-computer collaboration: Tschandl P. et al., Nature Medicine 2020 Following corresponding metadata is available herein: ISIC2018_Task3_Test_NatureMedicine_AI_Interaction_Benefit.csv: Human ratings for Test images with and without interaction with a ResNet34 CNN (Malignancy Probability, Multi-Class probability, CBIR) or Human-Crowd Multi-Class probabilities. This is data was collected for and analyzed in Tschandl P. et al., Nature Medicine 2020, therefore please refer to this publication when using the data. HAM10000_segmentations_lesion_tschandl.zip: To evaluate regions of CNN activations in Tschandl P. et al., Nature Medicine 2020 (please refer to this publication when using the data), a single dermatologist (Tschandl P) created binary segmentation masks for all 10015 images from the HAM10000 dataset. Masks were initialized with the segmentation network as described by Tschandl et al., Computers in Biology and Medicine 2019, and following verified, corrected or replaced via the free-hand selection tool in FIJI.

  10. HAM10000 Lesion Segmentations

    • kaggle.com
    Updated Jul 2, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    chdlr (2020). HAM10000 Lesion Segmentations [Dataset]. https://www.kaggle.com/datasets/tschandl/ham10000-lesion-segmentations/
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 2, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    chdlr
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Context

    Dermatoscopic images usually depict a single skin lesion, but large scale datasets with available segmentations of affected areas are not available until now. Challenge segmentation data often suffered from being either too coarse or too noisy. This dataset provides 10015 binary segmentation masks based on FCN-created segmentations and hand-drawn lines, which together with the HAM10000 diagnosis metadata can be used for object detection or semantic segmentation.

    Content

    This dataset contains binary segmentation masks as PNG-files of all HAM10000 dataset images. The area segments lesion area as evaluated by a single dermatologist (me). They were initiated with a FCN lesion segmentation model, where afterwards I went through all of them and either approved them, or corrected / redrew them with the free-hand selection tool in FIJI.

    You can find the HAM10000 dataset images at the following places: - Harvard Dataverse: https://doi.org/10.7910/DVN/DBW86T - ISIC Archive Gallery: https://www.isic-archive.com - Kaggle Dataset Kernel (downsampled): https://www.kaggle.com/kmader/skin-cancer-mnist-ham10000

    Acknowledgements

    If you use this data, please cite/refer to the publication I made these segmentation masks for...

    ...and the original source of the images:

  11. PROVe-AI

    • api.isic-archive.com
    Updated 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Memorial Sloan Kettering Cancer Center (2022). PROVe-AI [Dataset]. http://doi.org/10.34970/576276
    Explore at:
    Dataset updated
    2022
    Dataset provided by
    DataCitehttps://www.datacite.org/
    ISIC Archive
    Authors
    Memorial Sloan Kettering Cancer Center
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    We conducted a prospective, observational clinical validation study to assess the diagnostic accuracy of the AI algorithm (ADAE) in predicting melanoma from dermoscopy skin lesion images. Patients who had consented for a skin biopsy to exclude melanoma were eligible. All lesions underwent biopsy.

  12. O

    ISIC_WSM

    • opendatalab.com
    zip
    Updated Jun 13, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    University of Siena (2024). ISIC_WSM [Dataset]. https://opendatalab.com/OpenDataLab/ISIC_WSM
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 13, 2024
    Dataset provided by
    University of Siena
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The ISIC_WSM dataset provides pixel–level supervisions for a subset of images (43885) from the ISIC archive, while the original images can be downloaded separately at the ISIC website. The supervision is obtained from the available bounding–boxes of the COCO–Text dataset exploiting a weakly supervised algorithm. See the paper for more details.

  13. NLP_SKIN_DATA_PS_DD

    • kaggle.com
    Updated Jul 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    HARINI SHREE R (2025). NLP_SKIN_DATA_PS_DD [Dataset]. http://doi.org/10.34740/kaggle/dsv/12368953
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 4, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    HARINI SHREE R
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    📄 Context Skin diseases are among the most common health concerns worldwide, ranging from benign lesions like keratosis to serious conditions such as melanoma. Early and accurate diagnosis plays a vital role in preventing disease progression and improving patient outcomes. This dataset aims to assist in developing AI-driven dermatology tools by providing structured information on various skin diseases, their definitions, patient-described symptoms, and associated clinical images. 🔍 Sources The dataset is compiled from a combination of: Publicly available dermatological image repositories, such as the ISIC (International Skin Imaging Collaboration) archive, which contains labeled dermoscopic images of skin lesions. Clinical literature and dermatology textbooks, used to write concise disease definitions. Simulated patient statements, reflecting typical ways in which patients describe their skin conditions during clinical consultations. These were generated based on clinical case studies and patient interviews found in dermatology research papers. Synthetic aggregation: File names refer to images associated with each disease class, meant for easy integration with machine learning pipelines. 🌟 Inspiration This dataset was inspired by the growing need for: Explainable AI (XAI) in dermatology: Making machine learning models more understandable to clinicians and patients. Bridging the gap between clinical terminology and patient language: Helping AI models learn how real patients describe their symptoms, enhancing the usability of teledermatology tools. Supporting education and research: Assisting medical students, researchers, and AI developers in understanding skin diseases in both clinical and layman contexts. Enabling multi-modal learning: Combining text descriptions, disease definitions, and images to train more robust models that can reason across data types. 📄 Column Descriptions Disease Class - The name of the skin disease type (e.g., Actinic Keratosis, Melanoma, Benign Keratosis, etc.). There are 9 unique classes. Disease Definition - A clinical description explaining the nature and characteristics of the disease. Major Statement - Simulated patient descriptions or questions that reflect how individuals typically describe their symptoms. File Name - The corresponding image file name related to the disease case

  14. i

    Dermatology Image and Text Dataset for AI-Powered Diagnosis and RAG-Based...

    • ieee-dataport.org
    Updated May 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Emre Olca (2025). Dermatology Image and Text Dataset for AI-Powered Diagnosis and RAG-Based Medical Support [Dataset]. https://ieee-dataport.org/documents/dermatology-image-and-text-dataset-ai-powered-diagnosis-and-rag-based-medical-support
    Explore at:
    Dataset updated
    May 1, 2025
    Authors
    Emre Olca
    Description

    100 high-resolution

  15. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
International Skin Imaging Collaboration (ISIC) (2025). International Skin Imaging Collaboration (ISIC) Archive [Dataset]. https://registry.opendata.aws/isic-archive/

International Skin Imaging Collaboration (ISIC) Archive

Explore at:
267 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Aug 12, 2025
Dataset provided by
International Skin Imaging Collaboration (ISIC)
License

CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically

Description

A public-access archive of skin lesion images, supporting teaching, research, and the development and evaluation of diagnostic algorithms.

Search
Clear search
Close search
Google apps
Main menu