17 datasets found
  1. Skin Cancer - The HAM10000 dataset

    • kaggle.com
    Updated Jul 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Élio Cordeiro Pereira (2024). Skin Cancer - The HAM10000 dataset [Dataset]. https://www.kaggle.com/datasets/eliocordeiropereira/skin-cancer-the-ham10000-dataset/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 1, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Élio Cordeiro Pereira
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    The Original Dataset

    The source dataset and its full description may be accessed through the Harvard Dataverse, and should be cited as

    Tschandl, Philipp, 2018, "The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions", https://doi.org/10.7910/DVN/DBW86T, Harvard Dataverse, V4, UNF:6:KCZFcBLiFE5ObWcTc2ZBOA== [fileUNF]

    The Current Dataset

    Note that the herein uploaded dataset does not contain all of the source material, namely the file ISIC2018_Task3_Test_NatureMedicine_AI_Interaction_Benefit.tab - which contains data on a study involving human-computer collaboration - and the folder HAM10000_segmentations_lesion_tschandl - containing binary segmentation masks of the training images. Still, in contrast to most of the HAM10000 datasets published in Kaggle, the current one includes the test dataset that was curated for the ISIC 2018 challenge (Task 3).

    Description

    Files and folders

    The uploaded dataset is comprised by 3 folders and 2 files, described in the table below.

    ContentTypeDescription
    HAM10000_images_part_1folderPart 1 of a set of training pictures
    HAM10000_images_part_2folderPart 2 of a set of training pictures
    ISIC2018_Task3_Test_ImagesfolderSet of test pictures
    HAM10000_metadata.csvfileMetadata associated with the training data
    ISIC2018_Task3_Test_GroundTruth.csvfileMetadata associated with the test data



    The training dataset (HAM10000_images_part_1 and HAM10000_images_part_2) is called "HAM10000" meaning "Human Against Machine with 10000 training images"" (actually 10015 images) and it corresponds to a large collection of multi-source dermatoscopic RGB images (JPG) of common pigmented skin lesions. The test dataset (ISIC2018_Task3_Test_Images) corresponds to 511 images. The files HAM10000_metadata.csv and ISIC2018_Task3_Test_GroundTruth.csv contain the respective metadata (data about the data) which further include other features and the labels.

    Columns of the metadata files

    Their structure of the metadata files follows the template presented by the table below.

    ColumnTypeDescription
    lesion_idStringID of the lesion case
    image_idStringID of an image (also the name of the respective JPG file) associated with that case
    dxStringLabel of that case
    dx_typeStringMethod used for diagnosing that case
    ageFloatAge of the person associated with that case
    sexStringSex of the person associated with that case
    localizationStringLocation of the lesion in the person body
    datasetStringReference from which the data was taken



    Values of the metadata dx column (the classes)

    The values that the column dx may take are tabulated below.

    ValueDescription
    akiecActinic keratoses and intraepithelial carcinoma (also called "Bowen's disease") - an early form of skin cancer
    bccBasal cell carcinoma - the most common type of skin cancer
    bklBenign keratosis-like lesions (solar lentigines / seborrheic keratoses and lichen-planus like keratoses) - common and benign
    dfDermatofibroma - common and benign
    melMelanoma - a type of skin cancer involving the melanin cells
    nvMelanocytic nevus - the medical term for a mole (benign)
    vascVascular lesions (angiomas, angiokeratomas, pyogenic granulomas and hemorrhage) (benign)



    Values of the metadata dx_type column (the diagnosis methods)

    And the table below present the values of the column dx_type.

    ValueDescription
    histoHistopathology
    follow_upFollow-up examination
    consensusExpert consensus
    confocalIn-vivo confocal microscopy
  2. HAM10000 (Resized and Pickled)

    • kaggle.com
    Updated Mar 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    khaganmv (2023). HAM10000 (Resized and Pickled) [Dataset]. https://www.kaggle.com/datasets/khaganmv/ham10000-resized-and-pickled
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 30, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    khaganmv
    Description

    This is the Skin Cancer MNIST: HAM10000 dataset, but resized and pickled. The images are resized from 600x450 pixels down to 40x30 pixels and then pickled using Python's pickle library.

    pickle5 is required to unpickle the dataset, so make sure to add !pip3 install pickle5 and import pickle5 as pickle to your notebook.

  3. R

    Ham10000 Skin Cancer Detection Dataset

    • universe.roboflow.com
    zip
    Updated Jul 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Reis (2025). Ham10000 Skin Cancer Detection Dataset [Dataset]. https://universe.roboflow.com/reis-fetxi/ham10000-skin-cancer-detection
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 3, 2025
    Dataset authored and provided by
    Reis
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Skin Diseases Bounding Boxes
    Description

    Sobre o Dataset HAM10000-SKIN-CANCER-DETECTION

    Este dataset é uma versão adaptada do “Skin Cancer MNIST” - HAM10000 original - convertida de uma tarefa de classificação para detecção de lesões cutâneas. Ele contém:
    - 10.000 imagens de lesões de pele humana anotadas manualmente com bounding boxes.
    - Divisão em 7 classes principais de lesões cutâneas, incluindo:
    1. Actinic keratoses and intraepithelial carcinoma/Bowen disease (akiec): Lesões pré-malignas.
    2. Basal cell carcinoma (bcc): Tipo de câncer de pele com bom prognóstico.
    3. Benign lesions of the keratosis type (bkl): Incluem lentigo solar, ceratose seborreica e ceratose liquenoide.
    4. Dermatofibroma (df): Lesões benignas comuns.
    5. Melanoma (mel): Lesão maligna com alta prioridade clínica.
    6. Melanocytic nevi (nv): Lesões benignas melanocíticas muito comuns.
    7. Vascular lesions (vasc): Incluem angiomas, angiokeratomas, granulomas piogênicos e hemorragias.

  4. HAM10000 Lesion Segmentations

    • kaggle.com
    Updated Jul 2, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    chdlr (2020). HAM10000 Lesion Segmentations [Dataset]. https://www.kaggle.com/tschandl/ham10000-lesion-segmentations/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 2, 2020
    Dataset provided by
    Kaggle
    Authors
    chdlr
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Context

    Dermatoscopic images usually depict a single skin lesion, but large scale datasets with available segmentations of affected areas are not available until now. Challenge segmentation data often suffered from being either too coarse or too noisy. This dataset provides 10015 binary segmentation masks based on FCN-created segmentations and hand-drawn lines, which together with the HAM10000 diagnosis metadata can be used for object detection or semantic segmentation.

    Content

    This dataset contains binary segmentation masks as PNG-files of all HAM10000 dataset images. The area segments lesion area as evaluated by a single dermatologist (me). They were initiated with a FCN lesion segmentation model, where afterwards I went through all of them and either approved them, or corrected / redrew them with the free-hand selection tool in FIJI.

    You can find the HAM10000 dataset images at the following places: - Harvard Dataverse: https://doi.org/10.7910/DVN/DBW86T - ISIC Archive Gallery: https://www.isic-archive.com - Kaggle Dataset Kernel (downsampled): https://www.kaggle.com/kmader/skin-cancer-mnist-ham10000

    Acknowledgements

    If you use this data, please cite/refer to the publication I made these segmentation masks for...

    ...and the original source of the images:

  5. h

    ham1ok

    • huggingface.co
    Updated Apr 21, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    karol adel (2024). ham1ok [Dataset]. https://huggingface.co/datasets/karoladelk/ham1ok
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 21, 2024
    Authors
    karol adel
    Description

    The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions

    Original Paper and Dataset here Kaggle dataset here

      Introduction to datasets
    

    Training of neural networks for automated diagnosis of pigmented skin lesions is hampered by the small size and lack of diversity of available dataset of dermatoscopic images. We tackle this problem by releasing the HAM10000 ("Human Against Machine with 10000 training images") dataset.… See the full description on the dataset page: https://huggingface.co/datasets/karoladelk/ham1ok.

  6. HAM10000-image

    • kaggle.com
    Updated Jul 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vu Ngoc Binh (2023). HAM10000-image [Dataset]. https://www.kaggle.com/datasets/vungocbinh/ham10000-image/data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 31, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Vu Ngoc Binh
    Description

    Dataset

    This dataset was created by Vu Ngoc Binh

    Contents

  7. ham10000

    • kaggle.com
    Updated May 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hijibiji_Hijibiji (2025). ham10000 [Dataset]. https://www.kaggle.com/datasets/hijibijihijibiji/ham10000
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 11, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Hijibiji_Hijibiji
    Description

    Dataset

    This dataset was created by Hijibiji_Hijibiji

    Contents

  8. Final Augumented Data

    • kaggle.com
    Updated Apr 14, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    gowtham_mandla (2024). Final Augumented Data [Dataset]. https://www.kaggle.com/datasets/mandlagowtham/final-augumented-data/suggestions?status=pending&yourSuggestions=true
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 14, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    gowtham_mandla
    Description

    Dataset

    This dataset was created by gowtham_mandla

    Contents

  9. HAM10000-single-dir

    • kaggle.com
    Updated Feb 26, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Iulia-Georgiana Talpalariu (2024). HAM10000-single-dir [Dataset]. https://www.kaggle.com/datasets/iuliali/ham10000-single-dir/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 26, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Iulia-Georgiana Talpalariu
    Description

    Dataset

    This dataset was created by Iulia-Georgiana Talpalariu

    Contents

  10. BBox Ham10000

    • kaggle.com
    Updated May 3, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    OUAHABI Benhenni (2024). BBox Ham10000 [Dataset]. https://www.kaggle.com/datasets/ouahabibenhenni/bbox-ham10000
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 3, 2024
    Dataset provided by
    Kaggle
    Authors
    OUAHABI Benhenni
    Description

    Dataset

    This dataset was created by OUAHABI Benhenni

    Contents

  11. test_ham10000

    • kaggle.com
    Updated Oct 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dai Nguyen 2 (2024). test_ham10000 [Dataset]. https://www.kaggle.com/datasets/dainguyen2/test-ham10000/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 25, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Dai Nguyen 2
    Description

    Dataset

    This dataset was created by Dai Nguyen 2

    Contents

  12. HAM10000_urfu

    • kaggle.com
    Updated Apr 24, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AntonRL124c (2023). HAM10000_urfu [Dataset]. https://www.kaggle.com/datasets/antonrl124c/ham10000-urfu/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 24, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    AntonRL124c
    Description

    Dataset

    This dataset was created by AntonRL124c

    Contents

  13. HAM10000_official

    • kaggle.com
    Updated Jun 27, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    VIVEK NARAYAN 21114108 (2023). HAM10000_official [Dataset]. https://www.kaggle.com/datasets/viveknarayan21114108/ham10000-official
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 27, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    VIVEK NARAYAN 21114108
    Description

    Dataset

    This dataset was created by VIVEK NARAYAN 21114108

    Contents

  14. HAM 10000 Segmented + Augmented

    • kaggle.com
    Updated Sep 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aranya Saha (2024). HAM 10000 Segmented + Augmented [Dataset]. https://www.kaggle.com/datasets/aranyasaha/ham-10000-segmented-augmented/versions/1
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 25, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Aranya Saha
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Aranya Saha

    Released under MIT

    Contents

  15. U-NET Image Segmentation HAM10000

    • kaggle.com
    Updated Nov 24, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jihaad Pangestu (2024). U-NET Image Segmentation HAM10000 [Dataset]. https://www.kaggle.com/datasets/jihaadpangestu/u-net-image-segmentation-ham10000/data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 24, 2024
    Dataset provided by
    Kaggle
    Authors
    Jihaad Pangestu
    Description

    Dataset

    This dataset was created by Jihaad Pangestu

    Contents

  16. Synthetic Skin Disease Dataset/Real and Synthetic

    • kaggle.com
    Updated Aug 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DevDope (2024). Synthetic Skin Disease Dataset/Real and Synthetic [Dataset]. https://www.kaggle.com/datasets/devdope/synthetic-skin-disease-datasetreal-and-synthetic/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 7, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    DevDope
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description

    Dataset Title

    Skin Disease GAN-Generated and Original Images Lightweight Dataset

    General Description

    This dataset is a collection of skin disease images generated using a Generative Adversarial Network (GAN) approach. Specifically, a GAN was utilized with Stable Diffusion as the generator and a transformer-based discriminator to create realistic images of various skin diseases. The GAN approach enhances the accuracy and realism of the generated images, making this dataset a valuable resource for machine learning and computer vision applications in dermatology.

    Creation Process

    To create this dataset, a series of Low-Rank Adaptations (LoRAs) were generated for each disease category. These LoRAs were trained on the base dataset with 60 epochs and 30,000 steps using OneTrainer. Images were then generated for the following disease categories:

    • Herpes
    • Measles
    • Chickenpox
    • Monkeypox

    Due to the availability of ample public images, Melanoma was excluded from the generation process. The Fooocus API served as the generator within the GAN framework, creating images based on the LoRAs.

    To ensure quality and accuracy, a transformer-based discriminator was employed to verify the generated images, classifying them into the correct disease categories.

    Sources

    The original base dataset used to create this GAN-based dataset includes reputable sources such as:

    2019 HAM10000 Challenge - Kaggle - Google Images - Dermnet NZ - Bing Images - Yandex - Hellenic Atlas - Dermatological Atlas The LoRAs and their recommended weights for generating images are available for download on our CivitAi profile. You can refer to this profile for detailed instructions and access to the LoRAs used in this dataset.

    Dataset Contents

    Generated Images: High-quality images of skin diseases generated via GAN with Stable Diffusion, using transformer-based discrimination for accurate classification.

    Categories

    • Herpes
    • Measles
    • Chickenpox
    • Monkeypox Each image corresponds to one of these four categories, providing a reliable set of generated data for training and evaluation. Melanoma was excluded from generation due to the abundance of public data.

    Suggested Use Cases

    This dataset is suitable for:

    • Image Classification and Augmentation Tasks: Training and evaluating models in skin disease classification, with additional augmentation from generated images.
    • Research in Dermatology and GAN Techniques: Investigating the effectiveness of GANs for generating medical images, as well as exploring the use of transformer-based discrimination.
    • Educational Projects in AI and Medicine: Offering insights into image generation for diagnostic purposes, combining GANs and Stable Diffusion with transformers for medical datasets.

    Citation

    Garcia-Espinosa, E. ., Ruiz-Castilla, J. S., & Garcia-Lamont, F. (2025). Generative AI and Transformers in Advanced Skin Lesion Classification applied on a mobile device. International Journal of Combinatorial Optimization Problems and Informatics, 16(2), 158–175. https://doi.org/10.61467/2007.1558.2025.v16i2.1078

    ** **

    Espinosa, E.G., Castilla, J.S.R., Lamont, F.G. (2025). Skin Disease Pre-diagnosis with Novel Visual Transformers. In: Figueroa-García, J.C., Hernández, G., Suero Pérez, D.F., Gaona García, E.E. (eds) Applied Computer Sciences in Engineering. WEA 2024. Communications in Computer and Information Science, vol 2222. Springer, Cham. https://doi.org/10.1007/978-3-031-74595-9_10

  17. ham_10000_dataset_test_train

    • kaggle.com
    Updated Nov 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Manodeep Ray (2024). ham_10000_dataset_test_train [Dataset]. https://www.kaggle.com/datasets/raymanodeep/ham-10000-dataset-test-train/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 1, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Manodeep Ray
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset

    This dataset was created by Manodeep Ray

    Released under CC0: Public Domain

    Contents

  18. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Élio Cordeiro Pereira (2024). Skin Cancer - The HAM10000 dataset [Dataset]. https://www.kaggle.com/datasets/eliocordeiropereira/skin-cancer-the-ham10000-dataset/code
Organization logo

Skin Cancer - The HAM10000 dataset

Multi-source dermatoscopic images of common pigmented skin leasons

Explore at:
7 scholarly articles cite this dataset (View in Google Scholar)
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 1, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Élio Cordeiro Pereira
License

Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically

Description

The Original Dataset

The source dataset and its full description may be accessed through the Harvard Dataverse, and should be cited as

Tschandl, Philipp, 2018, "The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions", https://doi.org/10.7910/DVN/DBW86T, Harvard Dataverse, V4, UNF:6:KCZFcBLiFE5ObWcTc2ZBOA== [fileUNF]

The Current Dataset

Note that the herein uploaded dataset does not contain all of the source material, namely the file ISIC2018_Task3_Test_NatureMedicine_AI_Interaction_Benefit.tab - which contains data on a study involving human-computer collaboration - and the folder HAM10000_segmentations_lesion_tschandl - containing binary segmentation masks of the training images. Still, in contrast to most of the HAM10000 datasets published in Kaggle, the current one includes the test dataset that was curated for the ISIC 2018 challenge (Task 3).

Description

Files and folders

The uploaded dataset is comprised by 3 folders and 2 files, described in the table below.

ContentTypeDescription
HAM10000_images_part_1folderPart 1 of a set of training pictures
HAM10000_images_part_2folderPart 2 of a set of training pictures
ISIC2018_Task3_Test_ImagesfolderSet of test pictures
HAM10000_metadata.csvfileMetadata associated with the training data
ISIC2018_Task3_Test_GroundTruth.csvfileMetadata associated with the test data



The training dataset (HAM10000_images_part_1 and HAM10000_images_part_2) is called "HAM10000" meaning "Human Against Machine with 10000 training images"" (actually 10015 images) and it corresponds to a large collection of multi-source dermatoscopic RGB images (JPG) of common pigmented skin lesions. The test dataset (ISIC2018_Task3_Test_Images) corresponds to 511 images. The files HAM10000_metadata.csv and ISIC2018_Task3_Test_GroundTruth.csv contain the respective metadata (data about the data) which further include other features and the labels.

Columns of the metadata files

Their structure of the metadata files follows the template presented by the table below.

ColumnTypeDescription
lesion_idStringID of the lesion case
image_idStringID of an image (also the name of the respective JPG file) associated with that case
dxStringLabel of that case
dx_typeStringMethod used for diagnosing that case
ageFloatAge of the person associated with that case
sexStringSex of the person associated with that case
localizationStringLocation of the lesion in the person body
datasetStringReference from which the data was taken



Values of the metadata dx column (the classes)

The values that the column dx may take are tabulated below.

ValueDescription
akiecActinic keratoses and intraepithelial carcinoma (also called "Bowen's disease") - an early form of skin cancer
bccBasal cell carcinoma - the most common type of skin cancer
bklBenign keratosis-like lesions (solar lentigines / seborrheic keratoses and lichen-planus like keratoses) - common and benign
dfDermatofibroma - common and benign
melMelanoma - a type of skin cancer involving the melanin cells
nvMelanocytic nevus - the medical term for a mole (benign)
vascVascular lesions (angiomas, angiokeratomas, pyogenic granulomas and hemorrhage) (benign)



Values of the metadata dx_type column (the diagnosis methods)

And the table below present the values of the column dx_type.

ValueDescription
histoHistopathology
follow_upFollow-up examination
consensusExpert consensus
confocalIn-vivo confocal microscopy
Search
Clear search
Close search
Google apps
Main menu