6 datasets found
  1. Malaria Bounding Boxes

    • kaggle.com
    zip
    Updated May 9, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    K Scott Mader (2019). Malaria Bounding Boxes [Dataset]. https://www.kaggle.com/kmader/malaria-bounding-boxes
    Explore at:
    zip(4517591256 bytes)Available download formats
    Dataset updated
    May 9, 2019
    Authors
    K Scott Mader
    License

    Attribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
    License information was derived automatically

    Description

    Context

    Malaria is a disease caused by Plasmodium parasites that remains a major threat in global health, affecting 200 million people and causing 400,000 deaths a year. The main species of malaria that affect humans are Plasmodium falciparum and Plasmodium vivax.

    For malaria as well as other microbial infections, manual inspection of thick and thin blood smears by trained microscopists remains the gold standard for parasite detection and stage determination because of its low reagent and instrument cost and high flexibility. Despite manual inspection being extremely low throughput and susceptible to human bias, automatic counting software remains largely unused because of the wide range of variations in brightfield microscopy images. However, a robust automatic counting and cell classification solution would provide enormous benefits due to faster and more accurate quantitative results without human variability; researchers and medical professionals could better characterize stage-specific drug targets and better quantify patient reactions to drugs.

    Previous attempts to automate the process of identifying and quantifying malaria have not gained major traction partly due to difficulty of replication, comparison, and extension. Authors also rarely make their image sets available, which precludes replication of results and assessment of potential improvements. The lack of a standard set of images nor standard set of metrics used to report results has impeded the field.

    Content

    Images are in .png or .jpg format. There are 3 sets of images consisting of 1364 images (~80,000 cells) with different researchers having prepared each one: from Brazil (Stefanie Lopes), from Southeast Asia (Benoit Malleret), and time course (Gabriel Rangel). Blood smears were stained with Giemsa reagent.

    Labels

    The data consists of two classes of uninfected cells (RBCs and leukocytes) and four classes of infected cells (gametocytes, rings, trophozoites, and schizonts). Annotators were permitted to mark some cells as difficult if not clearly in one of the cell classes. The data had a heavy imbalance towards uninfected RBCs versus uninfected leukocytes and infected cells, making up over 95% of all cells.

    A class label and set of bounding box coordinates were given for each cell. For all data sets, infected cells were given a class label by Stefanie Lopes, malaria researcher at the Dr. Heitor Vieira Dourado Tropical Medicine Foundation hospital, indicating stage of development or marked as difficult.

    Acknowledgements

    Original data available from the Broad Institute Repository at https://data.broadinstitute.org/bbbc/BBBC041/

    These images were contributed by Jane Hung of MIT and the Broad Institute in Cambridge, MA.

    There is also a Github repository that lists malaria parasite imaging datasets (blood smears): https://github.com/tobsecret/Awesome_Malaria_Parasite_Imaging_Datasets

    Published results using this image set These datasets will be evaluated in a publication to be submitted.

    Recommended citation "We used image set BBBC041v1, available from the Broad Bioimage Benchmark Collection [Ljosa et al., Nature Methods, 2012]."

    Copyright The images and ground truth are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License by Jane Hung.

  2. AI4Life-MDC25-Nuclei

    • zenodo.org
    zip
    Updated Jun 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Guillaume Jacquemet; Guillaume Jacquemet; Teresa Zulueta-Coarasa; Teresa Zulueta-Coarasa; Vebjorn Ljosa; Katherine L Sokolnicki; Anne E Carpenter; Vebjorn Ljosa; Katherine L Sokolnicki; Anne E Carpenter (2025). AI4Life-MDC25-Nuclei [Dataset]. http://doi.org/10.5281/zenodo.15624741
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 11, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Guillaume Jacquemet; Guillaume Jacquemet; Teresa Zulueta-Coarasa; Teresa Zulueta-Coarasa; Vebjorn Ljosa; Katherine L Sokolnicki; Anne E Carpenter; Vebjorn Ljosa; Katherine L Sokolnicki; Anne E Carpenter
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is a dataset used as a training set for "AI4Life Microscopy Supervised Denoising Challenge 2025" It is based on subsets of two datasets:

    For more details about the original datasets, please consult the links above.

    For the details about this subset and its organization, please see https://ai4life-mdc25.grand-challenge.org/leaderboard-3/


    AI4Life has received funding from the European Union’s Horizon Europe research and innovation programme under grant agreement number 101057970. Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union or the European Research Council Executive Agency. Neither the European Union nor the granting authority can be held responsible for them.
  3. d

    Cell Image Library Dataset CIL:32061

    • datamed.org
    Updated May 22, 2011
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2011). Cell Image Library Dataset CIL:32061 [Dataset]. https://datamed.org/display-item.php?repository=0024&id=595151425152c64c3b0ac0c8&query=positive
    Explore at:
    Dataset updated
    May 22, 2011
    Variables measured
    fluorescence emission
    Description

    This image set is of Transfluor assay where an orphan GPCR (G-protein coupled receptor) is stably integrated into a beta-arrestin GFP (green) expressing U2OS cell line also showing nucleus (red). After one hour incubation with a compound, the cells were quenched with fixative (formaldehyde) and the plate was read on Cellomics ArrayScan HCS Reader using the GPCR Bioapplication. Negative results are a more uniform distribution of GFP throughout the cytoplasm and positive results are a peri-nuclear localization. Recommended attribution: 'We used the SBS Roche Transfluor image set provided by Ilya Ravkin and available from the Broad Bioimage Benchmark Collection (www.broad.mit.edu/bbbc).' Analysis of the data set can be downloaded from http://www.broadinstitute.org/bbbc/sbs_roche_transfluor.html

  4. c

    Jeffrey Skerker, Meg Rush, Roger Wiegand, Tom Morgan (2011) CIL:27223, Homo...

    • cellimagelibrary.org
    zip
    Updated May 9, 2011
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CIL (2011). Jeffrey Skerker, Meg Rush, Roger Wiegand, Tom Morgan (2011) CIL:27223, Homo sapiens, blood cell, red blood cell. CIL. Dataset [Dataset]. http://doi.org/10.7295/W9CIL27223
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 9, 2011
    Dataset provided by
    CIL
    License

    http://creativecommons.org/choose/publicdomain-3?title=&copyright_holder=http://creativecommons.org/choose/publicdomain-3?title=&copyright_holder=

    Description

    This image set consists of five differential interference contrast (DIC) images of red bood cells (approximately 8 microns in diameter). Each image in the set contains a DIC image and a thresholded image. There are two options when using this data set to quantify an algorithm's effectiveness. The simplest method is simply to compute the percentage of pixels that your algorithm and the thresholded image have in common. However, a wide class of algorithms for handling DIC images do not explicity segment the image, but rather transform the image so that it can later be thresholded. These images were obtained from http://www.broadinstitute.org/bbbc/human_rbc_dic.html

    DIC images attributed to Skerker, Rush, and Wiegand; the thresholded images attributed to Morgan.

  5. High-Content Screening with C.Elegans

    • kaggle.com
    Updated Apr 19, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    K Scott Mader (2017). High-Content Screening with C.Elegans [Dataset]. https://www.kaggle.com/kmader/high-content-screening-celegans/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 19, 2017
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    K Scott Mader
    Description

    About

    This selection of images are controls selected from a screen to find novel anti-infectives using the roundworm C.elegans . The animals were exposed to the pathogen Enterococcus faecalis and either untreated or treated with ampicillin, a known antibiotic against the pathogen. The untreated (negative control) worms display predominantly the "dead" phenotype: worms appear rod-like in shape and slightly uneven in texture. The treated (ampicillin, positive control) worms display predominantly the "live" phenotype: worms appear curved in shape and smooth in texture. For more information, please see Moy et al. (ACS Chem Biol, 2009) [http://dx.doi.org/10.1021/cb900084v]

    Images

    One image per channel (Channel 1 = brightfield; channel 2 = GFP) was acquired at MGH on a Discovery-1 automated microscope (Molecular Devices). Original image size is 696 x 520 pixels. Images are available in 16-bit TIF.

    Ground Truth

    The 384 images are from a plate of positive and negative controls. The images are named using this format: <plate>_<wellrow>_<wellcolumn>_<wavelength>_<fileid>.tif Columns 1-12 are positive controls treated with ampicillin. Columns 13-24 are untreated negative controls.

    We also provide human-corrected binary images of foreground/background segmentation. To address the problem of correctly segmenting individual worms also when they overlap or cluster, we provide one binary foreground/background segmentation ground truth image for each worm:

    Acknowledgements

    The data have been reposted from the original data taken from the Broad Institute. Please acknowledge the original source if this is used in other works. The original data can be found and downloaded here: https://data.broadinstitute.org/bbbc/BBBC010/

    These images were originally acquired for a screen in Fred Ausubel's lab at MGH. Please contact aconery AT molbio.mgh.harvard.edu for more information.

    Original Publication: http://dx.doi.org/10.1038/nmeth.1984

    Inspiration

  6. c

    Ilya Ravkin (2011) CIL:32107, Homo sapiens, osteocarcenoma. CIL. Dataset

    • cellimagelibrary.org
    zip
    Updated May 21, 2011
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CIL (2011). Ilya Ravkin (2011) CIL:32107, Homo sapiens, osteocarcenoma. CIL. Dataset [Dataset]. http://doi.org/10.7295/W9CIL32107
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 21, 2011
    Dataset provided by
    CIL
    License

    Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
    License information was derived automatically

    Description

    This image set is of Transfluor assay where an orphan GPCR (G-protein coupled receptor) is stably integrated into a beta-arrestin GFP (green) expressing U2OS cell line also showing nucleus (red). After one hour incubation with a compound, the cells were quenched with fixative (formaldehyde) and the plate was read on Cellomics ArrayScan HCS Reader using the GPCR Bioapplication. Negative results are a more uniform distribution of GFP throughout the cytoplasm and positive results are a peri-nuclear localization. Recommended attribution: "We used the SBS Roche Transfluor image set provided by Ilya Ravkin and available from the Broad Bioimage Benchmark Collection (www.broad.mit.edu/bbbc)." Analysis of the data set can be downloaded from http://www.broadinstitute.org/bbbc/sbs_roche_transfluor.html

  7. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
K Scott Mader (2019). Malaria Bounding Boxes [Dataset]. https://www.kaggle.com/kmader/malaria-bounding-boxes
Organization logo

Malaria Bounding Boxes

P. vivax (malaria) infected human blood smears

Explore at:
zip(4517591256 bytes)Available download formats
Dataset updated
May 9, 2019
Authors
K Scott Mader
License

Attribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
License information was derived automatically

Description

Context

Malaria is a disease caused by Plasmodium parasites that remains a major threat in global health, affecting 200 million people and causing 400,000 deaths a year. The main species of malaria that affect humans are Plasmodium falciparum and Plasmodium vivax.

For malaria as well as other microbial infections, manual inspection of thick and thin blood smears by trained microscopists remains the gold standard for parasite detection and stage determination because of its low reagent and instrument cost and high flexibility. Despite manual inspection being extremely low throughput and susceptible to human bias, automatic counting software remains largely unused because of the wide range of variations in brightfield microscopy images. However, a robust automatic counting and cell classification solution would provide enormous benefits due to faster and more accurate quantitative results without human variability; researchers and medical professionals could better characterize stage-specific drug targets and better quantify patient reactions to drugs.

Previous attempts to automate the process of identifying and quantifying malaria have not gained major traction partly due to difficulty of replication, comparison, and extension. Authors also rarely make their image sets available, which precludes replication of results and assessment of potential improvements. The lack of a standard set of images nor standard set of metrics used to report results has impeded the field.

Content

Images are in .png or .jpg format. There are 3 sets of images consisting of 1364 images (~80,000 cells) with different researchers having prepared each one: from Brazil (Stefanie Lopes), from Southeast Asia (Benoit Malleret), and time course (Gabriel Rangel). Blood smears were stained with Giemsa reagent.

Labels

The data consists of two classes of uninfected cells (RBCs and leukocytes) and four classes of infected cells (gametocytes, rings, trophozoites, and schizonts). Annotators were permitted to mark some cells as difficult if not clearly in one of the cell classes. The data had a heavy imbalance towards uninfected RBCs versus uninfected leukocytes and infected cells, making up over 95% of all cells.

A class label and set of bounding box coordinates were given for each cell. For all data sets, infected cells were given a class label by Stefanie Lopes, malaria researcher at the Dr. Heitor Vieira Dourado Tropical Medicine Foundation hospital, indicating stage of development or marked as difficult.

Acknowledgements

Original data available from the Broad Institute Repository at https://data.broadinstitute.org/bbbc/BBBC041/

These images were contributed by Jane Hung of MIT and the Broad Institute in Cambridge, MA.

There is also a Github repository that lists malaria parasite imaging datasets (blood smears): https://github.com/tobsecret/Awesome_Malaria_Parasite_Imaging_Datasets

Published results using this image set These datasets will be evaluated in a publication to be submitted.

Recommended citation "We used image set BBBC041v1, available from the Broad Bioimage Benchmark Collection [Ljosa et al., Nature Methods, 2012]."

Copyright The images and ground truth are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License by Jane Hung.

Search
Clear search
Close search
Google apps
Main menu