Attribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
License information was derived automatically
Malaria is a disease caused by Plasmodium parasites that remains a major threat in global health, affecting 200 million people and causing 400,000 deaths a year. The main species of malaria that affect humans are Plasmodium falciparum and Plasmodium vivax.
For malaria as well as other microbial infections, manual inspection of thick and thin blood smears by trained microscopists remains the gold standard for parasite detection and stage determination because of its low reagent and instrument cost and high flexibility. Despite manual inspection being extremely low throughput and susceptible to human bias, automatic counting software remains largely unused because of the wide range of variations in brightfield microscopy images. However, a robust automatic counting and cell classification solution would provide enormous benefits due to faster and more accurate quantitative results without human variability; researchers and medical professionals could better characterize stage-specific drug targets and better quantify patient reactions to drugs.
Previous attempts to automate the process of identifying and quantifying malaria have not gained major traction partly due to difficulty of replication, comparison, and extension. Authors also rarely make their image sets available, which precludes replication of results and assessment of potential improvements. The lack of a standard set of images nor standard set of metrics used to report results has impeded the field.
Images are in .png or .jpg format. There are 3 sets of images consisting of 1364 images (~80,000 cells) with different researchers having prepared each one: from Brazil (Stefanie Lopes), from Southeast Asia (Benoit Malleret), and time course (Gabriel Rangel). Blood smears were stained with Giemsa reagent.
The data consists of two classes of uninfected cells (RBCs and leukocytes) and four classes of infected cells (gametocytes, rings, trophozoites, and schizonts). Annotators were permitted to mark some cells as difficult if not clearly in one of the cell classes. The data had a heavy imbalance towards uninfected RBCs versus uninfected leukocytes and infected cells, making up over 95% of all cells.
A class label and set of bounding box coordinates were given for each cell. For all data sets, infected cells were given a class label by Stefanie Lopes, malaria researcher at the Dr. Heitor Vieira Dourado Tropical Medicine Foundation hospital, indicating stage of development or marked as difficult.
Original data available from the Broad Institute Repository at https://data.broadinstitute.org/bbbc/BBBC041/
These images were contributed by Jane Hung of MIT and the Broad Institute in Cambridge, MA.
There is also a Github repository that lists malaria parasite imaging datasets (blood smears): https://github.com/tobsecret/Awesome_Malaria_Parasite_Imaging_Datasets
Published results using this image set These datasets will be evaluated in a publication to be submitted.
Recommended citation "We used image set BBBC041v1, available from the Broad Bioimage Benchmark Collection [Ljosa et al., Nature Methods, 2012]."
Copyright The images and ground truth are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License by Jane Hung.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is a dataset used as a training set for "AI4Life Microscopy Supervised Denoising Challenge 2025" It is based on subsets of two datasets:
This image set is of Transfluor assay where an orphan GPCR (G-protein coupled receptor) is stably integrated into a beta-arrestin GFP (green) expressing U2OS cell line also showing nucleus (red). After one hour incubation with a compound, the cells were quenched with fixative (formaldehyde) and the plate was read on Cellomics ArrayScan HCS Reader using the GPCR Bioapplication. Negative results are a more uniform distribution of GFP throughout the cytoplasm and positive results are a peri-nuclear localization. Recommended attribution: 'We used the SBS Roche Transfluor image set provided by Ilya Ravkin and available from the Broad Bioimage Benchmark Collection (www.broad.mit.edu/bbbc).' Analysis of the data set can be downloaded from http://www.broadinstitute.org/bbbc/sbs_roche_transfluor.html
http://creativecommons.org/choose/publicdomain-3?title=©right_holder=http://creativecommons.org/choose/publicdomain-3?title=©right_holder=
This image set consists of five differential interference contrast (DIC) images of red bood cells (approximately 8 microns in diameter). Each image in the set contains a DIC image and a thresholded image. There are two options when using this data set to quantify an algorithm's effectiveness. The simplest method is simply to compute the percentage of pixels that your algorithm and the thresholded image have in common. However, a wide class of algorithms for handling DIC images do not explicity segment the image, but rather transform the image so that it can later be thresholded. These images were obtained from http://www.broadinstitute.org/bbbc/human_rbc_dic.html
DIC images attributed to Skerker, Rush, and Wiegand; the thresholded images attributed to Morgan.
This selection of images are controls selected from a screen to find novel anti-infectives using the roundworm C.elegans . The animals were exposed to the pathogen Enterococcus faecalis and either untreated or treated with ampicillin, a known antibiotic against the pathogen. The untreated (negative control) worms display predominantly the "dead" phenotype: worms appear rod-like in shape and slightly uneven in texture. The treated (ampicillin, positive control) worms display predominantly the "live" phenotype: worms appear curved in shape and smooth in texture. For more information, please see Moy et al. (ACS Chem Biol, 2009) [http://dx.doi.org/10.1021/cb900084v]
One image per channel (Channel 1 = brightfield; channel 2 = GFP) was acquired at MGH on a Discovery-1 automated microscope (Molecular Devices). Original image size is 696 x 520 pixels. Images are available in 16-bit TIF.
The 384 images are from a plate of positive and negative controls. The images are named using this format:
<plate>_<wellrow>_<wellcolumn>_<wavelength>_<fileid>.tif
Columns 1-12 are positive controls treated with ampicillin. Columns 13-24 are untreated negative controls.
We also provide human-corrected binary images of foreground/background segmentation. To address the problem of correctly segmenting individual worms also when they overlap or cluster, we provide one binary foreground/background segmentation ground truth image for each worm:
The data have been reposted from the original data taken from the Broad Institute. Please acknowledge the original source if this is used in other works. The original data can be found and downloaded here: https://data.broadinstitute.org/bbbc/BBBC010/
These images were originally acquired for a screen in Fred Ausubel's lab at MGH. Please contact aconery AT molbio.mgh.harvard.edu for more information.
Original Publication: http://dx.doi.org/10.1038/nmeth.1984
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
This image set is of Transfluor assay where an orphan GPCR (G-protein coupled receptor) is stably integrated into a beta-arrestin GFP (green) expressing U2OS cell line also showing nucleus (red). After one hour incubation with a compound, the cells were quenched with fixative (formaldehyde) and the plate was read on Cellomics ArrayScan HCS Reader using the GPCR Bioapplication. Negative results are a more uniform distribution of GFP throughout the cytoplasm and positive results are a peri-nuclear localization. Recommended attribution: "We used the SBS Roche Transfluor image set provided by Ilya Ravkin and available from the Broad Bioimage Benchmark Collection (www.broad.mit.edu/bbbc)." Analysis of the data set can be downloaded from http://www.broadinstitute.org/bbbc/sbs_roche_transfluor.html
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Attribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
License information was derived automatically
Malaria is a disease caused by Plasmodium parasites that remains a major threat in global health, affecting 200 million people and causing 400,000 deaths a year. The main species of malaria that affect humans are Plasmodium falciparum and Plasmodium vivax.
For malaria as well as other microbial infections, manual inspection of thick and thin blood smears by trained microscopists remains the gold standard for parasite detection and stage determination because of its low reagent and instrument cost and high flexibility. Despite manual inspection being extremely low throughput and susceptible to human bias, automatic counting software remains largely unused because of the wide range of variations in brightfield microscopy images. However, a robust automatic counting and cell classification solution would provide enormous benefits due to faster and more accurate quantitative results without human variability; researchers and medical professionals could better characterize stage-specific drug targets and better quantify patient reactions to drugs.
Previous attempts to automate the process of identifying and quantifying malaria have not gained major traction partly due to difficulty of replication, comparison, and extension. Authors also rarely make their image sets available, which precludes replication of results and assessment of potential improvements. The lack of a standard set of images nor standard set of metrics used to report results has impeded the field.
Images are in .png or .jpg format. There are 3 sets of images consisting of 1364 images (~80,000 cells) with different researchers having prepared each one: from Brazil (Stefanie Lopes), from Southeast Asia (Benoit Malleret), and time course (Gabriel Rangel). Blood smears were stained with Giemsa reagent.
The data consists of two classes of uninfected cells (RBCs and leukocytes) and four classes of infected cells (gametocytes, rings, trophozoites, and schizonts). Annotators were permitted to mark some cells as difficult if not clearly in one of the cell classes. The data had a heavy imbalance towards uninfected RBCs versus uninfected leukocytes and infected cells, making up over 95% of all cells.
A class label and set of bounding box coordinates were given for each cell. For all data sets, infected cells were given a class label by Stefanie Lopes, malaria researcher at the Dr. Heitor Vieira Dourado Tropical Medicine Foundation hospital, indicating stage of development or marked as difficult.
Original data available from the Broad Institute Repository at https://data.broadinstitute.org/bbbc/BBBC041/
These images were contributed by Jane Hung of MIT and the Broad Institute in Cambridge, MA.
There is also a Github repository that lists malaria parasite imaging datasets (blood smears): https://github.com/tobsecret/Awesome_Malaria_Parasite_Imaging_Datasets
Published results using this image set These datasets will be evaluated in a publication to be submitted.
Recommended citation "We used image set BBBC041v1, available from the Broad Bioimage Benchmark Collection [Ljosa et al., Nature Methods, 2012]."
Copyright The images and ground truth are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License by Jane Hung.