100+ datasets found
  1. c

    Data from: A DICOM dataset for evaluation of medical image de-identification...

    • dev.cancerimagingarchive.net
    • cancerimagingarchive.net
    csv, dicom, n/a
    Updated Jan 31, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Cancer Imaging Archive (2021). A DICOM dataset for evaluation of medical image de-identification [Dataset]. http://doi.org/10.7937/s17z-r072
    Explore at:
    dicom, n/a, csvAvailable download formats
    Dataset updated
    Jan 31, 2021
    Dataset authored and provided by
    The Cancer Imaging Archive
    License

    https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/

    Time period covered
    Apr 7, 2021
    Dataset funded by
    National Cancer Institutehttp://www.cancer.gov/
    Description

    Open access or shared research data must comply with (HIPAA) patient privacy regulations. These regulations require the de-identification of datasets before they can be placed in the public domain. The process of image de-identification is time consuming, requires significant human resources, and is prone to human error. Automated image de-identification algorithms have been developed but the research community requires some method of evaluation before such tools can be widely accepted. This evaluation requires a robust dataset that can be used as part of an evaluation process for de-identification algorithms.

    We developed a DICOM dataset that can be used to evaluate the performance of de-identification algorithms. DICOM image information objects were selected from datasets published in TCIA. Synthetic Protected Health Information (PHI) was generated and inserted into selected DICOM data elements to mimic typical clinical imaging exams. The evaluation dataset was de-identified by a TCIA curation team using standard TCIA tools and procedures. We are publishing the evaluation dataset (containing synthetic PHI) and de-identified evaluation dataset (result of TCIA curation) in advance of a potential competition, sponsored by the National Cancer Institute (NCI), for de-identification algorithm evaluation, and de-identification of medical image datasets. The evaluation dataset published here is a subset of a larger evaluation dataset that was created under contract for the National Cancer Institute. This subset is being published to allow researchers to test their de-identification algorithms and promote standardized procedures for validating automated de-identification.

  2. c

    DICOM SR of clinical data and measurement for breast cancer collections to...

    • cancerimagingarchive.net
    dicom, n/a
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Cancer Imaging Archive, DICOM SR of clinical data and measurement for breast cancer collections to TCIA [Dataset]. http://doi.org/10.7937/TCIA.2019.wgllssg1
    Explore at:
    dicom, n/aAvailable download formats
    Dataset authored and provided by
    The Cancer Imaging Archive
    License

    https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/

    Time period covered
    May 26, 2020
    Dataset funded by
    National Cancer Institutehttp://www.cancer.gov/
    Description

    The Data Integration & Imaging Informatics (DI-Cubed) project explored the issue of lack of standardized data capture at the point of data creation, as reflected in the non-image data accompanying various TCIA breast cancer collections. The work addressed the desire for semantic interoperability between various NCI initiatives by aligning on common clinical metadata elements and supporting use cases that connect clinical, imaging, and genomics data. Accordingly, clinical and measurement data was imported into I2B2 and cross-mapped to industry standard concepts for names and values including those derived from BRIDG, CDISC SDTM, DICOM Structured Reporting models and using NCI Thesaurus, SNOMED CT and LOINC controlled terminology. A subset of the standardized data was then exported from I2B2 to CSV and thence converted to DICOM SR according to the the DICOM Breast Imaging Report template [1] , which supports description of patient characteristics, histopathology, receptor status and clinical findings including measurements. The purpose was not to advocate DICOM SR as an appropriate format for interchange or storage of such information for query purposes, but rather to demonstrate that use of standard concepts harmonized across multiple collections could be transformed into an existing standard report representation. The DICOM SR can be stored and used together with the images in repositories such as TCIA and in image viewers that support rendering of DICOM SR content. During the project, various deficiencies in the DICOM Breast Imaging Report template were identified with respect to describing breast MR studies, laterality of findings versus procedures, more recently developed receptor types, and patient characteristics and status. These were addressed via DICOM CP 1838, finalized in Jan 2019, and this subset reflects those changes. DICOM Breast Imaging Report Templates available from: http://dicom.nema.org/medical/dicom/current/output/chtml/part16/sect_BreastImagingReportTemplates.html

  3. DICOM files for "RECIST and Volumetric CT Measurements of Injected-Water...

    • catalog.data.gov
    • gimi9.com
    • +1more
    Updated Jul 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institute of Standards and Technology (2022). DICOM files for "RECIST and Volumetric CT Measurements of Injected-Water Phantoms" by ZH Levine, HH Chen-Mayer, AP Peskin, and AL Pintar. [Dataset]. https://catalog.data.gov/dataset/dicom-files-for-recist-and-volumetric-ct-measurements-of-injected-water-phantoms-by-zh-lev-4283f
    Explore at:
    Dataset updated
    Jul 29, 2022
    Dataset provided by
    National Institute of Standards and Technologyhttp://www.nist.gov/
    Description

    DICOM files are given for 11 CT scans which were used in a research article. Each scan contains about 1250 slices with 512x512 gray scale images, each in its own directory. The low number slices contain the diapers in the order D5 ... D1, then the vials of powder are contained in the order V11 ... V2. The symbols correspond to the injected masses which are given in the paper and which are repeated here in a file called "mass.txt". There is also a file "README.txt" which describes the directory structure.

  4. Z

    Automatic Synthesis of Anthropomorphic Pulmonary CT Phantoms - DICOM...

    • data.niaid.nih.gov
    Updated Jan 24, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Daniel Jimenez-Carretero (2020). Automatic Synthesis of Anthropomorphic Pulmonary CT Phantoms - DICOM Database [Dataset]. https://data.niaid.nih.gov/resources?id=ZENODO_32740
    Explore at:
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Daniel Jimenez-Carretero
    Mario Diaz Cacio
    Raul San Jose Estepar
    Maria J. Ledesma-Carbayo
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    This dataset contains DICOM versions of the 24 anthropomorphic pulmonary CT phantoms accompanying the manuscript "Automatic Synthesis of Anthropomorphic Pulmonary CT Phantoms" submitted to PLoS ONE.

    NRRD versions can be found in http://dx.doi.org/10.5281/zenodo.20766 (doi:10.5281/zenodo.20766).

  5. Anonymized Image Data for DWI and T2W MRI Registration Quality Assurance

    • figshare.com
    zip
    Updated Nov 9, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kareem Wahid; Mohamed Naser (2022). Anonymized Image Data for DWI and T2W MRI Registration Quality Assurance [Dataset]. http://doi.org/10.6084/m9.figshare.17162435.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Nov 9, 2022
    Dataset provided by
    figshare
    Authors
    Kareem Wahid; Mohamed Naser
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The following are pre-radiotherapy T2W and DWI MRI sequences in Digital Imaging and Communications in Medicine (DICOM) format for 20 patients curated from the MD Anderson Databases (NCT03145077).

    For each image set (T2W image and DWI image), ground truth segmentations for the left and right submandibular glands, left and right parotid glands, cervical spinal cord, brainstem, and primary gross tumor volume were manually generated by a trained physician expert (radiologist with > 5 years of experience in HNC). In a subset of five cases, segmentations for all structures in both sequences were also manually generated by three additional separate observers (two physicians and one medical student). All segmentations were generated in Velocity AI (v.3.0.1; Varian Medical Systems; Palo Alto, CA, USA) in DICOM RT structure format.

    DICOM data was anonymized using an in-house Python script that implements the RSNA CRP DICOM Anonymizer software. All files have had any DICOM header info and metadata containing PHI removed or replaced with dummy entries.

  6. h

    dicom-brain-dataset

    • huggingface.co
    Updated Feb 20, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Training Data (2024). dicom-brain-dataset [Dataset]. https://huggingface.co/datasets/TrainingDataPro/dicom-brain-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 20, 2024
    Authors
    Training Data
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description

    Brain MRI Dataset, Normal Brain Dataset, Anomaly Classification & Detection

    The dataset consists of .dcm files containing MRI scans of the brain of the person with a normal brain. The images are labeled by the doctors and accompanied by report in PDF-format. The dataset includes 7 studies, made from the different angles which provide a comprehensive understanding of a normal brain structure and useful in training brain anomaly classification algorithms.

      MRI study angles… See the full description on the dataset page: https://huggingface.co/datasets/TrainingDataPro/dicom-brain-dataset.
    
  7. Z

    Rt-Cloud Sample Project 'AmygActivation' DICOM Data

    • data.niaid.nih.gov
    • zenodo.org
    Updated May 28, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nastase, Samuel A. (2020). Rt-Cloud Sample Project 'AmygActivation' DICOM Data [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3862770
    Explore at:
    Dataset updated
    May 28, 2020
    Dataset provided by
    Mennen, Anne C.
    Nastase, Samuel A.
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This upload contains the same data as published in our previous zenodo dataset upload. Unlike our previous upload, this version contains data after transferring the DICOMs directly from the Siemens Skyra 3T to our Linux machine (as done in real-time experiments). The purpose of this separate upload is to serve as sample data for our real-time cloud software, for a specific sample project. The brain data are contributed by author S.A.N. and are authorized for non-anonymized distribution.

  8. Dicom data, MRI Alpine dingo and domestic dog brain

    • figshare.com
    zip
    Updated Feb 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Laura A. B. Wilson (2023). Dicom data, MRI Alpine dingo and domestic dog brain [Dataset]. http://doi.org/10.6084/m9.figshare.20514693.v2
    Explore at:
    zipAvailable download formats
    Dataset updated
    Feb 22, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Laura A. B. Wilson
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Data associated with manuscript:
    The Australasian dingo archetype: De novo chromosome-length genome assembly, DNA methylome, and cranial morphology

    Raw Dicom data, Alpine Dingo brain (zip) and domestic dog brain (zip). Brains were scanned using high-resolution magnetic resonance imaging (MRI). A Bruker Biospec 94/20 9.4T high field pre-clinical MRI system located at the Biological Resources imaging Laboratory University of New South Wales (UNSW) was used to acquire MRI data of a fixed dingo and domestic dog brain. The system was equipped with microimaging gradients with a maximum gradient strength of 660mT/m and a 72mm Quadrature volume coil. Images were acquired in transverse and coronal orientation using optimized 2D and 3D Fast Spin Echo (FSE) and Gradient Echo (MGE) methods. Image resolution was 200x200x500 and 300x300 microns isotropic for type 3D and 2D pulse sequences, respectively.

  9. n

    Cancer Imaging Archive (TCIA)

    • neuinfo.org
    • scicrunch.org
    • +1more
    Updated Oct 16, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2019). Cancer Imaging Archive (TCIA) [Dataset]. http://doi.org/10.25504/FAIRsharing.jrfd8y
    Explore at:
    Dataset updated
    Oct 16, 2019
    Description

    Archive of medical images of cancer accessible for public download. All images are stored in DICOM file format and organized as Collections, typically patients related by common disease (e.g. lung cancer), image modality (MRI, CT, etc) or research focus. Neuroimaging data sets include clinical outcomes, pathology, and genomics in addition to DICOM images. Submitting Data Proposals are welcomed.

  10. m

    Data from: Unpaired MR-CT Brain Dataset for Unsupervised Image Translation

    • data.mendeley.com
    Updated Mar 1, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Omar Al-Kadi (2022). Unpaired MR-CT Brain Dataset for Unsupervised Image Translation [Dataset]. http://doi.org/10.17632/z4wc364g79.1
    Explore at:
    Dataset updated
    Mar 1, 2022
    Authors
    Omar Al-Kadi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The Magnetic Resonance - Computed Tomography (MR-CT) Jordan University Hospital (JUH) dataset has been collected after receiving Institutional Review Board (IRB) approval of the hospital and consent forms have been obtained from all patients. All procedures has been carried out in accordance with The Code of Ethics of the World Medical Association (Declaration of Helsinki).

    The dataset consists of 2D image slices extracted using the RadiAnt DICOM viewer software. The extracted images are transformed to DICOM image data format with a resolution of 256x256 pixels. There are a total of 179 2D axial image slices referring to 20 patient volumes (90 MR and 89 CT 2D axial image slices). The dataset contains MR and CT brain tumour images with corresponding segmentation masks. The MR images of each patient were acquired with a 5.00mm T Siemens Verio 3T using a T2-weighted without contrast agent, 3 Fat sat pulses (FS), 2500-4000 TR, 20-30 TE, and 90/180 flip angle. The CT images were acquired with Siemens Somatom scanner with 2.46mGY.cm dose length, 130KV voltage, 113-327 mAs tube current, topogram acquisition protocol, 64 dual source, one projection, and slice thickness of 7.0mm. Smooth and sharp filters have been applied to the CT images. The MR scans have a resolution of 0.7x0.6x5 mm^3, while the CT scans have a resolution of 0.6x0.6x7 mm^3.

    More information and the application of the dataset can be found in the following research paper:

    Alaa Abu-Srhan; Israa Almallahi; Mohammad Abushariah; Waleed Mahafza; Omar S. Al-Kadi. Paired-Unpaired Unsupervised Attention Guided GAN with Transfer Learning for Bidirectional Brain MR-CT Synthesis. Comput. Biol. Med. 136, 2021. doi: https://doi.org/10.1016/j.compbiomed.2021.104763.

  11. Metadata record for: A DICOM dataset for evaluation of medical image...

    • springernature.figshare.com
    txt
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Scientific Data Curation Team (2023). Metadata record for: A DICOM dataset for evaluation of medical image de-identification [Dataset]. http://doi.org/10.6084/m9.figshare.14802774.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Scientific Data Curation Team
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This dataset contains key characteristics about the data described in the Data Descriptor A DICOM dataset for evaluation of medical image de-identification. Contents:

        1. human readable metadata summary table in CSV format
    
    
        2. machine readable metadata file in JSON format
    
  12. p

    Data from: MIMIC-CXR Database

    • physionet.org
    Updated Jul 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alistair Johnson; Tom Pollard; Roger Mark; Seth Berkowitz; Steven Horng (2024). MIMIC-CXR Database [Dataset]. http://doi.org/10.13026/4jqj-jw95
    Explore at:
    Dataset updated
    Jul 23, 2024
    Authors
    Alistair Johnson; Tom Pollard; Roger Mark; Seth Berkowitz; Steven Horng
    License

    https://github.com/MIT-LCP/license-and-dua/tree/master/draftshttps://github.com/MIT-LCP/license-and-dua/tree/master/drafts

    Description

    The MIMIC Chest X-ray (MIMIC-CXR) Database v2.0.0 is a large publicly available dataset of chest radiographs in DICOM format with free-text radiology reports. The dataset contains 377,110 images corresponding to 227,835 radiographic studies performed at the Beth Israel Deaconess Medical Center in Boston, MA. The dataset is de-identified to satisfy the US Health Insurance Portability and Accountability Act of 1996 (HIPAA) Safe Harbor requirements. Protected health information (PHI) has been removed. The dataset is intended to support a wide body of research in medicine including image understanding, natural language processing, and decision support.

  13. DICOM converted images for the NLM-Visible-Human-Project collection

    • zenodo.org
    bin
    Updated Jun 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    David Clunie; David Clunie; William Clifford; David Pot; Ulrike Wagner; Keyvan Farahani; Erika Kim; Andrey Fedorov; Andrey Fedorov; William Clifford; David Pot; Ulrike Wagner; Keyvan Farahani; Erika Kim (2025). DICOM converted images for the NLM-Visible-Human-Project collection [Dataset]. http://doi.org/10.5281/zenodo.12690050
    Explore at:
    binAvailable download formats
    Dataset updated
    Jun 6, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    David Clunie; David Clunie; William Clifford; David Pot; Ulrike Wagner; Keyvan Farahani; Erika Kim; Andrey Fedorov; Andrey Fedorov; William Clifford; David Pot; Ulrike Wagner; Keyvan Farahani; Erika Kim
    License

    https://www.nlm.nih.gov/databases/download/terms_and_conditions.htmlhttps://www.nlm.nih.gov/databases/download/terms_and_conditions.html

    Description

    This dataset corresponds to a collection of images and/or image-derived data available from National Cancer Institute Imaging Data Commons (IDC) [1]. This dataset was converted into DICOM representation and ingested by the IDC team. You can explore and visualize the corresponding images using IDC Portal here: NLM-Visible-Human-Project. You can use the manifests included in this Zenodo record to download the content of the collection following the Download instructions below.

    Collection description

    The NLM Visible Human Project [2] has created publicly-available complete, anatomically detailed, three-dimensional representations of a human male body and a human female body. Specifically, the VHP provides a public-domain library of cross-sectional cryosection, CT, and MRI images obtained from one male cadaver and one female cadaver. The Visible Man data set was publicly released in 1994 and the Visible Woman in 1995.

    The data sets were designed to serve as (1) a reference for the study of human anatomy, (2) public-domain data for testing medical imaging algorithms, and (3) a test bed and model for the construction of network-accessible image libraries. The VHP data sets have been applied to a wide range of educational, diagnostic, treatment planning, virtual reality, artistic, mathematical, and industrial uses. About 4,000 licensees from 66 countries were authorized to access the datasets. As of 2019, a license is no longer required to access the VHP datasets.

    Courtesy of the U.S. National Library of Medicine. Release of this collection by IDC does not indicate or imply that NLM has endorsed its products/services/applications. Please see the Visible Human Project information page to learn more about the images and to obtain any supporting metadata for this collection. Note that this collection may not reflect the most current/accurate data available from NLM.

    Citation guidelines can be found on the National Library of Medicine Terms and Conditions information page.

    Files included

    A manifest file's name indicates the IDC data release in which a version of collection data was first introduced. For example, collection_id-idc_v8-aws.s5cmd corresponds to the contents of the collection_id collection introduced in IDC data release v8. If there is a subsequent version of this Zenodo page, it will indicate when a subsequent version of the corresponding collection was introduced.

    1. nlm_visible_human_project-idc_v15-aws.s5cmd: manifest of files available for download from public IDC Amazon Web Services buckets
    2. nlm_visible_human_project-idc_v15-gcs.s5cmd: manifest of files available for download from public IDC Google Cloud Storage buckets
    3. nlm_visible_human_project-idc_v15-dcf.dcf: Gen3 manifest (for details see https://learn.canceridc.dev/data/organization-of-data/guids-and-uuids)

    Note that manifest files that end in -aws.s5cmd reference files stored in Amazon Web Services (AWS) buckets, while -gcs.s5cmd reference files in Google Cloud Storage. The actual files are identical and are mirrored between AWS and GCP.

    Download instructions

    Each of the manifests include instructions in the header on how to download the included files.

    To download the files using .s5cmd manifests:

    1. install idc-index package: pip install --upgrade idc-index
    2. download the files referenced by manifests included in this dataset by passing the .s5cmd manifest file: idc download manifest.s5cmd.

    To download the files using .dcf manifest, see manifest header.

    Acknowledgments

    Imaging Data Commons team has been funded in whole or in part with Federal funds from the National Cancer Institute, National Institutes of Health, under Task Order No. HHSN26110071 under Contract No. HHSN261201500003l.

    References

    [1] Fedorov, A., Longabaugh, W. J. R., Pot, D., Clunie, D. A., Pieper, S. D., Gibbs, D. L., Bridge, C., Herrmann, M. D., Homeyer, A., Lewis, R., Aerts, H. J. W., Krishnaswamy, D., Thiriveedhi, V. K., Ciausu, C., Schacherer, D. P., Bontempi, D., Pihl, T., Wagner, U., Farahani, K., Kim, E. & Kikinis, R. National Cancer Institute Imaging Data Commons: Toward Transparency, Reproducibility, and Scalability in Imaging Artificial Intelligence. RadioGraphics (2023). https://doi.org/10.1148/rg.230180

    [2] Spitzer, V., Ackerman, M. J., Scherzinger, A. L. & Whitlock, D. The visible human male: a technical report. J. Am. Med. Inform. Assoc. 3, 118–130 (1996). https://doi.org/10.1136/jamia.1996.96236280

  14. c

    Data from The Lung Image Database Consortium (LIDC) and Image Database...

    • cancerimagingarchive.net
    • dev.cancerimagingarchive.net
    dicom, n/a, xls, xlsx +1
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Cancer Imaging Archive, Data from The Lung Image Database Consortium (LIDC) and Image Database Resource Initiative (IDRI): A completed reference database of lung nodules on CT scans [Dataset]. http://doi.org/10.7937/K9/TCIA.2015.LO9QL9SX
    Explore at:
    xlsx, xls, n/a, xml and zip, dicomAvailable download formats
    Dataset authored and provided by
    The Cancer Imaging Archive
    License

    https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/

    Time period covered
    Sep 21, 2020
    Dataset funded by
    National Cancer Institutehttp://www.cancer.gov/
    Description

    The Lung Image Database Consortium image collection (LIDC-IDRI) consists of diagnostic and lung cancer screening thoracic computed tomography (CT) scans with marked-up annotated lesions. It is a web-accessible international resource for development, training, and evaluation of computer-assisted diagnostic (CAD) methods for lung cancer detection and diagnosis. Initiated by the National Cancer Institute (NCI), further advanced by the Foundation for the National Institutes of Health (FNIH), and accompanied by the Food and Drug Administration (FDA) through active participation, this public-private partnership demonstrates the success of a consortium founded on a consensus-based process.

    Seven academic centers and eight medical imaging companies collaborated to create this data set which contains 1018 cases. Each subject includes images from a clinical thoracic CT scan and an associated XML file that records the results of a two-phase image annotation process performed by four experienced thoracic radiologists. In the initial blinded-read phase, each radiologist independently reviewed each CT scan and marked lesions belonging to one of three categories ("nodule > or =3 mm," "nodule <3 mm," and "non-nodule > or =3 mm"). In the subsequent unblinded-read phase, each radiologist independently reviewed their own marks along with the anonymized marks of the three other radiologists to render a final opinion. The goal of this process was to identify as completely as possible all lung nodules in each CT scan without requiring forced consensus.

    Note : The TCIA team strongly encourages users to review pylidc and the Standardized representation of the TCIA LIDC-IDRI annotations using DICOM (DICOM-LIDC-IDRI-Nodules) of the annotations/segmentations included in this dataset before developing custom tools to analyze the XML version.

  15. Z

    DICOM converted annotations for the Prostate-MRI-US-Biopsy collection

    • data.niaid.nih.gov
    • zenodo.org
    Updated Nov 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ciausu, Cosmin (2023). DICOM converted annotations for the Prostate-MRI-US-Biopsy collection [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10069910
    Explore at:
    Dataset updated
    Nov 3, 2023
    Dataset provided by
    Fedorov, Andrey
    Ciausu, Cosmin
    Clunie, David
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contributes DICOM-converted annotations to the publicly available National Cancer Institute Imaging Data Commons [1] Prostate-MRI-US-Biopsy collection (https://portal.imaging.datacommons.cancer.gov/explore/filters/?collection_id=Community&collection_id=prostate_mri_us_biopsy). Prostate-MRI-US-Biopsy collection was initially released by The Cancer Imaging Archive (TCIA) [2,3,4]. While the images in this collection are stored in the standard DICOM format, the collection is also accompanied by 1017 semi-automatic segmentations of the prostate and 1317 manual segmentations of target lesions in the STL format. Although STL is a common and practical format for 3D printing, it is not interoperable with many visualization and analysis tools commonly used in medical imaging research and does not provide any standard means to communicate metadata, among other limitations. This dataset contains segmentations of the prostate and target lesions harmonized into DICOM representation. Specifically, we created DICOM Encapsulated 3D Manufacturing Model objects (M3D modality) that includes the original STL content enriched with the DICOM metadata. Furthermore, we created an alternative encoding of the surface segmentations by rasterizing them and saving the result as a DICOM Segmentation object (SEG modality). As a result, the contributed DICOM objects can be stored in any DICOM server that supports those objects (including Google Healthcare DICOM stores), and the DICOM Segmentations can be visualized using off-the-shelf tools, such as OHIF Viewer. Conversion from STL to DICOM M3D modality was performed using PixelMed toolkit (https://www.pixelmed.com/dicomtoolkit.html). Conversion from STL to DICOM SEG was done in 2 steps. We used Slicer (https://www.slicer.org/) to rasterize the surface segmentation to the matrix of the segmented image, which were next converted to DICOM SEGs using dcmqi (https://github.com/QIICR/dcmqi) [5]. Resulting objects were validated using dicom3tools dciodvfy (https://www.dclunie.com/dicom3tools.html). Details describing the conversion process as well as the details on how to access the encapsulated STL content from the DICOM m3D files are provided in this GitHub repository: https://github.com/ImagingDataCommons/prostate_mri_us_biopsy_dcm_conversion. Specific files included in the record are:

    Prostate-MRI-US-Biopsy-DICOM-Annotations.zip: DICOM M3D and SEG files, organized into the folder hierarchy following this pattern: Prostate-MRI-US-Biopsy/%PatientID/%StudyInstanceUID/%SeriesNumber-%Modality-%SeriesDescription.dcm referenced_images_sorted-idc_file_manifest.s5cmd: IDC manifest for downloading the T2W MRI images corresponding to the annotations. To download the files in this manifest, first install s5cmd (https://github.com/peak/s5cmd), and run the following command: s5cmd --no-sign-request --endpoint-url https://s3.amazonaws.com run referenced_images_sorted-idc_file_manifest.s5cmd. Files will be organized in the Prostate-MRI-US-Biopsy/%PatientID/%StudyInstanceUID/ folder hierarchy upon download. References [1] Fedorov, A., Longabaugh, W. J. R., Pot, D., Clunie, D. A., Pieper, S., Aerts, H. J. W. L., Homeyer, A., Lewis, R., Akbarzadeh, A., Bontempi, D., Clifford, W., Herrmann, M. D., Höfener, H., Octaviano, I., Osborne, C., Paquette, S., Petts, J., Punzo, D., Reyes, M., Schacherer, D. P., Tian, M., White, G., Ziegler, E., Shmulevich, I., Pihl, T., Wagner, U., Farahani, K. & Kikinis, R. NCI Imaging Data Commons. Cancer Res. 81, 4188–4193 (2021). doi: 10.1158/0008-5472.CAN-21-0950. [2] Natarajan, S., Priester, A., Margolis, D., Huang, J., & Marks, L. (2020). Prostate MRI and Ultrasound With Pathology and Coordinates of Tracked Biopsy (Prostate-MRI-US-Biopsy) (version 2) [Data set]. The Cancer Imaging Archive. DOI: 10.7937/TCIA.2020.A61IOC1A [3] Sonn GA, Natarajan S, Margolis DJ, MacAiran M, Lieu P, Huang J, Dorey FJ, Marks LS. Targeted biopsy in the detection of prostate cancer using an office based magnetic resonance ultrasound fusion device. Journal of Urology 189, no. 1 (2013): 86-91. DOI: 10.1016/j.juro.2012.08.095 [4] Clark K, Vendt B, Smith K, Freymann J, Kirby J, Koppel P, Moore S, Phillips S, Maffitt D, Pringle M, Tarbox L, Prior F. The Cancer Imaging Archive (TCIA): Maintaining and Operating a Public Information Repository, Journal of Digital Imaging, Volume 26, Number 6, December, 2013, pp 1045-1057. DOI: 10.1007/s10278-013-9622-7 [5] Herz, C., Fillion-Robin, J.-C., Onken, M., Riesmeier, J., Lasso, A., Pinter, C., Fichtinger, G., Pieper, S., Clunie, D., Kikinis, R. & Fedorov, A. dcmqi: An Open Source Library for Standardized Communication of Quantitative Image Analysis Results Using DICOM. Cancer Res. 77, e87–e90 (2017). DOI: 10.1158/0008-5472.CAN-17-0336.

  16. p

    Data from: MIMIC-CXR-JPG - chest radiographs with structured labels

    • physionet.org
    Updated Mar 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alistair Johnson; Matthew Lungren; Yifan Peng; Zhiyong Lu; Roger Mark; Seth Berkowitz; Steven Horng (2024). MIMIC-CXR-JPG - chest radiographs with structured labels [Dataset]. http://doi.org/10.13026/jsn5-t979
    Explore at:
    Dataset updated
    Mar 12, 2024
    Authors
    Alistair Johnson; Matthew Lungren; Yifan Peng; Zhiyong Lu; Roger Mark; Seth Berkowitz; Steven Horng
    License

    https://github.com/MIT-LCP/license-and-dua/tree/master/draftshttps://github.com/MIT-LCP/license-and-dua/tree/master/drafts

    Description

    The MIMIC Chest X-ray JPG (MIMIC-CXR-JPG) Database v2.0.0 is a large publicly available dataset of chest radiographs in JPG format with structured labels derived from free-text radiology reports. The MIMIC-CXR-JPG dataset is wholly derived from MIMIC-CXR, providing JPG format files derived from the DICOM images and structured labels derived from the free-text reports. The aim of MIMIC-CXR-JPG is to provide a convenient processed version of MIMIC-CXR, as well as to provide a standard reference for data splits and image labels. The dataset contains 377,110 JPG format images and structured labels derived from the 227,827 free-text radiology reports associated with these images. The dataset is de-identified to satisfy the US Health Insurance Portability and Accountability Act of 1996 (HIPAA) Safe Harbor requirements. Protected health information (PHI) has been removed. The dataset is intended to support a wide body of research in medicine including image understanding, natural language processing, and decision support.

  17. d

    Soil images in DICOM format including Python programs for data...

    • search.dataone.org
    • datadryad.org
    Updated Apr 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ralf Wieland (2025). Soil images in DICOM format including Python programs for data transformation, 3D analysis, CNN traininig, CNN analysis [Dataset]. http://doi.org/10.5061/dryad.66t1g1k0c
    Explore at:
    Dataset updated
    Apr 24, 2025
    Dataset provided by
    Dryad Digital Repository
    Authors
    Ralf Wieland
    Time period covered
    Jan 1, 2020
    Description

    The 'Use of Deep Learning for structural analysis of CT-images of soil samples' used a set of soil sample data (CT-images). All the data and programs used here are open source and were created with the help of open source software. All steps are made by Python programs which are included in the data set.

  18. o

    National Cancer Institute Imaging Data Commons (IDC) Collections

    • registry.opendata.aws
    Updated May 10, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Imaging Data Commons (IDC)(https://imaging.datacommons.cancer.gov) team (2023). National Cancer Institute Imaging Data Commons (IDC) Collections [Dataset]. https://registry.opendata.aws/nci-imaging-data-commons/
    Explore at:
    Dataset updated
    May 10, 2023
    Dataset provided by
    Imaging Data Commons (IDC)(<a href="https://imaging.datacommons.cancer.gov">https://imaging.datacommons.cancer.gov</a>) team
    Description

    Imaging Data Commons (IDC) is a repository within the Cancer Research Data Commons (CRDC) that manages imaging data and enables its integration with the other components of CRDC. IDC hosts a growing number of imaging collections that are contributed by either funded US National Cancer Institute (NCI) data collection activities, or by the individual researchers.Image data hosted by IDC is stored in DICOM format.

  19. Z

    AI-derived annotations for the NLST and NSCLC-Radiomics computed tomography...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jan 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hugo Aerts (2024). AI-derived annotations for the NLST and NSCLC-Radiomics computed tomography imaging collections [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7473970
    Explore at:
    Dataset updated
    Jan 22, 2024
    Dataset provided by
    Deepa Krishnaswamy
    Andrey Fedorov
    David Clunie
    Dennis Bontempi
    Hugo Aerts
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Public imaging datasets are critical for the development and evaluation of automated tools in cancer imaging. Unfortunately, many of the available datasets do not provide annotations of tumors or organs-at-risk, crucial for the assessment of these tools. This is due to the fact that annotation of medical images is time consuming and requires domain expertise. It has been demonstrated that artificial intelligence (AI) based annotation tools can achieve acceptable performance and thus can be used to automate the annotation of large datasets. As part of the effort to enrich the public data available within NCI Imaging Data Commons (IDC) (https://imaging.datacommons.cancer.gov/) [1], we introduce this dataset that consists of such AI-generated annotations for two publicly available medical imaging collections of Computed Tomography (CT) images of the chest. For detailed information concerning this dataset, please refer to our publication here [2].

    We use publicly available pre-trained AI tools to enhance CT lung cancer collections that are unlabeled or partially labeled. The first tool is the nnU-Net deep learning framework [3] for volumetric segmentation of organs, where we use a pretrained model (Task D18 using the SegTHOR dataset) for labeling volumetric regions in the image corresponding to the heart, trachea, aorta and esophagus. These are the major organs-at-risk for radiation therapy for lung cancer. We further enhance these annotations by computing 3D shape radiomics features using the pyradiomics package [4]. The second tool is a pretrained model for per-slice automatic labeling of anatomic landmarks and imaged body part regions in axial CT volumes [5].

    We focus on enhancing two publicly available collections, the Non-small Cell Lung Cancer Radiomics (NSCLC-Radiomics collection) [6,7], and the National Lung Screening Trial (NLST collection) [8,9]. The CT data for these collections are available both in The Cancer Imaging Archive (TCIA) [10] and in NCI Imaging Data Commons (IDC). Further, the NSLSC-Radiomics collection includes expert-generated manual annotations of several chest organs, allowing us to quantify performance of the AI tools in that subset of data.

    IDC is relying on the DICOM standard to achieve FAIR [10] sharing of data and interoperability. Generated annotations are saved as DICOM Segmentation objects (volumetric segmentations of regions of interest) created using the dcmqi [12], and DICOM Structured Report (SR) objects (per-slice annotations of the body part imaged, anatomical landmarks and radiomics features) created using dcmqi and highdicom [13]. 3D shape radiomics features and corresponding DICOM SR objects are also provided for the manual segmentations available in the NSCLC-Radiomics collection.

    The dataset is available in IDC, and is accompanied by our publication here [2]. This pre-print details how the data were generated, and how the resulting DICOM objects can be interpreted and used in tools. Additionally, for further information about how to interact with and explore the dataset, please refer to our repository and accompanying Google Colaboratory notebook.

    The annotations are organized as follows. For NSCLC-Radiomics, three nnU-Net models were evaluated ('2d-tta', '3d_lowres-tta' and '3d_fullres-tta'). Within each folder, the PatientID and the StudyInstanceUID are subdirectories, and within this the DICOM Segmentation object and the DICOM SR for the 3D shape features are stored. A separate directory for the DICOM SR body part regression regions ('sr_regions') and landmarks ('sr_landmarks') are also provided with the same folder structure as above. Lastly, the DICOM SR for the existing manual annotations are provided in the 'sr_gt' directory. For NSCLC-Radiomics, each patient has a single StudyInstanceUID. The DICOM Segmentation and SR objects are named according to the SeriesInstanceUID of the original CT files.

    nsclc

    2d-tta

    PatientID

    StudyInstanceUID

    ReferencedSeriesInstanceUID_SEG.dcm

    ReferencedSeriesInstanceUID_features_SR.dcm

    3d_lowres-tta

    PatientID

    StudyInstanceUID

    ReferencedSeriesInstanceUID_SEG.dcm

    ReferencedSeriesInstanceUID_features_SR.dcm

    3d_fullres-tta

    PatientID

    StudyInstanceUID

    ReferencedSeriesInstanceUID_SEG.dcm

    ReferencedSeriesInstanceUID_features_SR.dcm

    sr_regions

    PatientID

    StudyInstanceUID

    ReferencedSeriesInstanceUID_regions_SR.dcm

    sr_landmarks

    PatientID

    StudyInstanceUID

    ReferencedSeriesInstanceUID_landmarks_SR.dcm

    sr_gt

    PatientID

    StudyInstanceUID

    ReferencedSeriesInstanceUID_features_SR.dcm

    For NLST, the '3d_fullres-tta' model was evaluated. The data is organized the same as above, where within each folder the PatientID and the StudyInstanceUID are subdirectories. For the NLST collection, it is possible that some patients have more than one StudyInstanceUID subdirectory. A separate directory for the DICOM SR body par regions ('sr_regions') and landmarks ('sr_landmarks') are also provided. The DICOM Segmentation and SR objects are named according to the SeriesInstanceUID of the original CT files.

    nlst

    3d_fullres-tta

    PatientID

    StudyInstanceUID

    ReferencedSeriesInstanceUID_SEG.dcm

    ReferencedSeriesInstanceUID_features_SR.dcm

    sr_regions

    PatientID

    StudyInstanceUID

    ReferencedSeriesInstanceUID_regions_SR.dcm

    sr_landmarks

    PatientID

    StudyInstanceUID

    ReferencedSeriesInstanceUID_landmarks_SR.dcm

    The query used for NSCLC-Radiomics is here, and a list of corresponding SeriesInstanceUIDs (along with PatientIDs and StudyInstanceUIDs) is here. The query used for NLST is here, and a list of corresponding SeriesInstanceUIDs (along with PatientIDs and StudyInstanceUIDs) is here. The two csv files that describe the series analyzed, nsclc_series_analyzed.csv and nlst_series_analyzed.csv, are also available as uploads to this repository.

    Version updates:

    Version 2: For the regions SR and landmarks SR, changed to use a distinct TrackingUniqueIdentifier for each MeasurementGroup. Also instead of using TargetRegion, changed to use FindingSite. Additionally for the landmarks SR, the TopographicalModifier was made a child of FindingSite instead of a sibling.

    Version 3: Added the two csv files that describe which series were analyzed

    Version 4: Modified the landmarks SR as the TopographicalModifier for the Kidney landmark (bottom) does not describe the landmark correctly. The Kidney landmark is the "first slice where both kidneys can be seen well." Instead, removed the use of the TopographicalModifier for that landmark. For the features SR, modified the units code for the Flatness and Elongation, as we incorrectly used mm units instead of no units.

  20. DICOM converted Slide Microscopy images for the TCGA-BRCA collection

    • zenodo.org
    bin
    Updated Aug 20, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    David Clunie; David Clunie; William Clifford; David Pot; Ulrike Wagner; Keyvan Farahani; Erika Kim; Andrey Fedorov; Andrey Fedorov; William Clifford; David Pot; Ulrike Wagner; Keyvan Farahani; Erika Kim (2024). DICOM converted Slide Microscopy images for the TCGA-BRCA collection [Dataset]. http://doi.org/10.5281/zenodo.12689963
    Explore at:
    binAvailable download formats
    Dataset updated
    Aug 20, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    David Clunie; David Clunie; William Clifford; David Pot; Ulrike Wagner; Keyvan Farahani; Erika Kim; Andrey Fedorov; Andrey Fedorov; William Clifford; David Pot; Ulrike Wagner; Keyvan Farahani; Erika Kim
    License

    Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
    License information was derived automatically

    Description

    This dataset corresponds to a collection of images and/or image-derived data available from National Cancer Institute Imaging Data Commons (IDC) [1]. This dataset was converted into DICOM representation and ingested by the IDC team. You can explore and visualize the corresponding images using IDC Portal here: TCGA-BRCA. You can use the manifests included in this Zenodo record to download the content of the collection following the Download instructions below.

    Collection description

    The Cancer Imaging Program (CIP) is working directly with primary investigators from institutes participating in TCGA to obtain and load images relating to the genomic, clinical, and pathological data being stored within the TCGA Data Portal. Currently this MR multi-sequence image collection of breast invasive carcinoma patients can be matched by each unique case identifier with the extensive gene and expression data of the same case from The Cancer Genome Atlas Data Portal to research the link between clinical phenome and tissue genome.

    Please see the TCGA-BRCA page to learn more about the images and to obtain any supporting metadata for this collection.

    Files included

    A manifest file's name indicates the IDC data release in which a version of collection data was first introduced. For example, collection_id-idc_v8-aws.s5cmd corresponds to the contents of the collection_id collection introduced in IDC data release v8. If there is a subsequent version of this Zenodo page, it will indicate when a subsequent version of the corresponding collection was introduced.

    1. tcga_brca-idc_v8-aws.s5cmd: manifest of files available for download from public IDC Amazon Web Services buckets
    2. tcga_brca-idc_v8-gcs.s5cmd: manifest of files available for download from public IDC Google Cloud Storage buckets
    3. tcga_brca-idc_v8-dcf.dcf: Gen3 manifest (for details see https://learn.canceridc.dev/data/organization-of-data/guids-and-uuids)

    Note that manifest files that end in -aws.s5cmd reference files stored in Amazon Web Services (AWS) buckets, while -gcs.s5cmd reference files in Google Cloud Storage. The actual files are identical and are mirrored between AWS and GCP.

    Download instructions

    Each of the manifests include instructions in the header on how to download the included files.

    To download the files using .s5cmd manifests:

    1. install idc-index package: pip install --upgrade idc-index
    2. download the files referenced by manifests included in this dataset by passing the .s5cmd manifest file: idc download manifest.s5cmd.

    To download the files using .dcf manifest, see manifest header.

    Acknowledgments

    Imaging Data Commons team has been funded in whole or in part with Federal funds from the National Cancer Institute, National Institutes of Health, under Task Order No. HHSN26110071 under Contract No. HHSN261201500003l.

    References

    [1] Fedorov, A., Longabaugh, W. J. R., Pot, D., Clunie, D. A., Pieper, S. D., Gibbs, D. L., Bridge, C., Herrmann, M. D., Homeyer, A., Lewis, R., Aerts, H. J. W., Krishnaswamy, D., Thiriveedhi, V. K., Ciausu, C., Schacherer, D. P., Bontempi, D., Pihl, T., Wagner, U., Farahani, K., Kim, E. & Kikinis, R. National Cancer Institute Imaging Data Commons: Toward Transparency, Reproducibility, and Scalability in Imaging Artificial Intelligence. RadioGraphics (2023). https://doi.org/10.1148/rg.230180

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
The Cancer Imaging Archive (2021). A DICOM dataset for evaluation of medical image de-identification [Dataset]. http://doi.org/10.7937/s17z-r072

Data from: A DICOM dataset for evaluation of medical image de-identification

Pseudo-PHI-DICOM-Data

Related Article
Explore at:
4 scholarly articles cite this dataset (View in Google Scholar)
dicom, n/a, csvAvailable download formats
Dataset updated
Jan 31, 2021
Dataset authored and provided by
The Cancer Imaging Archive
License

https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/

Time period covered
Apr 7, 2021
Dataset funded by
National Cancer Institutehttp://www.cancer.gov/
Description

Open access or shared research data must comply with (HIPAA) patient privacy regulations. These regulations require the de-identification of datasets before they can be placed in the public domain. The process of image de-identification is time consuming, requires significant human resources, and is prone to human error. Automated image de-identification algorithms have been developed but the research community requires some method of evaluation before such tools can be widely accepted. This evaluation requires a robust dataset that can be used as part of an evaluation process for de-identification algorithms.

We developed a DICOM dataset that can be used to evaluate the performance of de-identification algorithms. DICOM image information objects were selected from datasets published in TCIA. Synthetic Protected Health Information (PHI) was generated and inserted into selected DICOM data elements to mimic typical clinical imaging exams. The evaluation dataset was de-identified by a TCIA curation team using standard TCIA tools and procedures. We are publishing the evaluation dataset (containing synthetic PHI) and de-identified evaluation dataset (result of TCIA curation) in advance of a potential competition, sponsored by the National Cancer Institute (NCI), for de-identification algorithm evaluation, and de-identification of medical image datasets. The evaluation dataset published here is a subset of a larger evaluation dataset that was created under contract for the National Cancer Institute. This subset is being published to allow researchers to test their de-identification algorithms and promote standardized procedures for validating automated de-identification.

Search
Clear search
Close search
Google apps
Main menu