100+ datasets found
  1. c

    The Cancer Genome Atlas Breast Invasive Carcinoma Collection

    • cancerimagingarchive.net
    dicom, n/a
    Updated May 29, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Cancer Imaging Archive (2020). The Cancer Genome Atlas Breast Invasive Carcinoma Collection [Dataset]. http://doi.org/10.7937/K9/TCIA.2016.AB2NAZRP
    Explore at:
    n/a, dicomAvailable download formats
    Dataset updated
    May 29, 2020
    Dataset authored and provided by
    The Cancer Imaging Archive
    License

    https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/

    Time period covered
    May 29, 2020
    Dataset funded by
    National Cancer Institutehttp://www.cancer.gov/
    Description

    The Cancer Genome Atlas Breast Invasive Carcinoma (TCGA-BRCA) data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA). Clinical, genetic, and pathological data resides in the Genomic Data Commons (GDC) Data Portal while the radiological data is stored on The Cancer Imaging Archive (TCIA).

    Matched TCGA patient identifiers allow researchers to explore the TCGA/TCIA databases for correlations between tissue genotype, radiological phenotype and patient outcomes. Tissues for TCGA were collected from many sites all over the world in order to reach their accrual targets, usually around 500 specimens per cancer type. For this reason the image data sets are also extremely heterogeneous in terms of scanner modalities, manufacturers and acquisition protocols. In most cases the images were acquired as part of routine care and not as part of a controlled research study or clinical trial.

    CIP TCGA Radiology Initiative

    Imaging Source Site (ISS) Groups are being populated and governed by participants from institutions that have provided imaging data to the archive for a given cancer type. Modeled after TCGA analysis groups, ISS groups are given the opportunity to publish a marker paper for a given cancer type per the guidelines in the table above. This opportunity will generate increased participation in building these multi-institutional data sets as they become an open community resource. Learn more about the TCGA Breast Phenotype Research Group.

  2. TCGA BRCA cancer dataset

    • zenodo.org
    • portalinvestigacion.udc.gal
    • +1more
    bin
    Updated Dec 11, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Carlos Fernandez-Lozano; Carlos Fernandez-Lozano (2020). TCGA BRCA cancer dataset [Dataset]. http://doi.org/10.5281/zenodo.4309168
    Explore at:
    binAvailable download formats
    Dataset updated
    Dec 11, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Carlos Fernandez-Lozano; Carlos Fernandez-Lozano
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Following the same steps that we used in the previous course we downloaded the TCGA-BRCA using R and Bioconductor and in particular the TCGABiolinks package. We downloaded transcriptome profiling of gene expression quantification where the experimental strategy is (RNAseq) and the workflow type is HTSeq-FPKM-UQ and only primary solid tumor data of the affymetrix GPL86 profile and clinical data.

  3. c

    TCGA Breast Phenotype Research Group Data sets

    • cancerimagingarchive.net
    • dev.cancerimagingarchive.net
    n/a, xls, zip
    Updated Sep 4, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Cancer Imaging Archive (2018). TCGA Breast Phenotype Research Group Data sets [Dataset]. http://doi.org/10.7937/K9/TCIA.2014.8SIPIY6G
    Explore at:
    xls, n/a, zipAvailable download formats
    Dataset updated
    Sep 4, 2018
    Dataset authored and provided by
    The Cancer Imaging Archive
    License

    https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/

    Time period covered
    Sep 4, 2018
    Dataset funded by
    National Cancer Institutehttp://www.cancer.gov/
    Description

    At the time of our study, 108 cases with breast MRI data were available in the The Cancer Genome Atlas Breast Invasive Carcinoma Collection (TCGA-BRCA) collection. In order to minimize variations in image quality across the multi-institutional cases we included only breast MRI studies acquired on GE 1.5 Tesla magnet strength scanners (GE Medical Systems, Milwaukee, Wisconsin, USA) scanners, yielding a total of 93 cases. We then excluded cases that had missing images in the dynamic sequence (1 patient), or at the time did not have gene expression analysis available in the TCGA Data Portal (8 patients). After these criteria, a dataset of 84 breast cancer patients resulted, with MRIs from four institutions: Memorial Sloan Kettering Cancer Center, the Mayo Clinic, the University of Pittsburgh Medical Center, and the Roswell Park Cancer Institute. The resulting cases contributed by each institution were 9 (date range 1999-2002), 5 (1999-2003), 46 (1999-2004), and 24 (1999-2002), respectively. The dataset of biopsy proven invasive breast cancers included 74 (88%) ductal, 8 (10%) lobular, and 2 (2%) mixed. Of these, 73 (87%) were ER+, 67 (80%) were PR+, and 19 (23%) were HER2+. Various types of analyses were conducted using the combined imaging, genomic, and clinical data. Those analyses are described within several manuscripts created by the group (cited below). Additional information about the methodology for how the Radiologist Annotations file can be found on the TCGA Breast Image Feature Scoring Project page.

  4. DICOM converted Slide Microscopy images for the TCGA-BRCA collection

    • zenodo.org
    bin
    Updated Aug 20, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    David Clunie; David Clunie; William Clifford; David Pot; Ulrike Wagner; Keyvan Farahani; Erika Kim; Andrey Fedorov; Andrey Fedorov; William Clifford; David Pot; Ulrike Wagner; Keyvan Farahani; Erika Kim (2024). DICOM converted Slide Microscopy images for the TCGA-BRCA collection [Dataset]. http://doi.org/10.5281/zenodo.12689963
    Explore at:
    binAvailable download formats
    Dataset updated
    Aug 20, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    David Clunie; David Clunie; William Clifford; David Pot; Ulrike Wagner; Keyvan Farahani; Erika Kim; Andrey Fedorov; Andrey Fedorov; William Clifford; David Pot; Ulrike Wagner; Keyvan Farahani; Erika Kim
    License

    Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
    License information was derived automatically

    Description

    This dataset corresponds to a collection of images and/or image-derived data available from National Cancer Institute Imaging Data Commons (IDC) [1]. This dataset was converted into DICOM representation and ingested by the IDC team. You can explore and visualize the corresponding images using IDC Portal here: TCGA-BRCA. You can use the manifests included in this Zenodo record to download the content of the collection following the Download instructions below.

    Collection description

    The Cancer Imaging Program (CIP) is working directly with primary investigators from institutes participating in TCGA to obtain and load images relating to the genomic, clinical, and pathological data being stored within the TCGA Data Portal. Currently this MR multi-sequence image collection of breast invasive carcinoma patients can be matched by each unique case identifier with the extensive gene and expression data of the same case from The Cancer Genome Atlas Data Portal to research the link between clinical phenome and tissue genome.

    Please see the TCGA-BRCA page to learn more about the images and to obtain any supporting metadata for this collection.

    Files included

    A manifest file's name indicates the IDC data release in which a version of collection data was first introduced. For example, collection_id-idc_v8-aws.s5cmd corresponds to the contents of the collection_id collection introduced in IDC data release v8. If there is a subsequent version of this Zenodo page, it will indicate when a subsequent version of the corresponding collection was introduced.

    1. tcga_brca-idc_v8-aws.s5cmd: manifest of files available for download from public IDC Amazon Web Services buckets
    2. tcga_brca-idc_v8-gcs.s5cmd: manifest of files available for download from public IDC Google Cloud Storage buckets
    3. tcga_brca-idc_v8-dcf.dcf: Gen3 manifest (for details see https://learn.canceridc.dev/data/organization-of-data/guids-and-uuids)

    Note that manifest files that end in -aws.s5cmd reference files stored in Amazon Web Services (AWS) buckets, while -gcs.s5cmd reference files in Google Cloud Storage. The actual files are identical and are mirrored between AWS and GCP.

    Download instructions

    Each of the manifests include instructions in the header on how to download the included files.

    To download the files using .s5cmd manifests:

    1. install idc-index package: pip install --upgrade idc-index
    2. download the files referenced by manifests included in this dataset by passing the .s5cmd manifest file: idc download manifest.s5cmd.

    To download the files using .dcf manifest, see manifest header.

    Acknowledgments

    Imaging Data Commons team has been funded in whole or in part with Federal funds from the National Cancer Institute, National Institutes of Health, under Task Order No. HHSN26110071 under Contract No. HHSN261201500003l.

    References

    [1] Fedorov, A., Longabaugh, W. J. R., Pot, D., Clunie, D. A., Pieper, S. D., Gibbs, D. L., Bridge, C., Herrmann, M. D., Homeyer, A., Lewis, R., Aerts, H. J. W., Krishnaswamy, D., Thiriveedhi, V. K., Ciausu, C., Schacherer, D. P., Bontempi, D., Pihl, T., Wagner, U., Farahani, K., Kim, E. & Kikinis, R. National Cancer Institute Imaging Data Commons: Toward Transparency, Reproducibility, and Scalability in Imaging Artificial Intelligence. RadioGraphics (2023). https://doi.org/10.1148/rg.230180

  5. The subgroup data (BPS-LumA and WPS-LumA) of the 415 TCGA luminal-A breast...

    • plos.figshare.com
    xlsx
    Updated Jun 9, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Seunghyun Wang; Doheon Lee (2023). The subgroup data (BPS-LumA and WPS-LumA) of the 415 TCGA luminal-A breast cancer samples. [Dataset]. http://doi.org/10.1371/journal.pcbi.1011197.s007
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jun 9, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Seunghyun Wang; Doheon Lee
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The subgroup data (BPS-LumA and WPS-LumA) of the 415 TCGA luminal-A breast cancer samples.

  6. raw data of TCGA-BRCA

    • figshare.com
    rar
    Updated Dec 26, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    lanxin mu (2020). raw data of TCGA-BRCA [Dataset]. http://doi.org/10.6084/m9.figshare.13489356.v1
    Explore at:
    rarAvailable download formats
    Dataset updated
    Dec 26, 2020
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    lanxin mu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Raw data downloaded from TCGA.

  7. TCGA-BRCA dataset image form

    • kaggle.com
    Updated May 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    RAJIB BAG_1 (2025). TCGA-BRCA dataset image form [Dataset]. https://www.kaggle.com/datasets/rajibbag1/tcga-brca-dataset-image-form/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 23, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    RAJIB BAG_1
    Description

    Dataset

    This dataset was created by RAJIB BAG_1

    Contents

  8. TCGA BRCA Diagnostic Slide (19)

    • kaggle.com
    Updated Jan 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Arya Z.E. (2025). TCGA BRCA Diagnostic Slide (19) [Dataset]. https://www.kaggle.com/datasets/aryaze/tcga-brca-diagnostic-slide-19
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 8, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Arya Z.E.
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset

    This dataset was created by Arya Z.E.

    Released under CC0: Public Domain

    Contents

  9. TCGA BRCA Diagnostic Slide (24)

    • kaggle.com
    Updated Jan 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Arya Z.E. (2025). TCGA BRCA Diagnostic Slide (24) [Dataset]. https://www.kaggle.com/datasets/aryaze/tcga-brca-diagnostic-slide-24/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 8, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Arya Z.E.
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset

    This dataset was created by Arya Z.E.

    Released under CC0: Public Domain

    Contents

  10. Raw data and code

    • figshare.com
    txt
    Updated Jul 12, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cao Jingyu (2023). Raw data and code [Dataset]. http://doi.org/10.6084/m9.figshare.23665038.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jul 12, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Cao Jingyu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The legend of supplemental file 1 is "raw data downloaded from TCGA database for TCGA-BRCA dataset”. The legend of supplemental file 2 is "merged data of raw data downloaded from GEO database for BRCA-dataset". The legend of supplemental file 3 is "The intersection of the anoikis-related genes from MSigDB and GeneCards". The legend of "data1.R" is "the code of R for LASSO prognostic model". The legend of "GEOcombine_all.R" is "the code of R for combining of GSE42568, GSE20685, and GSE102484". The legend of "GSEA.GSVA.R" is "the code of R for GSEA and GSVA analysis". The legend of "m.R" is "the code of R for grouping information of TCGA-BRCA". The legend of "boxplot.R" is "the code of R for boxplot of TCGA-BRCA".

  11. PIVOT - BRCA (light)

    • zenodo.org
    • data.niaid.nih.gov
    application/gzip
    Updated Jan 25, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Malvika Sudhakar; Malvika Sudhakar; Raghunathan Rengaswamy; Karthik Raman; Karthik Raman; Raghunathan Rengaswamy (2022). PIVOT - BRCA (light) [Dataset]. http://doi.org/10.5281/zenodo.5898117
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Jan 25, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Malvika Sudhakar; Malvika Sudhakar; Raghunathan Rengaswamy; Karthik Raman; Karthik Raman; Raghunathan Rengaswamy
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Pre-processed TCGA BRCA data used for PIVOT analysis.

  12. TCGA BRCA Diagnostic Slide (5)

    • kaggle.com
    Updated Jan 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Arya Z.E. (2025). TCGA BRCA Diagnostic Slide (5) [Dataset]. https://www.kaggle.com/datasets/aryaze/tcga-brca-diagnostic-slide-5/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 5, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Arya Z.E.
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Arya Z.E.

    Released under MIT

    Contents

  13. Z

    BRCA1- and BRCA2- mutation associated transcriptome landscapes in breast and...

    • data.niaid.nih.gov
    Updated Dec 17, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Siras Hakobyan (2020). BRCA1- and BRCA2- mutation associated transcriptome landscapes in breast and ovarian cancers: ml-SOM results [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4326451
    Explore at:
    Dataset updated
    Dec 17, 2020
    Dataset provided by
    Hans Binder
    Arsen Arakelyan
    Arman Simonyan
    Siras Hakobyan
    LIlit Nersisyan
    Maria Nikoghosyan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is the submission accompanying raw result files for multiple-layer SOM (ml-SOM) analysis for the paper "Transcriptome patterns of BRCA1- and BRCA2- mutated breast and ovarian cancers".

    The dataset contains the results of the ml-SOM analysis of RNA-sequencing data from TCGA-OV (ovarian cancer) and TCGA-BRCA (breast cancer) projects.

    The dataset is organized as follows:

    Folder "12.BC.40 - Results" - ml-SOM analysis of TCGA-BRCA (breast cancer) dataset

    Folder "12.OV.40 - Results" - ml-SOM analysis of TCGA-OV (ovarian cancer) dataset

    File "12.BC.40.RData" - R data file that contains ml-SOM environment for breast cancer

    File "12.OV.40.RData" - R data file that contains ml-SOM environment for breast cancer

    For detailed instructions on browsing the results and their interpretation please refer to the oposSOM package manual [1], as well as original publications [2-4].

    References

    Henry Loeffler-Wirth, Hoang Thanh Le and Martin Kalcheropos. SOM.Comprehensive analysis of transcriptome data. DOI: 10.18129/B9.bioc.oposSOM

    Löffler-Wirth H, Kalcher M, Binder H. oposSOM: R-package for high-dimensional portraying of genome-wide expression landscapes on Bioconductor. Bioinformatics. 2015 Oct 1;31(19):3225-7. DOI: 10.1093/bioinformatics/btv342. Epub 2015 Jun 10.

    Wirth H, von Bergen M, Binder H. Mining SOM expression portraits: feature selection and integrating concepts of molecular function. BioData Min. 2012 Oct 8;5(1):18. DOI: 10.1186/1756-0381-5-18.

    Wirth H, Löffler M, von Bergen M, Binder H. Expression cartography of human tissues using self-organizing maps. BMC Bioinformatics. 2011 Jul 27;12:306. DOI: 10.1186/1471-2105-12-306.

  14. c

    HER2 and trastuzumab treatment response H&E slides with tumor ROI...

    • cancerimagingarchive.net
    • dev.cancerimagingarchive.net
    n/a, xlsx +1
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Cancer Imaging Archive, HER2 and trastuzumab treatment response H&E slides with tumor ROI annotations [Dataset]. http://doi.org/10.7937/E65C-AM96
    Explore at:
    n/a, xlsx, xml and svsAvailable download formats
    Dataset authored and provided by
    The Cancer Imaging Archive
    License

    https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/

    Time period covered
    Aug 1, 2022
    Dataset funded by
    National Cancer Institutehttp://www.cancer.gov/
    Description

    The current standard of care for many patients with HER2-positive breast cancer is neoadjuvant chemotherapy in combination with anti-HER2 agents, based on HER2 amplification as detected by in situ hybridization (ISH) or protein immunohistochemistry (IHC). However, hematoxylin & eosin (H&E) tumor stains are more commonly available, and accurate prediction of HER2 status and anti-HER2 treatment response from H&E would reduce costs and increase the speed of treatment selection. Computational algorithms for H&E have been effective in predicting a variety of cancer features and clinical outcomes, including moderate success in predicting HER2 status. We trained a CNN classifier on 188 H&E whole slide images (WSIs) manually annotated for tumor regions of interest (ROIs) by our pathology team. Our classifier achieved an area under the curve (AUC) of 0.90 in cross-validation of slide-level HER2 status and 0.81 on an independent TCGA test set. Moreover, we trained our classifier on pre-treatment samples from 187 HER2+ patients that subsequently received trastuzumab therapy. Our classifier achieved an AUC of 0.80 in a five-fold cross validation. Our work provides an H&E-based algorithm that can predict HER2 status and trastuzumab response in breast cancer at an accuracy that may benefit clinical evaluations. Here, we are providing the datasets used in the study to facilitate development of other HER2+ diagnosis and trastuzumab response applications.

    Data annotation

    Annotation of digital slides was performed, circling areas of invasive carcinoma (Region of Interests, ROIs). The manual annotation of ROIs significantly enhances the prediction accuracy and reduces the need for extensively large datasets. Regions of necrosis, in situ carcinoma or benign stroma and epithelium were excluded. The images were annotated with ROIs associated to HER2+/- tumor area (TA) by a senior breast pathologist. The annotations were marked tumor boundaries and annotated by Aperio ImageScope software. The annotations were exported from the Aperio software in The Extensible Markup Language (XML) format, including X and Y coordinates corresponding to the annotated regions. We used these coordinates for each slide image to tile these regions separately from the rest of the image, labeled as HER2+ or HER2- class.

    Data set descriptions

    Yale HER2 cohort:

    This dataset presents 192 cases of HER2 positive and negative invasive breast carcinomas H&E slides from the Yale Pathology electronic database. All tissues and data were retrieved under permission from the Yale Human Investigation Committee protocol #9505008219 to DLR. HER2 positive cases defined as those with 3+ score by immunohistochemistry (IHC) or an equivocal (2+) IHC score with subsequent amplification by fluorescence in situ hybridization (FISH) as defined by American Society of Clinical Oncology/College of American Pathologists (ASCO/CAP) clinical practice guidelines. H&E slides generated at Yale School of Medicine include 93 HER2+ and 99 HER2- slides. The slides were scanned at Yale Pathology Tissue Services and underwent a slide quality check before they went into the scanner. The tissue samples were scanned using Vectra Polaris by Perkin-Elmer scanner using bright field whole slides scanning at 20× magnification at Brady Memorial Laboratory Rimm’s lab.

    Yale trastuzumab response cohort:

    85 response cohort cases were identified also by retrospective search of the Yale Pathology electronic database. Cases included those patients with a pre-treatment breast core biopsy with HER2 positive invasive breast carcinoma who then received neoadjuvant targeted therapy with trastuzumab +/- pertuzumab prior to definitive surgery. HER2 positivity was defined as previously described for the HER2 negative/positive cohort. The response to targeted therapy was obtained from the pathology reports of the surgical resection specimens and dichotomized into responders or non-responders. Those with a complete pathologic response, defined as no residual invasive, lymphovascular invasion or metastatic carcinoma, were designated as responders (n=36). Cases with only residual in situ carcinoma were included in the responder category. Those cases with any amount of residual invasive carcinoma, lymphovascular invasion or metastatic carcinoma were categorized as non-responders (n=49).

    TCGA HER2 cohort:

    A total of 668 TCGA-BRCA HER2+/- samples with available HER2 status were downloaded from the GDC portal (see "Additional Resources for this Dataset" below). Slides were visually inspected by our pathology team to exclude low quality samples with tissue folding or those that appeared to be from frozen tissue. A total of 182 samples (90 HER2- and 92 HER2+) were retained for use as independent test set. Information about which specific samples were retained can be found the TCGA_BRCA Filtered folder of the dataset.

  15. f

    TCGA-BRCA.mutect2_snv.tsv

    • figshare.com
    txt
    Updated Jun 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jarrett Lio (2022). TCGA-BRCA.mutect2_snv.tsv [Dataset]. http://doi.org/10.6084/m9.figshare.19948121.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 1, 2022
    Dataset provided by
    figshare
    Authors
    Jarrett Lio
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This is the SNP data downloaded from Xena public database

  16. c

    SDTM datasets of clinical data and measurements for selected cancer...

    • dev.cancerimagingarchive.net
    csv, n/a, xpt
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Cancer Imaging Archive, SDTM datasets of clinical data and measurements for selected cancer collections to TCIA [Dataset]. http://doi.org/10.7937/TCIA.2019.zfv154m9
    Explore at:
    n/a, xpt, csvAvailable download formats
    Dataset authored and provided by
    The Cancer Imaging Archive
    License

    https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/

    Time period covered
    Jun 21, 2019
    Dataset funded by
    National Cancer Institutehttp://www.cancer.gov/
    Description

    The Data Integration & Imaging Informatics (DI-Cubed) project explored the issue of lack of standardized data capture at the point of data creation, as reflected in the non-image data accompanying 4 TCIA breast cancer collections (Multi-center breast DCE-MRI data and segmentations from patients in the I-SPY 1/ACRIN 6657 trials (ISPY1), BREAST-DIAGNOSIS, Single site breast DCE-MRI data and segmentations from patients undergoing neoadjuvant chemotherapy (Breast-MRI-NACT-Pilot), The Cancer Genome Atlas Breast Invasive Carcinoma Collection (TCGA-BRCA)) and the Ivy Glioblastoma Atlas Project (IvyGAP) brain cancer collection. The work addressed the desire for semantic interoperability between various NCI initiatives by aligning on common clinical metadata elements and supporting use cases that connect clinical, imaging, and genomics data. Accordingly, clinical and measurement data imported into I2B2 were cross-mapped to industry standard concepts for names and values including those derived from BRIDG, CDISC SDTM, DICOM Structured Reporting models and using NCI Thesaurus, SNOMED CT and LOINC controlled terminology. A subset of the standardized data was then exported from I2B2 in SDTM compliant SAS transport files. The SDTM data was derived from data taken from both the curated TCIA spreadsheets as well as tumor measurements and dates from the TCIA Restful API. Due to the nature of the available data not all SDTM conformance rules were applicable or adhered to. These Study Data Tabulation Model format (SDTM) datasets were validated using Pinnacle 21 CDISC validation software. The validation software reviews datasets according to their degree of conformance to rules developed for the purposes of FDA submissions of electronic data. Iterative refinements were made to the datasets based upon group discussions and feedback from the validation tool. Export datasets for the following SDTM domains were generated:

    • DM (Demographics)
    • DS (Disposition)
    • MI (Microscopic Findings)
    • PR (Procedures)
    • SS (Subject Status)
    • TU (Tumor/Lesion Identification)
    • TR (Tumor/Lesion Results)

  17. S

    Figure 1. High ESRP1expression correlates with poor prognosis in estrogen...

    • search.sourcedata.io
    zip
    Updated Jan 21, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yesim Gokmen-Polar; Yaseswini Neelamraju; Chirayu, P Goswami; Xiaoping Gu; Gouthami Nallamothu; Edytha Vieth; Sarath Chandra Janga; Yuan Gu; Michael Ryan; Sunil, S Badve (2019). : Figure 1-F [Dataset]. https://search.sourcedata.io/panel/65185
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 21, 2019
    Authors
    Yesim Gokmen-Polar; Yaseswini Neelamraju; Chirayu, P Goswami; Xiaoping Gu; Gouthami Nallamothu; Edytha Vieth; Sarath Chandra Janga; Yuan Gu; Michael Ryan; Sunil, S Badve
    License

    Attribution 2.0 (CC BY 2.0)https://creativecommons.org/licenses/by/2.0/
    License information was derived automatically

    Variables measured
    ESRP1, human
    Description

    F The Cancer Genome Atlas Breast Invasive Carcinoma (TCGA-BRCA) RNA-seq dataset; Kaplan-Meier curves of overall survival (OS) in ER- breast cancer. Red-high ESRP1 expression; Black-low ESRP1 expression. A log rank test was used to calculate P=0.19 (n=100, number of events=17).. List of tagged entities: ESRP1 (ncbigene:54845), human (taxonomy:9606), , gene expression assay (bao:BAO_0002785),RNA sequencing (obi:OBI_0001177),survival curve (obi:OBI_0000889), BRCA,Breast Invasive Carcinoma,ER- breast cancer

  18. f

    Clinicopathologic features of BRCA from the TCGA.

    • plos.figshare.com
    bin
    Updated Aug 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jiatong Ding; Chenxi Li; Kexin Shu; Wanying Chen; Chenxi Cai; Xin Zhang; Wenxiong Zhang (2023). Clinicopathologic features of BRCA from the TCGA. [Dataset]. http://doi.org/10.1371/journal.pone.0289960.t001
    Explore at:
    binAvailable download formats
    Dataset updated
    Aug 16, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Jiatong Ding; Chenxi Li; Kexin Shu; Wanying Chen; Chenxi Cai; Xin Zhang; Wenxiong Zhang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    BackgroundPatients with systemic lupus erythematosus (SLE) have a lower risk of breast cancer (BRCA) than the general population. In this study, we explored the underlying molecular mechanism that is dysregulated in both diseases.MethodsWeighted gene coexpression network analysis (WGCNA) was executed with the SLE and BRCA datasets from the Gene Expression Omnibus (GEO) website and identified the potential role of membrane metalloendopeptidase (MME) in both diseases. Then, Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses of related proteins and miRNAs were performed to investigate the potential molecular pathways.ResultsWGCNA revealed that MME was positively related to SLE but negatively related to BRCA. In BRCA, MME expression was significantly decreased in tumor tissues, especially in luminal B and infiltrating ductal carcinoma subtypes. Receiver operating characteristic (ROC) analysis identified MME as a valuable diagnostic biomarker of BRCA, with an area under the curve (AUC) value equal to 0.984 (95% confidence interval = 0.976–0.992). KEGG enrichment analysis suggested that MME-related proteins and targeted miRNAs may reduce the incidence of BRCA in SLE patients via the PI3K/AKT/FOXO signaling pathway. Low MME expression was associated with favorable relapse-free survival (RFS) but no other clinical outcomes and may contribute to resistance to chemotherapy in BRCA, with an AUC equal to 0.527 (P value < 0.05).ConclusionsIn summary, MME expression was significantly decreased in BRCA but positively correlated with SLE, and it might reduce the incidence of BRCA in SLE patients via the PI3K/AKT/FOXO signaling pathway.

  19. Metadata record for the article: Signaling of MK2 Sustains Robust AP1...

    • springernature.figshare.com
    xlsx
    Updated Feb 5, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Haoming Chen; Ravi Padia; Tao Li; Yue Li; Bin Li; Lingtao Jin; Shuang Huang (2024). Metadata record for the article: Signaling of MK2 Sustains Robust AP1 Activity for Triple Negative Breast Cancer Tumorigenesis through Direct Phosphorylation of JAB1 [Dataset]. http://doi.org/10.6084/m9.figshare.14681250.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Feb 5, 2024
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Haoming Chen; Ravi Padia; Tao Li; Yue Li; Bin Li; Lingtao Jin; Shuang Huang
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Summary

    This metadata record provides details of the data supporting the claims of the related article: “Signaling of MK2 Sustains Robust AP1 Activity for Triple Negative Breast Cancer Tumorigenesis through Direct Phosphorylation of JAB1”.

    The related study showed that p38MAPK signalling pathway regulation of activator protein 1 (AP1) activity involves both MAPKAPK2 (MK2) and JAB1, a known JUN binding protein.

    Type of data: signalling pathway activity

    Subject of data: antibodies; Eukaryotic cell lines (ATCC); Mus musculus (Foxn1nu, female, Catalog# 007850, The Jackson laboratory)

    Data access

    The following files underlying the figures in the related manuscript are openly available with this data record:

    • Fig.1 data.xlsx (Fig.1a,1b of the related article)

    • Fig.2 data.xlsx (Fig.2c,2d and 2e)

    • Fig.3 data.xlsx (Fig.3e and 3f)

    • Fig.4 data.xlsx (Fig.4f,4g and 4f)

    • Fig.5 data.xlsx (Fig.5b,5c and 5d)

    • Fig.6 data.xlsx

    • Fig.7 data.xlsx (Fig. 7e and 7f)

    • Fig.8 data.xlsx (Fig.8a and 8c)

    All other data supporting the related study can be found in the supplementary information file of the related article, and the corresponding author can make any materials available upon request. Un-cropped gels and western blots for Fig. to Fig.5 were included in Supplementary Materials (Fig.S11).”

    JAB1 expression in different breast cancer subtypes were downloaded from https://tcga.xenahubs.net/download/TCGA.BRCA.sampleMap/HiSeqV2.gz, and https://tcga.xenahubs.net/download/TCGA.BRCA.sampleMap/BRCA_clinicalMatrix.gz. For analysis of p38MAPK activity in breast cancer, Reverse Phase Protein Array (RPPA) z score and corresponding clinical data from TCGA Breast Cancer Invasive Carcinoma, PanCancer Atlas were first downloaded through cBioportal (https://www.cbioportal.org/).

    Corresponding author(s) for this study

    Shuang Huang, Department of Anatomy and Cell Biology, University of Florida College of Medicine, Gainesville, FL 32610. E-mail: shuanghuang@ufl.edu

    Study approval

    University of Florida Institutional Animal Care and Use Committee.

  20. Pan-Cancer-Nuclei-Seg-DICOM: DICOM converted Dataset of Segmented Nuclei in...

    • zenodo.org
    • explore.openaire.eu
    bin
    Updated Oct 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Christopher Bridge; Markus Herrmann; David Clunie; David Clunie; Andrey Fedorov; Andrey Fedorov; Christopher Bridge; Markus Herrmann (2024). Pan-Cancer-Nuclei-Seg-DICOM: DICOM converted Dataset of Segmented Nuclei in Hematoxylin and Eosin Stained Histopathology Images [Dataset]. http://doi.org/10.5281/zenodo.11099005
    Explore at:
    binAvailable download formats
    Dataset updated
    Oct 1, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Christopher Bridge; Markus Herrmann; David Clunie; David Clunie; Andrey Fedorov; Andrey Fedorov; Christopher Bridge; Markus Herrmann
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset corresponds to a collection of images and/or image-derived data available from National Cancer Institute Imaging Data Commons (IDC) [1]. This dataset was converted into DICOM representation and ingested by the IDC team. You can explore and visualize the corresponding images using IDC Portal here: Pan-Cancer-Nuclei-Seg-DICOM. You can use the manifests included in this Zenodo record to download the content of the collection following the Download instructions below.

    Collection description

    This collection contains automatic nucleus segmentation data of 5,060 whole slide tissue images of 10 cancer types earlier published in [2] (https://doi.org/10.7937/TCIA.2019.4A4DKP9U) stored in DICOM Bulk Annotation format. Nuclei annotations are stored as closed polygons along with the area of each nuclei. The annotations correspond to digital pathology images from the TCGA-BLCA,TCGA-BRCA,TCGA-CESC,TCGA-COAD,TCGA-GBM,TCGA-LUAD,TCGA-LUSC,TCGA-PAAD,TCGA-PRAD,TCGA-READ,TCGA-SKCM,TCGA-STAD,TCGA-UCEC,TCGA-UVM collections available in NCI Imaging Data Commons.
    To learn how these files are organized and how to access the content programmatically, see this documentation page: https://highdicom.readthedocs.io/en/latest/ann.html.
    Conversion of the nuclei segmentations from the original CSV format into DICOM ANN format was done using the code available in 10.5281/zenodo.10632181.

    Files included

    A manifest file's name indicates the IDC data release in which a version of collection data was first introduced. For example, pan_cancer_nuclei_seg_dicom-collection_id-idc_v19-aws.s5cmd corresponds to the annotations for th eimages in the collection_id collection introduced in IDC data release v19. If there is a subsequent version of this Zenodo page, it will indicate when a subsequent version of the corresponding collection was introduced.

    For each of the collections, the following manifest files are provided:

    1. pan_cancer_nuclei_seg_dicom-: manifest of files available for download from public IDC Amazon Web Services buckets
    2. pan_cancer_nuclei_seg_dicom-: manifest of files available for download from public IDC Google Cloud Storage buckets
    3. pan_cancer_nuclei_seg_dicom-: Gen3 manifest (for details see https://learn.canceridc.dev/data/organization-of-data/guids-and-uuids)

    Note that manifest files that end in -aws.s5cmd reference files stored in Amazon Web Services (AWS) buckets, while -gcs.s5cmd reference files in Google Cloud Storage. The actual files are identical and are mirrored between AWS and GCP.

    Download instructions

    Each of the manifests include instructions in the header on how to download the included files.

    To download the files using .s5cmd manifests:

    1. install idc-index package: pip install --upgrade idc-index
    2. download the files referenced by manifests included in this dataset by passing the .s5cmd manifest file: idc download manifest.s5cmd

    To download the files using .dcf manifest, see manifest header.

    Acknowledgments

    Imaging Data Commons team has been funded in whole or in part with Federal funds from the National Cancer Institute, National Institutes of Health, under Task Order No. HHSN26110071 under Contract No. HHSN261201500003l.

    References

    [1] Fedorov, A., Longabaugh, W. J. R., Pot, D., Clunie, D. A., Pieper, S. D., Gibbs, D. L., Bridge, C., Herrmann, M. D., Homeyer, A., Lewis, R., Aerts, H. J. W. L., Krishnaswamy, D., Thiriveedhi, V. K., Ciausu, C., Schacherer, D. P., Bontempi, D., Pihl, T., Wagner, U., Farahani, K., Kim, E. & Kikinis, R. National cancer institute imaging data commons: Toward transparency, reproducibility, and scalability in imaging artificial intelligence. Radiographics 43, (2023).
    [2] Hou, L., Gupta, R., Van Arnam, J. S., Zhang, Y., Sivalenka, K., Samaras, D., Kurc, T., & Saltz, J. H. (2019). Dataset of Segmented Nuclei in Hematoxylin and Eosin Stained Histopathology Images of 10 Cancer Types [Data set]. The Cancer Imaging Archive. https://doi.org/10.7937/TCIA.2019.4A4DKP9U
Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
The Cancer Imaging Archive (2020). The Cancer Genome Atlas Breast Invasive Carcinoma Collection [Dataset]. http://doi.org/10.7937/K9/TCIA.2016.AB2NAZRP

The Cancer Genome Atlas Breast Invasive Carcinoma Collection

TCGA-BRCA

Explore at:
63 scholarly articles cite this dataset (View in Google Scholar)
n/a, dicomAvailable download formats
Dataset updated
May 29, 2020
Dataset authored and provided by
The Cancer Imaging Archive
License

https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/

Time period covered
May 29, 2020
Dataset funded by
National Cancer Institutehttp://www.cancer.gov/
Description

The Cancer Genome Atlas Breast Invasive Carcinoma (TCGA-BRCA) data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA). Clinical, genetic, and pathological data resides in the Genomic Data Commons (GDC) Data Portal while the radiological data is stored on The Cancer Imaging Archive (TCIA).

Matched TCGA patient identifiers allow researchers to explore the TCGA/TCIA databases for correlations between tissue genotype, radiological phenotype and patient outcomes. Tissues for TCGA were collected from many sites all over the world in order to reach their accrual targets, usually around 500 specimens per cancer type. For this reason the image data sets are also extremely heterogeneous in terms of scanner modalities, manufacturers and acquisition protocols. In most cases the images were acquired as part of routine care and not as part of a controlled research study or clinical trial.

CIP TCGA Radiology Initiative

Imaging Source Site (ISS) Groups are being populated and governed by participants from institutions that have provided imaging data to the archive for a given cancer type. Modeled after TCGA analysis groups, ISS groups are given the opportunity to publish a marker paper for a given cancer type per the guidelines in the table above. This opportunity will generate increased participation in building these multi-institutional data sets as they become an open community resource. Learn more about the TCGA Breast Phenotype Research Group.

Search
Clear search
Close search
Google apps
Main menu