100+ datasets found
  1. c

    The Cancer Genome Atlas Breast Invasive Carcinoma Collection

    • cancerimagingarchive.net
    dicom, n/a
    Updated Feb 2, 2014
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Cancer Imaging Archive (2014). The Cancer Genome Atlas Breast Invasive Carcinoma Collection [Dataset]. http://doi.org/10.7937/K9/TCIA.2016.AB2NAZRP
    Explore at:
    n/a, dicomAvailable download formats
    Dataset updated
    Feb 2, 2014
    Dataset authored and provided by
    The Cancer Imaging Archive
    License

    https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/

    Time period covered
    May 29, 2020
    Dataset funded by
    National Cancer Institutehttp://www.cancer.gov/
    Description

    The Cancer Genome Atlas Breast Invasive Carcinoma (TCGA-BRCA) data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA). Clinical, genetic, and pathological data resides in the Genomic Data Commons (GDC) Data Portal while the radiological data is stored on The Cancer Imaging Archive (TCIA).

    Matched TCGA patient identifiers allow researchers to explore the TCGA/TCIA databases for correlations between tissue genotype, radiological phenotype and patient outcomes. Tissues for TCGA were collected from many sites all over the world in order to reach their accrual targets, usually around 500 specimens per cancer type. For this reason the image data sets are also extremely heterogeneous in terms of scanner modalities, manufacturers and acquisition protocols. In most cases the images were acquired as part of routine care and not as part of a controlled research study or clinical trial.

    CIP TCGA Radiology Initiative

    Imaging Source Site (ISS) Groups are being populated and governed by participants from institutions that have provided imaging data to the archive for a given cancer type. Modeled after TCGA analysis groups, ISS groups are given the opportunity to publish a marker paper for a given cancer type per the guidelines in the table above. This opportunity will generate increased participation in building these multi-institutional data sets as they become an open community resource. Learn more about the TCGA Breast Phenotype Research Group.

  2. TCGA BRCA cancer dataset

    • zenodo.org
    • portalinvestigacion.udc.gal
    • +1more
    bin
    Updated Dec 11, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Carlos Fernandez-Lozano; Carlos Fernandez-Lozano (2020). TCGA BRCA cancer dataset [Dataset]. http://doi.org/10.5281/zenodo.4309168
    Explore at:
    binAvailable download formats
    Dataset updated
    Dec 11, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Carlos Fernandez-Lozano; Carlos Fernandez-Lozano
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Following the same steps that we used in the previous course we downloaded the TCGA-BRCA using R and Bioconductor and in particular the TCGABiolinks package. We downloaded transcriptome profiling of gene expression quantification where the experimental strategy is (RNAseq) and the workflow type is HTSeq-FPKM-UQ and only primary solid tumor data of the affymetrix GPL86 profile and clinical data.

  3. t

    TCGA-BRCA - Dataset - LDM

    • service.tib.eu
    Updated Dec 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). TCGA-BRCA - Dataset - LDM [Dataset]. https://service.tib.eu/ldmservice/dataset/tcga-brca
    Explore at:
    Dataset updated
    Dec 16, 2024
    Description

    The dataset used in the paper for whole slide image (WSI) classification, which is a type of digital pathology. The dataset consists of histopathology and cytopathology images, and is used to evaluate the performance of the proposed method for WSI classification.

  4. c

    TCGA Breast Phenotype Research Group Data sets

    • cancerimagingarchive.net
    n/a, xls, zip
    Updated Sep 4, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Cancer Imaging Archive (2018). TCGA Breast Phenotype Research Group Data sets [Dataset]. http://doi.org/10.7937/K9/TCIA.2014.8SIPIY6G
    Explore at:
    xls, n/a, zipAvailable download formats
    Dataset updated
    Sep 4, 2018
    Dataset authored and provided by
    The Cancer Imaging Archive
    License

    https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/

    Time period covered
    Sep 4, 2018
    Dataset funded by
    National Cancer Institutehttp://www.cancer.gov/
    Description

    At the time of our study, 108 cases with breast MRI data were available in the The Cancer Genome Atlas Breast Invasive Carcinoma Collection (TCGA-BRCA) collection. In order to minimize variations in image quality across the multi-institutional cases we included only breast MRI studies acquired on GE 1.5 Tesla magnet strength scanners (GE Medical Systems, Milwaukee, Wisconsin, USA) scanners, yielding a total of 93 cases. We then excluded cases that had missing images in the dynamic sequence (1 patient), or at the time did not have gene expression analysis available in the TCGA Data Portal (8 patients). After these criteria, a dataset of 84 breast cancer patients resulted, with MRIs from four institutions: Memorial Sloan Kettering Cancer Center, the Mayo Clinic, the University of Pittsburgh Medical Center, and the Roswell Park Cancer Institute. The resulting cases contributed by each institution were 9 (date range 1999-2002), 5 (1999-2003), 46 (1999-2004), and 24 (1999-2002), respectively. The dataset of biopsy proven invasive breast cancers included 74 (88%) ductal, 8 (10%) lobular, and 2 (2%) mixed. Of these, 73 (87%) were ER+, 67 (80%) were PR+, and 19 (23%) were HER2+. Various types of analyses were conducted using the combined imaging, genomic, and clinical data. Those analyses are described within several manuscripts created by the group (cited below). Additional information about the methodology for how the Radiologist Annotations file can be found on the TCGA Breast Image Feature Scoring Project page.

  5. TCGA-BRCA.mutect2_snv.tsv

    • figshare.com
    txt
    Updated Jun 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jarrett Lio (2022). TCGA-BRCA.mutect2_snv.tsv [Dataset]. http://doi.org/10.6084/m9.figshare.19948121.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 1, 2022
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Jarrett Lio
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This is the SNP data downloaded from Xena public database

  6. h

    sample-tcga-brca

    • huggingface.co
    Updated Nov 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vishvesh Trivedi (2025). sample-tcga-brca [Dataset]. https://huggingface.co/datasets/NerdyVisky/sample-tcga-brca
    Explore at:
    Dataset updated
    Nov 19, 2025
    Authors
    Vishvesh Trivedi
    Description

    NerdyVisky/sample-tcga-brca dataset hosted on Hugging Face and contributed by the HF Datasets community

  7. 447 TRIPLE NEGATIVE BREAST CANCER MERGED SAMPLES FROM METABRIC AND TCGA

    • springernature.figshare.com
    • datasetcatalog.nlm.nih.gov
    bin
    Updated Nov 17, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SAMINA GUL; Jianyu Pang; Hongjun Yuan; Yongzhi Chen; Qian yu; Wang Hui; WENRU TANG (2023). 447 TRIPLE NEGATIVE BREAST CANCER MERGED SAMPLES FROM METABRIC AND TCGA [Dataset]. http://doi.org/10.6084/m9.figshare.23831913.v1
    Explore at:
    binAvailable download formats
    Dataset updated
    Nov 17, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    SAMINA GUL; Jianyu Pang; Hongjun Yuan; Yongzhi Chen; Qian yu; Wang Hui; WENRU TANG
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    We extracted 320 samples of the TNBC subtype of breast cancer from METABRIC (Breast Cancer) with a total of 2509 samples and merged 127 TNBC sample from TCGA-BRCA and used merged 447 samples for validation.

  8. TCGA-BRCA updated image form dataset

    • kaggle.com
    zip
    Updated May 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Joy Dhar (2025). TCGA-BRCA updated image form dataset [Dataset]. https://www.kaggle.com/datasets/dharjoy/tcga-brca-updated-image-form-dataset
    Explore at:
    zip(888630513 bytes)Available download formats
    Dataset updated
    May 24, 2025
    Authors
    Joy Dhar
    Description

    Dataset

    This dataset was created by Joy Dhar

    Contents

  9. h

    TCGA-BRCA-30-samples

    • huggingface.co
    Updated Dec 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vishvesh Trivedi (2025). TCGA-BRCA-30-samples [Dataset]. https://huggingface.co/datasets/NerdyVisky/TCGA-BRCA-30-samples
    Explore at:
    Dataset updated
    Dec 20, 2025
    Authors
    Vishvesh Trivedi
    Description

    NerdyVisky/TCGA-BRCA-30-samples dataset hosted on Hugging Face and contributed by the HF Datasets community

  10. BRCA samples somatic mutation data

    • figshare.com
    application/gzip
    Updated Jan 19, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Endre Sebestyén (2016). BRCA samples somatic mutation data [Dataset]. http://doi.org/10.6084/m9.figshare.1061908.v1
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Jan 19, 2016
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Endre Sebestyén
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    TCGA BRCA samples somatic mutation data in BED format.

  11. TCGA BRCA Cancer Type Survival Dataset

    • kaggle.com
    zip
    Updated Jun 13, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    sajju (2024). TCGA BRCA Cancer Type Survival Dataset [Dataset]. https://www.kaggle.com/sajjuafridi/tcga-brca-cancer-type-survival-dataset
    Explore at:
    zip(69008197 bytes)Available download formats
    Dataset updated
    Jun 13, 2024
    Authors
    sajju
    Description

    Dataset

    This dataset was created by sajju

    Released under Other (specified in description)

    Contents

  12. BRCA non-paired sample gene level read counts

    • figshare.com
    • datasetcatalog.nlm.nih.gov
    application/gzip
    Updated Jan 19, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Endre Sebestyén (2016). BRCA non-paired sample gene level read counts [Dataset]. http://doi.org/10.6084/m9.figshare.1061900.v1
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Jan 19, 2016
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Endre Sebestyén
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    TCGA BRCA non-paired sample gene level read counts from Level 3 RNASeq-v2 data.

  13. TCGA-BRCA:survival analysis

    • kaggle.com
    zip
    Updated Mar 31, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Juan Malagón (2025). TCGA-BRCA:survival analysis [Dataset]. https://www.kaggle.com/datasets/jmalagontorres/tcga-brca-survival-analysis/code
    Explore at:
    zip(133161552490 bytes)Available download formats
    Dataset updated
    Mar 31, 2025
    Authors
    Juan Malagón
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Introduction

    This dataset consists of 1097 breast cancer patient cases and is designed for survival analysis using both histopathological and clinical information. The combination of these data sources allows for the exploration of disease progression patterns and the development of predictive models.

    Histopathological Data

    The dataset includes a folder containing histopathological image patches extracted from whole-slide imaging (WSI) scans.

    Optical magnification: x20

    Patch size: 1000 x 1000 pixels

    Region selection: Only patches containing tissue are included, discarding areas without relevant information

    Image-Derived Data

    For each patient, a CSV file is provided with extracted information from the histopathological patches:

    Histograms: Representation of the pixel intensity distribution in each image

    Cell count: Number of cells present in the selected patches

    Clinical Data

    A second CSV file contains clinical information about the patients, which is essential for survival analysis. The included variables are:

    Time until death: The time elapsed until the patient’s death

    Vital status: Indicates whether the patient is deceased or still alive

    Other clinical variables: Factors that may influence survival and help contextualize the histopathological data

    Dataset Objective

    The primary objective of this dataset is to facilitate the development of survival models that integrate histopathological and clinical information. This will help identify patterns in breast cancer progression and enhance predictive capabilities for estimating patient survival time.

    This dataset is ideal for exploring machine learning methods applied to digital pathology and survival analysis in oncology.

  14. Z

    Preliminary Mitosis Detection Results for TCGA-BRCA Dataset

    • data.niaid.nih.gov
    • data-staging.niaid.nih.gov
    Updated Feb 21, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jahanifar, Mostafa (2024). Preliminary Mitosis Detection Results for TCGA-BRCA Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10245706
    Explore at:
    Dataset updated
    Feb 21, 2024
    Authors
    Jahanifar, Mostafa
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset provides mitosis detection results employing the "Mitosis Detection, Fast and Slow" (MDFS) algorithm [[2208.12587] Mitosis Detection, Fast and Slow: Robust and Efficient Detection of Mitotic Figures (arxiv.org)] on the TCGA-BRCA dataset.

    The MDFS algorithm exemplifies a robust and efficient two-stage process for mitosis detection. Initially, potential mitotic figures are identified and later refined. The proposed model for the preliminary identification of candidates, the EUNet, stands out for its swift and accurate performance, largely due to its structural design. EUNet operates by outlining candidate areas at a lower resolution, significantly expediting the detection process. In the second phase, the initially identified candidates undergo further refinement using a more intricate classifier network, namely the EfficientNet-B7. The MDFS algorithm was originally developed for the MIDOG challenges.

    Viewing in QuPath

    The dataset at hand comprises GeoJSON files in two categories: mitosis and proxy (mimicker -- the candidates that are unlikely to be mitosis based on our algorithm). Users can open and visualize each category overlaid on the Whole Slide Image (WSI) using QuPath. Simply drag and drop the annotation file onto the opened image in the program. Additionally, users can employ the provided Python snippet to read the annotation into a Python dictionary or a Numpy array.

    Loading in Python

    To load the GeoJSON files in Python, users can use the following code:

    import json

    import numpy as np

    import pandas as pd

    def load_geojson(filename):

    # Load the GeoJSON file

    with open(filename, 'r') as f:

     data = json.load(f)
    

    # Extract the properties and store in a dictionary

    slide_properties = data["properties"]

    # Convert the points to a numpy array

    points_np = np.array([(feat['geometry']['coordinates'][0], feat['geometry']['coordinates'][1], feat['properties']['score']) for feat in data['features']])

    # Convert the points to a pandas DataFrame

    points_df = pd.DataFrame(points_np, columns=['x', 'y', 'score'])

    return slide_properties, points_np, points_df

    Use the function to load mitosis data

    mitosis_properties, mitosis_points_np, mitosis_points_df = load_geojson('mitosis.geojson')

    Use the function to load mimickers data

    mimickers_properties, mimickers_points_np, mimickers_points_df = load_geojson('mimickers.geojson')

    Properties

    Each WSI in the dataset includes the candidate's centroid, bounding box, hotspot location, hotspot mitotic count, and hotspot mitotic score. The structures of the mitosis and mimicker property dictionaries are as follows:

    Mitosis property dictionary structure:

    mitosis_properties = {

    'slide_id': slide_id,

    'slide_height': img_h,

    'slide_width': img_w,

    'wsi_mitosis_count': num_mitosis,

    'mitosis_threshold': 0.5,

    'hotspot_rect': {'x1': hotspot[0], 'y1': hotspot[1], 'x2': hotspot[2], 'y2': hotspot[3]},

    'hotspot_mitosis_count': mitosis_count,

    'hotspot_mitosis_score': mitosis_score,

    }

    Proxy figure (mimicker) property dictionary structure:

    mimicker_properties = {

    'slide_id': slide_id,

    'slide_height': img_h,

    'slide_width': img_w,

    'wsi_mimicker_count': num_mimicker,

    'mitosis_threshold': 0.5,

    }

    Disclaimer:

    It should be noted that we did not conduct a comprehensive review of all mitotic figures within each WSI, and we do not purport these to be free of errors. Nonetheless, a pathologist examined the resultant hotspot regions of interest from 757 WSIs within the TCGA-BRCA Mitosis Dataset where we found strong correlations between pathologist and MDFS mitotic counts (r=0.8, p$<$0.001). Furthermore, MDFS-derived mitosis scores are shown to be as prognostic as pathologist-assigned mitosis scores [1]. This examination was also aimed at verifying the quality of the selections, ensuring excessive false detections or artifacts did not primarily drive them and were in a plausible location in the tumor landscape.

    [1] Ibrahim, Asmaa, et al. "Artificial Intelligence-Based Mitosis Scoring in Breast Cancer: Clinical Application." Modern Pathology 37.3 (2024): 100416.

  15. BRCA non-paired sample isoform level read counts

    • figshare.com
    application/gzip
    Updated Jan 19, 2016
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Endre Sebestyén (2016). BRCA non-paired sample isoform level read counts [Dataset]. http://doi.org/10.6084/m9.figshare.1059133.v1
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Jan 19, 2016
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Endre Sebestyén
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    TCGA BRCA non-paired sample isoform level read counts from Level 3 RNASeq-v2 data.

  16. TCGA-BRCA dataset image form

    • kaggle.com
    zip
    Updated May 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    RAJIB BAG_1 (2025). TCGA-BRCA dataset image form [Dataset]. https://www.kaggle.com/datasets/rajibbag1/tcga-brca-dataset-image-form
    Explore at:
    zip(2148266 bytes)Available download formats
    Dataset updated
    May 23, 2025
    Authors
    RAJIB BAG_1
    Description

    Dataset

    This dataset was created by RAJIB BAG_1

    Contents

  17. TCGA BRCA Diagnostic Slide (26)

    • kaggle.com
    zip
    Updated Jan 8, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Arya Z.E. (2025). TCGA BRCA Diagnostic Slide (26) [Dataset]. https://www.kaggle.com/datasets/aryaze/tcga-brca-diagnostic-slide-26/code
    Explore at:
    zip(37570678718 bytes)Available download formats
    Dataset updated
    Jan 8, 2025
    Authors
    Arya Z.E.
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset

    This dataset was created by Arya Z.E.

    Released under CC0: Public Domain

    Contents

  18. Additional file 6: Table S5. of Identification of mRNA isoform switching in...

    • springernature.figshare.com
    xlsx
    Updated Jun 2, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wei Zhao; Katherine Hoadley; Joel Parker; Charles Perou (2023). Additional file 6: Table S5. of Identification of mRNA isoform switching in breast cancer [Dataset]. http://doi.org/10.6084/m9.figshare.c.3636521_D8.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Wei Zhao; Katherine Hoadley; Joel Parker; Charles Perou
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    470 pairs of isoforms with switching events in TCGA BRCA dataset. For each isoform pair, two clusters of samples with differential isoform expression pattern was identified by K-means clustering; and was summarized on the complete data set (K-means cluster sample size (overall)) and on the subset in which both isoforms were detected (K-means cluster sample size (detected)). (XLSX 63 kb)

  19. f

    Table_3_Integrative Analysis of DNA Methylation and Gene Expression to...

    • datasetcatalog.nlm.nih.gov
    Updated Dec 7, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wei, Minjie; Jiang, Longyang; Zhang, Ming; Wang, Yilin; Wang, Yan; Gao, Hua; Li, Xueping; Zhao, Lin (2020). Table_3_Integrative Analysis of DNA Methylation and Gene Expression to Determine Specific Diagnostic Biomarkers and Prognostic Biomarkers of Breast Cancer.XLSX [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000539955
    Explore at:
    Dataset updated
    Dec 7, 2020
    Authors
    Wei, Minjie; Jiang, Longyang; Zhang, Ming; Wang, Yilin; Wang, Yan; Gao, Hua; Li, Xueping; Zhao, Lin
    Description

    Background: DNA methylation is a common event in the early development of various tumors, including breast cancer (BRCA), which has been studies as potential tumor biomarkers. Although previous studies have reported a cluster of aberrant promoter methylation changes in BRCA, none of these research groups have proved the specificity of these DNA methylation changes. Here we aimed to identify specific DNA methylation signatures in BRCA which can be used as diagnostic and prognostic markers.Methods: Differentially methylated sites were identified using the Cancer Genome Atlas (TCGA) BRCA data set. We screened for BRCA-differential methylation by comparing methylation profiles of BRCA patients, healthy breast biopsies and blood samples. These differential methylated sites were compared to nine main cancer samples to identify BRCA specific methylated sites. A BayesNet model was built to distinguish BRCA patients from healthy donors. The model was validated using three Gene Expression Omnibus (GEO) independent data sets. In addition, we also carried out the Cox regression analysis to identify DNA methylation markers which are significantly related to the overall survival (OS) rate of BRCA patients and verified them in the validation cohort.Results: We identified seven differentially methylated sites (DMSs) that were highly correlated with cell cycle as potential specific diagnostic biomarkers for BRCA patients. The combination of 7 DMSs achieved ~94% sensitivity in predicting BRCA, ~95% specificity comparing healthy vs. cancer samples, and ~88% specificity in excluding other cancers. The 7 DMSs were highly correlated with cell cycle. We also identified 6 methylation sites that are highly correlated with the OS of BRCA patients and can be used to accurately predict the survival of BRCA patients (training cohort: likelihood ratio = 70.25, p = 3.633 × 10−13, area under the curve (AUC) = 0.784; validation cohort: AUC = 0.734). Stratification analysis by age, clinical stage, Tumor types, and chemotherapy retained statistical significance.Conclusion: In summary, our study demonstrated the role of methylation profiles in the diagnosis and prognosis of BRCA. This signature is superior to currently published methylation markers for diagnosis and prognosis for BRCA patients. It can be used as promising biomarkers for early diagnosis and prognosis of BRCA.

  20. M

    Breast Invasive Carcinoma (TCGA, Firehose Legacy)

    • datacatalog.mskcc.org
    Updated Sep 16, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Broad Institute (2020). Breast Invasive Carcinoma (TCGA, Firehose Legacy) [Dataset]. https://datacatalog.mskcc.org/dataset/10468
    Explore at:
    Dataset updated
    Sep 16, 2020
    Dataset provided by
    Broad Institute
    MSK Library
    Description

    TCGA Breast Invasive Carcinoma. Source data from GDAC Firehose. Previously known as TCGA Provisional. This dataset contains summary data visualizations and clinical data from a broad sampling of 1,108 carcinomas from 1,101 patients. The data was gathered as part of the Broad Institute of MIT and Harvard Firehose initiative, a cancer analysis pipeline. The clinical data includes mutation count, information about mutated genes, patient demographics, sample type, disease code, Adjuvant Postoperative Pharmaceutical Therapy Administered Indicator, American Joint Committee on Cancer Metastasis Stage Code, American Joint Committee on Cancer Publication Version Type, American Joint Committee on Cancer Tumor Stage Code, Brachytherapy first reference point administered total dose, Cent17 Copy Number, and the Days to Sample Collection. The dataset includes Next-Generation Clustered Heat Maps (NG-CHM) viewable via an embedded NG-CHM Heat Map Viewer, provided my MD Anderson Cancer Center, which provides a graphical environment for exploration of clustered or non-clustered heat map data. The data set also includes copy-number segment data downloadable as .seg files and viewable via the Integrative Genomics Viewer.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
The Cancer Imaging Archive (2014). The Cancer Genome Atlas Breast Invasive Carcinoma Collection [Dataset]. http://doi.org/10.7937/K9/TCIA.2016.AB2NAZRP

The Cancer Genome Atlas Breast Invasive Carcinoma Collection

TCGA-BRCA

Explore at:
89 scholarly articles cite this dataset (View in Google Scholar)
n/a, dicomAvailable download formats
Dataset updated
Feb 2, 2014
Dataset authored and provided by
The Cancer Imaging Archive
License

https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/

Time period covered
May 29, 2020
Dataset funded by
National Cancer Institutehttp://www.cancer.gov/
Description

The Cancer Genome Atlas Breast Invasive Carcinoma (TCGA-BRCA) data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA). Clinical, genetic, and pathological data resides in the Genomic Data Commons (GDC) Data Portal while the radiological data is stored on The Cancer Imaging Archive (TCIA).

Matched TCGA patient identifiers allow researchers to explore the TCGA/TCIA databases for correlations between tissue genotype, radiological phenotype and patient outcomes. Tissues for TCGA were collected from many sites all over the world in order to reach their accrual targets, usually around 500 specimens per cancer type. For this reason the image data sets are also extremely heterogeneous in terms of scanner modalities, manufacturers and acquisition protocols. In most cases the images were acquired as part of routine care and not as part of a controlled research study or clinical trial.

CIP TCGA Radiology Initiative

Imaging Source Site (ISS) Groups are being populated and governed by participants from institutions that have provided imaging data to the archive for a given cancer type. Modeled after TCGA analysis groups, ISS groups are given the opportunity to publish a marker paper for a given cancer type per the guidelines in the table above. This opportunity will generate increased participation in building these multi-institutional data sets as they become an open community resource. Learn more about the TCGA Breast Phenotype Research Group.

Search
Clear search
Close search
Google apps
Main menu