10 datasets found
  1. r

    Genomic Data Commons Data Portal (GDC Data Portal)

    • rrid.site
    • neuinfo.org
    • +2more
    Updated Mar 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Genomic Data Commons Data Portal (GDC Data Portal) [Dataset]. http://identifiers.org/RRID:SCR_014514
    Explore at:
    Dataset updated
    Mar 12, 2025
    Description

    A unified data repository of the National Cancer Institute (NCI)'s Genomic Data Commons (GDC) that enables data sharing across cancer genomic studies in support of precision medicine. The GDC supports several cancer genome programs at the NCI Center for Cancer Genomics (CCG), including The Cancer Genome Atlas (TCGA), Therapeutically Applicable Research to Generate Effective Treatments (TARGET), and the Cancer Genome Characterization Initiative (CGCI). The GDC Data Portal provides a platform for efficiently querying and downloading high quality and complete data. The GDC also provides a GDC Data Transfer Tool and a GDC API for programmatic access.

  2. Z

    Historical NCI Genomic Data Commons data (09-14-2017)

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Seim, Inge (2020). Historical NCI Genomic Data Commons data (09-14-2017) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_1186944
    Explore at:
    Dataset updated
    Jan 24, 2020
    Dataset authored and provided by
    Seim, Inge
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Historical NCI Genomic Data Commons data (v09-14-2017). Clinical ('phenotype') and gene expression (HTSeq FPKM-UQ).

    TCGA-COAD.GDC_phenotype.tsv

    dataset: phenotype - Phenotype

    cohortGDC TCGA Colon Cancer (COAD) dataset IDTCGA-COAD/Xena_Matrices/TCGA-COAD.GDC_phenotype.tsv downloadhttps://gdc.xenahubs.net/download/TCGA-COAD/Xena_Matrices/TCGA-COAD.GDC_phenotype.tsv.gz; Full metadata samples570 version11-27-2017 hubhttps://gdc.xenahubs.net type of dataphenotype authorGenomic Data Commons raw datahttps://docs.gdc.cancer.gov/Data/Release_Notes/Data_Release_Notes/#data-release-90 raw datahttps://api.gdc.cancer.gov/data/ input data formatROWs (samples) x COLUMNs (identifiers) (i.e. clinicalMatrix) 570 samples X 151 identifiersAll IdentifiersAll Samples

    TCGA-COAD.htseq_fpkm-uq.tsv

    dataset: gene expression RNAseq - HTSeq - FPKM-UQ

    cohortGDC TCGA Colon Cancer (COAD) dataset IDTCGA-COAD/Xena_Matrices/TCGA-COAD.htseq_fpkm-uq.tsv downloadhttps://gdc.xenahubs.net/download/TCGA-COAD/Xena_Matrices/TCGA-COAD.htseq_fpkm-uq.tsv.gz; Full metadata samples512 version09-14-2017 hubhttps://gdc.xenahubs.net type of datagene expression RNAseq unitlog2(fpkm-uq+1) platformIllumina ID/Gene Mappinghttps://gdc.xenahubs.net/download/probeMaps/gencode.v22.annotation.gene.probeMap.gz; Full metadata authorGenomic Data Commons raw datahttps://docs.gdc.cancer.gov/Data/Release_Notes/Data_Release_Notes/#data-release-80 raw datahttps://api.gdc.cancer.gov/data/ wranglingData from the same sample but from different vials/portions/analytes/aliquotes is averaged; data from different samples is combined into genomicMatrix; all data is then log2(x+1) transformed. input data formatROWs (identifiers) x COLUMNs (samples) (i.e. genomicMatrix) 60,484 identifiers X 512 samples

  3. List of all reprocessed vs. reprocessed differentially expressed genes...

    • plos.figshare.com
    csv
    Updated Mar 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ling-Hong Hung; Bryce Fukuda; Robert Schmitz; Varik Hoang; Wes Lloyd; Ka Yee Yeung (2025). List of all reprocessed vs. reprocessed differentially expressed genes (DEGs) comparing tumor data from the GDC and normal data from the GTEx. [Dataset]. http://doi.org/10.1371/journal.pone.0318676.s004
    Explore at:
    csvAvailable download formats
    Dataset updated
    Mar 4, 2025
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Ling-Hong Hung; Bryce Fukuda; Robert Schmitz; Varik Hoang; Wes Lloyd; Ka Yee Yeung
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Reprocessed counts were generated using our GDC RNA-seq workflow implementation. NA rank changes indicate the DEG cannot be found in the other DEG list. (CSV)

  4. f

    Comparison of the top 10 differentially expressed genes inferred from...

    • figshare.com
    xls
    Updated Mar 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ling-Hong Hung; Bryce Fukuda; Robert Schmitz; Varik Hoang; Wes Lloyd; Ka Yee Yeung (2025). Comparison of the top 10 differentially expressed genes inferred from concatenation of published counts (“published vs published”) versus those inferred from harmonized uniform GDC re-processing (“reprocessed vs reprocessed”). [Dataset]. http://doi.org/10.1371/journal.pone.0318676.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Mar 4, 2025
    Dataset provided by
    PLOS ONE
    Authors
    Ling-Hong Hung; Bryce Fukuda; Robert Schmitz; Varik Hoang; Wes Lloyd; Ka Yee Yeung
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Comparison of the top 10 differentially expressed genes inferred from concatenation of published counts (“published vs published”) versus those inferred from harmonized uniform GDC re-processing (“reprocessed vs reprocessed”).

  5. f

    Comparison of counts resulting from running our GDC RNA-seq workflow...

    • figshare.com
    xlsx
    Updated Mar 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ling-Hong Hung; Bryce Fukuda; Robert Schmitz; Varik Hoang; Wes Lloyd; Ka Yee Yeung (2025). Comparison of counts resulting from running our GDC RNA-seq workflow implementation (reprocessed counts) to GDC published counts. [Dataset]. http://doi.org/10.1371/journal.pone.0318676.s005
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Mar 4, 2025
    Dataset provided by
    PLOS ONE
    Authors
    Ling-Hong Hung; Bryce Fukuda; Robert Schmitz; Varik Hoang; Wes Lloyd; Ka Yee Yeung
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    There are three sheets in this spreadsheet file, corresponding to each of the three samples (TCGA-AB-2821, TCGA-AB-2828, TCGA-AB-2839). Correlation and RMSD between the reprocessed counts and published counts are included in each sheet. (XLSX)

  6. Z

    Sample datasets used in [Gil et al 2017] for AAAI 2017

    • data.niaid.nih.gov
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Adusumilli, Ravali (2020). Sample datasets used in [Gil et al 2017] for AAAI 2017 [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_180716
    Explore at:
    Dataset updated
    Jan 24, 2020
    Dataset authored and provided by
    Adusumilli, Ravali
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This file includes the pointer to the 42 patient ids and zip file names of the 84 genomic and proteomic datasets used for the paper "Gil, Y, Garijo, D, Ratnakar, V, Mayani, R, Adusumilli, A, Srivastava, R, Boyce, H, Mallick,P. Towards Continuous Scientific Data Analysis and Hypothesis Evolution", accepted in AAAI 2017.

    The datasets itself are not published due to their size and access conditions. They can be retrieved with the provided ids from TCGA (https://gdc-portal.nci.nih.gov/legacy-archive/search/f) and CPTAC (https://cptac-data-portal.georgetown.edu/cptac/s/S022) archives.

    These patient ids are a subset of the nearly 90 samples used in "Zhang, B., Wang, J., Wang, X., Zhu, J., Liu, Q., et al. Proteogenomic characterization of human colon and rectal cancer. Nature 513,382–387", in order to test the system described in the AAAI 2017 paper. More samples were not included in the analysis due to time constraints.

  7. f

    List of 625 false positive genes resulted from comparing GTEx published...

    • figshare.com
    csv
    Updated Mar 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ling-Hong Hung; Bryce Fukuda; Robert Schmitz; Varik Hoang; Wes Lloyd; Ka Yee Yeung (2025). List of 625 false positive genes resulted from comparing GTEx published counts versus GTEx reprocessed counts. [Dataset]. http://doi.org/10.1371/journal.pone.0318676.s002
    Explore at:
    csvAvailable download formats
    Dataset updated
    Mar 4, 2025
    Dataset provided by
    PLOS ONE
    Authors
    Ling-Hong Hung; Bryce Fukuda; Robert Schmitz; Varik Hoang; Wes Lloyd; Ka Yee Yeung
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    List of 625 false positive genes resulted from comparing GTEx published counts versus GTEx reprocessed counts.

  8. c

    The Cancer Genome Atlas Liver Hepatocellular Carcinoma Collection

    • cancerimagingarchive.net
    dicom, n/a
    Updated May 29, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Cancer Imaging Archive (2020). The Cancer Genome Atlas Liver Hepatocellular Carcinoma Collection [Dataset]. http://doi.org/10.7937/K9/TCIA.2016.IMMQW8UQ
    Explore at:
    n/a, dicomAvailable download formats
    Dataset updated
    May 29, 2020
    Dataset authored and provided by
    The Cancer Imaging Archive
    License

    https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/

    Time period covered
    May 29, 2020
    Dataset funded by
    National Cancer Institutehttp://www.cancer.gov/
    Description

    The Cancer Genome Atlas Liver Hepatocellular Carcinoma (TCGA-LIHC) data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA). Clinical, genetic, and pathological data resides in the Genomic Data Commons (GDC) Data Portal while the radiological data is stored on The Cancer Imaging Archive (TCIA).

    Matched TCGA patient identifiers allow researchers to explore the TCGA/TCIA databases for correlations between tissue genotype, radiological phenotype and patient outcomes. Tissues for TCGA were collected from many sites all over the world in order to reach their accrual targets, usually around 500 specimens per cancer type. For this reason the image data sets are also extremely heterogeneous in terms of scanner modalities, manufacturers and acquisition protocols. In most cases the images were acquired as part of routine care and not as part of a controlled research study or clinical trial.

    CIP TCGA Radiology Initiative

    Imaging Source Site (ISS) Groups are being populated and governed by participants from institutions that have provided imaging data to the archive for a given cancer type. Modeled after TCGA analysis groups, ISS groups are given the opportunity to publish a marker paper for a given cancer type per the guidelines in the table above. This opportunity will generate increased participation in building these multi-institutional data sets as they become an open community resource. Learn more about the CIP TCGA Radiology Initiative.

  9. f

    Table_1_Characteristics of Familial Lung Cancer in Yunnan-Guizhou Plateau of...

    • figshare.com
    docx
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Xiaojie Ding; Ying Chen; Jiapeng Yang; Guangjian Li; Huatao Niu; Rui He; Jie Zhao; Huanqi Ning (2023). Table_1_Characteristics of Familial Lung Cancer in Yunnan-Guizhou Plateau of China.docx [Dataset]. http://doi.org/10.3389/fonc.2018.00637.s002
    Explore at:
    docxAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    Frontiers
    Authors
    Xiaojie Ding; Ying Chen; Jiapeng Yang; Guangjian Li; Huatao Niu; Rui He; Jie Zhao; Huanqi Ning
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Yunnan, China
    Description

    Background: Lung cancer has inherited susceptibility and show familial aggregation, the characteristics of familial lung cancer exhibit population heterogeneity. Despite previous studies, familial lung cancer in China's Yunnan-Guizhou plateau remains understudied.Methods: Between 2015 and 2017, 1,023 lung cancer patients (residents of Yunnan-Guizhou plateau) were enrolled with no limitation on other parameters, 152 subjects had familial lung cancer. Clinicopathologic parameters were analyzed and compared, 4,754 lung cancer patients from NCI-GDC were used to represent a general population.Results: Familial lung cancer (FLC) subjects showed unique characters: early-onset; increased rate of female, adenocarcinoma, stage IV and other cancer history; unbalance in anatomic sites; all ruling out significant difference in smoking status. Unbalanced distribution of co-existing diseases or symptoms was also discovered. FLC patients were more likely to develop benign lesions (polyps, nodules, cysts) early in life, especially early-growth of multiple pulmonary nodules at higher frequency. Typical diseases with family history like diabetes and hypertension were also increased in FLC population. Compared to GDC data, our subject population was younger: the age peak of our FLC group was in 50–59; our sporadic group had an age peak around 60; while GDC patients' age peak was in 60–69. Importantly, the biggest difference happened in age 40–49: our FLC group and sporadic group had 3 times and 2 times higher ratio than GDC population, respectively. Moreover, the age peaks of our FLC males and FLC females were both in 50–59; while our sporadic females had the age peak in 50–59, much earlier than sporadic males (around 60–69); reflecting gender-specific or age-specific characters in our subject population.Conclusions: Familial lung cancer in China's Yunnan-Guizhou plateau showed unique clinicopathologic characters, differences were found in gender, age, histologic type, TNM stage and co-existing diseases or symptoms. Identification of hereditary factors which lead to increased lung cancer risk will be a challenge of both scientific and clinical significance.

  10. f

    A comparison for the precision values of the top 15 ranked genes related to...

    • figshare.com
    xls
    Updated Jun 9, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wonjun Choi; Hyunju Lee (2023). A comparison for the precision values of the top 15 ranked genes related to each cancer type by each centrality measure and against NCI’s GDC and by each approach. [Dataset]. http://doi.org/10.1371/journal.pone.0258626.t007
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 9, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Wonjun Choi; Hyunju Lee
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    A comparison for the precision values of the top 15 ranked genes related to each cancer type by each centrality measure and against NCI’s GDC and by each approach.

  11. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
(2025). Genomic Data Commons Data Portal (GDC Data Portal) [Dataset]. http://identifiers.org/RRID:SCR_014514

Genomic Data Commons Data Portal (GDC Data Portal)

RRID:SCR_014514, Genomic Data Commons Data Portal (GDC Data Portal) (RRID:SCR_014514), Genomic Data Commons Data Portal, GDC Data Portal

Explore at:
71 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Mar 12, 2025
Description

A unified data repository of the National Cancer Institute (NCI)'s Genomic Data Commons (GDC) that enables data sharing across cancer genomic studies in support of precision medicine. The GDC supports several cancer genome programs at the NCI Center for Cancer Genomics (CCG), including The Cancer Genome Atlas (TCGA), Therapeutically Applicable Research to Generate Effective Treatments (TARGET), and the Cancer Genome Characterization Initiative (CGCI). The GDC Data Portal provides a platform for efficiently querying and downloading high quality and complete data. The GDC also provides a GDC Data Transfer Tool and a GDC API for programmatic access.

Search
Clear search
Close search
Google apps
Main menu