10 datasets found

r
Genomic Data Commons Data Portal (GDC Data Portal)
rrid.site
neuinfo.org
+2more
Updated Mar 12, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Genomic Data Commons Data Portal (GDC Data Portal) [Dataset]. http://identifiers.org/RRID:SCR_014514
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_014514
Dataset updated
Mar 12, 2025
Description
A unified data repository of the National Cancer Institute (NCI)'s Genomic Data Commons (GDC) that enables data sharing across cancer genomic studies in support of precision medicine. The GDC supports several cancer genome programs at the NCI Center for Cancer Genomics (CCG), including The Cancer Genome Atlas (TCGA), Therapeutically Applicable Research to Generate Effective Treatments (TARGET), and the Cancer Genome Characterization Initiative (CGCI). The GDC Data Portal provides a platform for efficiently querying and downloading high quality and complete data. The GDC also provides a GDC Data Transfer Tool and a GDC API for programmatic access.
Z
Historical NCI Genomic Data Commons data (09-14-2017)
data.niaid.nih.gov
zenodo.org
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Seim, Inge (2020). Historical NCI Genomic Data Commons data (09-14-2017) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_1186944
Explore at:
Dataset updated
Jan 24, 2020
Dataset authored and provided by
Seim, Inge
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
Historical NCI Genomic Data Commons data (v09-14-2017). Clinical ('phenotype') and gene expression (HTSeq FPKM-UQ).

TCGA-COAD.GDC_phenotype.tsv

dataset: phenotype - Phenotype

cohortGDC TCGA Colon Cancer (COAD) dataset IDTCGA-COAD/Xena_Matrices/TCGA-COAD.GDC_phenotype.tsv downloadhttps://gdc.xenahubs.net/download/TCGA-COAD/Xena_Matrices/TCGA-COAD.GDC_phenotype.tsv.gz; Full metadata samples570 version11-27-2017 hubhttps://gdc.xenahubs.net type of dataphenotype authorGenomic Data Commons raw datahttps://docs.gdc.cancer.gov/Data/Release_Notes/Data_Release_Notes/#data-release-90 raw datahttps://api.gdc.cancer.gov/data/ input data formatROWs (samples) x COLUMNs (identifiers) (i.e. clinicalMatrix) 570 samples X 151 identifiersAll IdentifiersAll Samples

TCGA-COAD.htseq_fpkm-uq.tsv

dataset: gene expression RNAseq - HTSeq - FPKM-UQ

cohortGDC TCGA Colon Cancer (COAD) dataset IDTCGA-COAD/Xena_Matrices/TCGA-COAD.htseq_fpkm-uq.tsv downloadhttps://gdc.xenahubs.net/download/TCGA-COAD/Xena_Matrices/TCGA-COAD.htseq_fpkm-uq.tsv.gz; Full metadata samples512 version09-14-2017 hubhttps://gdc.xenahubs.net type of datagene expression RNAseq unitlog2(fpkm-uq+1) platformIllumina ID/Gene Mappinghttps://gdc.xenahubs.net/download/probeMaps/gencode.v22.annotation.gene.probeMap.gz; Full metadata authorGenomic Data Commons raw datahttps://docs.gdc.cancer.gov/Data/Release_Notes/Data_Release_Notes/#data-release-80 raw datahttps://api.gdc.cancer.gov/data/ wranglingData from the same sample but from different vials/portions/analytes/aliquotes is averaged; data from different samples is combined into genomicMatrix; all data is then log2(x+1) transformed. input data formatROWs (identifiers) x COLUMNs (samples) (i.e. genomicMatrix) 60,484 identifiers X 512 samples
List of all reprocessed vs. reprocessed differentially expressed genes...
plos.figshare.com
csv
Updated Mar 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ling-Hong Hung; Bryce Fukuda; Robert Schmitz; Varik Hoang; Wes Lloyd; Ka Yee Yeung (2025). List of all reprocessed vs. reprocessed differentially expressed genes (DEGs) comparing tumor data from the GDC and normal data from the GTEx. [Dataset]. http://doi.org/10.1371/journal.pone.0318676.s004
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0318676.s004
Dataset updated
Mar 4, 2025
Dataset provided by
PLOShttp://plos.org/
Authors
Ling-Hong Hung; Bryce Fukuda; Robert Schmitz; Varik Hoang; Wes Lloyd; Ka Yee Yeung
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Reprocessed counts were generated using our GDC RNA-seq workflow implementation. NA rank changes indicate the DEG cannot be found in the other DEG list. (CSV)
f
Comparison of the top 10 differentially expressed genes inferred from...
figshare.com
xls
Updated Mar 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ling-Hong Hung; Bryce Fukuda; Robert Schmitz; Varik Hoang; Wes Lloyd; Ka Yee Yeung (2025). Comparison of the top 10 differentially expressed genes inferred from concatenation of published counts (“published vs published”) versus those inferred from harmonized uniform GDC re-processing (“reprocessed vs reprocessed”). [Dataset]. http://doi.org/10.1371/journal.pone.0318676.t002
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0318676.t002
Dataset updated
Mar 4, 2025
Dataset provided by
PLOS ONE
Authors
Ling-Hong Hung; Bryce Fukuda; Robert Schmitz; Varik Hoang; Wes Lloyd; Ka Yee Yeung
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Comparison of the top 10 differentially expressed genes inferred from concatenation of published counts (“published vs published”) versus those inferred from harmonized uniform GDC re-processing (“reprocessed vs reprocessed”).
f
Comparison of counts resulting from running our GDC RNA-seq workflow...
figshare.com
xlsx
Updated Mar 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ling-Hong Hung; Bryce Fukuda; Robert Schmitz; Varik Hoang; Wes Lloyd; Ka Yee Yeung (2025). Comparison of counts resulting from running our GDC RNA-seq workflow implementation (reprocessed counts) to GDC published counts. [Dataset]. http://doi.org/10.1371/journal.pone.0318676.s005
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0318676.s005
Dataset updated
Mar 4, 2025
Dataset provided by
PLOS ONE
Authors
Ling-Hong Hung; Bryce Fukuda; Robert Schmitz; Varik Hoang; Wes Lloyd; Ka Yee Yeung
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
There are three sheets in this spreadsheet file, corresponding to each of the three samples (TCGA-AB-2821, TCGA-AB-2828, TCGA-AB-2839). Correlation and RMSD between the reprocessed counts and published counts are included in each sheet. (XLSX)
Z
Sample datasets used in [Gil et al 2017] for AAAI 2017
data.niaid.nih.gov
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Adusumilli, Ravali (2020). Sample datasets used in [Gil et al 2017] for AAAI 2017 [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_180716
Explore at:
Dataset updated
Jan 24, 2020
Dataset authored and provided by
Adusumilli, Ravali
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This file includes the pointer to the 42 patient ids and zip file names of the 84 genomic and proteomic datasets used for the paper "Gil, Y, Garijo, D, Ratnakar, V, Mayani, R, Adusumilli, A, Srivastava, R, Boyce, H, Mallick,P. Towards Continuous Scientific Data Analysis and Hypothesis Evolution", accepted in AAAI 2017.

The datasets itself are not published due to their size and access conditions. They can be retrieved with the provided ids from TCGA (https://gdc-portal.nci.nih.gov/legacy-archive/search/f) and CPTAC (https://cptac-data-portal.georgetown.edu/cptac/s/S022) archives.

These patient ids are a subset of the nearly 90 samples used in "Zhang, B., Wang, J., Wang, X., Zhu, J., Liu, Q., et al. Proteogenomic characterization of human colon and rectal cancer. Nature 513,382–387", in order to test the system described in the AAAI 2017 paper. More samples were not included in the analysis due to time constraints.
f
List of 625 false positive genes resulted from comparing GTEx published...
figshare.com
csv
Updated Mar 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ling-Hong Hung; Bryce Fukuda; Robert Schmitz; Varik Hoang; Wes Lloyd; Ka Yee Yeung (2025). List of 625 false positive genes resulted from comparing GTEx published counts versus GTEx reprocessed counts. [Dataset]. http://doi.org/10.1371/journal.pone.0318676.s002
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0318676.s002
Dataset updated
Mar 4, 2025
Dataset provided by
PLOS ONE
Authors
Ling-Hong Hung; Bryce Fukuda; Robert Schmitz; Varik Hoang; Wes Lloyd; Ka Yee Yeung
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
List of 625 false positive genes resulted from comparing GTEx published counts versus GTEx reprocessed counts.
c
The Cancer Genome Atlas Liver Hepatocellular Carcinoma Collection
cancerimagingarchive.net
dicom, n/a
Updated May 29, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Cancer Imaging Archive (2020). The Cancer Genome Atlas Liver Hepatocellular Carcinoma Collection [Dataset]. http://doi.org/10.7937/K9/TCIA.2016.IMMQW8UQ
Explore at:
n/a, dicomAvailable download formats
Unique identifier
https://doi.org/10.7937/K9/TCIA.2016.IMMQW8UQ
Dataset updated
May 29, 2020
Dataset authored and provided by
The Cancer Imaging Archive
License
https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/
Time period covered
May 29, 2020
Dataset funded by
National Cancer Institutehttp://www.cancer.gov/
Description
The Cancer Genome Atlas Liver Hepatocellular Carcinoma (TCGA-LIHC) data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA). Clinical, genetic, and pathological data resides in the Genomic Data Commons (GDC) Data Portal while the radiological data is stored on The Cancer Imaging Archive (TCIA).
Matched TCGA patient identifiers allow researchers to explore the TCGA/TCIA databases for correlations between tissue genotype, radiological phenotype and patient outcomes. Tissues for TCGA were collected from many sites all over the world in order to reach their accrual targets, usually around 500 specimens per cancer type. For this reason the image data sets are also extremely heterogeneous in terms of scanner modalities, manufacturers and acquisition protocols. In most cases the images were acquired as part of routine care and not as part of a controlled research study or clinical trial.
CIP TCGA Radiology Initiative
Imaging Source Site (ISS) Groups are being populated and governed by participants from institutions that have provided imaging data to the archive for a given cancer type. Modeled after TCGA analysis groups, ISS groups are given the opportunity to publish a marker paper for a given cancer type per the guidelines in the table above. This opportunity will generate increased participation in building these multi-institutional data sets as they become an open community resource. Learn more about the CIP TCGA Radiology Initiative.
f
Table_1_Characteristics of Familial Lung Cancer in Yunnan-Guizhou Plateau of...
figshare.com
docx
Updated Jun 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Xiaojie Ding; Ying Chen; Jiapeng Yang; Guangjian Li; Huatao Niu; Rui He; Jie Zhao; Huanqi Ning (2023). Table_1_Characteristics of Familial Lung Cancer in Yunnan-Guizhou Plateau of China.docx [Dataset]. http://doi.org/10.3389/fonc.2018.00637.s002
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.3389/fonc.2018.00637.s002
Dataset updated
Jun 3, 2023
Dataset provided by
Frontiers
Authors
Xiaojie Ding; Ying Chen; Jiapeng Yang; Guangjian Li; Huatao Niu; Rui He; Jie Zhao; Huanqi Ning
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Yunnan, China
Description
Background: Lung cancer has inherited susceptibility and show familial aggregation, the characteristics of familial lung cancer exhibit population heterogeneity. Despite previous studies, familial lung cancer in China's Yunnan-Guizhou plateau remains understudied.Methods: Between 2015 and 2017, 1,023 lung cancer patients (residents of Yunnan-Guizhou plateau) were enrolled with no limitation on other parameters, 152 subjects had familial lung cancer. Clinicopathologic parameters were analyzed and compared, 4,754 lung cancer patients from NCI-GDC were used to represent a general population.Results: Familial lung cancer (FLC) subjects showed unique characters: early-onset; increased rate of female, adenocarcinoma, stage IV and other cancer history; unbalance in anatomic sites; all ruling out significant difference in smoking status. Unbalanced distribution of co-existing diseases or symptoms was also discovered. FLC patients were more likely to develop benign lesions (polyps, nodules, cysts) early in life, especially early-growth of multiple pulmonary nodules at higher frequency. Typical diseases with family history like diabetes and hypertension were also increased in FLC population. Compared to GDC data, our subject population was younger: the age peak of our FLC group was in 50–59; our sporadic group had an age peak around 60; while GDC patients' age peak was in 60–69. Importantly, the biggest difference happened in age 40–49: our FLC group and sporadic group had 3 times and 2 times higher ratio than GDC population, respectively. Moreover, the age peaks of our FLC males and FLC females were both in 50–59; while our sporadic females had the age peak in 50–59, much earlier than sporadic males (around 60–69); reflecting gender-specific or age-specific characters in our subject population.Conclusions: Familial lung cancer in China's Yunnan-Guizhou plateau showed unique clinicopathologic characters, differences were found in gender, age, histologic type, TNM stage and co-existing diseases or symptoms. Identification of hereditary factors which lead to increased lung cancer risk will be a challenge of both scientific and clinical significance.
f
A comparison for the precision values of the top 15 ranked genes related to...
figshare.com
xls
Updated Jun 9, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Wonjun Choi; Hyunju Lee (2023). A comparison for the precision values of the top 15 ranked genes related to each cancer type by each centrality measure and against NCI’s GDC and by each approach. [Dataset]. http://doi.org/10.1371/journal.pone.0258626.t007
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0258626.t007
Dataset updated
Jun 9, 2023
Dataset provided by
PLOS ONE
Authors
Wonjun Choi; Hyunju Lee
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
A comparison for the precision values of the top 15 ranked genes related to each cancer type by each centrality measure and against NCI’s GDC and by each approach.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

(2025). Genomic Data Commons Data Portal (GDC Data Portal) [Dataset]. http://identifiers.org/RRID:SCR_014514

Genomic Data Commons Data Portal (GDC Data Portal)

RRID:SCR_014514, Genomic Data Commons Data Portal (GDC Data Portal) (RRID:SCR_014514), Genomic Data Commons Data Portal, GDC Data Portal

Explore at:

71 scholarly articles cite this dataset (View in Google Scholar)

Unique identifier

https://identifiers.org/RRID:SCR_014514

Dataset updated

Mar 12, 2025

Description

A unified data repository of the National Cancer Institute (NCI)'s Genomic Data Commons (GDC) that enables data sharing across cancer genomic studies in support of precision medicine. The GDC supports several cancer genome programs at the NCI Center for Cancer Genomics (CCG), including The Cancer Genome Atlas (TCGA), Therapeutically Applicable Research to Generate Effective Treatments (TARGET), and the Cancer Genome Characterization Initiative (CGCI). The GDC Data Portal provides a platform for efficiently querying and downloading high quality and complete data. The GDC also provides a GDC Data Transfer Tool and a GDC API for programmatic access.

Clear search

Close search

Google apps

Main menu

Genomic Data Commons Data Portal (GDC Data Portal)

Historical NCI Genomic Data Commons data (09-14-2017)

List of all reprocessed vs. reprocessed differentially expressed genes...

Comparison of the top 10 differentially expressed genes inferred from...

Comparison of counts resulting from running our GDC RNA-seq workflow...

Sample datasets used in [Gil et al 2017] for AAAI 2017

List of 625 false positive genes resulted from comparing GTEx published...

The Cancer Genome Atlas Liver Hepatocellular Carcinoma Collection

CIP TCGA Radiology Initiative

Table_1_Characteristics of Familial Lung Cancer in Yunnan-Guizhou Plateau of...

A comparison for the precision values of the top 15 ranked genes related to...

Genomic Data Commons Data Portal (GDC Data Portal)

RRID:SCR_014514, Genomic Data Commons Data Portal (GDC Data Portal) (RRID:SCR_014514), Genomic Data Commons Data Portal, GDC Data Portal