Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This DepMap release contains data from CRISPR knockout screens from project Achilles, as well as genomic characterization data from the CCLE project.For more information, please see README.txt.
Portal for identifying genetic and pharmacologic dependencies and biomarkers that predicts them by providing access to datasets, visualizations, and analysis tools that are being used by Cancer Dependency Map Project at Broad Institute. Project to systematically identify genes and small molecule dependencies and to determine markers that predict sensitivity. All data generated by DepMap Project are available to public under CC BY 4.0 license on quarterly basis and pre-publication.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This DepMap dataset contains data from CRISPR knockout screens from project Achilles, as well as genomic characterization data from the CCLE project. For more information about the pipelines, please refer to the DepMap portal, available at https://depmap.org/portal/. Note that the data has been processed internally in alignment with DepMap 23Q2-v97.The primary dataset files are located in the Main_Data/ folder. For additional details, please refer to the manuscript.
Information about the dataset files: 1) pancan_rnaseq_freeze.tsv.gz: Publicly available gene expression data for the TCGA Pan-cancer dataset. File: PanCanAtlas EBPlusPlusAdjustPANCAN_IlluminaHiSeq_RNASeqV2.geneExp.tsv was processed using script process_sample_freeze.py by Gregory Way et al as described in https://github.com/greenelab/pancancer/ data processing and initialization steps. [http://api.gdc.cancer.gov/data/3586c0da-64d0-4b74-a449-5ff4d9136611] [https://doi.org/10.1016/j.celrep.2018.03.046] 2) pancan_mutation_freeze.tsv.gz: Publicly available Mutational information for TCGA Pan-cancer dataset. File: mc3.v0.2.8.PUBLIC.maf.gz was processed using script process_sample_freeze.py by Gregory Way et al as described in https://github.com/greenelab/pancancer/ data processing and initialization steps. [http://api.gdc.cancer.gov/data/1c8cfe5f-e52d-41ba-94da-f15ea1337efc] [https://doi.org/10.1016/j.celrep.2018.03.046] 3) pancan_GISTIC_threshold.tsv.gz: Publicly available Gene- level copy number information of the TCGA Pan-cancer dataset. This file is processed using script process_copynumber.py by Gregory Way et al as described in https://github.com/greenelab/pancancer/ data processing and initialization steps. The files copy_number_loss_status.tsv.gz and copy_number_gain_status.tsv.gz generated from this data are used as inputs in our Galaxy pipeline. [https://xenabrowser.net/datapages/?cohort=TCGA%20Pan-Cancer%20(PANCAN)&removeHub=https%3A%2F%2Fxena.treehouse.gi.ucsc.edu%3A443] [https://doi.org/10.1016/j.celrep.2018.03.046] 4) mutation_burden_freeze.tsv.gz: Publicly available Mutational information for TCGA Pan-cancer dataset mc3.v0.2.8.PUBLIC.maf.gz was processed using script process_sample_freeze.py by Gregory Way et al as described in https://github.com/greenelab/pancancer/ data processing and initialization steps. [https://github.com/greenelab/pancancer/][http://api.gdc.cancer.gov/data/1c8cfe5f-e52d-41ba-94da-f15ea1337efc] [https://doi.org/10.1016/j.celrep.2018.03.046] 5) sample_freeze.tsv or sample_freeze_version4_modify.tsv: The file lists the frozen samples as determined by TCGA PanCancer Atlas consortium along with raw RNAseq and mutation data. These were previously determined and included for all downstream analysis All other datasets were processed and subset according to the frozen samples.[https://github.com/greenelab/pancancer/] 6) cosmic_cancer_classification.tsv: Compendium of OG and TSG used for the analysis. Added additional genes from the cosmic database to volgelstein_cancer_classification.tsv [https://github.com/greenelab/pancancer/] 7) CCLE_DepMap_18Q1_maf_20180207.txt.gz Publicly available Mutational data for CCLE cell lines from Broad Institute Cancer Cell Line Encyclopedia (CCLE) / DepMap Portal. [https://depmap.org/portal/download/api/download/external?file_name=ccle%2FCCLE_DepMap_18Q1_maf_20180207.txt] 8) ccle_rnaseq_genes_rpkm_20180929_mod.tsv.gz: Publicly available Expression data for 1019 cell lines (RPKM) from Broad Institute Cancer Cell Line Encyclopedia (CCLE) / DepMap Portal. [https://depmap.org/portal/download/api/download/external?file_name=ccle%2Fccle_2019%2FCCLE_RNAseq_genes_rpkm_20180929.gct.gz] 9) CCLE_MUT_CNA_AMP_DEL_binary_Revealer.tsv: Publicly available merged Mutational and copy number alterations that include gene amplifications and deletions for the CCLE cell lines. This data is represented in the binary format and provided by the Broad Institute Cancer Cell Line Encyclopedia (CCLE) / DepMap Portal. [https://data.broadinstitute.org/ccle_legacy_data/binary_calls_for_copy_number_and_mutation_data/CCLE_MUT_CNA_AMP_DEL_binary_Revealer.gct] 10) GDSC_cell_lines_EXP_CCLE_names.tsv.gz Publicly available RMA normalized expression data for Genomics of Drug Sensitivity in Cancer(GDSC) cell-lines. File gdsc_cell_line_RMA_proc_basalExp.csv was downloaded. This data was subsetted to 389 cell lines that are common among CCLE and GDSC. All the GDSC cell line names were replaced with CCLE cell line names for further processing. [https://www.cancerrxgene.org/gdsc1000/GDSC1000_WebResources//Data/preprocessed/Cell_line_RMA_proc_basalExp.txt.zip] 11) GDSC_CCLE_common_mut_cnv_binary.tsv.gz: A subset of merged Mutational and copy number alterations that include gene amplifications and deletions for common cell lines between GDSC and CCLE. This file is generated using CCLE_MUT_CNA_AMP_DEL_binary_Revealer.tsv and a list of common cell lines. 12) gdsc1_ccle_pharm_fitted_dose_data.txt.gz: Pharmacological data for GDSC1 cell lines. [ftp://ftp.sanger.ac.uk/pub/project/cancerrxgene/releases/current_release/GDSC1_fitted_dose_response_15Oct19.xlsx] 13) gdsc2_ccle_pharm_fitted_dose_data.txt.gz: Pharmacological data for GDSC2 cell lines. [ftp://ftp.sanger.ac.uk/pub/project/cancerrxgene/releases/current_release/GDSC2_fitted_dose_response_15Oct19.xlsx] 14) compounds_of_interest.txt: list of pharmacological compounds tested for our analysis, taken from ftp://ftp.sanger.ac.uk/pub4/cancerrxgen...
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
DEPMAP measured gene effect from DEPMAP portal
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Information about the dataset files:
1) pancan_rnaseq_freeze.tsv.gz: Publicly available gene expression data for the TCGA Pan-cancer dataset. File: PanCanAtlas EBPlusPlusAdjustPANCAN_IlluminaHiSeq_RNASeqV2.geneExp.tsv was processed using script process_sample_freeze.py by Gregory Way et al as described in https://github.com/greenelab/pancancer/ data processing and initialization steps. [http://api.gdc.cancer.gov/data/3586c0da-64d0-4b74-a449-5ff4d9136611] [https://doi.org/10.1016/j.celrep.2018.03.046]
2) pancan_mutation_freeze.tsv.gz: Publicly available Mutational information for TCGA Pan-cancer dataset. File: mc3.v0.2.8.PUBLIC.maf.gz was processed using script process_sample_freeze.py by Gregory Way et al as described in https://github.com/greenelab/pancancer/ data processing and initialization steps. [http://api.gdc.cancer.gov/data/1c8cfe5f-e52d-41ba-94da-f15ea1337efc] [https://doi.org/10.1016/j.celrep.2018.03.046]
3) pancan_GISTIC_threshold.tsv.gz: Publicly available Gene- level copy number information of the TCGA Pan-cancer dataset. This file is processed using script process_copynumber.py by Gregory Way et al as described in https://github.com/greenelab/pancancer/ data processing and initialization steps. The files copy_number_loss_status.tsv.gz and copy_number_gain_status.tsv.gz generated from this data are used as inputs in our Galaxy pipeline. [https://xenabrowser.net/datapages/?cohort=TCGA%20Pan-Cancer%20(PANCAN)&removeHub=https%3A%2F%2Fxena.treehouse.gi.ucsc.edu%3A443] [https://doi.org/10.1016/j.celrep.2018.03.046]
4) mutation_burden_freeze.tsv.gz: Publicly available Mutational information for TCGA Pan-cancer dataset mc3.v0.2.8.PUBLIC.maf.gz was processed using script process_sample_freeze.py by Gregory Way et al as described in https://github.com/greenelab/pancancer/ data processing and initialization steps. [https://github.com/greenelab/pancancer/][http://api.gdc.cancer.gov/data/1c8cfe5f-e52d-41ba-94da-f15ea1337efc] [https://doi.org/10.1016/j.celrep.2018.03.046]
5) sample_freeze.tsv or sample_freeze_version4_modify.tsv: The file lists the frozen samples as determined by TCGA PanCancer Atlas consortium along with raw RNAseq and mutation data. These were previously determined and included for all downstream analysis All other datasets were processed and subset according to the frozen samples.[https://github.com/greenelab/pancancer/]
6) vogelstein_cancergenes.tsv: compendium of OG and TSG used for the analysis. [https://github.com/greenelab/pancancer/]
7) CCLE_DepMap_18Q1_maf_20180207.txt.gz Publicly available Mutational data for CCLE cell lines from Broad Institute Cancer Cell Line Encyclopedia (CCLE) / DepMap Portal. [https://depmap.org/portal/download/api/download/external?file_name=ccle%2FCCLE_DepMap_18Q1_maf_20180207.txt]
8) ccle_rnaseq_genes_rpkm_20180929.gct.gz: Publicly available Expression data for 1019 cell lines (RPKM) from Broad Institute Cancer Cell Line Encyclopedia (CCLE) / DepMap Portal. [https://depmap.org/portal/download/api/download/external?file_name=ccle%2Fccle_2019%2FCCLE_RNAseq_genes_rpkm_20180929.gct.gz]
9) CCLE_MUT_CNA_AMP_DEL_binary_Revealer.gct: Publicly available merged Mutational and copy number alterations that include gene amplifications and deletions for the CCLE cell lines. This data is represented in the binary format and provided by the Broad Institute Cancer Cell Line Encyclopedia (CCLE) / DepMap Portal. [https://data.broadinstitute.org/ccle_legacy_data/binary_calls_for_copy_number_and_mutation_data/CCLE_MUT_CNA_AMP_DEL_binary_Revealer.gct]
10) GDSC_cell_lines_EXP_CCLE_names.csv.gz Publicly available RMA normalized expression data for Genomics of Drug Sensitivity in Cancer(GDSC) cell-lines. File gdsc_cell_line_RMA_proc_basalExp.csv was downloaded. This data was subsetted to 389 cell lines that are common among CCLE and GDSC. All the GDSC cell line names were replaced with CCLE cell line names for further processing. [https://www.cancerrxgene.org/gdsc1000/GDSC1000_WebResources//Data/preprocessed/Cell_line_RMA_proc_basalExp.txt.zip]
11) GDSC_CCLE_common_mut_cnv_binary.csv.gz: A subset of merged Mutational and copy number alterations that include gene amplifications and deletions for common cell lines between GDSC and CCLE. This file is generated using CCLE_MUT_CNA_AMP_DEL_binary_Revealer.gct and a list of common cell lines.
12) gdsc1_ccle_pharm_fitted_dose_data.txt.gz: Pharmacological data for GDSC1 cell lines. [ftp://ftp.sanger.ac.uk/pub/project/cancerrxgene/releases/current_release/GDSC1_fitted_dose_response_15Oct19.xlsx]
13) gdsc2_ccle_pharm_fitted_dose_data.txt.gz: Pharmacological data for GDSC2 cell lines. [ftp://ftp.sanger.ac.uk/pub/project/cancerrxgene/releases/current_release/GDSC2_fitted_dose_response_15Oct19.xlsx]
14) compounds.csv: list of pharmacological compounds tested for our analysis
15) tcga_dictonary.tsv: list of cancer types used in the analysis.
16) seg_based_scores.tsv: Measurement of total copy number burden, Percent of genome altered by copy number alterations. This file was used as part of the Pancancer analysis by Gregory Way et al as described in https://github.com/greenelab/pancancer/ data processing and initialization steps. [https://github.com/greenelab/pancancer/]
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Cancer cell line genetic dependencies estimated using the DEMETER2 model. DEMETER2 is applied to the combination of three large-scale RNAi screening datasets: the Broad Institute Project Achilles, Novartis Project DRIVE, and the Marcotte et al. breast cell line dataset. This dataset is derived from https://doi.org/10.6084/m9.figshare.6025238.v4 and formatted to more closely match quarterly releases. Specifically, the main matrices have been transposed so that they are indexed by cell line, cell lines are identified by DepMap IDs, and common dependencies and dependency probability estimated. Only select files are included. Refer to the original dataset for all files. See README for more information about files.Visit the Cancer Dependency Map portal at https://depmap.org to explore related datasets. Email questions to depmap@broadinstitute.org
This benchmark data was train and evaluate the models presented in the paper: A. Partin and P. Vasanthakumari et al. "Benchmarking community drug response prediction models: datasets, models, tools, and metrics for cross-dataset generalization analysis"
The benchmark data for Cross-Study Analysis (CSA) include four kinds of data, which are cell line response data, cell line multi-omics data, drug feature data, and data partitions. The figure below illustrates the curation, processing, and assembly of benchmark data, and a unified schema for data curation. Cell line response data were extracted from five sources, including the Cancer Cell Line Encyclopedia (CCLE), the Cancer Therapeutics Response Portal version 2 (CTRPv2), the Genomics of Drug Sensitivity in Cancer version 1 (GDSC1), the Genomics of Drug Sensitivity in Cancer version 2 (GDSC2), and the Genentech Cell Line Screening Initiative (GCSI). These are five large-scale cell line drug screening studies. We extracted their multi-dose viability data and used a unified dose response fitting pipeline to calculate multiple dose-independent response metrics as shown in the figure below, such as the area under the dose response curve (AUC) and the half-maximal inhibitory concentration (IC50). The multi-omics data of cell lines were extracted from the the Dependency Map (DepMap) portal of CCLE, including gene expressions, DNA mutations, DNA methylation, gene copy numbers, protein expressions measured by reverse phase protein array (RPPA), and miRNA expressions. Data preprocessing was performed, such as descritizing gene copy numbers and mapping between different gene identifier systems. Drug information was retrived from PubChem. Based on the drug SMILES (Simplified Molecular Input Line Entry Specification) strings, we calculated their molecular fingerprints and descriptors using the Mordred and RDKit Python packages. Data partition files were generated using the IMPROVE benchmark data preparation pipeline. They indicate, for each modeling analysis run, which samples should be included in the training, validation, and testing sets, for building and evaluating the drug response prediction (DRP) models. The Table below shows the numbers of cell lines, drugs, and experiments in each dataset. Across the five datasets, there are 785 unique cell lines and 749 unique drugs. All cell lines have gene expression, mutation, DNA methylation, and copy number data available. 760 of the cell lines have RPPA protein expressions, and 781 of them have miRNA expressions.
Further description is provided here: https://jdacs4c-improve.github.io/docs/content/app_drp_benchmark.html
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
CTRPv2 PharmacoSet (PSet) generated by ORCESTRA. Metadata can be found on ORCESTRA at: http://orcestra.ca/10.5281/zenodo.3905470
Disclaimer
The CTRPv2 data were generated by the Broad Institute CTD^2 Center and originally released via the Cancer Therapeutics Response Portal (CTRP). The Haibe-Kains Lab has reprocessed and re-annotated the data to maximize overlap with other pharmacogenomic datasets.
Data Usage Policy
The CTD^2 releases data in accordance with their data release policy. The DepMap data, including the CTRPv2 data, are provided under Creative Commons Attribution 4.0 license.
Contact depmap@broadinstitute.org for more information.
Please cite the following when using these data
Seashore-Ludlow B, Rees MG, Cheah JH, Cokol M, Price EV, Coletti ME, Jones V, Bodycombe NE, Soule CK, Gould J, Alexander B, Li A, Montgomery P, Wawer MJ, Kuru N, Kotz JD, Hon CS, Munoz B, Liefeld T, Dančík V, Bittker JA, Palmer M, Bradner JE, Shamji AF, Clemons PA, Schreiber SL. Harnessing Connectivity in a Large-Scale Small-Molecule Sensitivity Dataset. Cancer Discov. 2015 Nov;5(11):1210-23. doi: 10.1158/2159-8290.CD-15-0235. Epub 2015 Oct 19. PMID: 26482930; PMCID: PMC4631646.
Rees MG, Seashore-Ludlow B, Cheah JH, Adams DJ, Price EV, Gill S, Javaid S, Coletti ME, Jones VL, Bodycombe NE, Soule CK, Alexander B, Li A, Montgomery P, Kotz JD, Hon CS, Munoz B, Liefeld T, Dančík V, Haber DA, Clish CB, Bittker JA, Palmer M, Wagner BK, Clemons PA, Shamji AF, Schreiber SL. Correlating chemical sensitivity and basal gene expression reveals mechanism of action. Nat Chem Biol. 2016 Feb;12(2):109-16. doi: 10.1038/nchembio.1986. Epub 2015 Dec 14. PMID: 26656090; PMCID: PMC4718762.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Disclaimer
The CCLE data were generated and shared by the Broad Institute of Harvard and MIT as part of the Cancer Cell Line Encyclopedia project. The Haibe-Kains Lab has reprocessed and re-annotated the data to maximize overlap with other pharmacogenomic datasets.
Data Usage Policy
CCLE publishes its data under the Terms and Conditions linked here. The DepMap data, including the CCLE data, are provided under Creative Commons Attribution 4.0 license.
Contact depmap@broadinstitute.org for more information
Please cite the following when using these data
Cancer Cell Line Encyclopedia Consortium, and Genomics of Drug Sensitivity in Cancer Consortium. 2015. Pharmacogenomic Agreement between Two Cancer Cell Line Data Sets. Nature 528 (7580):84–87. https://doi.org/10.1038/nature15736.
Jordi Barretina, Giordano Caponigro, Nicolas Stransky, Kavitha Venkatesan, William R. Sellers, Robert Schlegel, Levi A. Garraway, et. al. 2012. The Cancer Cell Line Encyclopedia Enables Predictive Modelling of Anticancer Drug Sensitivity. Nature 483 (7391):603–7. https://doi.org/10.1038/nature11003.
For omics data:
Mahmoud Ghandi, Franklin W. Huang, Judit Jané-Valbuena, Gregory V. Kryukov, ... Todd R. Golub, Levi A. Garraway & William R. Sellers. 2019. Next-generation characterization of the Cancer Cell Line Encyclopedia. Nature 569, 503–508 (2019).
For metabolomics:
Haoxin Li, Shaoyang Ning, Mahmoud Ghandi, Gregory V. Kryukov, Shuba Gopal, ... Levi A. Garraway & William R. Sellers. The landscape of cancer cell line metabolism. Nature Medicine 25, 850-860 (2019).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
N/D is obtained from the DepMap portal (https://depmap.org/portal/) and is defined as the ratio of the cell death number (N) and the total number of colon cancer cell lines (D) used in the experimental test. “No.Drugs” denotes the number of drugs retrieved from DrugBank (https://go.drugbank.com/) that modulate each gene. “SE.Score” denotes the side effect score, calculated using average adverse events. “—” indicates that data were not available on the databases. “*” indicates that ADSS2 and CTPS1 are the representative of duplicate enzymes (ADSS1 and ADSS2) and (CTPS1 and CTPS2), respectively.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Development of MATLAB code to generate a context-specific metabolic model for the A2780 cell line, based on transcriptomic data retrieved from the CCLE_expression_full_22Q2 database (available at: https://depmap.org/portal/download/all/).
In parallel, implementation of Python code to construct a custom metabolic model tailored to specific experimental conditions derived from measurements obtained using Nuclear Magnetic Resonance (NMR) spectroscopy.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
CCLE RNA-seq expression data downloaded from DepMap portal (Log2 (TPM+1)). Updated annotation of genes to latest HUGO gene symbol saved as R data object
Attribution 2.0 (CC BY 2.0)https://creativecommons.org/licenses/by/2.0/
License information was derived automatically
(B) Analyzis of the depmap portal (www.depmap.org) reveals MYCN-amplified SK-N-DZ as hypersensitive cell lines to the GPX4 inhibitors RSL3 and ML210.. List of tagged entities: SK-N-DZ (cellosaurus:CVCL_1701), , cell viability assay (bao:BAO_0003009)
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The relationship between cancer cell line viability and NPM1 dependency, determined from genome-wide CRISPR-Cas9 screens (https://depmap.org/portal/interactive). A score of zero indicates that a gene is not essential. Common essential genes have a median score of -1. A lower score indicates that a gene is more likely to be dependent in a given cell line.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data used for Project: "BIT: Bayesian Identification of Transcriptional Regulators from Epigenomics-Based Query Region Sets"
BIT package is available on GitHub: GitHub
We also provide a online web portal: BIT Portal
Please consult the manual for instructions on loading the reference data.
Please note that the preprocessed reference database must be pre-loaded before running function in BIT!!
File Description:
hg38_200.tar.gz: Pre-processed TR ChIP-seq reference datasets for genome hg38 with bin width 200.
hg38_500.tar.gz: Pre-processed TR ChIP-seq reference datasets for genome hg38 with bin width 500.
hg38_1000.tar.gz: Pre-processed TR ChIP-seq reference datasets for genome hg38 with bin width 1000.
mm10_200.tar.gz: Pre-processed TR ChIP-seq reference datasets for genome mm10 with bin width 200.
mm10_500.tar.gz: Pre-processed TR ChIP-seq reference datasets for genome mm10 with bin width 500.
mm10_1000.tar.gz: Pre-processed TR ChIP-seq reference datasets for genome mm10 with bin width 1000.
Input_Data.tar.gz: contains the input data for the four application cases, including differentially accessible regions (DARs) from bulk and single-cell perturbation experiments, cancer-type-specific accessible regions, and cell-type-specific accessible regions.
Figure_Data_v2.tar.gz: is the updated figure data folder, which includes the data used to generate the manuscript’s plots, as well as the output from the benchmarking methods.
Figure.R: R code to replicate the figures, used together with Figure_Data_v2.tar.gz.
Depmap data can be accessed on DepMap Consortium: DepMap
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Chronos dependency score of 17081 genes from Depmap project (https://depmap.org/portal/download/all/, Public 22Q1) for 24 human liver cancer cell lines of CCLE.
Mutational activation of the KRAS oncogene is a major genetic driver of pancreatic ductal adenocarcinoma (PDAC) growth. KRAS-dependent PDAC growth is mediated primarily through persistent activation of the RAF-MEK-ERK mitogen-activated protein kinase (MAPK) cascade, one of the most extensively studied cancer signaling networks. While substrates of RAF and MEK kinases are highly restricted, ERK1/2 has been attributed to over 1,000 substrates. In this study, we used the highly selective ERK1/2 inhibitor, SCH772984, and proteomic and phosphoproteomic analyses to extend the repertoire of ERK-dependent phosphosites and phosphoproteins in PDAC. We validated the specificity of SCH772984 in our cell lines using multiplexed inhibitor beads coupled with mass spectrometry (MIB/MS). We then performed phosphoproteomics and global proteomics in a panel of PDAC cell lines and identified 5,117 ERK-dependent phosphosites on 2,252 proteins, of which 88% and 67%, respectively, were not previously associated with ERK. We then utilized our recently annotated serine/threonine kinome motif database to dissect the phosphoproteome and reveal an expansive ERK-regulated kinase network. We found that ERK- and immediate downstream kinase RSK-substrate motifs predominated after one hour of ERK inhibition, whereas cell cycle regulatory cyclin-dependent kinase motifs predominated by 24 h, reflecting a highly dynamic ERK-dependent phosphoproteome. We find compensatory activation of HIPK, CLK, PKN, PAK, and DYRK family kinases. Finally, using the genome-wide CRISPR-Cas9 dataset in the Cancer Dependency Map portal (DepMap), we determined that approximately 18% of ERK dependent phosphoproteins are essential for pancreatic cancer growth, and these are enriched in nuclear proteins. Together, our findings provide a system-wide profile of the mechanistic basis for ERK-driven pancreatic cancer growth.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The relationship between NPM1 dependency and cell line metastatic propensity, determined from genome-wide CRISPR-Cas9 screens (https://depmap.org/portal/interactive). A score of zero indicates that a gene is not essential. Common essential genes have a median score of -1. A lower score indicates that a gene is more likely to be dependent in a given cell line.
This benchmark dataset was created and used to train and evaluate models presented in the paper: A. Partin, P. Vasanthakumari et al., "Benchmarking community drug response prediction models: datasets, models, tools, and metrics for cross-dataset generalization analysis."
This dataset includes four main components: cell line drug response data, cell line multi-omics data, drug feature data, and predefined data partitions for modeling. Data response data were curated from five pharmacogenomic studies (CCLE, CTRPv2, GDSC1, GDSC2, GCSI), and processed using a unified pipeline for response fitting, omics harmonization, and drug representation.
Multi-dose viability data were extracted, and a unified dose response fitting pipeline was used to calculate multiple dose-independent response metrics, such as the area under the dose response curve (AUC) and the half-maximal inhibitory concentration (IC50).
The multi-omics data of cell lines were extracted from the the Dependency Map (DepMap) portal of CCLE, including gene expressions, DNA mutations, DNA methylation, gene copy numbers, protein expressions measured by reverse phase protein array (RPPA), and miRNA expressions. Data preprocessing was performed, such as discretizing gene copy numbers and mapping between different gene identifier systems.
Drug information was retrieved from PubChem. Based on the drug SMILES strings, we calculated their molecular fingerprints and descriptors using the Mordred and RDKit Python packages.
Data partition files were generated using the IMPROVE benchmark data preparation pipeline. They indicate, for each modeling analysis run, which samples should be included in the training, validation, and testing sets, for building and evaluating the drug response prediction (DRP) models.
More detailed information about the dataset and its construction can be found at https://jdacs4c-improve.github.io/docs/content/app_drp_benchmark.html
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This DepMap release contains data from CRISPR knockout screens from project Achilles, as well as genomic characterization data from the CCLE project.For more information, please see README.txt.