Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains the results of Avana library CRISPR-Cas9 genome-scale knockout (prefixed with Achilles) as well as mutation, copy number and gene expression data (prefixed with CCLE) for cancer cell lines as part of the Broad Institute’s Cancer Dependency Map project. We have repackaged our fileset to include all quarterly-updating datasets produced by DepMap.The Avana CRISPR-Cas9 genome-scale knockout data has expanded to include 689 cell lines, the RNAseq data includes 1249 cell lines, and the copy number data includes 1682 cell lines. Please see the README files for details regarding data processing pipeline procedures updates.As our screening efforts continue, we will be releasing additional cancer dependency data on a quarterly basis for unrestricted use. For the latest datasets available, further analyses, and to subscribe to our mailing list visit https://depmap.org.Descriptions of the experimental methods and the CERES algorithm are published in http://dx.doi.org/10.1038/ng.3984. Some cell lines were process using copy number data based on the Sanger Institute whole exome sequencing data (COSMIC: http://cancer.sanger.ac.uk.cell_lines, EGA accession number: EGAD00001001039) reprocessed using CCLE pipelines. A detailed description of the pipelines and tool versions for CCLE expression can be found here: https://github.com/broadinstitute/gtex-pipeline/blob/v9/TOPMed_RNAseq_pipeline.md.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Information about the dataset files:
1) pancan_rnaseq_freeze.tsv.gz: Publicly available gene expression data for the TCGA Pan-cancer dataset. File: PanCanAtlas EBPlusPlusAdjustPANCAN_IlluminaHiSeq_RNASeqV2.geneExp.tsv was processed using script process_sample_freeze.py by Gregory Way et al as described in https://github.com/greenelab/pancancer/ data processing and initialization steps. [http://api.gdc.cancer.gov/data/3586c0da-64d0-4b74-a449-5ff4d9136611] [https://doi.org/10.1016/j.celrep.2018.03.046]
2) pancan_mutation_freeze.tsv.gz: Publicly available Mutational information for TCGA Pan-cancer dataset. File: mc3.v0.2.8.PUBLIC.maf.gz was processed using script process_sample_freeze.py by Gregory Way et al as described in https://github.com/greenelab/pancancer/ data processing and initialization steps. [http://api.gdc.cancer.gov/data/1c8cfe5f-e52d-41ba-94da-f15ea1337efc] [https://doi.org/10.1016/j.celrep.2018.03.046]
3) pancan_GISTIC_threshold.tsv.gz: Publicly available Gene- level copy number information of the TCGA Pan-cancer dataset. This file is processed using script process_copynumber.py by Gregory Way et al as described in https://github.com/greenelab/pancancer/ data processing and initialization steps. The files copy_number_loss_status.tsv.gz and copy_number_gain_status.tsv.gz generated from this data are used as inputs in our Galaxy pipeline. [https://xenabrowser.net/datapages/?cohort=TCGA%20Pan-Cancer%20(PANCAN)&removeHub=https%3A%2F%2Fxena.treehouse.gi.ucsc.edu%3A443] [https://doi.org/10.1016/j.celrep.2018.03.046]
4) mutation_burden_freeze.tsv.gz: Publicly available Mutational information for TCGA Pan-cancer dataset mc3.v0.2.8.PUBLIC.maf.gz was processed using script process_sample_freeze.py by Gregory Way et al as described in https://github.com/greenelab/pancancer/ data processing and initialization steps. [https://github.com/greenelab/pancancer/][http://api.gdc.cancer.gov/data/1c8cfe5f-e52d-41ba-94da-f15ea1337efc] [https://doi.org/10.1016/j.celrep.2018.03.046]
5) sample_freeze.tsv or sample_freeze_version4_modify.tsv: The file lists the frozen samples as determined by TCGA PanCancer Atlas consortium along with raw RNAseq and mutation data. These were previously determined and included for all downstream analysis All other datasets were processed and subset according to the frozen samples.[https://github.com/greenelab/pancancer/]
6) vogelstein_cancergenes.tsv: compendium of OG and TSG used for the analysis. [https://github.com/greenelab/pancancer/]
7) CCLE_DepMap_18Q1_maf_20180207.txt.gz Publicly available Mutational data for CCLE cell lines from Broad Institute Cancer Cell Line Encyclopedia (CCLE) / DepMap Portal. [https://depmap.org/portal/download/api/download/external?file_name=ccle%2FCCLE_DepMap_18Q1_maf_20180207.txt]
8) ccle_rnaseq_genes_rpkm_20180929.gct.gz: Publicly available Expression data for 1019 cell lines (RPKM) from Broad Institute Cancer Cell Line Encyclopedia (CCLE) / DepMap Portal. [https://depmap.org/portal/download/api/download/external?file_name=ccle%2Fccle_2019%2FCCLE_RNAseq_genes_rpkm_20180929.gct.gz]
9) CCLE_MUT_CNA_AMP_DEL_binary_Revealer.gct: Publicly available merged Mutational and copy number alterations that include gene amplifications and deletions for the CCLE cell lines. This data is represented in the binary format and provided by the Broad Institute Cancer Cell Line Encyclopedia (CCLE) / DepMap Portal. [https://data.broadinstitute.org/ccle_legacy_data/binary_calls_for_copy_number_and_mutation_data/CCLE_MUT_CNA_AMP_DEL_binary_Revealer.gct]
10) GDSC_cell_lines_EXP_CCLE_names.csv.gz Publicly available RMA normalized expression data for Genomics of Drug Sensitivity in Cancer(GDSC) cell-lines. File gdsc_cell_line_RMA_proc_basalExp.csv was downloaded. This data was subsetted to 389 cell lines that are common among CCLE and GDSC. All the GDSC cell line names were replaced with CCLE cell line names for further processing. [https://www.cancerrxgene.org/gdsc1000/GDSC1000_WebResources//Data/preprocessed/Cell_line_RMA_proc_basalExp.txt.zip]
11) GDSC_CCLE_common_mut_cnv_binary.csv.gz: A subset of merged Mutational and copy number alterations that include gene amplifications and deletions for common cell lines between GDSC and CCLE. This file is generated using CCLE_MUT_CNA_AMP_DEL_binary_Revealer.gct and a list of common cell lines.
12) gdsc1_ccle_pharm_fitted_dose_data.txt.gz: Pharmacological data for GDSC1 cell lines. [ftp://ftp.sanger.ac.uk/pub/project/cancerrxgene/releases/current_release/GDSC1_fitted_dose_response_15Oct19.xlsx]
13) gdsc2_ccle_pharm_fitted_dose_data.txt.gz: Pharmacological data for GDSC2 cell lines. [ftp://ftp.sanger.ac.uk/pub/project/cancerrxgene/releases/current_release/GDSC2_fitted_dose_response_15Oct19.xlsx]
14) compounds.csv: list of pharmacological compounds tested for our analysis
15) tcga_dictonary.tsv: list of cancer types used in the analysis.
16) seg_based_scores.tsv: Measurement of total copy number burden, Percent of genome altered by copy number alterations. This file was used as part of the Pancancer analysis by Gregory Way et al as described in https://github.com/greenelab/pancancer/ data processing and initialization steps. [https://github.com/greenelab/pancancer/]
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This DepMap release contains data from CRISPR knockout screens from project Achilles, as well as genomic characterization data from the CCLE project.For more information, please see README.txt.
This benchmark dataset was created and used to train and evaluate models presented in the paper: A. Partin, P. Vasanthakumari et al., "Benchmarking community drug response prediction models: datasets, models, tools, and metrics for cross-dataset generalization analysis."
This dataset includes four main components: cell line drug response data, cell line multi-omics data, drug feature data, and predefined data partitions for modeling. Data response data were curated from five pharmacogenomic studies (CCLE, CTRPv2, GDSC1, GDSC2, GCSI), and processed using a unified pipeline for response fitting, omics harmonization, and drug representation.
Multi-dose viability data were extracted, and a unified dose response fitting pipeline was used to calculate multiple dose-independent response metrics, such as the area under the dose response curve (AUC) and the half-maximal inhibitory concentration (IC50).
The multi-omics data of cell lines were extracted from the the Dependency Map (DepMap) portal of CCLE, including gene expressions, DNA mutations, DNA methylation, gene copy numbers, protein expressions measured by reverse phase protein array (RPPA), and miRNA expressions. Data preprocessing was performed, such as discretizing gene copy numbers and mapping between different gene identifier systems.
Drug information was retrieved from PubChem. Based on the drug SMILES strings, we calculated their molecular fingerprints and descriptors using the Mordred and RDKit Python packages.
Data partition files were generated using the IMPROVE benchmark data preparation pipeline. They indicate, for each modeling analysis run, which samples should be included in the training, validation, and testing sets, for building and evaluating the drug response prediction (DRP) models.
More detailed information about the dataset and its construction can be found at https://jdacs4c-improve.github.io/docs/content/app_drp_benchmark.html
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Cancer cell line genetic dependencies estimated using the DEMETER2 model. DEMETER2 is applied to three large-scale RNAi screening datasets: the Broad Institute Project Achilles, Novartis Project DRIVE, and the Marcotte et al. breast cell line dataset. The model is also applied to generate a combined dataset of gene dependencies covering a total of 712 unique cancer cell lines. For more information visit https://depmap.org/R2-D2/. Visit the Cancer Dependency Map portal at https://depmap.org to explore related datasets. Email questions to depmap@broadinstitute.org This dataset includes gene dependencies estimated using the DEMETER2 model, the raw input datasets used to fit the models, as well as associated metadata. See Readme file for more details about the dataset contents and version history.-------------------------------------------------------------------Version history: (see README for more details)-------------------------------------------------------------------v1: Initial data releasev2: - Removed small number of non-human genes (e.g. GFP, RFP) from shRNA-to-gene mapping - Updated cell line names to be consistent with DepMap names, according to the following map (old -> new):v3: Added estimated seed effect matricesv4: Added RNAseq and mutation data files used in analysis for manuscriptv5: Fixed minor bug with Marcotte LFC data that caused hairpins targeting multiple genes to appear multiple times in the LFC matrix. This created bias in the seed effect estimates for those hairpins, causing very minor differences to the resulting model parameters.v6: Added tables with shRNA quality metrics for Achilles and DRIVE data
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset supports the "Gene expression has more power for predicting in vitro cancer cell vulnerabilities than genomics" preprint by Dempster et al. To generate the figure panels seen in the preprint using these data, use FigurePanelGeneration.ipynb. This study includes five datasets (citations and details in manuscript).Achilles: the Broad Institute's DepMap public 19Q4 CRISPR knockout screens processed with CERESScore: The Sanger Wellcome Institute's Project Score CRISPR knockout screens processed with CERESRNAi: The DEMETER2-processed combined dataset which includes RNAi data from Achilles, DRIVE, and Marcotte breast screens.PRISM: The PRISM pooled in vitro repurposing primary screen of compoundsGDSC17: Cancer drug in vitro drug screens performed by SangerThe files of most interest to a biologist are Summary.csv. If you are interested in trying machine learning, the files Features.hdf5 and Target.hdf5 contain the data munged in a convenient form for standard supervised machine learning algorithms.Some large files are in the binary format hdf5 for efficiency in space and read-in. These files each contain three named hdf5 datasets. "dim_0" holds the row/index names as an array of strings, "dim_1" holds the column names as an array of strings, and "data" holds the matrix contents as a 2D array of floats. In python, these files can be read in with: import pandas as pd import h5py def read_hdf5(filename): src = h5py.File(filename, 'r') try: dim_0 = [x.decode('utf8') for x in src['dim_0']] dim_1 = [x.decode('utf8') for x in src['dim_1']] data = np.array(src['data']) return pd.DataFrame(index=dim_0, columns=dim_1, data=data) finally: src.close()##################################################################Files (not every dataset will have every type of file listed below):##################################################################AllFeaturePredictions.hdf5: Matrix of cell lines by perturbations, with values indicating the predicted viability using a model with all feature types.ENAdditionScore.csv: A matrix of perturbations by number of features. Values indicate an elastic net model performance (Pearson correlation of concatenated out-of-sample predictions with the values given in Target.hdf5) using only the top X features, where X is the column header.FeatureDropScore.csv: Perturbations and predictive performance for a model using all single gene expression features EXCEPT those that had greater than 0.1 feature importance in a model trained with all single gene expression features. Features.hdf5: A very large matrix of all cell lines by all used CCLE cell features. Continuous features were zscored. Cell lines missing mutation or expression data were dropped. Remaining NA values were imputed to zero. Features types are indicated by the column matrix suffixes: _Exp: expression _Hot: hotspot mutation _Dam: damaging mutation _OtherMut: other mutation _CN: copy number _GSEA: ssGSEA score for an MSigDB gene set _MethTSS: Methylation of transcription start sites _MethCpG: Methylation of CpG islands _Fusion: Gene fusions _Cell: cell tissue propertiesNormLRT.csv: the normLRT score for the given perturbationRFAdditionScore.csv: similar to ENAdditionScore, but using a random forest model.Summary.csv: A dataframe containing predictive model results. Columns: model: Specifies the collection of features used (Expression, Mutation, Exp+CN, etc) gene: The perturbation (column in Target.hdf5) examined. Actually a compound for the PRISM and GDSC17 datasets. overall_pearson: Pearson correlation of concatenated out-of-sample predictions with the values given in Target.hdf5 feature: the Nth most important feature, found by retraining the model with all cell lines (N = 0-9) feature_importance: the feature importance as assessed by sklearn's RandomForestRegressorTarget.hdf5: A matrix of cell lines by perturbations, with entries indicating post-perturbation viability scores. Note that the scales of the viability effects are different for different datasets. See manuscript methods for details.PerturbationInfo.csv: Additional drug annotations for the PRISM and GDSC17 datasetsApproximateCFE.hdf5: A set of Cancer Functional Event cell features based on CCLE data, adapted from Iorio et al. 2016 (10.1016/j.cell.2016.06.017)DepMapSampleInfo.csv: sample info from DepMap_public_19Q4 data, reproduced here as a convenience.GeneRelationships.csv: A list of genes and their related (partner) genes, with the type of relationship (self, protein-protein interaction, CORUM complex membership, paralog). OncoKB_oncogenes.csv: A list of genes that have non-expression-based alterations listed as likely oncogenic or oncogenic by OncoKB as of 9 May 2018.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We developed a computational method (Celligner) that identifies and removes systematic differences between cell lines and tumor gene expression profiles, allowing for direct integration of existing large-scale cancer cell line and tumor datasets. Celligner performs this computational alignment across cancer types in a completely unsupervised fashion, without relying on prior annotations of cancer types, tumor sample purity, or contaminating cell expression profiles. We applied Celligner to produce a global alignment of 12,236 tumor samples from TCGA, TARGET, and Treehouse datasets and 1,249 cell lines from DepMap. This dataset includes Celligner-aligned data, a matrix of correlations between cell lines and tumors, associated cell line and tumor metadata, and other outputs from the Celligner method. See Readme file for more details about the dataset contents and version history.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This DepMap Release contains new cell models and data from Whole Genome/Exome Sequencing (Copy Number and Mutation), RNA Sequencing (Expression and Fusions), Genome-wide CRISPR knockout screens. Also included are updated metadata and mapping files for information about cell models and data relationships, respectively. Each release may contain improvements to our pipelines that generate this data so you may notice changes from the last release. For more information, please see README.txt.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Files contained in here come from data files used and are related to analysis and figure generation. Code notebooks within the code folder will point to these specific data files. Not all data files used are uploaded to this specific repository to avoid redistribution of other published work (specifically HumanNet files, CCLE/DepMap CERES, clinical files - TCGA/OHSU/TARGET data, and the Cancer Gene Census from COSMIC).Descriptions of data files contained in folder:AML_age.txt - curated AML cell line data and age of derived patient.Avana_Corrected_FC_2020_Q4.txt - Crispr cleanR corrected fold-change data of the 2020q4 Avana release.Avana_NORM_MIXEM_FC_2020_Q4.txt - mu and sigma calculations Mixed model (k=2) for each screen's null distribution from Avana 2020q4.avana_output_update_2020_Q4 - Primary data file used to complete figure analysis. Data file contains, depmap cell line id, entrez id, gene name, mean log2FC, CCLE expression, binary classification of mutation status, mixed z-score of gene, binary classification of cosmic TSG status, binary classification of non essential gene status, mean log2FC ranking, and hit_mix which represents PSG classification for each gene-cell line pair from of the Avana 2020q4 distribution.bf_avana_2020q4_CRISPRcleanR_corrected.noNA - Crispr cleanR corrected bagel scores for the Avana 2020q4 distribution.data_not_redistributed.xlsx - description and sources of data not uploaded to figshare to avoid redistribution of other published data. dPCC-AML-qualFilt-varFilt.txt - filtered dPCC correlations related to figure 3.fisher_edges_mix_hits_tsg.txt - Text file of all PSG gene pairs, and fishers test pvalue, and total count of gene observations as a hit (count not used for analysis).fisher_net_mix_Z_fdr_0.001.txt - FDR < 0.001 filtered network of all PSG gene pairs, and fishers test pvalue, and total count of gene observations as a hit (count not used for analysis). Main network used for analyses.genes-significant-dPCC-with-chp1-cluster-zSTD-filter.txt - Genes filtered and selected for dPCC heatmap analysis of figure 3e.Human_net_cutoff_results_updated.txt - Human net comparisons and cutoffs used for supplemental figure 4b.Hunet_comparison_update.Rdata - Human net comparisons and cutoffs used for supplemental figure 4a.JACKS_result_gene_JACKS_results.txt - Crispr cleanR corrected JACKS scores for Avana 2020q4 distribution. log_normalmixEM.txt - log file of mixture model iterations of avana2020q4.matrix-GMMZ-qualFilter-varFilter-9055genes-659cells-17aml.txt - Selecting appropriate AML cells for dpcc analysis in figure 3e.metabolite_error.txt - Metabolite variance measurements used in determining viable metabolites for analysis. Metabolites that had measurements below error were not used.Mix_Z_pr_values_updated.txt - precision recall measurements and associated mixed z-scores of pr cutoffs. used to determine FDR cutoff measurements. NEGv1.txt - Non essential genes from bagel.PTEN_CN.txt - PTEN copy number values from CCLE.Sanger_Corrected_FC.txt - Crispr cleanR corrected fold-change data of the Sanger 2019 release.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Additional file 17: Table S12. Characterising [AT]n motif expansion in cancer cell lines. A. All MSI DepMap cancer cell lines with WGS data and three MSS DepMap cell lines with WGS data for comparison. Cell_line_stripped_name: cell line names provided by DepMap without special characters or spaces. disease: cancer type assigned by DepMap. MSI_status: inferred microsatellite status by Ghandi et al. (2019). KMT2D_group: KMT2D group assigned to a cell line. SRA_project_ID: NCBI SRA project ID for raw WGS data. SRA_Run_ID: NCBI SRA run ID for raw WGS data. Instrument: sequencer used to perform WGS. B. Profiles of [AT]n motifs generated using ExpansionHunter Denovo. Cell_line_name: cell line names provided by DepMap without special characters or spaces. disease: cancer type assigned by DepMap. SRA_id: NCBI SRAN run ID. group: MSI status and KMT2D group of cell line. Contig: chromosome location of motif. Start: start site (bp) of motif. End: end site (bp) of motif. Motif: motif type. Num_anc_irrs: number of anchored in-repeat reads (i.e. one read pair maps within and the other outside of a repeat region; see Methods). norm_num_anc_irrs: normalised number of anchored in-repeat reads. Het_str_size: estimated motif repeat size.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is the result of 318 cancer cell lines screened with the genome-wide KY1.0/1.1 CRISPR KO library by the Sanger Institute, processed with the Achilles pipeline (except QC). The publication describing the experiment is "Prioritization of cancer therapeutic targets using CRISPR-Cas9 screens," DOI 10.1038/s41586-019-1103-9.Readcounts were downloaded from https://score.depmap.sanger.ac.uk/downloads on 8 May 2019. Only cell lines annotated by the authors as passing both QC steps in Supplementary Table 1 were retained. Additionally, only cell lines for which the Broad has copy number data as of 10 May 2019 were retained.For more details on included files, see README
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The PRISM Repurposing release contains small molecule viability datasets generated using the Broad Repurposing Library and the PRISM multiplexed cell-line viability assay.
The primary PRISM Repurposing dataset contains the results of pooled-cell line chemical-perturbation viability screens for 5,274 compounds screened against 578 or 562 cell lines.
The secondary PRISM Repurposing dataset contains the results of pooled-cell line chemical-perturbation viability screens for 1,448 compounds screened against 489 cell lines in an 8-step, 4-fold dilution, starting from 10uM. Technical redos for 147 previously screened oncology compounds have been added as of 11/22/2019. This data is identified using a screen_id
of "MTS010". It is recommended to use MTS010 when available. MTS010 was not included in the analysis for Corsello et al. 2019, DOI:10.1101/730119.
Data processing steps are described in the README files. Further descriptions of methods are available at https://doi.org/10.1101/730119.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
BackgroundFerroptosis is a form of programmed cell death (PCD) that has been implicated in cancer progression, although the specific mechanism is not known. Here, we used the latest DepMap release CRISPR data to identify the essential ferroptosis-related genes (FRGs) in glioma and their role in patient outcomes.MethodsRNA-seq and clinical information on glioma cases were obtained from the Chinese Glioma Genome Atlas (CGGA) and The Cancer Genome Atlas (TCGA). FRGs were obtained from the FerrDb database. CRISPR-screened essential genes (CSEGs) in glioma cell lines were downloaded from the DepMap portal. A series of bioinformatic and machine learning approaches were combined to establish FRG signatures to predict overall survival (OS) in glioma patients. In addition, pathways analysis was used to identify the functional roles of FRGs. Somatic mutation, immune cell infiltration, and immune checkpoint gene expression were analyzed within the risk subgroups. Finally, compounds for reversing high-risk gene signatures were predicted using the GDSC and L1000 datasets.ResultsSeven FRGs (ISCU, NFS1, MTOR, EIF2S1, HSPA5, AURKA, RPL8) were included in the model and the model was found to have good prognostic value (p < 0.001) in both training and validation groups. The risk score was found to be an independent prognostic factor and the model had good efficacy. Subgroup analysis using clinical parameters demonstrated the general applicability of the model. The nomogram indicated that the model could effectively predict 12-, 36-, and 60-months OS and progression-free interval (PFI). The results showed the presence of more aggressive phenotypes (lower numbers of IDH mutations, higher numbers of EGFR and PTEN mutations, greater infiltration of immune suppressive cells, and higher expression of immune checkpoint inhibitors) in the high-risk group. The signaling pathways enriched closely related to the cell cycle and DNA damage repair. Drug predictions showed that patients with higher risk scores may benefit from treatment with RTK pathway inhibitors, including compounds that inhibit RTKs directly or indirectly by targeting downstream PI3K or MAPK pathways.ConclusionIn summary, the proposed cancer essential FRG signature predicts survival and treatment response in glioma.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
RBM39 is an essential component of the spliceosome, playing a critical role in maintaining mRNA integrity. Its depletion significantly exacerbates RNA splicing defects and demonstrates potent anticancer activity. To identify key effectors following RBM39 depletion, we employed a multiomics approach to directly compare two structurally distinct compounds, CB039 and Indisulam. Through proteomic analysis, RNA sequencing, and DepMap dependency assessment, CEP192 emerged as a crucial gene, exhibiting dependency in 96% of the 1,100 analyzed cancer cell lines. In eight cancer cell lines, treatment with both CB039 and Indisulam consistently induced CEP192 exon 42 skipping and reduced CEP192 protein levels. Mechanistically, either CB039 treatment or RNA interference-mediated CEP192 knockdown led to a significant increase in spindle disorganization, as well as chromosome condensation and failed segregation. In conclusion, our characterization of the downstream effects of RBM39 depletion provides novel insights into the therapeutic potential of RBM39 degraders.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data used in analyses from Krill-Burger et al., “Partial inhibition improves identification of cancer vulnerabilities when CRISPR-Cas9 knockout is pan-lethal”. These raw data files include annotations from the Cancer Dependency Map (DepMap) and publicly available datasets that have been pre-processed to have consistent cell line and gene/compound identifiers. See ReadMe.pdf for a detailed description of each file.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
BackgroundEpigenetics regulate gene expression without altering the DNA sequence. Epigenetics targeted chemotherapeutic approach can be used to overcome treatment resistance and low response rate in HCC. However, a comprehensive review of genomic data was carried out to determine the role of epigenesis in the tumor microenvironment (TME), immune cell-infiltration characteristics in HCC is still insufficient.MethodsThe association between epigenetic-related genes (ERGs), inflammatory response-related genes (IRRGs) and CRISPR genes was determined by merging genomic and CRISPR data. Further, characteristics of immune-cell infiltration in the tumor microenvironment was evaluated.ResultsNine differentially expressed genes (ANP32B, ASF1A, BCORL1, BMI1, BUB1, CBX2, CBX3, CDK1, and CDK5) were shown to be independent prognostic factors based on lasso regression in the TCGA-LIHC and ICGC databases. In addition, the results showed significant differences in expression of PDCD-1 (PD-1) and CTLA4 between the high- and low-epigenetic score groups. The CTRP and PRISM-derived drug response data yielded four CTRP-derived compounds (SB-743921, GSK461364, gemcitabine, and paclitaxel) and two PRISM-derived compounds (dolastatin-10 and LY2606368). Patients with high ERGs benefited more from immune checkpoint inhibitor (ICI) therapy than patients with low ERGs. In addition, the high ERGs subgroup had a higher T cell exclusion score, while the low ERGs subgroup had a higher T cell dysfunction. However, there was no difference in microsatellite instability (MSI) score among the two subgroups. Further, genome-wide CRISPR-based loss-of function screening derived from DepMap was conducted to determine key genes leading to HCC development and progression. In total, 640 genes were identified to be essential for survival in HCC cell lines. The protein-protein interaction (PPI) network demonstrated that IRRGs PSEN1 was linked to most ERGs and CRISPR genes such as CDK1, TOP2A, CBX2 and CBX3.ConclusionEpigenetic alterations of cancer-related genes in the tumor microenvironment play a major role in carcinogenesis. This study showed that epigenetic-related novel biomarkers could be useful in predicting prognosis, clinical diagnosis, and management in HCC.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains the results of Avana library CRISPR-Cas9 genome-scale knockout (prefixed with Achilles) as well as mutation, copy number and gene expression data (prefixed with CCLE) for cancer cell lines as part of the Broad Institute’s Cancer Dependency Map project. We have repackaged our fileset to include all quarterly-updating datasets produced by DepMap.The Avana CRISPR-Cas9 genome-scale knockout data has expanded to include 689 cell lines, the RNAseq data includes 1249 cell lines, and the copy number data includes 1682 cell lines. Please see the README files for details regarding data processing pipeline procedures updates.As our screening efforts continue, we will be releasing additional cancer dependency data on a quarterly basis for unrestricted use. For the latest datasets available, further analyses, and to subscribe to our mailing list visit https://depmap.org.Descriptions of the experimental methods and the CERES algorithm are published in http://dx.doi.org/10.1038/ng.3984. Some cell lines were process using copy number data based on the Sanger Institute whole exome sequencing data (COSMIC: http://cancer.sanger.ac.uk.cell_lines, EGA accession number: EGAD00001001039) reprocessed using CCLE pipelines. A detailed description of the pipelines and tool versions for CCLE expression can be found here: https://github.com/broadinstitute/gtex-pipeline/blob/v9/TOPMed_RNAseq_pipeline.md.