16 datasets found
  1. public_20q3

    • figshare.com
    txt
    Updated May 31, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Broad DepMap (2023). public_20q3 [Dataset]. http://doi.org/10.6084/m9.figshare.12931238.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Broad DepMap
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains the results of Avana library CRISPR-Cas9 genome-scale knockout (prefixed with Achilles) as well as mutation, copy number and gene expression data (prefixed with CCLE) for cancer cell lines as part of the Broad Institute’s Cancer Dependency Map project. We have repackaged our fileset to include all quarterly-updating datasets produced by DepMap.The Avana CRISPR-Cas9 genome-scale knockout data has expanded to include 689 cell lines, the RNAseq data includes 1249 cell lines, and the copy number data includes 1682 cell lines. Please see the README files for details regarding data processing pipeline procedures updates.As our screening efforts continue, we will be releasing additional cancer dependency data on a quarterly basis for unrestricted use. For the latest datasets available, further analyses, and to subscribe to our mailing list visit https://depmap.org.Descriptions of the experimental methods and the CERES algorithm are published in http://dx.doi.org/10.1038/ng.3984. Some cell lines were process using copy number data based on the Sanger Institute whole exome sequencing data (COSMIC: http://cancer.sanger.ac.uk.cell_lines, EGA accession number: EGAD00001001039) reprocessed using CCLE pipelines. A detailed description of the pipelines and tool versions for CCLE expression can be found here: https://github.com/broadinstitute/gtex-pipeline/blob/v9/TOPMed_RNAseq_pipeline.md.

  2. Pan-cancer Aberrant Pathway Activity Analysis (PAPAA)

    • zenodo.org
    • explore.openaire.eu
    application/gzip, csv +1
    Updated Dec 5, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DANIEL BLANKENBERG; DANIEL BLANKENBERG; VIJAY NAGAMPALLI; VIJAY NAGAMPALLI (2020). Pan-cancer Aberrant Pathway Activity Analysis (PAPAA) [Dataset]. http://doi.org/10.5281/zenodo.3629709
    Explore at:
    application/gzip, tsv, csvAvailable download formats
    Dataset updated
    Dec 5, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    DANIEL BLANKENBERG; DANIEL BLANKENBERG; VIJAY NAGAMPALLI; VIJAY NAGAMPALLI
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Information about the dataset files:

    1) pancan_rnaseq_freeze.tsv.gz: Publicly available gene expression data for the TCGA Pan-cancer dataset. File: PanCanAtlas EBPlusPlusAdjustPANCAN_IlluminaHiSeq_RNASeqV2.geneExp.tsv was processed using script process_sample_freeze.py by Gregory Way et al as described in https://github.com/greenelab/pancancer/ data processing and initialization steps. [http://api.gdc.cancer.gov/data/3586c0da-64d0-4b74-a449-5ff4d9136611] [https://doi.org/10.1016/j.celrep.2018.03.046]

    2) pancan_mutation_freeze.tsv.gz: Publicly available Mutational information for TCGA Pan-cancer dataset. File: mc3.v0.2.8.PUBLIC.maf.gz was processed using script process_sample_freeze.py by Gregory Way et al as described in https://github.com/greenelab/pancancer/ data processing and initialization steps. [http://api.gdc.cancer.gov/data/1c8cfe5f-e52d-41ba-94da-f15ea1337efc] [https://doi.org/10.1016/j.celrep.2018.03.046]

    3) pancan_GISTIC_threshold.tsv.gz: Publicly available Gene- level copy number information of the TCGA Pan-cancer dataset. This file is processed using script process_copynumber.py by Gregory Way et al as described in https://github.com/greenelab/pancancer/ data processing and initialization steps. The files copy_number_loss_status.tsv.gz and copy_number_gain_status.tsv.gz generated from this data are used as inputs in our Galaxy pipeline. [https://xenabrowser.net/datapages/?cohort=TCGA%20Pan-Cancer%20(PANCAN)&removeHub=https%3A%2F%2Fxena.treehouse.gi.ucsc.edu%3A443] [https://doi.org/10.1016/j.celrep.2018.03.046]

    4) mutation_burden_freeze.tsv.gz: Publicly available Mutational information for TCGA Pan-cancer dataset mc3.v0.2.8.PUBLIC.maf.gz was processed using script process_sample_freeze.py by Gregory Way et al as described in https://github.com/greenelab/pancancer/ data processing and initialization steps. [https://github.com/greenelab/pancancer/][http://api.gdc.cancer.gov/data/1c8cfe5f-e52d-41ba-94da-f15ea1337efc] [https://doi.org/10.1016/j.celrep.2018.03.046]

    5) sample_freeze.tsv or sample_freeze_version4_modify.tsv: The file lists the frozen samples as determined by TCGA PanCancer Atlas consortium along with raw RNAseq and mutation data. These were previously determined and included for all downstream analysis All other datasets were processed and subset according to the frozen samples.[https://github.com/greenelab/pancancer/]

    6) vogelstein_cancergenes.tsv: compendium of OG and TSG used for the analysis. [https://github.com/greenelab/pancancer/]

    7) CCLE_DepMap_18Q1_maf_20180207.txt.gz Publicly available Mutational data for CCLE cell lines from Broad Institute Cancer Cell Line Encyclopedia (CCLE) / DepMap Portal. [https://depmap.org/portal/download/api/download/external?file_name=ccle%2FCCLE_DepMap_18Q1_maf_20180207.txt]

    8) ccle_rnaseq_genes_rpkm_20180929.gct.gz: Publicly available Expression data for 1019 cell lines (RPKM) from Broad Institute Cancer Cell Line Encyclopedia (CCLE) / DepMap Portal. [https://depmap.org/portal/download/api/download/external?file_name=ccle%2Fccle_2019%2FCCLE_RNAseq_genes_rpkm_20180929.gct.gz]

    9) CCLE_MUT_CNA_AMP_DEL_binary_Revealer.gct: Publicly available merged Mutational and copy number alterations that include gene amplifications and deletions for the CCLE cell lines. This data is represented in the binary format and provided by the Broad Institute Cancer Cell Line Encyclopedia (CCLE) / DepMap Portal. [https://data.broadinstitute.org/ccle_legacy_data/binary_calls_for_copy_number_and_mutation_data/CCLE_MUT_CNA_AMP_DEL_binary_Revealer.gct]

    10) GDSC_cell_lines_EXP_CCLE_names.csv.gz Publicly available RMA normalized expression data for Genomics of Drug Sensitivity in Cancer(GDSC) cell-lines. File gdsc_cell_line_RMA_proc_basalExp.csv was downloaded. This data was subsetted to 389 cell lines that are common among CCLE and GDSC. All the GDSC cell line names were replaced with CCLE cell line names for further processing. [https://www.cancerrxgene.org/gdsc1000/GDSC1000_WebResources//Data/preprocessed/Cell_line_RMA_proc_basalExp.txt.zip]

    11) GDSC_CCLE_common_mut_cnv_binary.csv.gz: A subset of merged Mutational and copy number alterations that include gene amplifications and deletions for common cell lines between GDSC and CCLE. This file is generated using CCLE_MUT_CNA_AMP_DEL_binary_Revealer.gct and a list of common cell lines.

    12) gdsc1_ccle_pharm_fitted_dose_data.txt.gz: Pharmacological data for GDSC1 cell lines. [ftp://ftp.sanger.ac.uk/pub/project/cancerrxgene/releases/current_release/GDSC1_fitted_dose_response_15Oct19.xlsx]

    13) gdsc2_ccle_pharm_fitted_dose_data.txt.gz: Pharmacological data for GDSC2 cell lines. [ftp://ftp.sanger.ac.uk/pub/project/cancerrxgene/releases/current_release/GDSC2_fitted_dose_response_15Oct19.xlsx]

    14) compounds.csv: list of pharmacological compounds tested for our analysis

    15) tcga_dictonary.tsv: list of cancer types used in the analysis.

    16) seg_based_scores.tsv: Measurement of total copy number burden, Percent of genome altered by copy number alterations. This file was used as part of the Pancancer analysis by Gregory Way et al as described in https://github.com/greenelab/pancancer/ data processing and initialization steps. [https://github.com/greenelab/pancancer/]

  3. f

    DepMap 23Q4 Public

    • plus.figshare.com
    txt
    Updated Dec 19, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Broad DepMap (2023). DepMap 23Q4 Public [Dataset]. http://doi.org/10.25452/figshare.plus.24667905.v2
    Explore at:
    txtAvailable download formats
    Dataset updated
    Dec 19, 2023
    Dataset provided by
    Figshare+
    Authors
    Broad DepMap
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This DepMap release contains data from CRISPR knockout screens from project Achilles, as well as genomic characterization data from the CCLE project.For more information, please see README.txt.

  4. Cross-Study Benchmark Dataset for Monotherapy Drug Response Prediction

    • zenodo.org
    zip
    Updated Apr 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexander Partin; Alexander Partin (2025). Cross-Study Benchmark Dataset for Monotherapy Drug Response Prediction [Dataset]. http://doi.org/10.5281/zenodo.15258883
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 22, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Alexander Partin; Alexander Partin
    Description

    This benchmark dataset was created and used to train and evaluate models presented in the paper: A. Partin, P. Vasanthakumari et al., "Benchmarking community drug response prediction models: datasets, models, tools, and metrics for cross-dataset generalization analysis."

    This dataset includes four main components: cell line drug response data, cell line multi-omics data, drug feature data, and predefined data partitions for modeling. Data response data were curated from five pharmacogenomic studies (CCLE, CTRPv2, GDSC1, GDSC2, GCSI), and processed using a unified pipeline for response fitting, omics harmonization, and drug representation.

    Multi-dose viability data were extracted, and a unified dose response fitting pipeline was used to calculate multiple dose-independent response metrics, such as the area under the dose response curve (AUC) and the half-maximal inhibitory concentration (IC50).

    The multi-omics data of cell lines were extracted from the the Dependency Map (DepMap) portal of CCLE, including gene expressions, DNA mutations, DNA methylation, gene copy numbers, protein expressions measured by reverse phase protein array (RPPA), and miRNA expressions. Data preprocessing was performed, such as discretizing gene copy numbers and mapping between different gene identifier systems.

    Drug information was retrieved from PubChem. Based on the drug SMILES strings, we calculated their molecular fingerprints and descriptors using the Mordred and RDKit Python packages.

    Data partition files were generated using the IMPROVE benchmark data preparation pipeline. They indicate, for each modeling analysis run, which samples should be included in the training, validation, and testing sets, for building and evaluating the drug response prediction (DRP) models.

    More detailed information about the dataset and its construction can be found at https://jdacs4c-improve.github.io/docs/content/app_drp_benchmark.html

  5. DEMETER2 data

    • figshare.com
    txt
    Updated Apr 9, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cancer Data Science (2020). DEMETER2 data [Dataset]. http://doi.org/10.6084/m9.figshare.6025238.v6
    Explore at:
    txtAvailable download formats
    Dataset updated
    Apr 9, 2020
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Cancer Data Science
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Cancer cell line genetic dependencies estimated using the DEMETER2 model. DEMETER2 is applied to three large-scale RNAi screening datasets: the Broad Institute Project Achilles, Novartis Project DRIVE, and the Marcotte et al. breast cell line dataset. The model is also applied to generate a combined dataset of gene dependencies covering a total of 712 unique cancer cell lines. For more information visit https://depmap.org/R2-D2/. Visit the Cancer Dependency Map portal at https://depmap.org to explore related datasets. Email questions to depmap@broadinstitute.org This dataset includes gene dependencies estimated using the DEMETER2 model, the raw input datasets used to fit the models, as well as associated metadata. See Readme file for more details about the dataset contents and version history.-------------------------------------------------------------------Version history: (see README for more details)-------------------------------------------------------------------v1: Initial data releasev2: - Removed small number of non-human genes (e.g. GFP, RFP) from shRNA-to-gene mapping - Updated cell line names to be consistent with DepMap names, according to the following map (old -> new):v3: Added estimated seed effect matricesv4: Added RNAseq and mutation data files used in analysis for manuscriptv5: Fixed minor bug with Marcotte LFC data that caused hairpins targeting multiple genes to appear multiple times in the LFC matrix. This created bias in the seed effect estimates for those hairpins, causing very minor differences to the resulting model parameters.v6: Added tables with shRNA quality metrics for Achilles and DRIVE data

  6. f

    Expression vs genomics for predicting dependencies

    • figshare.com
    hdf
    Updated May 17, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Broad DepMap (2024). Expression vs genomics for predicting dependencies [Dataset]. http://doi.org/10.6084/m9.figshare.25843450.v1
    Explore at:
    hdfAvailable download formats
    Dataset updated
    May 17, 2024
    Dataset provided by
    figshare
    Authors
    Broad DepMap
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset supports the "Gene expression has more power for predicting in vitro cancer cell vulnerabilities than genomics" preprint by Dempster et al. To generate the figure panels seen in the preprint using these data, use FigurePanelGeneration.ipynb. This study includes five datasets (citations and details in manuscript).Achilles: the Broad Institute's DepMap public 19Q4 CRISPR knockout screens processed with CERESScore: The Sanger Wellcome Institute's Project Score CRISPR knockout screens processed with CERESRNAi: The DEMETER2-processed combined dataset which includes RNAi data from Achilles, DRIVE, and Marcotte breast screens.PRISM: The PRISM pooled in vitro repurposing primary screen of compoundsGDSC17: Cancer drug in vitro drug screens performed by SangerThe files of most interest to a biologist are Summary.csv. If you are interested in trying machine learning, the files Features.hdf5 and Target.hdf5 contain the data munged in a convenient form for standard supervised machine learning algorithms.Some large files are in the binary format hdf5 for efficiency in space and read-in. These files each contain three named hdf5 datasets. "dim_0" holds the row/index names as an array of strings, "dim_1" holds the column names as an array of strings, and "data" holds the matrix contents as a 2D array of floats. In python, these files can be read in with: import pandas as pd import h5py def read_hdf5(filename): src = h5py.File(filename, 'r') try: dim_0 = [x.decode('utf8') for x in src['dim_0']] dim_1 = [x.decode('utf8') for x in src['dim_1']] data = np.array(src['data']) return pd.DataFrame(index=dim_0, columns=dim_1, data=data) finally: src.close()##################################################################Files (not every dataset will have every type of file listed below):##################################################################AllFeaturePredictions.hdf5: Matrix of cell lines by perturbations, with values indicating the predicted viability using a model with all feature types.ENAdditionScore.csv: A matrix of perturbations by number of features. Values indicate an elastic net model performance (Pearson correlation of concatenated out-of-sample predictions with the values given in Target.hdf5) using only the top X features, where X is the column header.FeatureDropScore.csv: Perturbations and predictive performance for a model using all single gene expression features EXCEPT those that had greater than 0.1 feature importance in a model trained with all single gene expression features. Features.hdf5: A very large matrix of all cell lines by all used CCLE cell features. Continuous features were zscored. Cell lines missing mutation or expression data were dropped. Remaining NA values were imputed to zero. Features types are indicated by the column matrix suffixes: _Exp: expression _Hot: hotspot mutation _Dam: damaging mutation _OtherMut: other mutation _CN: copy number _GSEA: ssGSEA score for an MSigDB gene set _MethTSS: Methylation of transcription start sites _MethCpG: Methylation of CpG islands _Fusion: Gene fusions _Cell: cell tissue propertiesNormLRT.csv: the normLRT score for the given perturbationRFAdditionScore.csv: similar to ENAdditionScore, but using a random forest model.Summary.csv: A dataframe containing predictive model results. Columns: model: Specifies the collection of features used (Expression, Mutation, Exp+CN, etc) gene: The perturbation (column in Target.hdf5) examined. Actually a compound for the PRISM and GDSC17 datasets. overall_pearson: Pearson correlation of concatenated out-of-sample predictions with the values given in Target.hdf5 feature: the Nth most important feature, found by retraining the model with all cell lines (N = 0-9) feature_importance: the feature importance as assessed by sklearn's RandomForestRegressorTarget.hdf5: A matrix of cell lines by perturbations, with entries indicating post-perturbation viability scores. Note that the scales of the viability effects are different for different datasets. See manuscript methods for details.PerturbationInfo.csv: Additional drug annotations for the PRISM and GDSC17 datasetsApproximateCFE.hdf5: A set of Cancer Functional Event cell features based on CCLE data, adapted from Iorio et al. 2016 (10.1016/j.cell.2016.06.017)DepMapSampleInfo.csv: sample info from DepMap_public_19Q4 data, reproduced here as a convenience.GeneRelationships.csv: A list of genes and their related (partner) genes, with the type of relationship (self, protein-protein interaction, CORUM complex membership, paralog). OncoKB_oncogenes.csv: A list of genes that have non-expression-based alterations listed as likely oncogenic or oncogenic by OncoKB as of 9 May 2018.

  7. Celligner data

    • figshare.com
    bin
    Updated Dec 7, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cancer Data Science (2020). Celligner data [Dataset]. http://doi.org/10.6084/m9.figshare.11965269.v5
    Explore at:
    binAvailable download formats
    Dataset updated
    Dec 7, 2020
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Cancer Data Science
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    We developed a computational method (Celligner) that identifies and removes systematic differences between cell lines and tumor gene expression profiles, allowing for direct integration of existing large-scale cancer cell line and tumor datasets. Celligner performs this computational alignment across cancer types in a completely unsupervised fashion, without relying on prior annotations of cancer types, tumor sample purity, or contaminating cell expression profiles. We applied Celligner to produce a global alignment of 12,236 tumor samples from TCGA, TARGET, and Treehouse datasets and 1,249 cell lines from DepMap. This dataset includes Celligner-aligned data, a matrix of correlations between cell lines and tumors, associated cell line and tumor metadata, and other outputs from the Celligner method. See Readme file for more details about the dataset contents and version history.

  8. f

    DepMap 24Q4 Public

    • plus.figshare.com
    bin
    Updated Dec 10, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Broad DepMap (2024). DepMap 24Q4 Public [Dataset]. http://doi.org/10.25452/figshare.plus.27993248.v1
    Explore at:
    binAvailable download formats
    Dataset updated
    Dec 10, 2024
    Dataset provided by
    Figshare+
    Authors
    Broad DepMap
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This DepMap Release contains new cell models and data from Whole Genome/Exome Sequencing (Copy Number and Mutation), RNA Sequencing (Expression and Fusions), Genome-wide CRISPR knockout screens. Also included are updated metadata and mapping files for information about cell models and data relationships, respectively. Each release may contain improvements to our pipelines that generate this data so you may notice changes from the last release. For more information, please see README.txt.

  9. f

    Data Files - Discovery of putative tumor suppressors from CRISPR screens...

    • figshare.com
    txt
    Updated Oct 15, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Walter Lenoir; Traver Hart (2021). Data Files - Discovery of putative tumor suppressors from CRISPR screens reveals rewired lipid metabolism in acute myeloid leukemia cells [Dataset]. http://doi.org/10.6084/m9.figshare.16746040.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Oct 15, 2021
    Dataset provided by
    figshare
    Authors
    Walter Lenoir; Traver Hart
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Files contained in here come from data files used and are related to analysis and figure generation. Code notebooks within the code folder will point to these specific data files. Not all data files used are uploaded to this specific repository to avoid redistribution of other published work (specifically HumanNet files, CCLE/DepMap CERES, clinical files - TCGA/OHSU/TARGET data, and the Cancer Gene Census from COSMIC).Descriptions of data files contained in folder:AML_age.txt - curated AML cell line data and age of derived patient.Avana_Corrected_FC_2020_Q4.txt - Crispr cleanR corrected fold-change data of the 2020q4 Avana release.Avana_NORM_MIXEM_FC_2020_Q4.txt - mu and sigma calculations Mixed model (k=2) for each screen's null distribution from Avana 2020q4.avana_output_update_2020_Q4 - Primary data file used to complete figure analysis. Data file contains, depmap cell line id, entrez id, gene name, mean log2FC, CCLE expression, binary classification of mutation status, mixed z-score of gene, binary classification of cosmic TSG status, binary classification of non essential gene status, mean log2FC ranking, and hit_mix which represents PSG classification for each gene-cell line pair from of the Avana 2020q4 distribution.bf_avana_2020q4_CRISPRcleanR_corrected.noNA - Crispr cleanR corrected bagel scores for the Avana 2020q4 distribution.data_not_redistributed.xlsx - description and sources of data not uploaded to figshare to avoid redistribution of other published data. dPCC-AML-qualFilt-varFilt.txt - filtered dPCC correlations related to figure 3.fisher_edges_mix_hits_tsg.txt - Text file of all PSG gene pairs, and fishers test pvalue, and total count of gene observations as a hit (count not used for analysis).fisher_net_mix_Z_fdr_0.001.txt - FDR < 0.001 filtered network of all PSG gene pairs, and fishers test pvalue, and total count of gene observations as a hit (count not used for analysis). Main network used for analyses.genes-significant-dPCC-with-chp1-cluster-zSTD-filter.txt - Genes filtered and selected for dPCC heatmap analysis of figure 3e.Human_net_cutoff_results_updated.txt - Human net comparisons and cutoffs used for supplemental figure 4b.Hunet_comparison_update.Rdata - Human net comparisons and cutoffs used for supplemental figure 4a.JACKS_result_gene_JACKS_results.txt - Crispr cleanR corrected JACKS scores for Avana 2020q4 distribution. log_normalmixEM.txt - log file of mixture model iterations of avana2020q4.matrix-GMMZ-qualFilter-varFilter-9055genes-659cells-17aml.txt - Selecting appropriate AML cells for dpcc analysis in figure 3e.metabolite_error.txt - Metabolite variance measurements used in determining viable metabolites for analysis. Metabolites that had measurements below error were not used.Mix_Z_pr_values_updated.txt - precision recall measurements and associated mixed z-scores of pr cutoffs. used to determine FDR cutoff measurements. NEGv1.txt - Non essential genes from bagel.PTEN_CN.txt - PTEN copy number values from CCLE.Sanger_Corrected_FC.txt - Crispr cleanR corrected fold-change data of the Sanger 2019 release.

  10. f

    Additional file 17 of Mapping in silico genetic networks of the KMT2D tumour...

    • figshare.com
    xlsx
    Updated Nov 23, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yuka Takemon; Erin D. Pleasance; Alessia Gagliardi; Christopher S. Hughes; Veronika Csizmok; Kathleen Wee; Diane L. Trinh; Ryan D. Huff; Andrew J. Mungall; Richard A. Moore; Eric Chuah; Karen L. Mungall; Eleanor Lewis; Jessica Nelson; Howard J. Lim; Daniel J. Renouf; Steven JM. Jones; Janessa Laskin; Marco A. Marra (2024). Additional file 17 of Mapping in silico genetic networks of the KMT2D tumour suppressor gene to uncover novel functional associations and cancer cell vulnerabilities [Dataset]. http://doi.org/10.6084/m9.figshare.27894380.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Nov 23, 2024
    Dataset provided by
    figshare
    Authors
    Yuka Takemon; Erin D. Pleasance; Alessia Gagliardi; Christopher S. Hughes; Veronika Csizmok; Kathleen Wee; Diane L. Trinh; Ryan D. Huff; Andrew J. Mungall; Richard A. Moore; Eric Chuah; Karen L. Mungall; Eleanor Lewis; Jessica Nelson; Howard J. Lim; Daniel J. Renouf; Steven JM. Jones; Janessa Laskin; Marco A. Marra
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Additional file 17: Table S12. Characterising [AT]n motif expansion in cancer cell lines. A. All MSI DepMap cancer cell lines with WGS data and three MSS DepMap cell lines with WGS data for comparison. Cell_line_stripped_name: cell line names provided by DepMap without special characters or spaces. disease: cancer type assigned by DepMap. MSI_status: inferred microsatellite status by Ghandi et al. (2019). KMT2D_group: KMT2D group assigned to a cell line. SRA_project_ID: NCBI SRA project ID for raw WGS data. SRA_Run_ID: NCBI SRA run ID for raw WGS data. Instrument: sequencer used to perform WGS. B. Profiles of [AT]n motifs generated using ExpansionHunter Denovo. Cell_line_name: cell line names provided by DepMap without special characters or spaces. disease: cancer type assigned by DepMap. SRA_id: NCBI SRAN run ID. group: MSI status and KMT2D group of cell line. Contig: chromosome location of motif. Start: start site (bp) of motif. End: end site (bp) of motif. Motif: motif type. Num_anc_irrs: number of anchored in-repeat reads (i.e. one read pair maps within and the other outside of a repeat region; see Methods). norm_num_anc_irrs: normalised number of anchored in-repeat reads. Het_str_size: estimated motif repeat size.

  11. Project SCORE processed with CERES

    • figshare.com
    txt
    Updated Jun 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Broad DepMap (2023). Project SCORE processed with CERES [Dataset]. http://doi.org/10.6084/m9.figshare.9116732.v2
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 4, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Broad DepMap
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is the result of 318 cancer cell lines screened with the genome-wide KY1.0/1.1 CRISPR KO library by the Sanger Institute, processed with the Achilles pipeline (except QC). The publication describing the experiment is "Prioritization of cancer therapeutic targets using CRISPR-Cas9 screens," DOI 10.1038/s41586-019-1103-9.Readcounts were downloaded from https://score.depmap.sanger.ac.uk/downloads on 8 May 2019. Only cell lines annotated by the authors as passing both QC steps in Supplementary Table 1 were retained. Additionally, only cell lines for which the Broad has copy number data as of 10 May 2019 were retained.For more details on included files, see README

  12. f

    PRISM Repurposing 20Q2 Dataset

    • figshare.com
    txt
    Updated Aug 23, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Broad DepMap; Steven Corsello; Mustafa Kocak; Todd Golub (2022). PRISM Repurposing 20Q2 Dataset [Dataset]. http://doi.org/10.6084/m9.figshare.20564034.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Aug 23, 2022
    Dataset provided by
    figshare
    Authors
    Broad DepMap; Steven Corsello; Mustafa Kocak; Todd Golub
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The PRISM Repurposing release contains small molecule viability datasets generated using the Broad Repurposing Library and the PRISM multiplexed cell-line viability assay. The primary PRISM Repurposing dataset contains the results of pooled-cell line chemical-perturbation viability screens for 5,274 compounds screened against 578 or 562 cell lines. The secondary PRISM Repurposing dataset contains the results of pooled-cell line chemical-perturbation viability screens for 1,448 compounds screened against 489 cell lines in an 8-step, 4-fold dilution, starting from 10uM. Technical redos for 147 previously screened oncology compounds have been added as of 11/22/2019. This data is identified using a screen_id of "MTS010". It is recommended to use MTS010 when available. MTS010 was not included in the analysis for Corsello et al. 2019, DOI:10.1101/730119. Data processing steps are described in the README files. Further descriptions of methods are available at https://doi.org/10.1101/730119.

  13. f

    Table_2_A Novel Prognostic Signature Based on Glioma Essential...

    • frontiersin.figshare.com
    xlsx
    Updated Jun 4, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Debo Yun; Xuya Wang; Wenbo Wang; Xiao Ren; Jiabo Li; Xisen Wang; Jianshen Liang; Jie Liu; Jikang Fan; Xiude Ren; Hao Zhang; Guanjie Shang; Jingzhang Sun; Lei Chen; Tao Li; Chen Zhang; Shengping Yu; Xuejun Yang (2023). Table_2_A Novel Prognostic Signature Based on Glioma Essential Ferroptosis-Related Genes Predicts Clinical Outcomes and Indicates Treatment in Glioma.xlsx [Dataset]. http://doi.org/10.3389/fonc.2022.897702.s002
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jun 4, 2023
    Dataset provided by
    Frontiers
    Authors
    Debo Yun; Xuya Wang; Wenbo Wang; Xiao Ren; Jiabo Li; Xisen Wang; Jianshen Liang; Jie Liu; Jikang Fan; Xiude Ren; Hao Zhang; Guanjie Shang; Jingzhang Sun; Lei Chen; Tao Li; Chen Zhang; Shengping Yu; Xuejun Yang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    BackgroundFerroptosis is a form of programmed cell death (PCD) that has been implicated in cancer progression, although the specific mechanism is not known. Here, we used the latest DepMap release CRISPR data to identify the essential ferroptosis-related genes (FRGs) in glioma and their role in patient outcomes.MethodsRNA-seq and clinical information on glioma cases were obtained from the Chinese Glioma Genome Atlas (CGGA) and The Cancer Genome Atlas (TCGA). FRGs were obtained from the FerrDb database. CRISPR-screened essential genes (CSEGs) in glioma cell lines were downloaded from the DepMap portal. A series of bioinformatic and machine learning approaches were combined to establish FRG signatures to predict overall survival (OS) in glioma patients. In addition, pathways analysis was used to identify the functional roles of FRGs. Somatic mutation, immune cell infiltration, and immune checkpoint gene expression were analyzed within the risk subgroups. Finally, compounds for reversing high-risk gene signatures were predicted using the GDSC and L1000 datasets.ResultsSeven FRGs (ISCU, NFS1, MTOR, EIF2S1, HSPA5, AURKA, RPL8) were included in the model and the model was found to have good prognostic value (p < 0.001) in both training and validation groups. The risk score was found to be an independent prognostic factor and the model had good efficacy. Subgroup analysis using clinical parameters demonstrated the general applicability of the model. The nomogram indicated that the model could effectively predict 12-, 36-, and 60-months OS and progression-free interval (PFI). The results showed the presence of more aggressive phenotypes (lower numbers of IDH mutations, higher numbers of EGFR and PTEN mutations, greater infiltration of immune suppressive cells, and higher expression of immune checkpoint inhibitors) in the high-risk group. The signaling pathways enriched closely related to the cell cycle and DNA damage repair. Drug predictions showed that patients with higher risk scores may benefit from treatment with RTK pathway inhibitors, including compounds that inhibit RTKs directly or indirectly by targeting downstream PI3K or MAPK pathways.ConclusionIn summary, the proposed cancer essential FRG signature predicts survival and treatment response in glioma.

  14. f

    Data from: Synthesis of an RBM39 Degrader That Downregulates CEP192 and...

    • acs.figshare.com
    csv
    Updated Mar 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Xilin Lyu; Xiancheng Wang; Dongze Lin; Yuhan Lu; Chenxu Wang; Ziqin Yan; Zhiyi Wang; Ying Cheng; Jing Cheng; Xuelian Ren; Yi Su; Shijie Zhang; Yi Chen; He Huang; Yujun Zhao (2025). Synthesis of an RBM39 Degrader That Downregulates CEP192 and Induces Disorganized Spindle Structures [Dataset]. http://doi.org/10.1021/acs.jmedchem.5c00534.s002
    Explore at:
    csvAvailable download formats
    Dataset updated
    Mar 20, 2025
    Dataset provided by
    ACS Publications
    Authors
    Xilin Lyu; Xiancheng Wang; Dongze Lin; Yuhan Lu; Chenxu Wang; Ziqin Yan; Zhiyi Wang; Ying Cheng; Jing Cheng; Xuelian Ren; Yi Su; Shijie Zhang; Yi Chen; He Huang; Yujun Zhao
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    RBM39 is an essential component of the spliceosome, playing a critical role in maintaining mRNA integrity. Its depletion significantly exacerbates RNA splicing defects and demonstrates potent anticancer activity. To identify key effectors following RBM39 depletion, we employed a multiomics approach to directly compare two structurally distinct compounds, CB039 and Indisulam. Through proteomic analysis, RNA sequencing, and DepMap dependency assessment, CEP192 emerged as a crucial gene, exhibiting dependency in 96% of the 1,100 analyzed cancer cell lines. In eight cancer cell lines, treatment with both CB039 and Indisulam consistently induced CEP192 exon 42 skipping and reduced CEP192 protein levels. Mechanistically, either CB039 treatment or RNA interference-mediated CEP192 knockdown led to a significant increase in spindle disorganization, as well as chromosome condensation and failed segregation. In conclusion, our characterization of the downstream effects of RBM39 depletion provides novel insights into the therapeutic potential of RBM39 degraders.

  15. Raw data (CRISPR vs. RNAi)

    • figshare.com
    pdf
    Updated Feb 25, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    John Michael Krill-Burger (2022). Raw data (CRISPR vs. RNAi) [Dataset]. http://doi.org/10.6084/m9.figshare.16735132.v1
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Feb 25, 2022
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    John Michael Krill-Burger
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data used in analyses from Krill-Burger et al., “Partial inhibition improves identification of cancer vulnerabilities when CRISPR-Cas9 knockout is pan-lethal”. These raw data files include annotations from the Cancer Dependency Map (DepMap) and publicly available datasets that have been pre-processed to have consistent cell line and gene/compound identifiers. See ReadMe.pdf for a detailed description of each file.

  16. f

    DataSheet_1_Epigenetic and Immune-Cell Infiltration Changes in the Tumor...

    • frontiersin.figshare.com
    bin
    Updated Jun 5, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zeng-Hong Wu; Dong-Liang Yang; Liang Wang; Jia Liu (2023). DataSheet_1_Epigenetic and Immune-Cell Infiltration Changes in the Tumor Microenvironment in Hepatocellular Carcinoma.docx [Dataset]. http://doi.org/10.3389/fimmu.2021.793343.s001
    Explore at:
    binAvailable download formats
    Dataset updated
    Jun 5, 2023
    Dataset provided by
    Frontiers
    Authors
    Zeng-Hong Wu; Dong-Liang Yang; Liang Wang; Jia Liu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    BackgroundEpigenetics regulate gene expression without altering the DNA sequence. Epigenetics targeted chemotherapeutic approach can be used to overcome treatment resistance and low response rate in HCC. However, a comprehensive review of genomic data was carried out to determine the role of epigenesis in the tumor microenvironment (TME), immune cell-infiltration characteristics in HCC is still insufficient.MethodsThe association between epigenetic-related genes (ERGs), inflammatory response-related genes (IRRGs) and CRISPR genes was determined by merging genomic and CRISPR data. Further, characteristics of immune-cell infiltration in the tumor microenvironment was evaluated.ResultsNine differentially expressed genes (ANP32B, ASF1A, BCORL1, BMI1, BUB1, CBX2, CBX3, CDK1, and CDK5) were shown to be independent prognostic factors based on lasso regression in the TCGA-LIHC and ICGC databases. In addition, the results showed significant differences in expression of PDCD-1 (PD-1) and CTLA4 between the high- and low-epigenetic score groups. The CTRP and PRISM-derived drug response data yielded four CTRP-derived compounds (SB-743921, GSK461364, gemcitabine, and paclitaxel) and two PRISM-derived compounds (dolastatin-10 and LY2606368). Patients with high ERGs benefited more from immune checkpoint inhibitor (ICI) therapy than patients with low ERGs. In addition, the high ERGs subgroup had a higher T cell exclusion score, while the low ERGs subgroup had a higher T cell dysfunction. However, there was no difference in microsatellite instability (MSI) score among the two subgroups. Further, genome-wide CRISPR-based loss-of function screening derived from DepMap was conducted to determine key genes leading to HCC development and progression. In total, 640 genes were identified to be essential for survival in HCC cell lines. The protein-protein interaction (PPI) network demonstrated that IRRGs PSEN1 was linked to most ERGs and CRISPR genes such as CDK1, TOP2A, CBX2 and CBX3.ConclusionEpigenetic alterations of cancer-related genes in the tumor microenvironment play a major role in carcinogenesis. This study showed that epigenetic-related novel biomarkers could be useful in predicting prognosis, clinical diagnosis, and management in HCC.

  17. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Broad DepMap (2023). public_20q3 [Dataset]. http://doi.org/10.6084/m9.figshare.12931238.v1
Organization logo

public_20q3

Explore at:
13 scholarly articles cite this dataset (View in Google Scholar)
txtAvailable download formats
Dataset updated
May 31, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Broad DepMap
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This dataset contains the results of Avana library CRISPR-Cas9 genome-scale knockout (prefixed with Achilles) as well as mutation, copy number and gene expression data (prefixed with CCLE) for cancer cell lines as part of the Broad Institute’s Cancer Dependency Map project. We have repackaged our fileset to include all quarterly-updating datasets produced by DepMap.The Avana CRISPR-Cas9 genome-scale knockout data has expanded to include 689 cell lines, the RNAseq data includes 1249 cell lines, and the copy number data includes 1682 cell lines. Please see the README files for details regarding data processing pipeline procedures updates.As our screening efforts continue, we will be releasing additional cancer dependency data on a quarterly basis for unrestricted use. For the latest datasets available, further analyses, and to subscribe to our mailing list visit https://depmap.org.Descriptions of the experimental methods and the CERES algorithm are published in http://dx.doi.org/10.1038/ng.3984. Some cell lines were process using copy number data based on the Sanger Institute whole exome sequencing data (COSMIC: http://cancer.sanger.ac.uk.cell_lines, EGA accession number: EGAD00001001039) reprocessed using CCLE pipelines. A detailed description of the pipelines and tool versions for CCLE expression can be found here: https://github.com/broadinstitute/gtex-pipeline/blob/v9/TOPMed_RNAseq_pipeline.md.

Search
Clear search
Close search
Google apps
Main menu