6 datasets found
  1. DepMap 19Q4 Public

    • figshare.com
    txt
    Updated May 31, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Broad DepMap (2023). DepMap 19Q4 Public [Dataset]. http://doi.org/10.6084/m9.figshare.11384241.v3
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Broad DepMap
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains the results of Avana library CRISPR-Cas9 genome-scale knockout (prefixed with Achilles) as well as mutation, copy number and gene expression data (prefixed with CCLE) for cancer cell lines as part of the Broad Institute’s Cancer Dependency Map project. We have repackaged our fileset to include all quarterly-updating datasets produced by DepMap.The Avana CRISPR-Cas9 genome-scale knockout data has expanded to include 689 cell lines, the RNAseq data includes 1249 cell lines, and the copy number data includes 1682 cell lines. Please see the README files for details regarding data processing pipeline procedures updates.As our screening efforts continue, we will be releasing additional cancer dependency data on a quarterly basis for unrestricted use. For the latest datasets available, further analyses, and to subscribe to our mailing list visit https://depmap.org.Descriptions of the experimental methods and the CERES algorithm are published in http://dx.doi.org/10.1038/ng.3984. Some cell lines were process using copy number data based on the Sanger Institute whole exome sequencing data (COSMIC: http://cancer.sanger.ac.uk.cell_lines, EGA accession number: EGAD00001001039) reprocessed using CCLE pipelines. A detailed description of the pipelines and tool versions for CCLE expression can be found here: https://github.com/broadinstitute/gtex-pipeline/blob/v9/TOPMed_RNAseq_pipeline.md.## V2 ChangesCCLE_fusions.csv and CCLE_fusions_unfiltered.csv were swapped in v1, they are correct now.## V3 ChangesUACC62_SKIN_CJ1_RESISTANT has been removed from Public 19Q4 Achilles files due to an issue with fingerprinting. Values for this cell line have been NAed in the following files: Achilles_gene_effect.csv, Achilles_gene_effect_unscaled.csv, Achilles_gene_dependency.csv, Achilles_logfold_change.csv, Achilles_raw_readcounts.csv.

  2. n

    Data from: Covalent disruptor of YAP-TEAD association suppresses defective...

    • data.niaid.nih.gov
    • datadryad.org
    zip
    Updated Nov 7, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mengyang Fan; Wenchao Lu; Jianwei Che; Nicholas Kwiatkowski; Yang Gao; Hyuk-Soo Seo; Scott Ficarro; Prafulla Gokhale; Yao Liu; Ezekiel Geffken; Jimit Lakhani; Kijun Song; Miljan Kuljanin; Wenzhi Ji; Jie Jiang; Zhixiang He; Jason Tse; Andrew Boghossian; Matthew Rees; Melissa Ronan; Jennifer Roth; Joseph Mancias; Jarrod Marto; Sirano Dhe-Paganon; Tinghu Zhang; Nathanael Gray (2022). Covalent disruptor of YAP-TEAD association suppresses defective Hippo signaling [Dataset]. http://doi.org/10.5061/dryad.rxwdbrvbn
    Explore at:
    zipAvailable download formats
    Dataset updated
    Nov 7, 2022
    Dataset provided by
    Stanford University
    Chinese Academy of Sciences
    Broad Institute
    Dana-Farber Cancer Institute
    Authors
    Mengyang Fan; Wenchao Lu; Jianwei Che; Nicholas Kwiatkowski; Yang Gao; Hyuk-Soo Seo; Scott Ficarro; Prafulla Gokhale; Yao Liu; Ezekiel Geffken; Jimit Lakhani; Kijun Song; Miljan Kuljanin; Wenzhi Ji; Jie Jiang; Zhixiang He; Jason Tse; Andrew Boghossian; Matthew Rees; Melissa Ronan; Jennifer Roth; Joseph Mancias; Jarrod Marto; Sirano Dhe-Paganon; Tinghu Zhang; Nathanael Gray
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    The transcription factor TEAD, together with its coactivator YAP/TAZ, is a key transcriptional modulator of the Hippo pathway. Activation of TEAD transcription by YAP has been implicated in a number of malignancies, and this complex represents a promising target for drug discovery. Here, we employed covalent fragment screening approach followed by structure-based design to develop an irreversible TEAD inhibitor MYF-03-69. Using a range of in vitro and cell-based assays we demonstrated that through a covalent binding with TEAD palmitate pocket, MYF-03-69 disrupts YAP-TEAD association, suppresses TEAD transcriptional activity and inhibits cell growth of Hippo signaling defective malignant pleural mesothelioma (MPM). Further, a cell viability screening with a panel of 903 cancer cell lines indicated a high correlation between TEAD-YAP dependency and the sensitivity to MYF-03-69. To validate MYF-03-69 as potent and selective pan-TEAD inhibitor, we interrogated the proteome-wide selectivity profile of MYF-03-69 on cysteine labeling using a streamlined cysteine activity-based protein profiling (SLC-ABPP) approach and generated the spreadsheet "Supplementary_Dataset_1._Proteome-wide_selectivity_profile_of_MYF-03-69_on_cysteines_labeling_using_SLC-ABPP_approach". We employed the cysteine reactive desthiobiotin iodoacetamide (DBIA) probe which was reported to map more than 8,000 cysteines and performed a competition study on NCI-H226 cells pretreated with 0.5, 2, 10 or 25 µM of MYF-03-69 for 3 hours in triplicate. The cysteines that were conjugated >50% (competition ratio CR>2) compared to DMSO control were analyzed and assigned to the protein targets. In the DMSO control group, although DBIA mapped 12,498 cysteines in total, the TEAD PBP cysteines were not detected. This might be due to low TEAD1-4 protein abundance and/or inability of the PBP cysteines to be labeled given that they are mostly modified by palmitate under physiological conditions. Among 12,498 mapped cysteines, only 7 cysteines were significantly labeled (i.e. exhibited >50% conjugation or CR>2) by 25 µM of MYF-03-69, and all of these sites exhibited dose-dependent engagement. To study the whole transcriptome perturbation by TEAD inhibitor MYF-03-69, mRNA sequencing was performed in NCI-H226 cells that were treated with 0.1 μM, 0.5 μM, and 2 μM of MYF-03-69 and generated the spreadsheet "Supplementary_Dataset_2._List_of_differentially_expressed_genes_under_MYF-03-69_treatments". The genes that were differentially expressed with statistical significance (Fold change > 1.5 and adjusted p value < 0.05) are listed in this dataset. To investigate whether TEAD inhibition by MYF-03-69 was selectively lethal to YAP/TEAD-dependent cancers, PRISM screening across a broad panel of cell lineages were performed and generated the spreadsheet "Supplementary_Dataset_3". 903 cancer cells were treated with TEAD inhibitor MYF-03-69 for 5 days. The viability values were measured at 8-point dose manner (3-fold dilution from 10 μM) and fitted a dose-response curve for each cell line. Area under the curve (AUC) was calculated as a measurement of compound effect on cell viability. CERES score of YAP1 or TEADs from CRISPR (Avana) Public 21Q1 dataset (DepMap) were listed in the spreadsheet and used to estimate gene-dependency. The CERES Score of most dependent TEAD isoform was used to represent TEAD dependency. With PRISM screen dataset of TEAD inhibitor MYF-03-69, we investigated whether TEAD inhibition recapulates genetically knockout outcome of YAP or TEADs and generated the spreadsheet "Supplementary_Dataset_4". Correlation analysis between compound PRISM sensitivity (log2.AUC of each cell line) and dependency of certain gene (CRISPR knockout score for each cell line, from DepMap Public 20Q4 Achilles_gene_effect.csv dataset) across the PRISM cell line panel. The Pearson correlation coefficients and associated p-values were computed. Positive correlations correspond to dependency correlating with increased sensitivity. The q-values (a corrected significance value accounting for false discovery rate) are computed from p-values using the Benjamini Hochberg algorithm. Associations with q-values above 0.1 are filtered out. This correlation analysis reveals that the dependency scores of TEAD1 and YAP1 according to genomic knockout dataset (DepMap portal) provided the highest correlation with the compound PRISM sensitivity profile. This is followed by TP53BP2, a gene that is also involved in Hippo pathway as activator of TAZ. Methods For "Supplementary_Dataset_1._Proteome-wide_selectivity_profile_of_MYF-03-69_on_cysteines_labeling_using_SLC-ABPP_approach", the date was collected on NCI-H226 cells using the same methods reported in reference paper Reimagining high-throughput profiling of reactive cysteines for cell-based screening of large electrophile libraries | Nature Biotechnology. (Kuljanin, M.; Mitchell, D. C.; Schweppe, D. K.; Gikandi, A. S.; Nusinow, D. P.; Bulloch, N. J.; Vinogradova, E. V.; Wilson, D. L.; Kool, E. T.; Mancias, J. D.; Cravatt, B. F.; Gygi, S. P., Reimagining high-throughput profiling of reactive cysteines for cell-based screening of large electrophile libraries. Nature Biotechnology 2021, 39, 630-641) The competition ratio CR was calculated as descibed in the above reference paper. For "Supplementary_Dataset_2._List_of_differentially_expressed_genes_under_MYF-03-69_treatments", the date was collected on NCI-H226 cells treated with MYF-03-69 at indicated concentrations for 6 hours (n=3). The RNA was extracted using RNeasy plus mini kit (Qiagen, cat no.74134) according to the manufacturer instructions. Then libraries were prepared using Roche Kapa mRNA HyperPrep strand specific sample preparation kits from 200 ng of purified total RNA according to the manufacturer’s protocol on a Beckman Coulter Biomek i7. The finished dsDNA libraries were quantified by Qubit fluorometer and Agilent TapeStation 4200. Uniquely dual indexed libraries were pooled in an equimolar ratio and shallowly sequenced on an Illumina MiSeq to further evaluate library quality and pool balance. The final pool was sequenced on an Illumina NovaSeq 6000 targeting 40 million 100bp read pairs per library at the Dana-Farber Cancer Institute Molecular Biology Core Facilities. Sequenced reads were aligned to the UCSC hg19 reference genome assembly and gene counts were quantified using STAR (v2.7.3a). Differential gene expression testing was performed by DESeq2 (v1.22.1). RNAseq analysis was performed using the VIPER snakemake pipeline. KEGG pathway enrichment analysis was performed through metascape webportal. For "Supplementary_Dataset_3", the date was collected using the methods reported in reference paper Discovering the anticancer potential of non-oncology drugs by systematic viability profiling | Nature Cancer. Briefly, up to 931 barcoded cell lines in pools of 20-25 were thawed and plated into 384-well plates (1250 cells/well for adherent cell pools, 2000 cells/well for suspension or mixed suspension/adherent cell pools) containing compound (top concentration: 10 µM, 8-point, threefold dilution). All conditions were tested in triplicate. Cells were lysed after 5 days of treatment and mRNA based Luminex detection of barcode abundance from lysates was carried out as in the reference paper above. Luminex median fluorescence intensity (MFI) data was input to a standardized R pipeline (https://github.com/broadinstitute/prism_data_processing) to generate viability estimates relative to vehicle treatment for each cell line and treatment condition, and to fit dose-response curves from viability data. CERES score of YAP1 or TEADs from CRISPR (Avana) Public 21Q1 dataset (DepMap) were downloaded from DepMap portal (DepMap Data Downloads) and listed with the viability data. For "Supplementary_Dataset_4", the data was correlation analysis results of "Supplementary_Dataset_3", which was performed in the R pipeline mentioned above (https://github.com/broadinstitute/prism_data_processing).

  3. cross-dataset-drp-paper

    • zenodo.org
    zip
    Updated Apr 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    A. Partin; A. Partin (2025). cross-dataset-drp-paper [Dataset]. http://doi.org/10.5281/zenodo.15258451
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 22, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    A. Partin; A. Partin
    Description

    This benchmark data was train and evaluate the models presented in the paper: A. Partin and P. Vasanthakumari et al. "Benchmarking community drug response prediction models: datasets, models, tools, and metrics for cross-dataset generalization analysis"

    The benchmark data for Cross-Study Analysis (CSA) include four kinds of data, which are cell line response data, cell line multi-omics data, drug feature data, and data partitions. The figure below illustrates the curation, processing, and assembly of benchmark data, and a unified schema for data curation. Cell line response data were extracted from five sources, including the Cancer Cell Line Encyclopedia (CCLE), the Cancer Therapeutics Response Portal version 2 (CTRPv2), the Genomics of Drug Sensitivity in Cancer version 1 (GDSC1), the Genomics of Drug Sensitivity in Cancer version 2 (GDSC2), and the Genentech Cell Line Screening Initiative (GCSI). These are five large-scale cell line drug screening studies. We extracted their multi-dose viability data and used a unified dose response fitting pipeline to calculate multiple dose-independent response metrics as shown in the figure below, such as the area under the dose response curve (AUC) and the half-maximal inhibitory concentration (IC50). The multi-omics data of cell lines were extracted from the the Dependency Map (DepMap) portal of CCLE, including gene expressions, DNA mutations, DNA methylation, gene copy numbers, protein expressions measured by reverse phase protein array (RPPA), and miRNA expressions. Data preprocessing was performed, such as descritizing gene copy numbers and mapping between different gene identifier systems. Drug information was retrived from PubChem. Based on the drug SMILES (Simplified Molecular Input Line Entry Specification) strings, we calculated their molecular fingerprints and descriptors using the Mordred and RDKit Python packages. Data partition files were generated using the IMPROVE benchmark data preparation pipeline. They indicate, for each modeling analysis run, which samples should be included in the training, validation, and testing sets, for building and evaluating the drug response prediction (DRP) models. The Table below shows the numbers of cell lines, drugs, and experiments in each dataset. Across the five datasets, there are 785 unique cell lines and 749 unique drugs. All cell lines have gene expression, mutation, DNA methylation, and copy number data available. 760 of the cell lines have RPPA protein expressions, and 781 of them have miRNA expressions.

    Further description is provided here: https://jdacs4c-improve.github.io/docs/content/app_drp_benchmark.html

  4. Webster Supplemental Output

    • figshare.com
    txt
    Updated Jan 19, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Joshua Pan (2022). Webster Supplemental Output [Dataset]. http://doi.org/10.6084/m9.figshare.14963561.v2
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jan 19, 2022
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Joshua Pan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Webster Supplemental OutputThis repository contains data to support Pan et al., "Sparse dictionary learning recovers pleiotropy from human cell fitness screens". There are four groups of data:## Tables (xlsx)These are the excel files to support manuscript submission. Each file contains a Readme as the first sheet with data descriptions.* Table S1: Webster output from genotoxic fitness screen data: dictionary matrix, loadings matrix, and annotations.* Table S2: UMAP embedding coordinates for genotoxic fitness screen data.* Table S3: Webster output from Cancer Dependency Map data: dictionary matrix, loadings matrix, and annotations.* Table S4: UMAP embedding coordinates for Cancer Dependency Map data.* Table S5: Mass spectrometry peptide counts for immunoprecipitations. * Table S6: The maximum subcellular localization score for each of the functions learned from Cancer Dependency Map data.* Table S7: Compound-to-function loadings for annotated compounds from PRISM primary and secondary screens.## depmap (tsv)These are flat files that are the basis for the Tables above, and represent the raw input and outputs of Webster.* depmap_cell_line_infoAnnotations for each cell line. * depmap_dictionaryWebster dictionary matrix inferred from fitness data.* depmap_fn_annot_gprofilerAnnotations derived from gProfiler using gene loadings on each function.* depmap_fn_biomarkersRandom forest modeling results using cell line features to predict the fitness effect of ecah function in the dictionary.* depmap_fn_manual_nameManual name for each function, derived from above resources.* depmap_fn_subcell_raw_matrixMatrix cross product between Go et al. localization scores, and our gene-to-function loadings. * depmap_fn_subcellThe subcell localization information used for coloring functions in the global embedding.* depmap_gene_loadingsWebster gene-to-function loadings matrix, inferred from fitness data.* depmap_gene_metaGene-centric information and useful links.* depmap_inputPre-processed Cancer Dependency Map (DepMap) data that is the input to Webster.* depmap_umapEmbedding coordinates.## genotoxic (tsv)Same structure as above, but for Webster input from the smaller genotoxic fitness dataset (Olivieri et al 2020)* genotoxic_dictionary* genotoxic_gene_loadings* genotoxic_gene_meta* genotoxic_input* genotoxic_umap## prism (tsv)Results of projecting PRISM screening data (Corsello et al 2020) into a latent space inferred from Depmap data.* prism_embeddingSame as depmap_umap above, except with the addition of selected compounds into the embedding, as well sa compound meta information useful for labeling the plot.* prism_primary_imputedInput for projection into the Webster latent space. This is preprocessed and filtered for high-variance, well annotated compounds. * prism_primary_meta Compound annotations for primary screen data. * prism_primary_ompCompound-to-function loadings learned by Orthogonal Matching Pursuit.* prism_primary_proj_resultsSummary statistics for projection results.* prism_secondary_imputedInput for projection into the Webster latent space. This is preprocessed and filtered for high-variance, well annotated compounds, treated at many doses.* prism_secondary_metaCompound annotations for secondary screen data. * prism_secondary_ompCompound-to-function loadings learned by Orthogonal Matching Pursuit.* prism_secondary_proj_resultsSummary statistics for projection results.

  5. Raw data (CRISPR vs. RNAi)

    • figshare.com
    pdf
    Updated Feb 25, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    John Michael Krill-Burger (2022). Raw data (CRISPR vs. RNAi) [Dataset]. http://doi.org/10.6084/m9.figshare.16735132.v1
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Feb 25, 2022
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    John Michael Krill-Burger
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data used in analyses from Krill-Burger et al., “Partial inhibition improves identification of cancer vulnerabilities when CRISPR-Cas9 knockout is pan-lethal”. These raw data files include annotations from the Cancer Dependency Map (DepMap) and publicly available datasets that have been pre-processed to have consistent cell line and gene/compound identifiers. See ReadMe.pdf for a detailed description of each file.

  6. Cross-Study Benchmark Dataset for Monotherapy Drug Response Prediction

    • zenodo.org
    zip
    Updated Apr 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexander Partin; Alexander Partin (2025). Cross-Study Benchmark Dataset for Monotherapy Drug Response Prediction [Dataset]. http://doi.org/10.5281/zenodo.15258883
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 22, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Alexander Partin; Alexander Partin
    Description

    This benchmark dataset was created and used to train and evaluate models presented in the paper: A. Partin, P. Vasanthakumari et al., "Benchmarking community drug response prediction models: datasets, models, tools, and metrics for cross-dataset generalization analysis."

    This dataset includes four main components: cell line drug response data, cell line multi-omics data, drug feature data, and predefined data partitions for modeling. Data response data were curated from five pharmacogenomic studies (CCLE, CTRPv2, GDSC1, GDSC2, GCSI), and processed using a unified pipeline for response fitting, omics harmonization, and drug representation.

    Multi-dose viability data were extracted, and a unified dose response fitting pipeline was used to calculate multiple dose-independent response metrics, such as the area under the dose response curve (AUC) and the half-maximal inhibitory concentration (IC50).

    The multi-omics data of cell lines were extracted from the the Dependency Map (DepMap) portal of CCLE, including gene expressions, DNA mutations, DNA methylation, gene copy numbers, protein expressions measured by reverse phase protein array (RPPA), and miRNA expressions. Data preprocessing was performed, such as discretizing gene copy numbers and mapping between different gene identifier systems.

    Drug information was retrieved from PubChem. Based on the drug SMILES strings, we calculated their molecular fingerprints and descriptors using the Mordred and RDKit Python packages.

    Data partition files were generated using the IMPROVE benchmark data preparation pipeline. They indicate, for each modeling analysis run, which samples should be included in the training, validation, and testing sets, for building and evaluating the drug response prediction (DRP) models.

    More detailed information about the dataset and its construction can be found at https://jdacs4c-improve.github.io/docs/content/app_drp_benchmark.html

  7. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Broad DepMap (2023). DepMap 19Q4 Public [Dataset]. http://doi.org/10.6084/m9.figshare.11384241.v3
Organization logo

DepMap 19Q4 Public

Explore at:
21 scholarly articles cite this dataset (View in Google Scholar)
txtAvailable download formats
Dataset updated
May 31, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Broad DepMap
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This dataset contains the results of Avana library CRISPR-Cas9 genome-scale knockout (prefixed with Achilles) as well as mutation, copy number and gene expression data (prefixed with CCLE) for cancer cell lines as part of the Broad Institute’s Cancer Dependency Map project. We have repackaged our fileset to include all quarterly-updating datasets produced by DepMap.The Avana CRISPR-Cas9 genome-scale knockout data has expanded to include 689 cell lines, the RNAseq data includes 1249 cell lines, and the copy number data includes 1682 cell lines. Please see the README files for details regarding data processing pipeline procedures updates.As our screening efforts continue, we will be releasing additional cancer dependency data on a quarterly basis for unrestricted use. For the latest datasets available, further analyses, and to subscribe to our mailing list visit https://depmap.org.Descriptions of the experimental methods and the CERES algorithm are published in http://dx.doi.org/10.1038/ng.3984. Some cell lines were process using copy number data based on the Sanger Institute whole exome sequencing data (COSMIC: http://cancer.sanger.ac.uk.cell_lines, EGA accession number: EGAD00001001039) reprocessed using CCLE pipelines. A detailed description of the pipelines and tool versions for CCLE expression can be found here: https://github.com/broadinstitute/gtex-pipeline/blob/v9/TOPMed_RNAseq_pipeline.md.## V2 ChangesCCLE_fusions.csv and CCLE_fusions_unfiltered.csv were swapped in v1, they are correct now.## V3 ChangesUACC62_SKIN_CJ1_RESISTANT has been removed from Public 19Q4 Achilles files due to an issue with fingerprinting. Values for this cell line have been NAed in the following files: Achilles_gene_effect.csv, Achilles_gene_effect_unscaled.csv, Achilles_gene_dependency.csv, Achilles_logfold_change.csv, Achilles_raw_readcounts.csv.

Search
Clear search
Close search
Google apps
Main menu