Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Results of transcript sequencing for AtT-20FlpIn cells. mRNA was isolated from AtT-20FlpIn cells using standard procedures, next generation sequencing was performed by Macrogen (https://dna.macrogen.com/). A report ourtlining the workflow and data analysis methods is available from the Authors by request.
Deposited data is in an Excel file, which includes the gene symbol, transcript ID from the reference mouse genome, protein ID and transcript abundance. The AtT-20FlpIn cells were generated by Dr Santiago, and have been used as the 'wild type' cells for generating cell lines stably expressing GPCR and ion channels for most of the molecular pharmacology projects in the Molecular Pharmacodynamics group.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The SUM human breast cancer cell lines have been used by many labs around the world to develop extensive data sets derived from comparative genomic hybridization analysis, gene expression profiling, whole exome sequencing, and reverse phase protein array analysis. In a previous study, the authors of this paper performed genome-scale shRNA essentiality screens on the entire SUM line panel, as well as on MCF10A cells, MCF-7 cells, and MCF-7LTED cells. In this study, the authors have developed the SUM Breast Cancer Cell Line Knowledge Base, to make all of these omics data sets available to users of the SUM lines, and to allow users to mine the data and analyse them with respect to biological pathways enriched by the data in each cell line.Data access: All the datasets supporting the findings of this study are publicly available in the SLKBase platform here: https://sumlineknowledgebase.com/. RPPA data, drug sensitivity data, apelisib response data, and data on dose response, are also part of this figshare data record (https://doi.org/10.6084/m9.figshare.12497630).Study aims and methodology: This web-based knowledge base provides users with data and information on the derivation of each of the cell lines, provides narrative summaries of the genomics and cell biology of each breast cancer cell line, and provides protocols for the proper maintenance of the cells. The database includes a series of data mining tools that allow rapid identification of the functional oncogene signatures for each line, the enrichment of any KEGG pathway with screen hit and gene expression data for each of the lines, and a rapid analysis of protein and phospho-protein expression for the cell lines. A gene search tool that returns all of the functional genome and functional druggable data for any gene for the entire cell line panel, is included. Additionally, the authors have expanded the database to include functional genomic data for an additional 29 commonly used breast cancer cell lines. The three overarching goals in the original development of the SLKBase are: 1) to provide a rich source of information for anyone working with any of the SUM breast cancer cell lines, 2) to give researchers ready access to the large genomic data sets that have been developed with these cells, and 3) to allow researchers to perform orthogonal analyses of the various genomics data sets that we and others have obtained from the SUM lines. For more information on the development and contents of the database, please read the related article.Datasets supporting the paper:The data mining tools accessed the following datasets to generate the figures and tables, and these datasets are downloadable from the Data Download centre on the SLKBase: Exome sequencing data: SLKBase.exome_.seq_.sum_.xlsxGene amplification and expression data for the SUM cell lines: SUM44amplificationdata.xlsSUM52.xlsSUM149.xlsSUM159.xlsSUM185.xlsSUM190.xlsSUM225.xlsSUM229.xlsSUM1315.xlsCellecta shRNA screen data for the SUM cell lines:SUM44Celectadata.csvSUM52Cellectadata.csvSUM102Cellectadata.csvSUM149Cellectadata.csvSUM159Cellectadata.csvSUM185Cellectadata.csvSUM190Cellectadata.csvSUM225Cellectadata.csvSUM229Cellectadata.csvSUM1315hits.hit.csvMCF10A.hits_.csvBreast cancer cell line data included in this data record (these datasets were used to generate figures 1, 2 and 7 in the article):Proteomics data from the Reverse Phase Protein Array (RPPA) assay analysis: Ethier.SUMline.RPPA.xlsxDrug sensitivity data: NAVITOCLAX.drugsensitivity.Zscores.xlsxApelisib response data: Apelisib all lines (2).xlsxDose response data: 092614 Dose Response CP 52s.11.15.xlsxAll the files are either in .xlsx or .csv file format.
This data package contains expression profiles for proteins in normal and cancer tissues. It also contains data on sequence based RNA levels in human tissue and cell line.
https://www.proteinatlas.org/about/licencehttps://www.proteinatlas.org/about/licence
The Cell Atlas provides high-resolution insights into the expression and spatio-temporal distribution of proteins within human cells. Using a panel of 64 cell lines to represent various cell populations in different organs and tissues of the human body, the mRNA expression of all human genes are characterized by deep RNA-sequencing. The subcellular distribution of each protein is investigated in a subset of cell lines selected based on corresponding gene expression. The protein localization data is derived from antibody-based profiling by immunofluorescence confocal microscopy, and classified into 32 different organelles and fine subcellular structures. The Cell Atlas currently covers 12390 genes (63%) for which there are available antibodies. It offers a database for exploring details of individual genes and proteins of interest, as well as systematically analyzing transcriptomes and proteomes in broader contexts, in order to increase our understanding of human cells.
DLD-1 and MOLT-4 cell lines were cultured in a Rotating cell culture system to simulate microgravity and mRNA expression profile was observed in comparison to Static controls. Cells were grown in 10mL rotating vessels in an RCCS and in 60mm Petri dishes (test control respectively).Two replicates of test (Microgravity) and control (static) each from DLD-1 and MOLT-4 were analyzed by microarray. Simulated microgravity affected the solid tumor cell line DLD-1 markedly which showed a higher percentage of dysregulated genes compared to the hematological tumor cell line MOLT-4. Microgravity affects the cell cycle of DLD-1 cells and disturbs expression of cell cycle regulatory gene networks. Multiple microRNA host genes were dysregulated and significantly mir-22 tumor suppressor microRNA is highly upregulated in DLD-1.
Journal article published in PLOS One, Vol 20, Issue 5, e0320862, 2025; DOI: https://doi.org/10.1371/journal.pone.0320862; PMC12064016. The datasets generated and analyzed during the current study are provided in Supplemental S1 File. The RNA-seq data is Protein Atlas Version 23 from the Human Protein Atlas website (https://www.proteinatlas.org/about/download, “RNA HPA cell line gene data” released 2023.06.19). All FASTQ files and aligned counts for the U.S. EPA TempO-seq data have been deposited into NCBI Gene Expression Omnibus under the accession number GSE288929 and are publicly available at: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE288929. The R code is available through FigShare at: https://doi.org/10.23645/epacomptox.27341970.v1. This dataset is associated with the following publication: Word, L., C. Willis, R. Judson, L. Everett, S. Davidson-Fritz, D. Haggard, B. Chambers, J. Rogers, J. Bundy, I. Shah, N. Sipes, and J. Harrill. TempO-seq and RNA-seq Gene Expression Levels are Highly Correlated for Most Genes: A Comparison Using 39 Human Cell Lines. PLOS ONE. Public Library of Science, San Francisco, CA, USA, 20(5): e0320862, (2025).
https://www.proteinatlas.org/about/licencehttps://www.proteinatlas.org/about/licence
The subcellular resource of the Human Protein Atlas provides high-resolution insights into the expression and spatiotemporal distribution of proteins encoded by 13534 genes (67% of the human protein-coding genes), as well as predictions for an additional 3491 secreted- or membrane proteins, covering a total of 17025 genes (84 % of the human protein-coding genes). For each gene, the subcellular distribution of the protein has been investigated by immunofluorescence (ICC-IF) and confocal microscopy in up to three different standard cell lines, selected from a panel of 41 cell lines used in the subcellular resource. For some genes, the protein has also been stained in up to three ciliated cell lines and/or in human sperm cells. Upon image analysis, the subcellular localization of the protein has been classified into one or more of 49 different organelles and subcellular structures. In addition, the resource includes an annotation of genes that display single-cell variation in protein expression levels and/or subcellular distribution, as well as an extended analysis of cell cycle dependency of such variations. The subcellular resource offers a database for detailed exploration of individual genes and proteins of interest, as well as for systematic analysis of proteomes in a broader context. More information about the content of the resouce, as well as the generation and analysis of the data, can be found in the Methods summary. Learn about:
The subcellular distribution of proteins in human cell lines. The subcellular distribution of proteins in human sperm. The proteomes of different organelles and subcellular structures. Single-cell variability in the expression levels and/or localizations of proteins.
Expression data from pancreatic cancer cell lines and non-neoplastic pancreatic cell line HPDE To identify genes epigenetically silenced and regulated in pancreatic cancer We compared the gene expression profiles of 6 pancreatic cancer cell lines (panc215, A32-1, A38-5, panc2.5, panc2.8, and panc3.014), to the non-neoplastic pancreas cell line, HPDE. We also compared the baseline gene expression of the pancreatic cancer cell lines to expression patterns after treatment with 5-aza-dC alone, TSA alone, and to a combination of 5-aza-dC/TSA.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
(A) Targeted inhibitors from the CCLE database. (B) Targeted inhibitors from the Sanger COSMIC database. (C) Selected targeted inhibitors in our lab showing significant differences in sensitivity (except XAV939) towards the two clusters of cell lines. Y-axis is the IC50 values in log10 scale. P-value is computed by Mann Whitney U-test. Horizontal bars are medians for sample distributions.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Gene expression (counts) scRNA-seq of co-cultured cancer- and immune cells treated with trifluridine and DMSO control assayed at two time-points (12h and 72h).
HCT116 were seeded in 6-well Nunc plates (50,000 cells/3mL/well) and precultured for 24 h before PBMCs were added at a 1:8 ratio. Co-cultures were treated with DMSO vehicle (0.1%) or FTD (3mM) for 12 h or 72 h. MACS Dead Cell Removal Kit (Miltenyi Biotec, Gladbach, DEU) was performed according to the manufacturer’s instructions on cells treated for 72 h to increase the viability of the samples before RNA-sequencing. The viability of the samples treated for 12 h was not subjected to Dead Cell Removal as the viability was already sufficient. All samples were washed in PBS with 0.04% BSA (2x1mL). Chromium Next GEM Single Cell 3’ library preparation and RNA-sequencing were performed by the SNP&SEQ Technology Platform (National Genomics Infrastructure (NGI), Science for Life Laboratory, Uppsala University, Sweden).
This data set contains processed data using Cell Ranger toolkit version 5.0.1 provided by 10x Genomics, for demultiplexing, aligning reads to the human reference genome GRCh38, and generating gene-cell unique molecular identifiers
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Data set dimensions: 57 rows x 54357 columns
This is a data set contains RMA normalized log expression values for 54356 genes identified with their ENSEMBL ID (columns 1-54356) for 57 cancer cell lines and their respective proliferation rates (column 54357).
Gene expression was obtained from the GSE29682 GEO HuEx 1.0 ST microarray data.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Information about the dataset files:
1) pancan_rnaseq_freeze.tsv.gz: Publicly available gene expression data for the TCGA Pan-cancer dataset. File: PanCanAtlas EBPlusPlusAdjustPANCAN_IlluminaHiSeq_RNASeqV2.geneExp.tsv was processed using script process_sample_freeze.py by Gregory Way et al as described in https://github.com/greenelab/pancancer/ data processing and initialization steps. [http://api.gdc.cancer.gov/data/3586c0da-64d0-4b74-a449-5ff4d9136611] [https://doi.org/10.1016/j.celrep.2018.03.046]
2) pancan_mutation_freeze.tsv.gz: Publicly available Mutational information for TCGA Pan-cancer dataset. File: mc3.v0.2.8.PUBLIC.maf.gz was processed using script process_sample_freeze.py by Gregory Way et al as described in https://github.com/greenelab/pancancer/ data processing and initialization steps. [http://api.gdc.cancer.gov/data/1c8cfe5f-e52d-41ba-94da-f15ea1337efc] [https://doi.org/10.1016/j.celrep.2018.03.046]
3) pancan_GISTIC_threshold.tsv.gz: Publicly available Gene- level copy number information of the TCGA Pan-cancer dataset. This file is processed using script process_copynumber.py by Gregory Way et al as described in https://github.com/greenelab/pancancer/ data processing and initialization steps. The files copy_number_loss_status.tsv.gz and copy_number_gain_status.tsv.gz generated from this data are used as inputs in our Galaxy pipeline. [https://xenabrowser.net/datapages/?cohort=TCGA%20Pan-Cancer%20(PANCAN)&removeHub=https%3A%2F%2Fxena.treehouse.gi.ucsc.edu%3A443] [https://doi.org/10.1016/j.celrep.2018.03.046]
4) mutation_burden_freeze.tsv.gz: Publicly available Mutational information for TCGA Pan-cancer dataset mc3.v0.2.8.PUBLIC.maf.gz was processed using script process_sample_freeze.py by Gregory Way et al as described in https://github.com/greenelab/pancancer/ data processing and initialization steps. [https://github.com/greenelab/pancancer/][http://api.gdc.cancer.gov/data/1c8cfe5f-e52d-41ba-94da-f15ea1337efc] [https://doi.org/10.1016/j.celrep.2018.03.046]
5) sample_freeze.tsv or sample_freeze_version4_modify.tsv: The file lists the frozen samples as determined by TCGA PanCancer Atlas consortium along with raw RNAseq and mutation data. These were previously determined and included for all downstream analysis All other datasets were processed and subset according to the frozen samples.[https://github.com/greenelab/pancancer/]
6) vogelstein_cancergenes.tsv: compendium of OG and TSG used for the analysis. [https://github.com/greenelab/pancancer/]
7) CCLE_DepMap_18Q1_maf_20180207.txt.gz Publicly available Mutational data for CCLE cell lines from Broad Institute Cancer Cell Line Encyclopedia (CCLE) / DepMap Portal. [https://depmap.org/portal/download/api/download/external?file_name=ccle%2FCCLE_DepMap_18Q1_maf_20180207.txt]
8) ccle_rnaseq_genes_rpkm_20180929.gct.gz: Publicly available Expression data for 1019 cell lines (RPKM) from Broad Institute Cancer Cell Line Encyclopedia (CCLE) / DepMap Portal. [https://depmap.org/portal/download/api/download/external?file_name=ccle%2Fccle_2019%2FCCLE_RNAseq_genes_rpkm_20180929.gct.gz]
9) CCLE_MUT_CNA_AMP_DEL_binary_Revealer.gct: Publicly available merged Mutational and copy number alterations that include gene amplifications and deletions for the CCLE cell lines. This data is represented in the binary format and provided by the Broad Institute Cancer Cell Line Encyclopedia (CCLE) / DepMap Portal. [https://data.broadinstitute.org/ccle_legacy_data/binary_calls_for_copy_number_and_mutation_data/CCLE_MUT_CNA_AMP_DEL_binary_Revealer.gct]
10) GDSC_cell_lines_EXP_CCLE_names.csv.gz Publicly available RMA normalized expression data for Genomics of Drug Sensitivity in Cancer(GDSC) cell-lines. File gdsc_cell_line_RMA_proc_basalExp.csv was downloaded. This data was subsetted to 389 cell lines that are common among CCLE and GDSC. All the GDSC cell line names were replaced with CCLE cell line names for further processing. [https://www.cancerrxgene.org/gdsc1000/GDSC1000_WebResources//Data/preprocessed/Cell_line_RMA_proc_basalExp.txt.zip]
11) GDSC_CCLE_common_mut_cnv_binary.csv.gz: A subset of merged Mutational and copy number alterations that include gene amplifications and deletions for common cell lines between GDSC and CCLE. This file is generated using CCLE_MUT_CNA_AMP_DEL_binary_Revealer.gct and a list of common cell lines.
12) gdsc1_ccle_pharm_fitted_dose_data.txt.gz: Pharmacological data for GDSC1 cell lines. [ftp://ftp.sanger.ac.uk/pub/project/cancerrxgene/releases/current_release/GDSC1_fitted_dose_response_15Oct19.xlsx]
13) gdsc2_ccle_pharm_fitted_dose_data.txt.gz: Pharmacological data for GDSC2 cell lines. [ftp://ftp.sanger.ac.uk/pub/project/cancerrxgene/releases/current_release/GDSC2_fitted_dose_response_15Oct19.xlsx]
14) compounds.csv: list of pharmacological compounds tested for our analysis
15) tcga_dictonary.tsv: list of cancer types used in the analysis.
16) seg_based_scores.tsv: Measurement of total copy number burden, Percent of genome altered by copy number alterations. This file was used as part of the Pancancer analysis by Gregory Way et al as described in https://github.com/greenelab/pancancer/ data processing and initialization steps. [https://github.com/greenelab/pancancer/]
Epithelial ovarian cancer is a very heterogeneous disease and remains the most lethal gynaecological malignancy in the Western world. Rational therapeutic approaches need to account for interpatient and intratumoral heterogeneity in treatment design. Detailed characterization of in vitro models representing the different histological and molecular subtypes is therefore imperative. Strikingly, from ~100 available ovarian cancer cell lines the origin and which subtype they represent is largely unknown. We have extensively and uniformly characterized 39 ovarian cancer cell lines (with mRNA/microRNA expression, exon sequencing, dose response curves for clinically relevant therapeutics) and obtained all available information on the clinical features and tissue of origin of the original ovarian cancer to refine the putative histological subtypes. From 39 ovarian cell lines, 14 were assigned as high-grade serous, four serous-type, one low-grade serous and 20 non-serous type. Three morphological subtypes (21 Epithelial, 7 Round, 12 Spindle) were identified that showed distinct biological and molecular characteristics, including overexpression of cell movement and migration-associated genes for the Spindle subtype. Clinical validation showed a clear association of the spindle-like tumors with metastasis, advanced stage, suboptimal debulking and poor prognosis. In addition, the morphological subtypes associated with the molecular C1-6 subtypes identified by Tothill et al. [1], Spindle clustered with C1-stromal subtype, Round with C5-mesenchymal and Epithelial with C4 subtype. We provide a uniformly generated data resource for 39 ovarian cancer cell lines, the ovarian cancer cell line panel (OCCP). This should be the basis for selecting models to develop subtype specific treatment approaches, which is very much needed to prolong the survival of ovarian cancer patients. Gene expression was measured for 32 ovarian cancer cell lines using the GeneChip Human Exon 1.0 ST Array (Affymetrix). Morphological subtypes were assigned based on cell morphology, size, growth pattern and proliferation rate during culturing of the cell lines.
Long non-coding RNAs (lncRNAs) form a new class of RNA molecules implicated in various aspects of protein coding gene expression regulation. To study lncRNAs in cancer, we generated expression profiles for 1708 human lncRNAs in the NCI60 cancer cell line panel using a high-throughput nanowell RT-qPCR platform. We describe how qPCR assays were designed and validated and provide processed and normalized expression data for further analysis. Data quality is demonstrated by matching the lncRNA expression profiles with phenotypic and genomic characteristics of the cancer cell lines. This data set can be integrated with publicly available omics and pharmacological data sets to uncover novel associations between lncRNA expression and mRNA expression, miRNA expression, DNA copy number, protein coding gene mutation status or drug response. Overall design: lncRNA expression profiling of 60 cancer cell lines
In order to identify the gene targets of frequently altered chromosomal regions in retinoblastoma, a meta-analysis of genome-wide copy number alterations studies on primary retinoblastoma tissue and retinoblastoma cell lines was performed. Published studies were complemented by copy number and gene expression analysis on primary and cell line samples of retinoblastoma. This dataset includes the gene expression data of the retinoblastoma cell lines This data set contains the gene expression (Affymetrix human genome u133 plus 2.0 PM) results for 7 unique retinoblastoma cell lines. For one of the 7 unique cell lines, 3 RNA isolations were performed and were profiled on seperate arrays, adding up to 9 unique array files. Copy number data for primary retinoblastoma (tumor and blood DNA) and retinoblastoma cell lines are available (controlled-access) at the European Genomics Archive. Gene expression data of primary retinoblastoma is available under GSE59983. The GSE59983 records represent the primary tissue gene expression data and the CN data will be deposited into a controlled-access database, probably EGA.
We used microarrays to assess gene expression in proliferating ovarian cancer cell lines Ovarian cancer cell lines were plated and harvested when 50-80% confluent. RNA was extracted and hybridized on Affymetrix microarrays. At least 3 biological repeats were performed for each cell line.
Full legacy gene expression dataset run internally Gene expression data from various panels and experiments was collected, QC'd and normalised in tandem to provide a full summary dataset of the cell line expression data generated internally
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Table of cell line and primary neutrophil gene expression data. This comma separated value file contains the final averaged log10 normalized gene expression values for undifferentiated and differentiated cell lines, as well as primary human and mouse neutrophils. This file contains all of the data included in our online searchable neutrophil gene expression database. (CSV 2970 kb)
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains the results of Avana library CRISPR-Cas9 genome-scale knockout (prefixed with Achilles) as well as mutation, copy number and gene expression data (prefixed with CCLE) for cancer cell lines as part of the Broad Institute’s Cancer Dependency Map project. We have repackaged our fileset to include all quarterly-updating datasets produced by DepMap.The Avana CRISPR-Cas9 genome-scale knockout data has expanded to include 625 cell lines, the RNAseq data includes 1,210 cell lines, and the copy number data includes 1,657 cell lines. Please see the README files for details regarding data processing pipeline procedures updates.As our screening efforts continue, we will be releasing additional cancer dependency data on a quarterly basis for unrestricted use. For the latest datasets available, further analyses, and to subscribe to our mailing list visit https://depmap.org.Descriptions of the experimental methods and the CERES algorithm are published in http://dx.doi.org/10.1038/ng.3984. Additional Achilles processing information is published here https://www.biorxiv.org/content/10.1101/720243v1.full. Some cell lines were process using copy number data based on the Sanger Institute whole exome sequencing data (COSMIC: http://cancer.sanger.ac.uk.cell_lines, EGA accession number: EGAD00001001039) reprocessed using CCLE pipelines. A detailed description of the pipelines and tool versions for CCLE expression can be found here: https://github.com/broadinstitute/gtex-pipeline/blob/v9/TOPMed_RNAseq_pipeline.md.version 2: uploaded a new version of CCLE_gene_cn.csv to correctly reflect released cell lines.version 3: uploaded a new version of CCLE_gene_cn.csv that has the log2 transform correctly applied to it and to removed duplicate cell lines.
Breast cancer cell line MDA-MB-231 was treated with DMSO or UF010, a novel HDAC inhibitor for 24 hours. The impact of UF010 treatment on global gene expression was determined. We used Affymetrix Human Transcriptome Array 2.0 (HTA-2_0) to analyze genes that were up or downregulated upon UF010 exposure. MDA-MB-231 cells were treated with DMSO (control) or 1.0 µM of UF010 for 24 hours.Total RNAs were isolated for hybridization on Affymetrix microarrays.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Results of transcript sequencing for AtT-20FlpIn cells. mRNA was isolated from AtT-20FlpIn cells using standard procedures, next generation sequencing was performed by Macrogen (https://dna.macrogen.com/). A report ourtlining the workflow and data analysis methods is available from the Authors by request.
Deposited data is in an Excel file, which includes the gene symbol, transcript ID from the reference mouse genome, protein ID and transcript abundance. The AtT-20FlpIn cells were generated by Dr Santiago, and have been used as the 'wild type' cells for generating cell lines stably expressing GPCR and ion channels for most of the molecular pharmacology projects in the Molecular Pharmacodynamics group.