Facebook
Twitterhttps://www.proteinatlas.org/about/licencehttps://www.proteinatlas.org/about/licence
Subcellular methods
The subcellular resource of the Human Protein Atlas provides high-resolution insights into the expression and spatiotemporal distribution of proteins encoded by 13603 genes (67% of the human protein-coding genes), as well as predictions for an additional 3459 secreted- or membrane proteins, covering a total of 17062 genes (85% of the human protein-coding genes). For each gene, the subcellular distribution of the protein has been investigated by immunofluorescence (ICC-IF) and confocal microscopy in up to three different standard cell lines, selected from a panel of 42 cell lines used in the subcellular resource. For some genes, the protein has also been stained in up to three ciliated cell lines, induced pluripotent stem cells (iPSCs) and/or in human sperm cells. Upon image analysis, the subcellular localization of the protein has been classified into one or more of 49 different organelles and subcellular structures. In addition, the resource includes an annotation of genes that display single-cell variation in protein expression levels and/or subcellular distribution, as well as an extended analysis of cell cycle dependency of such variations.
The subcellular resource offers a database for detailed exploration of individual genes and proteins of interest, as well as for systematic analysis of proteomes in a broader context. More information about the content of the resouce, as well as the generation and analysis of the data, can be found in the Methods summary. Learn about:
The subcellular distribution of proteins in standard human cell lines, including ciliated cells and iPSCs. The subcellular distribution of proteins in human sperm. The proteomes of different organelles and subcellular structures. Single-cell variability in the expression levels and/or localizations of proteins.
Facebook
Twitterhttps://www.proteinatlas.org/about/licencehttps://www.proteinatlas.org/about/licence
The Cell Atlas provides high-resolution insights into the expression and spatio-temporal distribution of RNA and proteins in human cell lines. Genome-wide mRNA expression is determined by deep RNA-sequencing in a panel of 69 cell lines, representing various cell populations in different organs and tissues of the human body. The subcellular distributions of proteins encoded by 12813 genes (65% of the human protein-coding genes) are investigated in a subset of cell lines selected based on corresponding RNA expression. Protein localization data is derived from antibody-based profiling, using immunofluorescence (ICC-IF) and confocal microscopy, and classified into 35 different organelles and fine subcellular structures. The Cell Atlas offers a database for detailed exploration of individual genes and proteins of interest, as well as for systematic analysis of transcriptomes and proteomes in broader contexts, in order to increase the understanding of human biology at the cellular and subcellular levels.
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Human cancer cell lines grown in vitro are frequently used to decipher basic cell biological phenomena and to also specifically study different forms of cancer. Here we present the first large-scale study of protein expression patterns in cell lines using an antibody-based proteomics approach. We analyzed the expression pattern of 5436 proteins in 45 different cell lines using hierarchical clustering, principal component analysis, and two-group comparisons for the identification of differentially expressed proteins. Our results show that immunohistochemically determined protein profiles can categorize cell lines into groups that overall reflect the tumor tissue of origin and that hematological cell lines appear to retain their protein profiles to a higher degree than cell lines established from solid tumors. The two-group comparisons reveal well-characterized proteins as well as previously unstudied proteins that could be of potential interest for further investigations. Moreover, multiple myeloma cells and cells of myeloid origin were found to share a protein profile, relative to the protein profile of lymphoid leukemia and lymphoma cells, possibly reflecting their common dependency of bone marrow microenvironment. This work also provides an extensive list of antibodies, for which high-resolution images as well as validation data are available on the Human Protein Atlas (www.proteinatlas.org), that are of potential use in cell line studies.
Facebook
TwitterA collaborative project between the Broad Institute and the Novartis Institutes for Biomedical Research and its Genomics Institute of the Novartis Research Foundation, with the goal of conducting a detailed genetic and pharmacologic characterization of a large panel of human cancer models. The CCLE also works to develop integrated computational analyses that link distinct pharmacologic vulnerabilities to genomic patterns and to translate cell line integrative genomics into cancer patient stratification. The CCLE provides public access to genomic data, analysis and visualization for about 1000 cell lines.
Facebook
TwittermRNA microarray expression profiles for cancer cell lines
Facebook
TwitterThe Cell Line Data Base (CLDB) is a reference information source for human and animal cell lines. It provides the characteristics of the cell lines and their availability through distributors, allowing cell line requests to be made from collections and laboratories.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The SUM human breast cancer cell lines have been used by many labs around the world to develop extensive data sets derived from comparative genomic hybridization analysis, gene expression profiling, whole exome sequencing, and reverse phase protein array analysis. In a previous study, the authors of this paper performed genome-scale shRNA essentiality screens on the entire SUM line panel, as well as on MCF10A cells, MCF-7 cells, and MCF-7LTED cells. In this study, the authors have developed the SUM Breast Cancer Cell Line Knowledge Base, to make all of these omics data sets available to users of the SUM lines, and to allow users to mine the data and analyse them with respect to biological pathways enriched by the data in each cell line.Data access: All the datasets supporting the findings of this study are publicly available in the SLKBase platform here: https://sumlineknowledgebase.com/. RPPA data, drug sensitivity data, apelisib response data, and data on dose response, are also part of this figshare data record (https://doi.org/10.6084/m9.figshare.12497630).Study aims and methodology: This web-based knowledge base provides users with data and information on the derivation of each of the cell lines, provides narrative summaries of the genomics and cell biology of each breast cancer cell line, and provides protocols for the proper maintenance of the cells. The database includes a series of data mining tools that allow rapid identification of the functional oncogene signatures for each line, the enrichment of any KEGG pathway with screen hit and gene expression data for each of the lines, and a rapid analysis of protein and phospho-protein expression for the cell lines. A gene search tool that returns all of the functional genome and functional druggable data for any gene for the entire cell line panel, is included. Additionally, the authors have expanded the database to include functional genomic data for an additional 29 commonly used breast cancer cell lines. The three overarching goals in the original development of the SLKBase are: 1) to provide a rich source of information for anyone working with any of the SUM breast cancer cell lines, 2) to give researchers ready access to the large genomic data sets that have been developed with these cells, and 3) to allow researchers to perform orthogonal analyses of the various genomics data sets that we and others have obtained from the SUM lines. For more information on the development and contents of the database, please read the related article.Datasets supporting the paper:The data mining tools accessed the following datasets to generate the figures and tables, and these datasets are downloadable from the Data Download centre on the SLKBase: Exome sequencing data: SLKBase.exome_.seq_.sum_.xlsxGene amplification and expression data for the SUM cell lines: SUM44amplificationdata.xlsSUM52.xlsSUM149.xlsSUM159.xlsSUM185.xlsSUM190.xlsSUM225.xlsSUM229.xlsSUM1315.xlsCellecta shRNA screen data for the SUM cell lines:SUM44Celectadata.csvSUM52Cellectadata.csvSUM102Cellectadata.csvSUM149Cellectadata.csvSUM159Cellectadata.csvSUM185Cellectadata.csvSUM190Cellectadata.csvSUM225Cellectadata.csvSUM229Cellectadata.csvSUM1315hits.hit.csvMCF10A.hits_.csvBreast cancer cell line data included in this data record (these datasets were used to generate figures 1, 2 and 7 in the article):Proteomics data from the Reverse Phase Protein Array (RPPA) assay analysis: Ethier.SUMline.RPPA.xlsxDrug sensitivity data: NAVITOCLAX.drugsensitivity.Zscores.xlsxApelisib response data: Apelisib all lines (2).xlsxDose response data: 092614 Dose Response CP 52s.11.15.xlsxAll the files are either in .xlsx or .csv file format.
Facebook
TwitterDatabase of all cell lines used in biomedical research which include immortalized cell lines, naturally immortal cell lines (stem cells), widely used and distributed finite life cell lines, vertebrate cell lines (majority being human, mouse, and rat), and invertebrate (insects and ticks) cell lines, as well as cell line synonyms. Each cell line is provided with the following information: the recommended name (the name which appears in the original publication), a list of synonyms, a unique accession number, comments on a number of topics including misspellings and gene transfection, information on the tissue/organ origin with the UBERON code, the NCI Thesaurus or Orphanet ORDO code for the disease(s) the individual suffered from (for cancer and human genetic disease lines only), the species of origin, the parent cell line, cross-references of sister cell lines, the sex of the individual, the category in which the cell line belongs (Adult stem cell; Cancer cell line; Embryonic stem cell; Factor-dependent cell line; Finite cell line; Hybrid cell line; Hybridoma; Induced pluripotent stem cell; Spontaneously immortalized cell line; Stromal cell line; Telomerase immortalized cell line; Transformed cell line; Undefined cell line type), web links, publication references, and/or cross-references to cell line catalogs/collections, ontologies, cell lines databases/resources, and to databases that list cell lines as samples.
Facebook
TwitterThe Human Protein Atlas portal is a publicly available database with millions of high-resolution images showing the spatial distribution of proteins in 46 different normal human tissues and 20 different cancer types, as well as 47 different human cell lines. The data is released together with application-specific validation performed for each antibody, including immunohistochemisty, Western blot analysis and, for a large fraction, a protein array assay and immunofluorescent based confocal microscopy. The database has been developed in a gene-centric manner with the inclusion of all human genes predicted from genome efforts. Search functionalities allow for complex queries regarding protein expression profiles, protein classes and chromosome location.
Facebook
TwitterThe proteome provides unique insights into biology and disease beyond the genome and transcriptome. Lack of large proteomic datasets has restricted identification of new cancer biomarkers. Here, proteomes of 949 cancer cell lines across 28 tissue types were analyzed by mass spectrometry. Deploying a clinically-relevant workflow to quantify 8,498 proteins, these data capture evidence of cell type and post-transcriptional modifications. Integrating multi-omics, drug response and CRISPR-Cas9 gene essentiality screens with a deep learning-based pipeline revealed thousands of protein-specific biomarkers of cancer vulnerabilities. Proteomic data had highly similar power to predict drug response than the equivalent portion of the transcriptome. Further, random downsampling to only 1,500 proteins had limited impact on predictive power, consistent with protein networks being highly connected and co-regulated. This pan-cancer proteomic map (ProCan-DepMapSanger), available at https://cellmodelpassports.sanger.ac.uk, is a comprehensive resource revealing principles of protein regulation with important implications for future studies.
Facebook
TwitterThe study of the function of many human proteins is often hampered by technical limitations, such as cytotoxicity and phenotypes that result from overexpression of the protein of interest together with the endogenous version. Here we present the snoMEN (snoRNA Modulator of gene ExpressioN) vector technology for generating stable cell lines where expression of the endogenous protein can be reduced and replaced by an exogenous protein, such as a fluorescent protein (FP)-tagged version. SnoMEN are snoRNAs engineered to contain complementary sequences that can promote knock-down of targeted RNAs. We have established and characterised two such partial protein replacement human cell lines (snoMEN-PR). Quantitative mass spectrometry was used to analyse the specificity of knock-down and replacement at the protein level and also showed an increased pull-down efficiency of protein complexes containing exogenous, tagged proteins in the protein replacement cell lines, as compared with conventional co-expression strategies. The snoMEN approach facilitates the study of mammalian proteins, particularly those that have so far been difficult to investigate by exogenous expression and has wide applications in basic and applied gene-expression research.
Facebook
TwitterA vast assortment of human cell lines is available for cell culture model-based studies, and as such the potential exists for discrepancies in findings due to cell line selection. To investigate this concept, we determined the relative protein abundance profiles of a panel of eight diverse, but commonly studied, human cell lines. This panel includes: HAP1, HEK293T, HeLa, HepG2, Jurkat, Panc1, SH-SY5Y, and SVGp12. We use a mass spectrometry-based proteomics workflow designed to enhance quantitative accuracy while maintaining analytical depth. To this end, our strategy leverages TMTpro16-based sample multiplexing, high-Field Asymmetric Ion Mobility Spectrometry (FAIMS), and real-time database searching (RTS). The data show that cell line diversity was reflective of differences in the relative protein abundance profiles. We also determined that several hundred proteins were highly enriched for a given cell line and performed gene ontology and pathway analysis on these cell line-enriched proteins. We provide an R Shiny application to query protein abundance profiles and retrieve proteins with similar patterns. The workflows used herein can be applied to additional cell lines to aid cell line selection in addressing a given scientific inquiry or in improving an experimental design.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In recent years, biological research involving human cell lines has been rapidly developing in China. However, some of the cell lines are not authenticated before use. Therefore, misidentified and/or cross-contaminated cell lines are unfortunately commonplace. In this study, we present a comprehensive investigation of cross-contamination and misidentification for a panel of 278 cell lines from 28 institutes in China by using short tandem repeat profiling method. By comparing the DNA profiles with the cell bank databases of ATCC and DSMZ, a total of 46.0% (128/278) cases with cross-contamination/misidentification were uncovered coming from 22 institutes. Notably, 73.2% (52 out of 71) of the cell lines established by the Chinese researchers were misidentified and accounted for 40.6% of total misidentification (52/128). Further, 67.3% (35/52) of the misidentified cell lines established in laboratories of China were HeLa cells or a possible hybrid of HeLa with another kind of cell line. Furthermore, the bile duct cancer cell line HCCC-9810 and degenerative lung cancer Calu-6 exhibited 88.9% match in the ATCC database (9-loci), indicating that they were from the same origin. However, when we used 21-loci to compare these two cell lines with the same algorithm, the percent match was only 48.2%, indicating that these two cell lines were different. The SNP profiles of HCCC-9810 and Calu-6 also revealed that they were different cell lines. 150 cell lines with unique profiles demonstrated a wide range of in vitro phenotypes. This panel of 150 genomically validated cancer cell lines represents a valuable resource for the cancer research community and will advance our understanding of the disease by providing a standard reference for cell lines that can be used for biological as well as preclinical studies.
Facebook
TwitterGene-level mutation profiles for cancer cell lines
Facebook
TwitterComprehensive database of Short Tandem Repeat DNA profiles for all of ATCC human cell lines. ATCC data collection as part of continuing efforts to characterize and authenticate cell lines in Cell Biology collection.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The reversible lipid modification protein S-palmitoylation can dynamically modify the localization, diffusion, function, conformation and physical interactions of substrate proteins. Dysregulated S-palmitoylation is associated with a multitude of human diseases including brain and metabolic disorders, viral infection and cancer. However, the diverse expression patterns of the genes that regulate palmitoylation in the broad range of human cell types are currently unexplored, and their expression in commonly used cell lines that are the workhorse of basic and preclinical research are often overlooked when studying palmitoylation dependent processes. We therefore created CellPalmSeq (https://cellpalmseq.med.ubc.ca), a curated RNAseq database and interactive webtool for visualization of the expression patterns of the genes that regulate palmitoylation across human single cell types, bulk tissue, cancer cell lines and commonly used laboratory non-human cell lines. This resource will allow exploration of these expression patterns, revealing important insights into cellular physiology and disease, and will aid with cell line selection and the interpretation of results when studying important cellular processes that depend on protein S-palmitoylation.
Facebook
TwitterhPSCreg is a global registry of human pluripotent stem cell (hPSC) lines containing manually validated information, including ethical provenance, procurement, derivation process, genetic and expression data, other biological and molecular characteristics, use, and quality of the line — Current status: 1123 hESC lines, 7670 hiPSC lines, and 205 clinical studies, and 2402 certificates
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Raw NOME-seq data (Gene Expression Omnibus accession GSE57498) for the human cell lines HMEC, MCF7, PrEC, PC3 were aligned to hg19 using bwa-meth. Methylation and occupancy calls were made at WCG and GCH sites respectively using bwa-meth, BisSNP and bespoke, tailor made scripts (https://github.com/astatham/NOMe-seq-analysis).
Facebook
TwitterSupplementary Data from Identifying Phased Mutations and Complex Rearrangements in Human Prostate Cancer Cell Lines through Linked-Read Whole-Genome Sequencing
Facebook
Twitterhttps://www.proteinatlas.org/about/licencehttps://www.proteinatlas.org/about/licence
Subcellular methods
The subcellular resource of the Human Protein Atlas provides high-resolution insights into the expression and spatiotemporal distribution of proteins encoded by 13603 genes (67% of the human protein-coding genes), as well as predictions for an additional 3459 secreted- or membrane proteins, covering a total of 17062 genes (85% of the human protein-coding genes). For each gene, the subcellular distribution of the protein has been investigated by immunofluorescence (ICC-IF) and confocal microscopy in up to three different standard cell lines, selected from a panel of 42 cell lines used in the subcellular resource. For some genes, the protein has also been stained in up to three ciliated cell lines, induced pluripotent stem cells (iPSCs) and/or in human sperm cells. Upon image analysis, the subcellular localization of the protein has been classified into one or more of 49 different organelles and subcellular structures. In addition, the resource includes an annotation of genes that display single-cell variation in protein expression levels and/or subcellular distribution, as well as an extended analysis of cell cycle dependency of such variations.
The subcellular resource offers a database for detailed exploration of individual genes and proteins of interest, as well as for systematic analysis of proteomes in a broader context. More information about the content of the resouce, as well as the generation and analysis of the data, can be found in the Methods summary. Learn about:
The subcellular distribution of proteins in standard human cell lines, including ciliated cells and iPSCs. The subcellular distribution of proteins in human sperm. The proteomes of different organelles and subcellular structures. Single-cell variability in the expression levels and/or localizations of proteins.