Facebook
Twitterhttps://www.proteinatlas.org/about/licencehttps://www.proteinatlas.org/about/licence
Subcellular methods
The subcellular resource of the Human Protein Atlas provides high-resolution insights into the expression and spatiotemporal distribution of proteins encoded by 13603 genes (67% of the human protein-coding genes), as well as predictions for an additional 3459 secreted- or membrane proteins, covering a total of 17062 genes (85% of the human protein-coding genes). For each gene, the subcellular distribution of the protein has been investigated by immunofluorescence (ICC-IF) and confocal microscopy in up to three different standard cell lines, selected from a panel of 42 cell lines used in the subcellular resource. For some genes, the protein has also been stained in up to three ciliated cell lines, induced pluripotent stem cells (iPSCs) and/or in human sperm cells. Upon image analysis, the subcellular localization of the protein has been classified into one or more of 49 different organelles and subcellular structures. In addition, the resource includes an annotation of genes that display single-cell variation in protein expression levels and/or subcellular distribution, as well as an extended analysis of cell cycle dependency of such variations.
The subcellular resource offers a database for detailed exploration of individual genes and proteins of interest, as well as for systematic analysis of proteomes in a broader context. More information about the content of the resouce, as well as the generation and analysis of the data, can be found in the Methods summary. Learn about:
The subcellular distribution of proteins in standard human cell lines, including ciliated cells and iPSCs. The subcellular distribution of proteins in human sperm. The proteomes of different organelles and subcellular structures. Single-cell variability in the expression levels and/or localizations of proteins.
Facebook
TwitterLOCATE is a curated database that houses data describing the membrane organization and subcellular localization of proteins from the RIKEN FANTOM4 mouse and human protein sequence set. The membrane organization is predicted by the high-throughput, computational pipeline MemO. The subcellular locations were determined by a high-throughput, immunofluorescence-based assay and by manually reviewing peer-reviewed publications.
Facebook
TwittereSLDB is a database of protein subcellular localization annotation for eukaryotic organisms. It contains experimental annotations derived from primary protein databases, homology based annotations and computational predictions.
Facebook
TwitterWeb resource that integrates evidence on protein subcellular localization from manually curated literature, high-throughput screens, automatic text mining, and sequence-based prediction methods. All evidence is mapped to common protein identifiers and Gene Ontology terms, and further unify it by assigning confidence scores that facilitate comparison of the different types and sources of evidence and visualize these scores on a schematic cell.
Facebook
TwitterA database of protein subcellular localization containing proteins from primary protein database SWISS-PROT and PIR. By collecting the subcellular localization annotation, these information are classified and categorized by cross references to taxonomies and Gene Ontology database. Annotations were taken from primary protein databases, model organism genome projects and literature texts, and then were analyzed to dig out the subcellular localization features of the proteins. The proteins are also classified into different categories. Based on sequence alignment, nonredundant subsets of the database have been built, which may provide useful information for subcellular localization prediction. The database now contains >60 000 protein sequences including 30 000 protein sequences in the nonredundant data sets. Online download, SOAP server, Blast tools and prediction services are also available.
Facebook
TwitterSubcellular localization of proteins from low-throughput or high-throughput protein localization assays
Facebook
TwitterTHIS RESOURCE IS NO LONGER IN SERVICE, documented on July 16, 2013. LOC3d is a database of predicted subcellular localization for eukaryotic proteins of known 3-D structure taken from the Protein Databank. Subcellular localization is currently predicted using four different methods: predictNLS (nuclear localization signal), LOChom (using homology), LOCkey (using keywords) and LOC3d (neural network based prediction). The reported localization is based on the method which predicts localization of a given protein with the highest confidence. LOCtree is a novel system of support vector machines (SVMs) that predict the subcellular localization of proteins, and DNA-binding propensity for nuclear proteins, by incorporating a hierarchical ontology of localization classes modeled onto biological processing pathways. Biological similarities are incorporated from the description of cellular components provided by the gene ontology consortium (GO). GO definitions have been simplified and tailored to the problem of protein sorting. Technically the ontology has been implemented using a decision tree with SVMs as the nodes. LOCtree, was extremely successful at learning evolutionary similarities among subcellular localization classes and was significantly more accurate than other traditional networks at predicting subcellular localization. Whenever available, LOCtree also reports predictions based on the following: 1) Nuclear localization signals found by PredictNLS, 2) Localization inferred using Prosite motifs and Pfam domains found in the protein, and 3) SWISS-PROT keywords associated with a protein. Localization is inferred in the last two cases using the entropy-based LOCkey algorithm. Additional information can be found in the LOCtree manuscript and associated PredictNLS and LOCkey publications.
Facebook
TwitterSUBA provides a powerful tool to investigate subcellular localization in Arabidopsis. SUBA houses large scale proteomic and GFP localization sets from cellular compartments of Arabidopsis, and also contains pre-compiled bioinformatic predictions for protein subcellular localizations. The Database functions through the unification of disparate datasets and through the provision of a web accessible interface for the construction of user based queries resulting in a one-stop-shop for protein localization in this model plant. Subcellular localization information can contribute towards our understanding of protein function, protein redundancy and of biological inter-relationships. In an attempt to get a clearer picture of our experimental data and to more generally understand subcellular partitioning we have brought together various data sources to build SUBA.
Facebook
TwitterTHIS RESOURCE IS NO LONGER IN SERVICE. Documented August 23, 2017.Annotated database of fluorescence microscope images depicting subcellular location proteins with two interfaces: a text and image content search interface, and a graphical interface for exploring location patterns grouped into Subcellular Location Trees. The annotations in PSLID provide a description of sample preparation and fluorescence microscope imaging.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset provides the immunofluorescence images acquired during this study for the experimentally validation of PUPS. Cells of 5 cell lines were stained with antibody targeting a-tubulin, ER, and one kind of proteins out of 9 proteins that PUPS predicted, then imaged with 60x confocal with Z-stacks.
Facebook
TwitterTHIS RESOURCE IS NO LONGER IN SERVICE. Documented on August 22,2022. database of protein subcellular localization annotation for eukaryotic organisms. It contains experimental annotations derived from primary protein databases, homology based annotations and computational predictions.
Facebook
TwitterAttribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
License information was derived automatically
This repository contains data used to obtain results from a 5-fold cross-validation testing of how MSclassifier and other packages accurately predict protein subcellular localization in the software article entitled "MSclassifier: median-supplement model-based classification tool for automated knowledge discovery." The data used in the software article is derived from data generated in "G. K. Acquaah-Mensah, S. M. Leach, and C. Guda, Predicting the subcellular localization of human proteins using machine learning and exploratory data analysis, Genomics Proteomics Bioinformatics, 4(2):120-133, 2006, https://doi.org/10.1016/S1672-0229(06)60023-5"
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
The proteomics dataset was summarized by the SWISS-PROT database release 42 (2003–2004) by which obtained extracting all animal, fungal and plant protein sequences.
The dataset contains 5959 proteins annotated to one of 11 different subcellular locations which are: chloroplast, cytoplasm, endoplasmic reticulum, extracellular space, Golgi apparatus, lysosomal, mitochondrion, nucleus, peroxisome, plasma membrane and vacuole which represented proteins of plants cell and fungal cell while animal cells shared all localizations with them, but have lysosomes instead of vacuoles. The only variable we intend to consider is protein sequence.
We wouldn't be here without the help of others. If you owe any attributions or thanks, include them here along with any citations of past research.
Your data will be in front of the world's largest data science community. What questions do you want to see answered?
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Images of 5 cell lines, each stained for 9 proteins, generated in the validation experiments of the study "Prediction of protein subcellular localization in single cells".
Facebook
TwitterSUBA (http://suba.live/) is the central resource for Arabidopsis protein subcellular location data. Proteins have specific functions and locations within the plant cell. They generate or are themselves products important for plant growth and response. Protein subcellular location and the proximity relationship of proteins are important clues to function within the metabolic household. Subcellular location can be determined by fluorescent protein tagging or mass spectrometry detection in subcellular purifications and by prediction using protein sequence features. SUBA provides a subcellular data query platform, protein sequence BLAST alignment, a high confidence subcellular locations reference standards and analytic tools.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Additional file 1: Table S1. Re-annotated protein names and Identifiers for the proteins included in SToPSdb.
Facebook
TwitterAttribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
License information was derived automatically
Localization by immunofluorescence and confocal microscopy of 72 antibodies targeting 72 genes and validation through siRNA knock down to verify protein localization and antibody binding. 59 of the antibodies are those described in Stadler et al 2012.
Version History August 2017 - additional phenotype to CMPO ontology mappings added
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Access to Data The Pathway Localization database (PathLocdb) was developed to serve as a central repository of subcellular localizations of metabolic pathways as well as their participant enzymes. Our database allows you to: 1. searching and browsing the metabolic pathways by their subcelluar localizations and organisms 2. systematic comparing the localization profiles of metabolic pathways between different organisms 3. discover the potential regulatory mechanisms and suspicious localization of metabolic pathways 4. clarify the pathway boundary from the view of subcellular localization 5. discover the mechanism of intermediates communication between different subcellular localizations Number of superpathway in database:337 (SwissProt dataset), 215 (KEGG dataset) and 337 (UniProt dataset) Number of pathways with localization annotation: 43014 Number of Proteins in database: 80676 Number of pathways with mulitple localization annotation: 4477 Number of superpathways with mulitple localization annotation: 682
Facebook
TwitterApoptosis is a fundamental process controlling normal tissue homeostasis by regulating a balance between cell proliferation and death. Predicting the subcellular location of apoptosis proteins is very helpful for understanding the mechanism of programmed cell death. Predicting protein subcellular localization with bioinformatic techniques provides quite a few opportunities in related fields. In this work, we propose the use of a hierarchical extreme learning machine (H-ELM) to make a classification of high-dimensional input data without demanding a dimension reduction process, which yields acceptable results. An attempt is made to extract features from different perspectives, and a feature fusion process is accomplished. Regarding the position-specific scoring matrix, the first type depicts the correlation within the sequence with the autocorrelation function for relatively random sections from the sequence; and the second type is the Kullback-Leibler (K-L) divergence of the two distributions formed by the amino acids’ constitutuent proportions. It is illustrated in an experiment with features from different sources mixed by simple concatenation yielding a poor result, but the synthetical feature fused with stochastic nonlinear embedding (t-SNE) greatly improved the classification. Finally, the highest overall accuracy of ZD98 is 87.5% by adjusting the hyper-parameters of H-ELM, and of CL317 is 92.4%.
Facebook
TwitterDatabase that integrates large-scale functional genomics assays and manual cDNA annotation with bioinformatics gene expression and protein analysis. LifeDB integrates data regarding full length cDNA clones and data on expression of encoded protein and their subcellular localization on mammalian cell line. LifeDB enables the scientific community to systematically search and select genes, proteins as well as cDNA of interest by specific database identifiers as well as gene name. It enables to visualize cDNA clone and subcellular location of proteins. It also links the results to external biological databases in order to provide a broader functional information. LifeDB also provides an annotation pipeline which facilitates an improved mapping of clones to known human reference transcripts from the RefSeq database and the Ensembl database. An advanced web interface enables the researchers to view the data in a more user friendly manner. Users can search using any one of the following search options available both in Search gene and cDNA clones and Search Sub-cellular locations of human proteins: By Keyword, By gene/transcript identifier, By plate name, By clone name, By cellular location. * The Search genes and cDNA clones results include: Gene Name, Ensemble ID, Genomic Region, Clone name, Plate name, Plate position, Classification class, Synonymous SNP''s, Non- synonymous SNP''s, Number of ambiguous positions, and Alignment with reference genes. * The Search sub-cellular locations of human proteins results include: Subcellular location, Gene Name, Ensemble ID, Clone name, True localization, Images, Start tag and End tag. Every result page has an option to download result data (excluding the microscopy images). On click of ''Download results as CSV-file'' link in the result page the user will be given a choice to open or save result data in form of a CSV (Comma Separated Values) file. Later the CSV file can be easily opened using Excel or OpenOffice.
Facebook
Twitterhttps://www.proteinatlas.org/about/licencehttps://www.proteinatlas.org/about/licence
Subcellular methods
The subcellular resource of the Human Protein Atlas provides high-resolution insights into the expression and spatiotemporal distribution of proteins encoded by 13603 genes (67% of the human protein-coding genes), as well as predictions for an additional 3459 secreted- or membrane proteins, covering a total of 17062 genes (85% of the human protein-coding genes). For each gene, the subcellular distribution of the protein has been investigated by immunofluorescence (ICC-IF) and confocal microscopy in up to three different standard cell lines, selected from a panel of 42 cell lines used in the subcellular resource. For some genes, the protein has also been stained in up to three ciliated cell lines, induced pluripotent stem cells (iPSCs) and/or in human sperm cells. Upon image analysis, the subcellular localization of the protein has been classified into one or more of 49 different organelles and subcellular structures. In addition, the resource includes an annotation of genes that display single-cell variation in protein expression levels and/or subcellular distribution, as well as an extended analysis of cell cycle dependency of such variations.
The subcellular resource offers a database for detailed exploration of individual genes and proteins of interest, as well as for systematic analysis of proteomes in a broader context. More information about the content of the resouce, as well as the generation and analysis of the data, can be found in the Methods summary. Learn about:
The subcellular distribution of proteins in standard human cell lines, including ciliated cells and iPSCs. The subcellular distribution of proteins in human sperm. The proteomes of different organelles and subcellular structures. Single-cell variability in the expression levels and/or localizations of proteins.