62 datasets found

BLASTP vs TrEMBL
figshare.com
txt
Updated Feb 16, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Franco Liberati (2025). BLASTP vs TrEMBL [Dataset]. http://doi.org/10.6084/m9.figshare.28407983.v2
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.28407983.v2
Dataset updated
Feb 16, 2025
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Franco Liberati
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
BLASTP vs TrEMBL
R
Isoelectric point for all UniProtKB/TrEMBL proteins April 2016
repod.icm.edu.pl
commons.datacite.org
7z, bin
Updated May 18, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kozlowski, Lukasz (2016). Isoelectric point for all UniProtKB/TrEMBL proteins April 2016 [Dataset]. http://doi.org/10.18150/repod.9948646
Explore at:
7z(11492396457), bin(11492396457)Available download formats
Unique identifier
https://doi.org/10.18150/repod.9948646
Dataset updated
May 18, 2016
Dataset provided by
RepOD
Authors
Kozlowski, Lukasz
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Predicted isoelectric point for all UniProtKB/TrEMBL proteins (April 2016) done using 18 different algorithms. Over 63 millions of protein sequences. Compressed using 7zip **Primary reference: Kozlowski, LP (2016) Proteome-pI: proteome isoelectric point database. Nucleic Acids Research doi: 10.1093/nar/gkw978 **www: http://isoelectricpointdb.org
b
BlastX Result : TrEMBL
dbarchive.biosciencedbc.jp
Updated Oct 7, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2016). BlastX Result : TrEMBL [Dataset]. http://doi.org/10.18908/lsdba.nbdc00839-003
Explore at:
Unique identifier
https://doi.org/10.18908/lsdba.nbdc00839-003
Dataset updated
Oct 7, 2016
Description
Results of the blastx search of Adiantum capillus-veneris EST (AcEST) sequences against the UniProtKB/TrEMBL (release 39.9) database. The alignment information of each hit in the Blastx hit list is provided on a single line. CSV format text file.
n
UniProtKB
neuinfo.org
rrid.site
+2more
Updated Oct 13, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). UniProtKB [Dataset]. http://identifiers.org/RRID:SCR_004426
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_004426
Dataset updated
Oct 13, 2024
Description
Central repository for collection of functional information on proteins, with accurate and consistent annotation. In addition to capturing core data mandatory for each UniProtKB entry (mainly, the amino acid sequence, protein name or description, taxonomic data and citation information), as much annotation information as possible is added. This includes widely accepted biological ontologies, classifications and cross-references, and experimental and computational data. The UniProt Knowledgebase consists of two sections, UniProtKB/Swiss-Prot and UniProtKB/TrEMBL. UniProtKB/Swiss-Prot (reviewed) is a high quality manually annotated and non-redundant protein sequence database which brings together experimental results, computed features, and scientific conclusions. UniProtKB/TrEMBL (unreviewed) contains protein sequences associated with computationally generated annotation and large-scale functional characterization that await full manual annotation. Users may browse by taxonomy, keyword, gene ontology, enzyme class or pathway.
ZAMMITII - BLASTX vs TrEMBL
figshare.com
txt
Updated Dec 6, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Franco Liberati (2024). ZAMMITII - BLASTX vs TrEMBL [Dataset]. http://doi.org/10.6084/m9.figshare.26300614.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.26300614.v1
Dataset updated
Dec 6, 2024
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Franco Liberati
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Annotation Diamond database trEMBL Aedes zammitii
NR Viral TrEMBL
figshare.com
bz2
Updated Feb 9, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Feargal Ryan (2018). NR Viral TrEMBL [Dataset]. http://doi.org/10.6084/m9.figshare.5822166.v1
Explore at:
bz2Available download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.5822166.v1
Dataset updated
Feb 9, 2018
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Feargal Ryan
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The viral subset of the TrEMBL database clustered at 95% identity at the amino acid level to remove redundancy.
n
Data from: SYSTERS
neuinfo.org
dknet.org
+2more
Updated Nov 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). SYSTERS [Dataset]. http://identifiers.org/RRID:SCR_007955/resolver/mentions
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_007955 https://identifiers.org/RRID:SCR_007955/resolver/mentions
Dataset updated
Nov 12, 2024
Description
SYSTERS is a database of protein sequences grouped into homologous families and superfamilies. The SYSTERS project aims to provide a meaningful partitioning of the whole protein sequence space by a fully automatic procedure. A refined two-step algorithm assigns each protein to a family and a superfamily. The sequence data underlying SYSTERS release 4 now comprise several protein sequence databases derived from completely sequenced genomes (ENSEMBL, TAIR, SGD and GeneDB), in addition to the comprehensive Swiss-Prot/TrEMBL databases. To augment the automatically derived results, information from external databases like Pfam and Gene Ontology are added to the web server. Furthermore, users can retrieve pre-processed analyses of families like multiple alignments and phylogenetic trees. New query options comprise a batch retrieval tool for functional inference about families based on automatic keyword extraction from sequence annotations. A new access point, PhyloMatrix, allows the retrieval of phylogenetic profiles of SYSTERS families across organisms with completely sequenced genomes. Gene, Human, Vertebrate, Genome, Human ORFs
d
Peptide Sequence Database
dknet.org
scicrunch.org
+1more
Updated Jan 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). Peptide Sequence Database [Dataset]. http://identifiers.org/RRID:SCR_005764
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_005764
Dataset updated
Jan 29, 2022
Description
The Peptide Sequence Database contains putative peptide sequences from human, mouse, rat, and zebrafish. Compressed to eliminate redundancy, these are about 40 fold smaller than a brute force enumeration. Current and old releases are available for download. Each species'' peptide sequence database comprises peptide sequence data from releveant species specific UniGene and IPI clusters, plus all sequences from their consituent EST, mRNA and protein sequence databases, namely RefSeq proteins and mRNAs, UniProt''s SwissProt and TrEMBL, GenBank mRNA, ESTs, and high-throughput cDNAs, HInv-DB, VEGA, EMBL, IPI protein sequences, plus the enumeration of all combinations of UniProt sequence variants, Met loss PTM, and signal peptide cleavages. The README file contains some information about the non amino-acid symbols O (digest site corresponding to a protein N- or C-terminus) and J (no digest sequence join) used in these peptide sequence databases and information about how to configure various search engines to use them. Some search engines handle (very) long sequences badly and in some cases must be patched to use these peptide sequence databases. All search engines supported by the PepArML meta-search engine can (or can be patched to) successfully search these peptide sequence databases.
f
E-value distribution of the BLASTx hits against the Nr and TrEMBL databases...
datasetcatalog.nlm.nih.gov
Updated Jul 15, 2015
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Leng, Yan; Shi, Rui-Fang; Li, Shi-Weng (2015). E-value distribution of the BLASTx hits against the Nr and TrEMBL databases for each unigene. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001873304
Explore at:
Dataset updated
Jul 15, 2015
Authors
Leng, Yan; Shi, Rui-Fang; Li, Shi-Weng
Description
E-value distribution of the BLASTx hits against the Nr and TrEMBL databases for each unigene.
n
Homologous Invertebrate Genes Database
neuinfo.org
scicrunch.org
+1more
Updated Jan 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). Homologous Invertebrate Genes Database [Dataset]. http://identifiers.org/RRID:SCR_007716
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_007716
Dataset updated
Jan 29, 2022
Description
A database of homologous invertebrate genes, structured under ACNUC sequence database management system. It allows one to select sets of homologous genes among invertebrate species, and to visualize multiple alignments and phylogenetic trees. The database itself contains all invertebrate protein sequences from UniProt (SWISS-PROT+TrEMBL), with some data corrected, clarified or completed (notably to address the problem of redundancy and orthology/paralogy) and with some annotation modifications. It contains also all the corresponding nucleotide sequences in EMBL. Homologous proteins are classified into families and multiple alignments and phylogenetic trees are computed for each family. Sequences and related information have been structured in an ACNUC database. Thus, HOINVGEN is particularly useful for comparative sequence analysis, phylogeny and molecular evolution studies. More generally, HOINVGEN gives an overall view of what is known about a peculiar gene family.
n
Protein-Protein Interaction Database
neuinfo.org
dknet.org
+1more
Updated Jan 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). Protein-Protein Interaction Database [Dataset]. http://identifiers.org/RRID:SCR_007288
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_007288
Dataset updated
Jan 29, 2022
Description
Mammalian protein-protein interaction database focusing on synaptic proteins. The Protein-Protein Interaction Database was originally a single-person's attempt to integrate a gamut of biological/bibliographical/molecular data and build a framework which might help understanding how cells orchestrate their protein content in order to become what they are: machines with a purpose. This is based on the simple paradigm that functionality like signal cascades are held together in a close space, thereby allowing specific events to occur without the necessity of passive diffusion and random events. The PPID database arose from the need to interpret Proteomic datasets, which were generated analysing the NMDA-receptor complex (see H. Husi, M. A. Ward, J. S. Choudhary, W. P. Blackstock and S. G. Grant (2000). Proteomic analysis of NMDA receptor-adhesion protein signaling complexes. Nat Neurosci 3, 661-669.). To study these clusters of proteins requires unavoidably the handling of large datasets, which PPID is generally aimed and tailored for. This database is unifying molecular entries across three species, namely human, rat and mouse and is is footed on sequence databases such as SwissProt, EMBL, TrEMBL (translated EMBL sequences) and Unigene and the literature database PubMed. A typical entry in PPID holds up to three general entries for the three species, all protein and gene accession numbers associated with them (assembled from Blast2 searches of the databases) and the OMIM entry as maintained by Johns Hopkins University. Furthermore protein sequence information is also included, together with known and novel splice-variants of each molecule as found by ClustalW sequence alignments. Entry points also include protein-binding information together with the literature reference. The whole database is curated manually to insure accuracy and quality. Querying the database will be possible by online browsing and batch-submission for large datasets holding accession number information, as can be generated using software like Mascot for mass-spectrometry. Cluster-analysis of the submitted datasets in the form of a graphical output will be developed as well as an easy-to-use web-interface. An interface is currently being built in collaboration with the Department of Informatics (T. Theodosiou and D. Armstrong) and will be deployed soon The current team of people collating and deploying the database are H. Husi (database mining and information gathering) and T. Theodosiou (web-interface and deployment). Please note that this database is not funded financially, and cannot survive without sponsorship.
f
feature-representation-for-LLMs
figshare.com
xlsx
Updated Mar 28, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
wang rui; yujuan zhang; Zeyu Luo (2024). feature-representation-for-LLMs [Dataset]. http://doi.org/10.6084/m9.figshare.24312292.v6
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.24312292.v6
Dataset updated
Mar 28, 2024
Dataset provided by
figshare
Authors
wang rui; yujuan zhang; Zeyu Luo
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
This is a database for feature representation of ESM2, which includes Swiss data, Swiss normalized data, original TrEMBL data, original TrEMBL normalized data, non-homology TrEMBL data and Table S10.Non-homologous TrEMBL normalized data can be created by extracting Entry ID from the non-homologous TrEMBL data and then extracting the corresponding feature representation from the original TrEMBL normalized data.Figure S4 (eos) and Figure S5 (eos) are supplement for the Histogram plots and Scatter plots of feature eos in corresponding Figure S4 and Figure S5.Figure S6 and Figure S8 are the results of GO annotation enrichment. The GO gene set is a grouped protein dataset used for GO annotation enrichment.Figure S7 is a silhouette score plot.For specific usage of the dataset, please refer to Github.The RF_model files are pickle files for different RF models, which can be used for dataset inference and interpretable analysis. Among these models, the AA_count model and feature_all model have more complex feature inputs. Therefore, we provide the Swiss training dataset as a reference for feature arrangement. The feature order for other models is simply from 0 to 1279.
n
IPI
neuinfo.org
scicrunch.org
+2more
Updated Jul 21, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2018). IPI [Dataset]. http://identifiers.org/RRID:SCR_003012
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_003012
Dataset updated
Jul 21, 2018
Description
IPI provides a top level guide to the main databases (UniProtKB/Swiss-Prot, UniProtKB/TrEMBL, RefSeq, Ensembl, TAIR, H-InvDB, Vega) that describe the proteomes of higher eukaryotic organisms. IPI: :1. effectively maintains a database of cross references between the primary data sources :2. provides minimally redundant yet maximally complete sets of proteins for featured species (one sequence per transcript) :3. maintains stable identifiers (with incremental versioning) to allow the tracking of sequences in IPI between IPI releases. IPI is updated monthly in accordance with the latest data released by the primary data sources. As previously announced, the closure of IPI has been proposed for some time. Replacement data sets are now available through UniProt for human and mouse; sets for the other species contained within IPI are expected to be included as part of the UniProt release 2011_07. To allow users time to transition to using the new UniProt data sets, IPI releases will continue to be produced throughout the summer. The final release will be made in September 2011. Thereafter, the IPI website will cease to be maintained, although previous releases of the dataset will continue to be available from the FTP site. We would like to thank our users for their support and interest in this service.
f
Summary statistics for the number of protein-coding gene predictions for...
datasetcatalog.nlm.nih.gov
plos.figshare.com
Updated May 16, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Buti, Matteo; Price, R. Jordan; Røen, Dag; Šurbanovski, Nada; Harrison, Richard J.; Lynn, Samantha; Sargent, Daniel James; Alsheikh, Muath; Nellist, Charlotte F.; Davik, Jahn; Fernandéz, Felicidad Fernandéz; Bates, Helen J. (2023). Summary statistics for the number of protein-coding gene predictions for ‘Anitra’, ‘Autumn Bliss’ and ‘Malling Jewel’ that returned ≥1 positive hit after the BlastP analysis with nr, Araport11, RefSeq, SwissProt and TrEMBL databases as subjects, along with the number of protein-coding gene regions assigned Interpro, GO, KEGG orthology and KEGG pathway terms. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001053064
Explore at:
Dataset updated
May 16, 2023
Authors
Buti, Matteo; Price, R. Jordan; Røen, Dag; Šurbanovski, Nada; Harrison, Richard J.; Lynn, Samantha; Sargent, Daniel James; Alsheikh, Muath; Nellist, Charlotte F.; Davik, Jahn; Fernandéz, Felicidad Fernandéz; Bates, Helen J.
Description
Summary statistics for the number of protein-coding gene predictions for ‘Anitra’, ‘Autumn Bliss’ and ‘Malling Jewel’ that returned ≥1 positive hit after the BlastP analysis with nr, Araport11, RefSeq, SwissProt and TrEMBL databases as subjects, along with the number of protein-coding gene regions assigned Interpro, GO, KEGG orthology and KEGG pathway terms.
n
ExPASy Biochemical Pathways
neuinfo.org
scicrunch.org
+1more
Updated Jan 29, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). ExPASy Biochemical Pathways [Dataset]. http://identifiers.org/RRID:SCR_007944
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_007944
Dataset updated
Jan 29, 2022
Description
The ExPASy (Expert Protein Analysis System) proteomics server of the Swiss Institute of Bioinformatics (SIB) is dedicated to the analysis of protein sequences and structures as well as 2-D PAGE. It is a curated protein sequence database which strives to provide a high level of annotation, a minimal level of redundancy and high level of integration with other databases. Recent developments of the database include format and content enhancements, cross-references to additional databases, new documentation files and improvements to TrEMBL, a computer-annotated supplement to SWISS-PROT.
f
Functional annotation returned by CELLO2GO for Gram-negative bacteria...
datasetcatalog.nlm.nih.gov
figshare.com
Updated Jun 9, 2014
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yu, Chin-Sheng; Lu, Chih-Hao; Su, Wen-Chi; Huang, Shao-Wei; Cheng, Chih-Wen; Chang, Kuei-Chung; Hwang, Jenn-Kang (2014). Functional annotation returned by CELLO2GO for Gram-negative bacteria sequence dataset PS30GN. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001188541
Explore at:
Dataset updated
Jun 9, 2014
Authors
Yu, Chin-Sheng; Lu, Chih-Hao; Su, Wen-Chi; Huang, Shao-Wei; Cheng, Chih-Wen; Chang, Kuei-Chung; Hwang, Jenn-Kang
Description
1The percentage of homologous sequences for which GO functional annotations were not found by a BLAST search of the in-house database derived from the UniProtKB/SwissProt database for bacteria.2The percentage of homologous sequences for which GO functional annotations were not found by a BLAST search of the in-house database derived from the UniProtKB/TrEMBL database for bacteria.3The percentage of entries for which GO annotations for cellular components were missing or homologs were not retrieved by BLAST searching of the UniProtKB/TrEMBL databases, but for which CELLO accurately predicted the subcellular localization(s).4The Gram-negative bacterial benchmark dataset found in PSORTb3.0 [23], denoted PS30GN, includes 8029 protein sequences in five subcellular categories: extracellular, outer membrane, periplasmic, inner membrane, and cytoplasmic.
f
Identification of coimmunoprecipitated proteins from B. glabrata.
datasetcatalog.nlm.nih.gov
figshare.com
Updated Feb 20, 2013
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Moné, Yves; Mitta, Guillaume; Gourbal, Benjamin; Duval, David; Du Pasquier, Louis; Kieffer-Jaquinod, Sylvie (2013). Identification of coimmunoprecipitated proteins from B. glabrata. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001652585
Explore at:
Dataset updated
Feb 20, 2013
Authors
Moné, Yves; Mitta, Guillaume; Gourbal, Benjamin; Duval, David; Du Pasquier, Louis; Kieffer-Jaquinod, Sylvie
Description
LC-MS/MS results were used to interrogate Swiss prot/Trembl database (MSdb) and Biomphalaria glabrata ESTs database (Bg-dbEST).A protein was considered to be correctly identified if at least two peptides were confidently matched with a score greater than 100.ID: identified, C: compatible combination, IC: incompatible combination.
n
Xpro
neuinfo.org
dknet.org
+1more
Updated Oct 7, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Xpro [Dataset]. http://identifiers.org/RRID:SCR_007976
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_007976 https://identifiers.org/RRID:SCR_007976/resolver/mentions
Dataset updated
Oct 7, 2024
Description
THIS RESOURCE IS NO LONGER IN SERVICE, documented on July 16, 2013. A relational database that contains all the eukaryotic protein-encoding DNA sequences in GenBank. It provides detailed and comprehensive features about both the intron containing and the intron-less genes. In addition to the information found in the GenBank records, which includes properties such as sequence, position, length and description about introns, exons and protein coding regions, Xpro provides annotations on the splice sites motifs and intron phases. Furthermore, Xpro validates intron positions using alignment information between the records sequence and EST sequences found in dbEST. The entries in the XPro are also cross-referenced to SWISS-PROT/TrEMBL and Pfam databases. Unprecedented growth data in GenBank, the primary repository of nucleotide sequences due to the ever increasing number of genome and EST sequencing projects and the poor annotation of exon/intron details required for molecular evolution studies in the primary nucleotide database have made development of Xpro database. It is a specialized database that contains details about genomic features specific to eukaryotic genes and provides various web tools for analyzing/visualizing these features., THIS RESOURCE IS NO LONGER IN SERVICE. Documented on September 16,2025.
f
Functional annotation returned by CELLO2GO for archaeal dataset PS30Arch.
datasetcatalog.nlm.nih.gov
Updated Jun 9, 2014
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Chang, Kuei-Chung; Hwang, Jenn-Kang; Yu, Chin-Sheng; Su, Wen-Chi; Huang, Shao-Wei; Cheng, Chih-Wen; Lu, Chih-Hao (2014). Functional annotation returned by CELLO2GO for archaeal dataset PS30Arch. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001188524
Explore at:
Dataset updated
Jun 9, 2014
Authors
Chang, Kuei-Chung; Hwang, Jenn-Kang; Yu, Chin-Sheng; Su, Wen-Chi; Huang, Shao-Wei; Cheng, Chih-Wen; Lu, Chih-Hao
Description
1The percentage of homologous sequences for which GO functional annotations were not found by a BLAST search of the in-house database derived from the UniProtKB/SwissProt database for archaea.2The percentage of homologous sequences for which GO functional annotations were not found by a BLAST search of the in-house database derived from the UniProtKB/TrEMBL database for archaea.3The percentage of entries for which GO annotations for cellular components were missing or homologs were not retrieved by BLAST searching of the UniProtKB/TrEMBL databases, but for which CELLO accurately predicted the subcellular localization(s).4The archaeal benchmark dataset found in PSORTb3.0 [23], denoted PS30Arch, includes 805 protein sequences in four subcellular categories: extracellular, cell wall, membrane, and cytoplasmic.
f
Functional annotation returned by CELLO2GO for Pseudomonas aeruginosa PA01...
datasetcatalog.nlm.nih.gov
plos.figshare.com
Updated Jun 9, 2014
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Su, Wen-Chi; Cheng, Chih-Wen; Lu, Chih-Hao; Yu, Chin-Sheng; Huang, Shao-Wei; Chang, Kuei-Chung; Hwang, Jenn-Kang (2014). Functional annotation returned by CELLO2GO for Pseudomonas aeruginosa PA01 dataset. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001188544
Explore at:
Dataset updated
Jun 9, 2014
Authors
Su, Wen-Chi; Cheng, Chih-Wen; Lu, Chih-Hao; Yu, Chin-Sheng; Huang, Shao-Wei; Chang, Kuei-Chung; Hwang, Jenn-Kang
Description
1The percentage of homologous sequences for which GO functional annotations were not found by a BLAST search of the in-house database derived from the UniProtKB/SwissProt database for bacteria.2The percentage of homologous sequences for which GO functional annotations were not found by a BLAST search of the in-house database derived from the UniProtKB/TrEMBL database for bacteria.3The percentage of entries for which GO annotations for cellular components were missing or homologs were not retrieved by BLAST searching of the UniProtKB/TrEMBL databases, but for which CELLO accurately predicted the subcellular localization(s).4The proteomic sequence data is that of the newly documented Pseudomonas aeruginosa PA01 dataset [31], which contains hypothetical and uncharacterized proteins.