100+ datasets found

e
PROSITE profiles
ebi.ac.uk
Updated Feb 5, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). PROSITE profiles [Dataset]. https://www.ebi.ac.uk/interpro/
Explore at:
Dataset updated
Feb 5, 2025
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
PROSITE is a database of protein families and domains. It consists of biologically significant sites, patterns and profiles that help to reliably identify to which known protein family a new sequence belongs. PROSITE is based at the Swiss Institute of Bioinformatics (SIB), Geneva, Switzerland.
Swiss-Prot database
springernature.figshare.com
application/cdfv2
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shuqi Wang; Cuihong You; Hongyu Ma; Yin Zhang; Guidong Miao; Qingyang Wu; Fan Lin; Jude Juventus Aweya (2023). Swiss-Prot database [Dataset]. http://doi.org/10.6084/m9.figshare.6124457.v1
Explore at:
application/cdfv2Available download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.6124457.v1
Dataset updated
Jun 1, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Shuqi Wang; Cuihong You; Hongyu Ma; Yin Zhang; Guidong Miao; Qingyang Wu; Fan Lin; Jude Juventus Aweya
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
All unigenes of Portunus sanguinolentus hit to the Swiss-Prot database.
e
Data from: PROSITE
prosite.expasy.org
the-mouth.com
+7more
Updated Jun 18, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). PROSITE [Dataset]. https://prosite.expasy.org/
Explore at:
Dataset updated
Jun 18, 2025
Description
PROSITE consists of documentation entries describing protein domains, families and functional sites as well as associated patterns and profiles to identify them [More... / References / Commercial users ]. PROSITE is complemented by ProRule , a collection of rules based on profiles and patterns, which increases the discriminatory power of profiles and patterns by providing additional information about functionally and/or structurally critical amino acids [More...].
s
UniProt
scicrunch.org
dknet.org
+2more
Updated Jan 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). UniProt [Dataset]. http://identifiers.org/RRID:SCR_002380
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_002380
Dataset updated
Jan 29, 2022
Description
Collection of data of protein sequence and functional information. Resource for protein sequence and annotation data. Consortium for preservation of the UniProt databases: UniProt Knowledgebase (UniProtKB), UniProt Reference Clusters (UniRef), and UniProt Archive (UniParc), UniProt Proteomes. Collaboration between European Bioinformatics Institute (EMBL-EBI), SIB Swiss Institute of Bioinformatics and Protein Information Resource. Swiss-Prot is a curated subset of UniProtKB.
e
HAMAP
ebi.ac.uk
Updated Feb 5, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). HAMAP [Dataset]. https://www.ebi.ac.uk/interpro/
Explore at:
Dataset updated
Feb 5, 2025
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
HAMAP stands for High-quality Automated and Manual Annotation of Proteins. HAMAP profiles are manually created by expert curators. They identify proteins that are part of well-conserved protein families or subfamilies. HAMAP is based at the SIB Swiss Institute of Bioinformatics, Geneva, Switzerland.
Proven Drug Targets Converted to Human SwissProt Accessions
johnsnowlabs.com
csv
Updated Jan 20, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
John Snow Labs (2021). Proven Drug Targets Converted to Human SwissProt Accessions [Dataset]. https://www.johnsnowlabs.com/marketplace/proven-drug-targets-converted-to-human-swissprot-accessions/
Explore at:
csvAvailable download formats
Dataset updated
Jan 20, 2021
Dataset authored and provided by
John Snow Labs
Area covered
N/A
Description
This dataset is a supplementary data from "Novelty in the target landscape of the pharmaceutical industry" (2013). The listing of proven drug targets is converted to 248 human Swiss-Prot accessions.
Approved and Researched Drug Targets Human SwissProt Accessions
johnsnowlabs.com
csv
Updated Jan 20, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
John Snow Labs (2021). Approved and Researched Drug Targets Human SwissProt Accessions [Dataset]. https://www.johnsnowlabs.com/marketplace/approved-and-researched-drug-targets-human-swissprot-accessions/
Explore at:
csvAvailable download formats
Dataset updated
Jan 20, 2021
Dataset authored and provided by
John Snow Labs
Area covered
N/A
Description
This dataset is a supplementary data from "Analysis of in vitro bioactivity data extracted from drug discovery literature and patents: Ranking 1654 human protein targets by assayed compounds and molecular scaffolds" (2011). In this case the Entrez Gene IDs were mapped to 1651 human Swiss-Prot accessions but this includes both approved and research targets.
Number of human protein variations collected from the UniProt/Swiss-Prot...
plos.figshare.com
xls
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yongwook Choi; Gregory E. Sims; Sean Murphy; Jason R. Miller; Agnes P. Chan (2023). Number of human protein variations collected from the UniProt/Swiss-Prot database. [Dataset]. http://doi.org/10.1371/journal.pone.0046688.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0046688.t001
Dataset updated
Jun 1, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Yongwook Choi; Gregory E. Sims; Sean Murphy; Jason R. Miller; Agnes P. Chan
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Number of human protein variations collected from the UniProt/Swiss-Prot database.
Matches Found in Swiss-Prot Database.
plos.figshare.com
xls
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kemal Sonmez; Naunihal T. Zaveri; Ilan A. Kerman; Sharon Burke; Charles R. Neal; Xinmin Xie; Stanley J. Watson; Lawrence Toll (2023). Matches Found in Swiss-Prot Database. [Dataset]. http://doi.org/10.1371/journal.pcbi.1000258.t002
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pcbi.1000258.t002
Dataset updated
May 31, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Kemal Sonmez; Naunihal T. Zaveri; Ilan A. Kerman; Sharon Burke; Charles R. Neal; Xinmin Xie; Stanley J. Watson; Lawrence Toll
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
False PositivesOther signaling molecules: FGF-3,5,7,10,17,18; GDNF; CD8,28; PDGF-2; TGF; VEGF (vascular endothelial growth factor); HBNF-1; MIP; NGF (nerve growth factor); Cytokine A21, IFN-α (interferon alpha); IGF binding protein 1B,2,3; IL7 (interleukin 7).Other: MAGF (microfibril associated protein), MINK (K-channel), K-channel related peptide, L-type Ca2+ channel, gamma subunit, myelin Po protein, Dif-2, Eosinophil, Syntaxin 1B (vesicle docking), Syntaxin 2, TMP21 (vesicle trafficking protein), Coagulation factor III, PGD2 synthase, syndecans, FKBP12 (FK506 binding protein), Folate receptor, ERp29, COMT, Connexin 32, Cytostatin.
e
SWISS-MODEL Homology Protein Models for Proteome UP000000589 - (Mus...
swissmodel.expasy.org
gz
Updated Sep 16, 2016
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2016). SWISS-MODEL Homology Protein Models for Proteome UP000000589 - (Mus musculus) [Dataset]. https://swissmodel.expasy.org/repository/species/10090
Explore at:
gzAvailable download formats
Dataset updated
Sep 16, 2016
Description
SWISS-MODEL homology protein models mapping to UniProtKB Proteome UP000000589 (Mus musculus)
n
ExPASy ABCD database
neuinfo.org
dknet.org
+1more
Updated Aug 5, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). ExPASy ABCD database [Dataset]. http://identifiers.org/RRID:SCR_017401
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_017401
Dataset updated
Aug 5, 2024
Description
Repository of sequenced antibodies, integrating curated information about antibody and its antigen with cross links to standardized databases of chemical and protein entities. Manually curated repository of sequenced antibodies, developed by Geneva Antibody Facility at University of Geneva, in collaboration with CALIPHO and Swiss Prot groups at SIB Swiss Institute of Bioinformatics. Database provides list of sequenced antibodies with their known targets. Each antibody is assigned unique ID number that can be used in academic publications to increase reproducibility of experiments.
f
Breakdown of the eukaryotic protein benchmark dataset derived from...
plos.figshare.com
xls
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kuo-Chen Chou; Hong-Bin Shen (2023). Breakdown of the eukaryotic protein benchmark dataset derived from Swiss-Prot database (release 55.3) according to the procedures described in the Materials section. [Dataset]. http://doi.org/10.1371/journal.pone.0009931.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0009931.t001
Dataset updated
May 30, 2023
Dataset provided by
PLOS ONE
Authors
Kuo-Chen Chou; Hong-Bin Shen
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
None of the proteins included here has sequence identity to any other in a same subcellular location.aSee Fig. 1 and Eq.1 as well as the relevant text for the definitions of the subsets listed in this table.bSee Eqs.2–3 for the definition about the number of virtual proteins, and its relation with the number of different proteins.cOf the 7,766 different proteins, 6,687 belong to one subcellular location, 1,029 to two locations, 48 to three locations, and 2 to four locations. See Online Supporting Information S1 for the protein sequences.
e
SWISS-MODEL Homology Protein Models for Proteome UP000000625 - (Escherichia...
swissmodel.expasy.org
gz
Updated Sep 16, 2016
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2016). SWISS-MODEL Homology Protein Models for Proteome UP000000625 - (Escherichia coli) [Dataset]. https://swissmodel.expasy.org/repository/species/83333
Explore at:
gzAvailable download formats
Dataset updated
Sep 16, 2016
Description
SWISS-MODEL homology protein models mapping to UniProtKB Proteome UP000000625 (Escherichia coli)
o
Data from: The SWISS-PROT protein knowledgebase and its supplement TrEMBL in...
omicsdi.org
Updated Jan 1, 1994
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(1994). The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003. [Dataset]. https://www.omicsdi.org/dataset/biostudies/S-EPMC165542
Explore at:
Dataset updated
Jan 1, 1994
Variables measured
Unknown
Description
The SWISS-PROT protein knowledgebase (http://www.expasy.org/sprot/ and http://www.ebi.ac.uk/swissprot/) connects amino acid sequences with the current knowledge in the Life Sciences. Each protein entry provides an interdisciplinary overview of relevant information by bringing together experimental results, computed features and sometimes even contradictory conclusions. Detailed expertise that goes beyond the scope of SWISS-PROT is made available via direct links to specialised databases. SWISS-PROT provides annotated entries for all species, but concentrates on the annotation of entries from human (the HPI project) and other model organisms to ensure the presence of high quality annotation for representative members of all protein families. Part of the annotation can be transferred to other family members, as is already done for microbes by the High-quality Automated and Manual Annotation of microbial Proteomes (HAMAP) project. Protein families and groups of proteins are regularly reviewed to keep up with current scientific findings. Complementarily, TrEMBL strives to comprise all protein sequences that are not yet represented in SWISS-PROT, by incorporating a perpetually increasing level of mostly automated annotation. Researchers are welcome to contribute their knowledge to the scientific community by submitting relevant findings to SWISS-PROT at swiss-prot@expasy.org.
The Therapeutic Drug Target Database Human SwissProt
johnsnowlabs.com
csv
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
John Snow Labs, The Therapeutic Drug Target Database Human SwissProt [Dataset]. https://www.johnsnowlabs.com/marketplace/the-therapeutic-drug-target-database-human-swissprot/
Explore at:
csvAvailable download formats
Dataset authored and provided by
John Snow Labs
Area covered
N/A
Description
This dataset is a selection of The Therapeutic Target Database (release 4.3.02, 18th Oct 2013) protein IDs for successful targets. The web page states 388 but these reduced to 345 human Swiss-Prot accessions.
Z
PSSH2 - database of protein sequence-to-structure homologies (including...
data.niaid.nih.gov
zenodo.org
Updated Feb 11, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sandeep Kaur (2022). PSSH2 - database of protein sequence-to-structure homologies (including Sars-CoV-2 structures) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4279163
Explore at:
Dataset updated
Feb 11, 2022
Dataset provided by
Neblina Sikta
Sandeep Kaur
Sean O'Donoghue
Andrea Schafferhans
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Protein sequence and structure data

This data set contains data from Uniprot (in the files called protein_sequence, protein_synonyms, protein_names, organism_synonyms) and PDB (in the files called PDB and PDB_chain) as used by the Aquaria web resource at the time of download (2022-02-08).

The PSSH2 data set

PSSH2 is a database of protein sequence-to-structure homologies based on HHblits, an alignment method employing iterative comparisons of hidden Markov models (HMMs). To ensure the highest possible final alignment quality for matches in Aquaria using HHblits, we first calculate HMM profiles for each unique PDB sequence (PDB_full) and also for each unique Swiss-Prot sequence. We generated PSSH2 using HHblits to find similarities between HMMs from PDB and HMMs from UniProt sequences.

Calculating PSSH2

The Swissprot and PDB data was downloaded in November 2021. Generating PSSH2: We used UniRef30_2021_03 (originally called UniRef30_2021_06) from HH-suite, a database of non-redundant UniProt sequence clusters in which the highest pairwise sequence identity between clusters was 30%. The HHblits code and the code for running the calculations was retrieved from git (https://github.com/soedinglab/hh-suite.git and https://github.com/aschafu/PSSH2.git respectively) at the respective time of calculation in the timeframe until December 2021.

PDB based sequence-to-structure alignments

In addition to the PSSH2 data, new PDB structures were retrieved based on the primary accession of the proteins, by querying for all chains in all PDB entries with exact matches using the sequence cross references records given in PDB. Sequence-to-structure alignments were then created, again based on information provided in each PDB entry. These are contained in the PDBchain data.

This data covers sequences and PDB structures in the timeframe until February 2022.

Evaluating PSSH2

The resulting alignment data was analysed using CATH domain assignments downloaded from /cath/releases/all-releases/v4_2_0/cath-classification-data/ to define correct hits and false hits:

The set of query sequences is defined by the CATH non-redundant S40_overlap_60 dataset (ftp://orengoftp.biochem.ucl.ac.uk/cath/releases/all-releases/v4_2_0/non-redundant-data-sets/)

The set of all expected hits are all pdb structures containing a domain with the same CATH code if contained in the set of processed sequences (-> all) or only if also contained in the set of non redundant sequences (-> nr40).

The set of true positives is defined by sharing the same CATH code up to the level of homology ("CATH") or up to the level of topology ("CAT").

The data was evaluated with respect to false discovery rate (FDR) and recall (true positive rate TPR) by cumulatively considering all hits with an E-value below the threshold ("C") or in bins with an E-value between the threshold and one tenth of the threshold ("B"). This evaluation was carried out for the data obtained in November 2021 (202111) as well as previous data from October 2020 (202010), February 2020 (202002) and September 2017 (201709). The results are collected in PSSH CATH validation.csv.

Known errors

Due to processing error, the profile of pdb structure 5fia A / B (sequence md5 052667679fc644184f40063c7602c9e1) is incomplete in the pdb_full hhblits database which led to further errors in generating sequence based alignments for sequences for 1vtm P (sequence md5 c844aff103449363cb8489c78c58ebf1) and 434t A / B (sequence md5 d67aa1c3a36492c719cb48b5e7ecc624).
d
SWISS-2DPAGE
dknet.org
neuinfo.org
+1more
Updated Jan 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). SWISS-2DPAGE [Dataset]. http://identifiers.org/RRID:SCR_006946
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_006946
Dataset updated
Jan 29, 2022
Description
A database of proteins identified by various 2-D PAGE and SDS-PAGE reference maps. Each SWISS-2DPAGE entry contains textual data on one protein, including mapping procedures, physiological and pathological information, experimental data (isoelectric point, molecular weight, amino acid composition, peptide masses) and bibliographical references. In addition to this textual data, SWISS-2DPAGE provides several 2-D PAGE and SDS-PAGE images showing the experimentally determined location of the protein, as well as a theoretical region computed from the sequence protein, indicating where the protein might be found in the gel. Using the database, users can locate these proteins on the 2-D PAGE maps or display the region of a 2-D PAGE map where one might expect to find a protein from UniProtKB/Swiss-Prot.
s
UniProtKB
scicrunch.org
Updated Oct 24, 2019
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2019). UniProtKB [Dataset]. http://identifiers.org/RRID:SCR_004426
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_004426
Dataset updated
Oct 24, 2019
Description
Central repository for collection of functional information on proteins, with accurate and consistent annotation. In addition to capturing core data mandatory for each UniProtKB entry (mainly, the amino acid sequence, protein name or description, taxonomic data and citation information), as much annotation information as possible is added. This includes widely accepted biological ontologies, classifications and cross-references, and experimental and computational data. The UniProt Knowledgebase consists of two sections, UniProtKB/Swiss-Prot and UniProtKB/TrEMBL. UniProtKB/Swiss-Prot (reviewed) is a high quality manually annotated and non-redundant protein sequence database which brings together experimental results, computed features, and scientific conclusions. UniProtKB/TrEMBL (unreviewed) contains protein sequences associated with computationally generated annotation and large-scale functional characterization that await full manual annotation. Users may browse by taxonomy, keyword, gene ontology, enzyme class or pathway.
e
NCBIFAM
ebi.ac.uk
Updated Dec 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). NCBIFAM [Dataset]. https://www.ebi.ac.uk/interpro/
Explore at:
Dataset updated
Dec 16, 2024
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
NCBIfam is a collection of protein families, featuring curated multiple sequence alignments, hidden Markov models (HMMs) and annotation, which provides a tool for identifying functionally related proteins based on sequence homology. NCBIfam is maintained at the National Center for Biotechnology Information (Bethesda, MD). NCBIfam includes models from TIGRFAMs, another database of protein families developed at The Institute for Genomic Research, then at the J. Craig Venter Institute (Rockville, MD, US).
s
Repository URL
cinergi.sdsc.edu
resource url
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Repository URL [Dataset]. http://cinergi.sdsc.edu/geoportal/rest/metadata/item/323ebc5365ec476ebdcb92329cf10b57/html
Explore at:
resource urlAvailable download formats
Description
Link Function: information

Facebook

Twitter

Click to copy link

Link copied

Cite

(2025). PROSITE profiles [Dataset]. https://www.ebi.ac.uk/interpro/

PROSITE profiles

Explore at:

Dataset updated

Feb 5, 2025

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

PROSITE is a database of protein families and domains. It consists of biologically significant sites, patterns and profiles that help to reliably identify to which known protein family a new sequence belongs. PROSITE is based at the Swiss Institute of Bioinformatics (SIB), Geneva, Switzerland.

Clear search

Close search

Google apps

Main menu

PROSITE profiles

Swiss-Prot database

Data from: PROSITE

UniProt

HAMAP

Proven Drug Targets Converted to Human SwissProt Accessions

Approved and Researched Drug Targets Human SwissProt Accessions

Number of human protein variations collected from the UniProt/Swiss-Prot...

Matches Found in Swiss-Prot Database.

SWISS-MODEL Homology Protein Models for Proteome UP000000589 - (Mus...

ExPASy ABCD database

Breakdown of the eukaryotic protein benchmark dataset derived from...

SWISS-MODEL Homology Protein Models for Proteome UP000000625 - (Escherichia...

Data from: The SWISS-PROT protein knowledgebase and its supplement TrEMBL in...

The Therapeutic Drug Target Database Human SwissProt

PSSH2 - database of protein sequence-to-structure homologies (including...

SWISS-2DPAGE

UniProtKB

NCBIFAM

Repository URL

PROSITE profiles