100+ datasets found
  1. e

    PROSITE profiles

    • ebi.ac.uk
    Updated Feb 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). PROSITE profiles [Dataset]. https://www.ebi.ac.uk/interpro/
    Explore at:
    Dataset updated
    Feb 5, 2025
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    PROSITE is a database of protein families and domains. It consists of biologically significant sites, patterns and profiles that help to reliably identify to which known protein family a new sequence belongs. PROSITE is based at the Swiss Institute of Bioinformatics (SIB), Geneva, Switzerland.

  2. Swiss-Prot database

    • springernature.figshare.com
    application/cdfv2
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shuqi Wang; Cuihong You; Hongyu Ma; Yin Zhang; Guidong Miao; Qingyang Wu; Fan Lin; Jude Juventus Aweya (2023). Swiss-Prot database [Dataset]. http://doi.org/10.6084/m9.figshare.6124457.v1
    Explore at:
    application/cdfv2Available download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Shuqi Wang; Cuihong You; Hongyu Ma; Yin Zhang; Guidong Miao; Qingyang Wu; Fan Lin; Jude Juventus Aweya
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    All unigenes of Portunus sanguinolentus hit to the Swiss-Prot database.

  3. e

    Data from: PROSITE

    • prosite.expasy.org
    • the-mouth.com
    • +7more
    Updated Jun 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). PROSITE [Dataset]. https://prosite.expasy.org/
    Explore at:
    Dataset updated
    Jun 18, 2025
    Description

    PROSITE consists of documentation entries describing protein domains, families and functional sites as well as associated patterns and profiles to identify them [More... / References / Commercial users ]. PROSITE is complemented by ProRule , a collection of rules based on profiles and patterns, which increases the discriminatory power of profiles and patterns by providing additional information about functionally and/or structurally critical amino acids [More...].

  4. s

    UniProt

    • scicrunch.org
    • dknet.org
    • +2more
    Updated Jan 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). UniProt [Dataset]. http://identifiers.org/RRID:SCR_002380
    Explore at:
    Dataset updated
    Jan 29, 2022
    Description

    Collection of data of protein sequence and functional information. Resource for protein sequence and annotation data. Consortium for preservation of the UniProt databases: UniProt Knowledgebase (UniProtKB), UniProt Reference Clusters (UniRef), and UniProt Archive (UniParc), UniProt Proteomes. Collaboration between European Bioinformatics Institute (EMBL-EBI), SIB Swiss Institute of Bioinformatics and Protein Information Resource. Swiss-Prot is a curated subset of UniProtKB.

  5. e

    HAMAP

    • ebi.ac.uk
    Updated Feb 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). HAMAP [Dataset]. https://www.ebi.ac.uk/interpro/
    Explore at:
    Dataset updated
    Feb 5, 2025
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    HAMAP stands for High-quality Automated and Manual Annotation of Proteins. HAMAP profiles are manually created by expert curators. They identify proteins that are part of well-conserved protein families or subfamilies. HAMAP is based at the SIB Swiss Institute of Bioinformatics, Geneva, Switzerland.

  6. Proven Drug Targets Converted to Human SwissProt Accessions

    • johnsnowlabs.com
    csv
    Updated Jan 20, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    John Snow Labs (2021). Proven Drug Targets Converted to Human SwissProt Accessions [Dataset]. https://www.johnsnowlabs.com/marketplace/proven-drug-targets-converted-to-human-swissprot-accessions/
    Explore at:
    csvAvailable download formats
    Dataset updated
    Jan 20, 2021
    Dataset authored and provided by
    John Snow Labs
    Area covered
    N/A
    Description

    This dataset is a supplementary data from "Novelty in the target landscape of the pharmaceutical industry" (2013). The listing of proven drug targets is converted to 248 human Swiss-Prot accessions.

  7. Approved and Researched Drug Targets Human SwissProt Accessions

    • johnsnowlabs.com
    csv
    Updated Jan 20, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    John Snow Labs (2021). Approved and Researched Drug Targets Human SwissProt Accessions [Dataset]. https://www.johnsnowlabs.com/marketplace/approved-and-researched-drug-targets-human-swissprot-accessions/
    Explore at:
    csvAvailable download formats
    Dataset updated
    Jan 20, 2021
    Dataset authored and provided by
    John Snow Labs
    Area covered
    N/A
    Description

    This dataset is a supplementary data from "Analysis of in vitro bioactivity data extracted from drug discovery literature and patents: Ranking 1654 human protein targets by assayed compounds and molecular scaffolds" (2011). In this case the Entrez Gene IDs were mapped to 1651 human Swiss-Prot accessions but this includes both approved and research targets.

  8. Number of human protein variations collected from the UniProt/Swiss-Prot...

    • plos.figshare.com
    xls
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yongwook Choi; Gregory E. Sims; Sean Murphy; Jason R. Miller; Agnes P. Chan (2023). Number of human protein variations collected from the UniProt/Swiss-Prot database. [Dataset]. http://doi.org/10.1371/journal.pone.0046688.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Yongwook Choi; Gregory E. Sims; Sean Murphy; Jason R. Miller; Agnes P. Chan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Number of human protein variations collected from the UniProt/Swiss-Prot database.

  9. Matches Found in Swiss-Prot Database.

    • plos.figshare.com
    xls
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kemal Sonmez; Naunihal T. Zaveri; Ilan A. Kerman; Sharon Burke; Charles R. Neal; Xinmin Xie; Stanley J. Watson; Lawrence Toll (2023). Matches Found in Swiss-Prot Database. [Dataset]. http://doi.org/10.1371/journal.pcbi.1000258.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Kemal Sonmez; Naunihal T. Zaveri; Ilan A. Kerman; Sharon Burke; Charles R. Neal; Xinmin Xie; Stanley J. Watson; Lawrence Toll
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    False PositivesOther signaling molecules: FGF-3,5,7,10,17,18; GDNF; CD8,28; PDGF-2; TGF; VEGF (vascular endothelial growth factor); HBNF-1; MIP; NGF (nerve growth factor); Cytokine A21, IFN-α (interferon alpha); IGF binding protein 1B,2,3; IL7 (interleukin 7).Other: MAGF (microfibril associated protein), MINK (K-channel), K-channel related peptide, L-type Ca2+ channel, gamma subunit, myelin Po protein, Dif-2, Eosinophil, Syntaxin 1B (vesicle docking), Syntaxin 2, TMP21 (vesicle trafficking protein), Coagulation factor III, PGD2 synthase, syndecans, FKBP12 (FK506 binding protein), Folate receptor, ERp29, COMT, Connexin 32, Cytostatin.

  10. e

    SWISS-MODEL Homology Protein Models for Proteome UP000000589 - (Mus...

    • swissmodel.expasy.org
    gz
    Updated Sep 16, 2016
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2016). SWISS-MODEL Homology Protein Models for Proteome UP000000589 - (Mus musculus) [Dataset]. https://swissmodel.expasy.org/repository/species/10090
    Explore at:
    gzAvailable download formats
    Dataset updated
    Sep 16, 2016
    Description

    SWISS-MODEL homology protein models mapping to UniProtKB Proteome UP000000589 (Mus musculus)

  11. n

    ExPASy ABCD database

    • neuinfo.org
    • dknet.org
    • +1more
    Updated Aug 5, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). ExPASy ABCD database [Dataset]. http://identifiers.org/RRID:SCR_017401
    Explore at:
    Dataset updated
    Aug 5, 2024
    Description

    Repository of sequenced antibodies, integrating curated information about antibody and its antigen with cross links to standardized databases of chemical and protein entities. Manually curated repository of sequenced antibodies, developed by Geneva Antibody Facility at University of Geneva, in collaboration with CALIPHO and Swiss Prot groups at SIB Swiss Institute of Bioinformatics. Database provides list of sequenced antibodies with their known targets. Each antibody is assigned unique ID number that can be used in academic publications to increase reproducibility of experiments.

  12. f

    Breakdown of the eukaryotic protein benchmark dataset derived from...

    • plos.figshare.com
    xls
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kuo-Chen Chou; Hong-Bin Shen (2023). Breakdown of the eukaryotic protein benchmark dataset derived from Swiss-Prot database (release 55.3) according to the procedures described in the Materials section. [Dataset]. http://doi.org/10.1371/journal.pone.0009931.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Kuo-Chen Chou; Hong-Bin Shen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    None of the proteins included here has sequence identity to any other in a same subcellular location.aSee Fig. 1 and Eq.1 as well as the relevant text for the definitions of the subsets listed in this table.bSee Eqs.2–3 for the definition about the number of virtual proteins, and its relation with the number of different proteins.cOf the 7,766 different proteins, 6,687 belong to one subcellular location, 1,029 to two locations, 48 to three locations, and 2 to four locations. See Online Supporting Information S1 for the protein sequences.

  13. e

    SWISS-MODEL Homology Protein Models for Proteome UP000000625 - (Escherichia...

    • swissmodel.expasy.org
    gz
    Updated Sep 16, 2016
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2016). SWISS-MODEL Homology Protein Models for Proteome UP000000625 - (Escherichia coli) [Dataset]. https://swissmodel.expasy.org/repository/species/83333
    Explore at:
    gzAvailable download formats
    Dataset updated
    Sep 16, 2016
    Description

    SWISS-MODEL homology protein models mapping to UniProtKB Proteome UP000000625 (Escherichia coli)

  14. o

    Data from: The SWISS-PROT protein knowledgebase and its supplement TrEMBL in...

    • omicsdi.org
    Updated Jan 1, 1994
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (1994). The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003. [Dataset]. https://www.omicsdi.org/dataset/biostudies/S-EPMC165542
    Explore at:
    Dataset updated
    Jan 1, 1994
    Variables measured
    Unknown
    Description

    The SWISS-PROT protein knowledgebase (http://www.expasy.org/sprot/ and http://www.ebi.ac.uk/swissprot/) connects amino acid sequences with the current knowledge in the Life Sciences. Each protein entry provides an interdisciplinary overview of relevant information by bringing together experimental results, computed features and sometimes even contradictory conclusions. Detailed expertise that goes beyond the scope of SWISS-PROT is made available via direct links to specialised databases. SWISS-PROT provides annotated entries for all species, but concentrates on the annotation of entries from human (the HPI project) and other model organisms to ensure the presence of high quality annotation for representative members of all protein families. Part of the annotation can be transferred to other family members, as is already done for microbes by the High-quality Automated and Manual Annotation of microbial Proteomes (HAMAP) project. Protein families and groups of proteins are regularly reviewed to keep up with current scientific findings. Complementarily, TrEMBL strives to comprise all protein sequences that are not yet represented in SWISS-PROT, by incorporating a perpetually increasing level of mostly automated annotation. Researchers are welcome to contribute their knowledge to the scientific community by submitting relevant findings to SWISS-PROT at swiss-prot@expasy.org.

  15. The Therapeutic Drug Target Database Human SwissProt

    • johnsnowlabs.com
    csv
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    John Snow Labs, The Therapeutic Drug Target Database Human SwissProt [Dataset]. https://www.johnsnowlabs.com/marketplace/the-therapeutic-drug-target-database-human-swissprot/
    Explore at:
    csvAvailable download formats
    Dataset authored and provided by
    John Snow Labs
    Area covered
    N/A
    Description

    This dataset is a selection of The Therapeutic Target Database (release 4.3.02, 18th Oct 2013) protein IDs for successful targets. The web page states 388 but these reduced to 345 human Swiss-Prot accessions.

  16. Z

    PSSH2 - database of protein sequence-to-structure homologies (including...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Feb 11, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sandeep Kaur (2022). PSSH2 - database of protein sequence-to-structure homologies (including Sars-CoV-2 structures) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4279163
    Explore at:
    Dataset updated
    Feb 11, 2022
    Dataset provided by
    Neblina Sikta
    Sandeep Kaur
    Sean O'Donoghue
    Andrea Schafferhans
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Protein sequence and structure data

    This data set contains data from Uniprot (in the files called protein_sequence, protein_synonyms, protein_names, organism_synonyms) and PDB (in the files called PDB and PDB_chain) as used by the Aquaria web resource at the time of download (2022-02-08).

    The PSSH2 data set

    PSSH2 is a database of protein sequence-to-structure homologies based on HHblits, an alignment method employing iterative comparisons of hidden Markov models (HMMs). To ensure the highest possible final alignment quality for matches in Aquaria using HHblits, we first calculate HMM profiles for each unique PDB sequence (PDB_full) and also for each unique Swiss-Prot sequence. We generated PSSH2 using HHblits to find similarities between HMMs from PDB and HMMs from UniProt sequences.

    Calculating PSSH2

    The Swissprot and PDB data was downloaded in November 2021. Generating PSSH2: We used UniRef30_2021_03 (originally called UniRef30_2021_06) from HH-suite, a database of non-redundant UniProt sequence clusters in which the highest pairwise sequence identity between clusters was 30%. The HHblits code and the code for running the calculations was retrieved from git (https://github.com/soedinglab/hh-suite.git and https://github.com/aschafu/PSSH2.git respectively) at the respective time of calculation in the timeframe until December 2021.

    PDB based sequence-to-structure alignments

    In addition to the PSSH2 data, new PDB structures were retrieved based on the primary accession of the proteins, by querying for all chains in all PDB entries with exact matches using the sequence cross references records given in PDB. Sequence-to-structure alignments were then created, again based on information provided in each PDB entry. These are contained in the PDBchain data.

    This data covers sequences and PDB structures in the timeframe until February 2022.

    Evaluating PSSH2

    The resulting alignment data was analysed using CATH domain assignments downloaded from /cath/releases/all-releases/v4_2_0/cath-classification-data/ to define correct hits and false hits:

    The set of query sequences is defined by the CATH non-redundant S40_overlap_60 dataset (ftp://orengoftp.biochem.ucl.ac.uk/cath/releases/all-releases/v4_2_0/non-redundant-data-sets/)

    The set of all expected hits are all pdb structures containing a domain with the same CATH code if contained in the set of processed sequences (-> all) or only if also contained in the set of non redundant sequences (-> nr40).

    The set of true positives is defined by sharing the same CATH code up to the level of homology ("CATH") or up to the level of topology ("CAT").

    The data was evaluated with respect to false discovery rate (FDR) and recall (true positive rate TPR) by cumulatively considering all hits with an E-value below the threshold ("C") or in bins with an E-value between the threshold and one tenth of the threshold ("B"). This evaluation was carried out for the data obtained in November 2021 (202111) as well as previous data from October 2020 (202010), February 2020 (202002) and September 2017 (201709). The results are collected in PSSH CATH validation.csv.

    Known errors

    Due to processing error, the profile of pdb structure 5fia A / B (sequence md5 052667679fc644184f40063c7602c9e1) is incomplete in the pdb_full hhblits database which led to further errors in generating sequence based alignments for sequences for 1vtm P (sequence md5 c844aff103449363cb8489c78c58ebf1) and 434t A / B (sequence md5 d67aa1c3a36492c719cb48b5e7ecc624).

  17. d

    SWISS-2DPAGE

    • dknet.org
    • neuinfo.org
    • +1more
    Updated Jan 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). SWISS-2DPAGE [Dataset]. http://identifiers.org/RRID:SCR_006946
    Explore at:
    Dataset updated
    Jan 29, 2022
    Description

    A database of proteins identified by various 2-D PAGE and SDS-PAGE reference maps. Each SWISS-2DPAGE entry contains textual data on one protein, including mapping procedures, physiological and pathological information, experimental data (isoelectric point, molecular weight, amino acid composition, peptide masses) and bibliographical references. In addition to this textual data, SWISS-2DPAGE provides several 2-D PAGE and SDS-PAGE images showing the experimentally determined location of the protein, as well as a theoretical region computed from the sequence protein, indicating where the protein might be found in the gel. Using the database, users can locate these proteins on the 2-D PAGE maps or display the region of a 2-D PAGE map where one might expect to find a protein from UniProtKB/Swiss-Prot.

  18. s

    UniProtKB

    • scicrunch.org
    Updated Oct 24, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2019). UniProtKB [Dataset]. http://identifiers.org/RRID:SCR_004426
    Explore at:
    Dataset updated
    Oct 24, 2019
    Description

    Central repository for collection of functional information on proteins, with accurate and consistent annotation. In addition to capturing core data mandatory for each UniProtKB entry (mainly, the amino acid sequence, protein name or description, taxonomic data and citation information), as much annotation information as possible is added. This includes widely accepted biological ontologies, classifications and cross-references, and experimental and computational data. The UniProt Knowledgebase consists of two sections, UniProtKB/Swiss-Prot and UniProtKB/TrEMBL. UniProtKB/Swiss-Prot (reviewed) is a high quality manually annotated and non-redundant protein sequence database which brings together experimental results, computed features, and scientific conclusions. UniProtKB/TrEMBL (unreviewed) contains protein sequences associated with computationally generated annotation and large-scale functional characterization that await full manual annotation. Users may browse by taxonomy, keyword, gene ontology, enzyme class or pathway.

  19. e

    NCBIFAM

    • ebi.ac.uk
    Updated Dec 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). NCBIFAM [Dataset]. https://www.ebi.ac.uk/interpro/
    Explore at:
    Dataset updated
    Dec 16, 2024
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    NCBIfam is a collection of protein families, featuring curated multiple sequence alignments, hidden Markov models (HMMs) and annotation, which provides a tool for identifying functionally related proteins based on sequence homology. NCBIfam is maintained at the National Center for Biotechnology Information (Bethesda, MD). NCBIfam includes models from TIGRFAMs, another database of protein families developed at The Institute for Genomic Research, then at the J. Craig Venter Institute (Rockville, MD, US).

  20. s

    Repository URL

    • cinergi.sdsc.edu
    resource url
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Repository URL [Dataset]. http://cinergi.sdsc.edu/geoportal/rest/metadata/item/323ebc5365ec476ebdcb92329cf10b57/html
    Explore at:
    resource urlAvailable download formats
    Description

    Link Function: information

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
(2025). PROSITE profiles [Dataset]. https://www.ebi.ac.uk/interpro/

PROSITE profiles

Explore at:
Dataset updated
Feb 5, 2025
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

PROSITE is a database of protein families and domains. It consists of biologically significant sites, patterns and profiles that help to reliably identify to which known protein family a new sequence belongs. PROSITE is based at the Swiss Institute of Bioinformatics (SIB), Geneva, Switzerland.

Search
Clear search
Close search
Google apps
Main menu