Database to catalog experimentally determined interactions between proteins combining information from a variety of sources to create a single, consistent set of protein-protein interactions that can be downloaded in a variety of formats. The data were curated, both, manually and also automatically using computational approaches that utilize the the knowledge about the protein-protein interaction networks extracted from the most reliable, core subset of the DIP data. Because the reliability of experimental evidence varies widely, methods of quality assessment have been developed and utilized to identify the most reliable subset of the interactions. This CORE set can be used as a reference when evaluating the reliability of high-throughput protein-protein interaction data sets, for development of prediction methods, as well as in the studies of the properties of protein interaction networks. Tools are available to analyze, visualize and integrate user's own experimental data with the information about protein-protein interactions available in the DIP database. The DIP database lists protein pairs that are known to interact with each other. By interact they mean that two amino acid chains were experimentally identified to bind to each other. The database lists such pairs to aid those studying a particular protein-protein interaction but also those investigating entire regulatory and signaling pathways as well as those studying the organization and complexity of the protein interaction network at the cellular level. Registration is required to gain access to most of the DIP features. Registration is free to the members of the academic community. Trial accounts for the commercial users are also available.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
IPLID integrates protein-ligand interaction data from multiple well-known resources, including BindingDB, ChEMBL, DrugBank, GPCRDB, PubChem, LINCS-HMS KinomeScan, and four published kinome assay results. Our database can facilitate projects in machine learning or deep learning-based drug development and other applications by providing integrated data sets appropriate for many research interests. Our database can be utilized for small-scale (e.g. kinases or GPCRs only) and large-scale (e.g. proteome-wide), qualitative or quantitative projects. With its ease of use and straightforward data format, IPLID offers a great educational resource for computer science and data science trainees who lack familiarity with chemistry and biology.
Data statistics
Target (data type) Activities | Unique chemicals | Unique proteins | File name
All (binary) 96318 | 18107 | 3107 | integrated_binary_activity.tsv
All (numerical) 2798365 | 683009 | 5876 | integrated_continuous_activity.tsv
CYP450 (binary) 67552 | 17273 | 47 | integrated_cyp450_binary.tsv
CRT (binary) 4152 | 1219 | 412 | integrated_cancer_related_targets_binary.tsv
CDT (binary) 519 | 349 | 88 | integrated_cardio_targets_binary.tsv
DRT (binary) 4433 | 1325 | 852 | integrated_disease_related_targets_binary.tsv
FDA (binary) 6217 | 1521 | 592 | integrated_fda_approved_targets_binary.tsv
GPCR (binary) 1958 | 545 | 129 | integrated_gpcr_binary.tsv
NR (binary) 1335 | 657 | 264 | integrated_nr_binary.tsv
PDT (binary) 1469 | 674 | 404 | integrated_potential_drug_targets_binary.tsv
TF (binary) 1966 | 998 | 304 | integrated_tf_binary.tsv
*Abbreviations: CYP450 (Cytochrome P450), CRT (Cancer-Related Target), CDT (Cardiovascular Disease candidate Target), DRT (Disease-Related Target), FDA (FDA-approved target), GPCR (G-Protein Coupled Receptor), NR (Nuclear Receptor), PDT (Potential Drug Target), TF (Transcription Factor)
*These protein classifications are from UniProt database and the Human Protein Atlas (https://www.proteinatlas.org/)
IPLID data statistics
Database of non-redundant sets of protein - small-molecule complexes that are especially suitable for structure-based drug design and protein - small-molecule interaction research. PSMB supports: * Support frequent updates - The number of new structures in the PDB is growing rapidly. In order to utilize these structures, frequent updates are required. In contrast to manual procedures which require significant time and effort per update, generation of the PSMDB database is fully automatic thereby facilitating frequent database updates. * Consider both protein and ligand structural redundancy - In the database, two complexes are considered redundant if they share a similar protein and ligand (the protein - small-molecule non-redundant set). This allows the database to contain structural information for the same protein bound to several different ligands (and vice-versa). Additionally, for completeness, the database contains a set of non-redundant complexes when only protein structural redundancy is considered (our protein non-redundant set). The following images demonstrate the structural redundancy of the protein complexes in the PDB compared to the PSMDB. * Efficient handling of covalent bonds -Many protein complexes contain covalently bound ligands. Typically, protein-ligand databases discard these complexes; however, the PSMDB simply removes the covalently bound ligand from the complex, retaining any non-covalently bound ligands. This increases the number of usable complexes in the database. * Separate complexes into protein and ligand files -The PSMDB contains individual structure files for both the protein and all non-covalently bound ligands. The unbound proteins are in PDB format while the individual ligands are in SDF format (in their native coordinate frame).
A publicly accessible web-based database through which the interactions between a variety of chelating groups and various central metal ions in the active site of metalloproteins can be explored in detail. Additional information can also be retrieved including protein and inhibitor names, the amino acid residues coordinated to the central metal ion, and the binding affinity of the inhibitor for the target metalloprotein.
THIS RESOURCE IS NO LONGER IN SERVICE. Documented on September 23,2022.A manually curated protein-protein interaction database developed specifically for interactions involving PDZ domains. It currently contains 339 experimentally determined protein protein interactions.
BindingDB is a public, web-accessible database of measured binding affinities, focusing chiefly on the interactions of protein considered to be drug-targets with small, drug-like molecules. As of May 27, 2022, BindingDB contains 41,296 Entries, each with a DOI, containing 2,519,702 binding data for 8,810 protein targets and 1,080,101 small molecules. There are 5,988 protein-ligand crystal structures with BindingDB affinity measurements for proteins with 100% sequence identity, and 11,442 crystal structures allowing proteins to 85% sequence identity.You can also use BindingDB data through the Registry of Open Data on AWS: https://registry.opendata.aws/binding-db. This dataset using the split by TransformerCPI(doi.org/10.1093/bioinformatics/btaa524)
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Repression of Hh signaling in the absence of ligand depends on the transmembrane receptor protein Patched (PTCH), which inhibits Smoothened (SMO) activity by an unknown mechanism. This promotes the proteolytic processing and/or degradation of the GLI family of transcription factors and maintains the pathway in a transcriptionally repressed state (reviewed in Briscoe and Therond, 2013). In the absence of ligand, PTCH is localized in the cilium, while SMO is largely concentrated in intracellular compartments. Upon binding of Hh to the PTCH receptor, PTCH is endocytosed, relieving SMO inhibition and allowing it to accumulate in the primary cilium (Marigo et al, 1996; Chen and Struhl, 1996; Stone et al, 1996; Rohatgi et al, 2007; Corbit et al, 2005; reviewed in Goetz and Anderson, 2010). In the cilium, SMO is activated by an unknown mechanism, allowing the full length transcriptional activator forms of the GLI proteins to accumulate and translocate to the nucleus, where they bind to the promoters of Hh-responsive genes (reviewed in Briscoe and Therond, 2013).
In addition to PTCH, three additional membrane proteins have been shown to bind Hh and to be required for Hh-dependent signaling in vertebrates: CDON (CAM-related/downregulated by oncogenes), BOC (brother of CDO) and GAS1 (growth arrest specific 1) (Yao et al, 2006; Okada et al, 2006; Tenzen et al, 2006; McLellan et al, 2008; reviewed in Kang et al, 2007; Beachy et al, 2010; Sanchez-Arrones et al, 2012). CDON and BOC, homologues of Drosophila Ihog and Boi respectively, are evolutionarily conserved transmembrane glycoproteins that have been shown to bind both to Hh ligand and to the canonical receptor PTCH to promote Hh signaling (Okada et al, 2006; Yao et al, 2006; Tenzen et al, 2006, McLellan et al, 2008; Izzi et al, 2011; reviewed in Sanchez-Arrones et al, 2012). Despite the evolutionary conservation, the mode of ligand binding by CDON/Ihog and BOC/Boi is distinct in vertebrates and invertebrates. High affinity ligand-binding by CDON and BOC requires Ca2+, while invertebrate ligand-binding is heparin-dependent (Okada et al, 2006; Tenzen et al, 2006; McLellan et al, 2008; Yao et al, 2006; Kavran et al, 2010). GAS1 is a vertebrate-specific GPI-anchored protein that similarly binds both to Hh ligand and to the PTCH receptor to promote Hh signaling (Martinelli and Fan, 2007; Izzi et al, 2011; reviewed in Kang et al, 2007). CDON, BOC and GAS1 have partially overlapping but not totally redundant roles, and knock-out of all three is required to abrogate Hh signaling in mice (Allen et al, 2011; Izzi et al, 2011; reviewed in Briscoe and Therond, 2013).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
https://www.gnu.org/licenses/old-licenses/gpl-2.0-standalone.htmlhttps://www.gnu.org/licenses/old-licenses/gpl-2.0-standalone.html
This dataset is a mysql dump of the PIBASE.ligands database describing the overlap of small molecule and protein binding sites ( http://fredpdavis.com/pibase.ligands ) described in:
The overlap of small molecule and protein binding sites within families of protein structures.
Davis FP, Sali A. PLoS Comput Biology 2010 6(2): e1000668.
doi:10.1371/journal.pcbi.1000668
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data snapshot from the Protein Ligand Binding Database (PLBD)
=============================================================
This data set contains tabular data snapshots from the PLBD. Data are provided in the time-stamped JSON [1] files.
The time stamp in all files (the first line of each data file) records the moment when the data tables were retrieved from the PLBD server. Since the PLBD is a constantly evolving database, it is recommended that the latest data from the PLBD server [2] are downloaded and used, unless historic data are needed. The provided Makefile illustrates a possible download procedure. The tables here are provided solely to comply with the editorial policies of the "Scientific Data" journal.
The ".json" files in the "outputs/main" directory contain the data; the ".nrec" files contain the number of records in the PLBD tables at the moment of download (a single decimal integer recorded on the second line of each ".nrec" file) and were used to request the the appropriate number of rows when downloading the "outputs/main/*.json" files.
The meaning and the data type descriptions of the JSON file attributes are provided in the database description file "main-database-description.xml". This file is deposited as a separate digital object in Zenodo [3]. The layout of the JSON files follows ideas described in the JSON:API schema [4].
Layout of the archived file tree
--------------------------------
.
├── Makefile
├── README
├── README.pdf
├── inputs
│ ├── tables.lst
│ ├── tables.lst.history
│ └── tables.lst.log
├── makefiles
│ ├── available
│ │ ├── Makelocal-download-json
│ │ └── Makelocal-download-nrec
│ └── enabled
│ ├── Makelocal-download-json
│ └── Makelocal-download-nrec
└── outputs
└── main [52 entries exceeds filelimit, not opening dir]
References:
1. ECMA Standard. The JSON data interchange syntax. 2017, 1-16, URL: https://www.ecma-international.org/publications/files/ECMA-ST/ECMA-404.pdf
2. The Protein Ligand Binding Database (PLBD) server. URL: https://plbd.org/db/main [accessed: 2022-12-21T09:29+02:00]
3. Lingė, D.; Gedgaudas, M.; Merkys, A.; Petrauskas, V.; Vaitkus, A.; Grybauskas, A.; Paketurytė, V.; Zubrienė, A.; Zakšauskas, A.; Mickevičiūtė, A.; Smirnovienė, J.; Baranauskienė, L.; Čapkauskaitė, E.; Dudutienė, V.; Urniežius, E.; Konovalovas, A.; Kazlauskas, E.; Gražulis, S. & Matulis, D. PLBD (Protein Ligand Binding Database) table description XML file. Zenodo, 2022, DOI: https://doi.org/10.5281/ZENODO.7482008
4. Katz, Y.; Gebhardt, D.; Sullice, G.; Hanschke, J.; Kellen, T.; Klabnik, S. & Resnick, E. JSON:API version 1.1. 2022 URL: https://jsonapi.org/format/1.1/ [accessed: 2022-12-25T16:04+02:00]
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
We report for the first time the use of experimental electron density (ED) in the Protein Data Bank for modeling of noncovalent interactions (NCIs) for protein–ligand complexes. Our methodology is based on reduced electron density gradient (RDG) theory describing intermolecular NCIs by ED and its first derivative. We established a database named Experimental NCI Database (ExptNCI; http://ncidatabase.stonewise.cn/#/nci) containing ED saddle points, indicating ∼200,000 NCIs from over 12,000 protein–ligand complexes. We also demonstrated the usage of the database in the case of depicting amide−π interactions in protein–ligand binding systems. In summary, the database provides details on experimentally observed NCIs for protein–ligand complexes and can support future studies including studies on rarely documented NCIs and the development of artificial intelligence models for protein–ligand binding prediction.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The BO1 subset of the scPDB database.
BO1 consists in 766 pairs of non-redundant binding-sites (383 similar pairs, 383 dissimilar pairs).
Eguida, M., & Rognan, D. (2020). A computer vision approach to align and compare protein cavities: application to fragment-based drug design. Journal of Medicinal Chemistry, 63(13), 7127-7142.
Desaphy, J., Bret, G., Rognan, D., & Kellenberger, E. (2015). sc-PDB: a 3D-database of ligandable binding sites—10 years on. Nucleic acids research, 43(D1), D399-D404.
The scPDB is available at: http://bioinfo-pharma.u-strasbg.fr/scPDB/
An object-oriented database of domain-domain interactions observed in structural data. SNAPPI-DB is a useful resource for any analysis of structures but has been opitmised for analysis on domain-domain interactions and domain-ligand interactions. The database has already been employed for 3 studies on the properties of domain-domain interactions and is currently being employed to train a protein-protein interaction predictor and a functional residue predictor. SNAPPI-DB has several features which are not available in other databases, including links to the MSD, speed, being object oriented, storage of multiple domain definitions, and storage of Protein Quaternary Structures (PQS).
Database for protein structural change upon ligand binding that are classified into 7 classes in terms of the ligand binding sites and the location where the dominant motion occurs. # Coupled Domain motions are the domain motions induced upon ligand binding. # Independent Domain motions are the observable domain motions regardless of ligand binding. # Coupled Local motions are the local motions induced upon ligand binding. # Independent Local motions are the observable local motions regardless of ligand binding. # Burying ligand motions are imaginable motions required to hold ligand protein-inside. # No significant motions mean just nothing happen. # Other motions are motions unclassified into domain and local motions. Proteins are flexible molecules that undergo structural changes to function. The Protein Data Bank contains multiple entries for identical proteins determined under different conditions, e.g. with and without a ligand molecule, which provides important information for understanding the structural changes related to protein functions. We gathered 839 protein structural pairs of ligand-free and ligand-bound states from monomeric or homo-dimeric proteins, and constructed the Protein Structural Change DataBase (PSCDB). In the database, we focused on whether the motions were coupled with ligand binding. As a result, the protein structural changes were classified into seven classes, i.e. coupled domain motion (59 structural changes), independent domain motion (70), coupled local motion (125), independent local motion (135), burying ligand motion (104), no significant motion (311) and other type motion (35). PSCDB provides lists of each class. On each entry page, users can view detailed information about the motion, accompanied by a morphing animation of the structural changes.
THIS RESOURCE IS NO LONGER IN SERVICE, documented May 10, 2017. A pilot effort that has developed a centralized, web-based biospecimen locator that presents biospecimens collected and stored at participating Arizona hospitals and biospecimen banks, which are available for acquisition and use by researchers. Researchers may use this site to browse, search and request biospecimens to use in qualified studies. The development of the ABL was guided by the Arizona Biospecimen Consortium (ABC), a consortium of hospitals and medical centers in the Phoenix area, and is now being piloted by this Consortium under the direction of ABRC. You may browse by type (cells, fluid, molecular, tissue) or disease. Common data elements decided by the ABC Standards Committee, based on data elements on the National Cancer Institute''s (NCI''s) Common Biorepository Model (CBM), are displayed. These describe the minimum set of data elements that the NCI determined were most important for a researcher to see about a biospecimen. The ABL currently does not display information on whether or not clinical data is available to accompany the biospecimens. However, a requester has the ability to solicit clinical data in the request. Once a request is approved, the biospecimen provider will contact the requester to discuss the request (and the requester''s questions) before finalizing the invoice and shipment. The ABL is available to the public to browse. In order to request biospecimens from the ABL, the researcher will be required to submit the requested required information. Upon submission of the information, shipment of the requested biospecimen(s) will be dependent on the scientific and institutional review approval. Account required. Registration is open to everyone.. Documented on August 19,2019.BOND, which requires registration of a free account, is a resource used to perform cross-database searches of available sequence, interaction, complex and pathway information. BOND integrates a range of component databases including GenBank and BIND, the Biomolecular Interaction Network Database. BOND contains 70+ million biological sequences, 33,000 structures, 38,000 GO terms, and over 200,000 human curated interactions contained in BIND, and is open access. BOND serves the interests of the developing global interactome effort encompassing the genomic, proteomic and metabolomic research communities. BOND is the first open access search resource to integrate sequence and interaction information. BOND integrates BLAST functionality, and contains a well-documented API. BOND also stores annotation links for sequences, including links to Genome Ontology descriptions, MedLine abstracts, taxon identifiers, associated structures, redundant sequences, sequence neighbors, conserved domains, data base cross-references, Online Mendalian Inheritance in Man identifiers, LocusLink identifiers and complete genomes. BIND on BOND The Biomolecular Interaction Network Database (BIND), a component database of BOND, is a collection of records documenting molecular interactions. The contents of BIND include high-throughput data submissions and hand-curated information gathered from the scientific literature. BIND is an interaction database with three classifications for molecular associations: molecules that associate with each other to form interactions, molecular complexes that are formed from one or more interaction(s) and pathways that are defined by a specific sequence of two or more interactions.Interactions A BIND record represents an interaction between two or more objects that is believed to occur in a living organism. A biological object can be a protein, DNA, RNA, ligand, molecular complex, gene, photon or an unclassified biological entity. BIND records are created for interactions which have been shown experimentally and published in at least one peer-reviewed journal. A record also references any papers with experimental evidence that support or dispute the associated interaction. Interactions are the basic units of BIND and can be linked together to form molecular complexes or pathways. The BIND interaction viewer is a tool to visualize and analyze molecular interactions, complexes and pathways. The BIND interaction viewer uses Ontoglyphs to display information about a protein via attributes such as molecular function, biological process and sub-cellular localization. Ontoglyphs allow to graphically and interactively explore interaction networks, by visualizing interactions in the context of 34 functional, 25 binding specificity and 24 sub-cellular localization Ontoglyphs categories. We will continue to provide an open access version of BOND, providing its subscribers with free, unlimited access to a core content set. But we are confident you will soon want to upgrade to BONDplus.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
PLBD (Protein Ligand Binding Database) table description XML file
=================================================================
General
-------
The provided ZIP archive contains an XML file "main-database-description.xml" with the description of all tables (VIEWS) that are exposed publicly at the PLBD server (https://plbd.org/). In the XML file, all columns of the visible tables are described, specifying their SQL types, measurement units, semantics, calculation formulae, SQL statements that can be used to generate values in these columns, and publications of the formulae derivations.
The XML file conforms to the published XSD schema created for descriptions of relational databases for specifications of scientific measurement data. The XSD schema ("relational-database_v2.0.0-rc.18.xsd") and all included sub-schemas are provided in the same archive for convenience. All XSD schemas are validated against the "XMLSchema.xsd" schema from the W3C consortium.
The ZIP file contains the excerpt from the files hosted in the https://plbd.org/ at the moment of submission of the PLBD database in the Scientific Data journal, and is provided to conform the journal policies. The current data and schemas should be fetched from the published URIs:
https://plbd.org/
https://plbd.org/doc/db/schemas
https://plbd.org/doc/xml/schemas
Software that is used to generate SQL schemas, RestfulDB metadata and the RestfulDB middleware that allows to publish the databases generated from the XML description on the Web are available at public Subversion repositories:
svn://www.crystallography.net/solsa-database-scripts
svn://saulius-grazulis.lt/restfuldb
Usage
-----
The unpacked ZIP file will create the "db/" directory with the tree layout given below. In addition to the database description file "main-database-description.xml", all XSD schemas necessary for validation of the XML file are provided. On a GNU/Linux operating system with a GNU Make package installed, the XML file validity can be checked by unpacking the ZIP file, entering the unpacked directory, and running 'make distclean; make'. For example, on a Linux Mint distribution, the following commands should work:
unzip main-database-description.zip
cd db/release/v0.10.0/tables/
sh -x dependencies/Linuxmint-20.1/install.sh
make distclean
make
If necessary, additional packages can be installed using the 'install.sh' script in the 'dependencies/' subdirectory corresponding to your operating system. As of the moment of writing, Debian-10 and Linuxmint-20.1 OSes are supported out of the box; similar OSes might work with the same 'install.sh' scripts. The installation scripts require to run package installation command under system administrator privileges, but they use *only* the standard system package manager, thus they should not put your system at risk. For validation and syntax checking, the 'rxp' and 'xmllint' programs are used.
The log files provided in the "outputs/validation" subdirectory contain validation logs obtained on the system where the XML files were last checked and should indicate validity of the provided XML file against the references schemas.
Layout of the archived file tree
--------------------------------
db/
└── release
└── v0.10.0
└── tables
├── Makeconfig-validate-xml
├── Makefile
├── Makelocal-validate-xml
├── dependencies
├── main-database-description.xml
├── outputs
└── schema
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
A systematic analysis is presented of the 220 phosphodiesterase (PDE) catalytic domain crystal structures present in the Protein Data Bank (PDB) with a focus on PDE–ligand interactions. The consistent structural alignment of 57 PDE ligand binding site residues enables the systematic analysis of PDE–ligand interaction fingerprints (IFPs), the identification of subtype-specific PDE–ligand interaction features, and the classification of ligands according to their binding modes. We illustrate how systematic mining of this phosphodiesterase structure and ligand interaction annotated (PDEStrIAn) database provides new insights into how conserved and selective PDE interaction hot spots can accommodate the large diversity of chemical scaffolds in PDE ligands. A substructure analysis of the cocrystallized PDE ligands in combination with those in the ChEMBL database provides a toolbox for scaffold hopping and ligand design. These analyses lead to an improved understanding of the structural requirements of PDE binding that will be useful in future drug discovery studies.
THIS RESOURCE IS NO LONGER IN SERVICE, documented on August 18, 2014. A database for structural and functional information on various protein sites (post-translational modification, catalytic active, organic and inorganic ligand binding, protein-protein, protein-DNA and protein-RNA interactions) in the Protein Data Bank (PDB). It was developed as a daughter database accumulating the data on functional and structural characteristics of functional sites stored in PDB, as well as their spatial surroundings. It consists of functional sites extracted from PDB using the SITE records and of an additional set containing the protein interaction sites inferred from the contact residues in heterocomplexes. The PDBSite was set up by automated processing of the PDB. The PDBSite database can be queried through the functional description and the structural characteristics of the site and its environment. The PDBSite is integrated with the PDBSiteScan tool allowing structural comparisons of a protein against the functional sites. The PDBSite enables the recognition of functional sites in protein tertiary structures, providing annotation of function through structure. The Protein Data Bank (PDB) contains data on the spatial protein structures and their biologically active sites (i.e., ligand binding regions, enzyme catalytic centers, regions subjected to biochemical modifications, etc.). However, neither of the well known systems searching PDB does not provide the user with possibility to make the queries related with the active sites. A database PDBSITE storing the data on biologically active sites contained in the PDB database has been developed. PDBSITE accumulates amino acid content, structure features calculated by spatial protein structures, and physicochemical properties of sites and their spatial surroundings. The data on biologically active protein sites are of extreme importance for solving many problems in molecular biology, biotechnology, and medicine. High specificity of biological activity in proteins is produced by unique structure of active sites that are often organized by a very complicate pattern. In particular, biologically active sites in proteins are often compiled out of remote by primary structure amino acid residues, which form compact clusters in the spatial structure with strictly ordered conformation. Specific structure and conformational parameters of these sites are determined by the structure of their spatial amino acid surroundings. For example, spatial amino acid surroundings of enzyme catalytic centres determine the relief of hollows in catalytic centres of enzymes in a substrate binding regions, whereas the residues of antigen determinants of proteins determine their structure by organizing prominent parts at the protein surface. For many natural and mutant proteins, the relationships were found between protein activity and physico-chemical properties of amino acid residues composing the local surroundings of a functional site. The spatial surroundings of biologically active sites may be detected only if the data on tertiary protein structures are available. The Protein Data Bank (PDB) contains data on the spatial protein structures and their biologically active sites. However, neither of the well-known systems searching PDB does not provide the user with possibility to make the queries related with the active sites. Sponsor: This site is funded by GeneNetWorks.
A gold-standard dataset of biologically relevant binding sites in protein structures. It consists of proteins with one unbound structure and at least one structure of the protein-ligand complex. Both a redundant and a non-redundant (sequence identity lower than 25) version is available. Quaternary structures proposed by PQS (2) are used for all structures in the dataset. The availability of both unbound and bound structures for each protein guarantees that our dataset can be used to benchmark binding site prediction methods, in conditions that mimic cases where the binding site is truly unknown. In cases where several different bound structures are available for a given protein, all are used to define the binding sites.
Web accessible database of data extracted from scientific literature, focusing on proteins that are drug-targets or candidate drug-targets and for which structural data are present in Protein Data Bank . Website supports query types including searches by chemical structure, substructure and similarity, protein sequence, ligand and protein names, affinity ranges and molecular weight . Data sets generated by BindingDB queries can be downloaded in form of annotated SDfiles for further analysis, or used as basis for virtual screening of compound database uploaded by user. Data are linked to structural data in PDB via PDB IDs and chemical and sequence searches, and to literature in PubMed via PubMed IDs .
Database to catalog experimentally determined interactions between proteins combining information from a variety of sources to create a single, consistent set of protein-protein interactions that can be downloaded in a variety of formats. The data were curated, both, manually and also automatically using computational approaches that utilize the the knowledge about the protein-protein interaction networks extracted from the most reliable, core subset of the DIP data. Because the reliability of experimental evidence varies widely, methods of quality assessment have been developed and utilized to identify the most reliable subset of the interactions. This CORE set can be used as a reference when evaluating the reliability of high-throughput protein-protein interaction data sets, for development of prediction methods, as well as in the studies of the properties of protein interaction networks. Tools are available to analyze, visualize and integrate user's own experimental data with the information about protein-protein interactions available in the DIP database. The DIP database lists protein pairs that are known to interact with each other. By interact they mean that two amino acid chains were experimentally identified to bind to each other. The database lists such pairs to aid those studying a particular protein-protein interaction but also those investigating entire regulatory and signaling pathways as well as those studying the organization and complexity of the protein interaction network at the cellular level. Registration is required to gain access to most of the DIP features. Registration is free to the members of the academic community. Trial accounts for the commercial users are also available.