Web accessible database of data extracted from scientific literature, focusing on proteins that are drug-targets or candidate drug-targets and for which structural data are present in Protein Data Bank . Website supports query types including searches by chemical structure, substructure and similarity, protein sequence, ligand and protein names, affinity ranges and molecular weight . Data sets generated by BindingDB queries can be downloaded in form of annotated SDfiles for further analysis, or used as basis for virtual screening of compound database uploaded by user. Data are linked to structural data in PDB via PDB IDs and chemical and sequence searches, and to literature in PubMed via PubMed IDs .
BindingDB is a public, web-accessible database of measured binding affinities, focusing chiefly on the interactions of protein considered to be drug-targets with small, drug-like molecules. As of May 27, 2022, BindingDB contains 41,296 Entries, each with a DOI, containing 2,519,702 binding data for 8,810 protein targets and 1,080,101 small molecules. There are 5,988 protein-ligand crystal structures with BindingDB affinity measurements for proteins with 100% sequence identity, and 11,442 crystal structures allowing proteins to 85% sequence identity.You can also use BindingDB data through the Registry of Open Data on AWS: https://registry.opendata.aws/binding-db. This dataset using the split by TransformerCPI(doi.org/10.1093/bioinformatics/btaa524)
http://www.apache.org/licenses/LICENSE-2.0http://www.apache.org/licenses/LICENSE-2.0
The code, dataset, and model weights are described in the paper "Interformer: An Interaction-Aware Model for Protein-Ligand Docking and Affinity Prediction."
experiment_results.zip: Contains generated results that can reproduce the result from the reported paper.
benchmark.zip: Contains docking and affinity input data of the interformer. You can use the source code to make predictions and reproduce the number of the reported paper.
checkpoints.zip: Contains one weight for the Energy and four PoseScore and Affinity models.
source_code_1.0.zip: Contains the initial version of the source code.
interformer_train.tar.gz: Contains prepared training data for interformer. poses/ contains all structure need for training, poses/ligand contains the re-docking poses generated by interformer energy, poses/ligand/rcsb contains the conformation of reference ligand, poses/pocket contains all pocket extract by raw PDB from rcsb, poses/uff contains all ligand conformation minimized using UFF from reference ligand, and train/ contains the training csv.
You can also find the newest version of the source code at https://github.com/tencent-ailab/Interformer
Database of affinity data for protein-ligand complexes of the Protein Data Bank (PDB) providing direct and free access to the experimental affinity of a given complex structure. Affinity data are exclusively obtained from the scientific literature. As of Thursday, May 01st, 2014, AffinDB contains 748 affinity values covering 474 different PDB complexes. More than one affinity value may be associated with a single PDB complex, which is most frequently due to multiple references reporting affinity data for the same complex. AffinDB provides access to data in three different forms: # Summary information for PDB entry # Affinity information window # Tabular reports
https://github.com/DISIC/politique-de-contribution-open-source/blob/master/LICENSE.pdfhttps://github.com/DISIC/politique-de-contribution-open-source/blob/master/LICENSE.pdf
This Zenodo repository provides comprehensive resources for the paper titled "Spatio-temporal learning from MD simulations for protein-ligand binding affinity prediction". We created a dataset of 63,000 molecular dynamics simulations by performing 10 simulations of 10 ns on 6,300 complexes. Neural networks were developed to learn from this data in order to predict the binding affinities of protein-ligand complexes. The implementation of these neural networks are available on github. Our collection includes training/benchmark datasets, trained statistical models, and results on test sets (CSV & PDF files).
Training/benchmark datasets:
Training, validation and test sets are provided to train and evaluate the following neural networks:
Pafnucy, Proli and Densenucy without MD data augmentation (dataset file names contain "initial")
Pafnucy, Proli and Densenucy with MD data augmentation (dataset file names contain "MDDA")
Pafnucy with/without MD data augmentation and Proli and Densenucy with MD data augmentation were also evaluated on the fep test set (test set file name contain "fep")
Timenucy and Videonucy using spatiotemporal learning methods (dataset file names contain "4D")
Pafnucy without MD data augmentation and on a reduced training set (dataset file names contain "reduced")
For each training methodology (MD data augmentation and spatiotemporal learning), we provide the data for the whole complex, only the ligand or only the protein. Additionally for spatiotemporal learning, we provide the data with only the ligand using the tracking mode.
Statistical models:
We provide the models trained with Pafnucy, Proli, Densenucy, Timenucy and Videonucy. Each models were trained in 10 replicates.
For Pafnucy, Proli, Densenucy, we provide the models trained with random and systematic rotations, as well as with or without MD data augmentation.
For Proli, Densenucy, Timenucy and Videonucy, we provide the models trained on the whole complex, only the ligand or only the protein.
For Pafnucy we also provide the models trained on the reduced set (5932 complexes).
Results on test sets (CSV & PDF files):
We provide the predictions on the PDBbind v.2016 core set.
For spatiotemporal learning methods (Timenucy and Videonucy), there are predictions for only 83 complexes, as we did not perform simulations on the whole test set.
For models trained with MD DA, predictions were carried on the crystallographic structures as well as on the frames extracted from the simulations performed on the test set (augmented test).
Results on the FEP dataset are also provided for Pafnucy, Proli and Densenucy.
Due to the large size of the raw MD data (~4.5 To), we are not able to share this data on zenodo, and will provide it upon demand.
This work was performed using HPC resources from GENCI-IDRIS (Grant 2021-A0100712496 & 2022-AD011013521) and CRIANN (Grant 2021002).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Receptor affinity data for morphine collected from the literature. The columns identify the receptor, the radioligand used in determining affinity, the source species from which the receptor was used, the tissue from which the receptor was used, the Ki value in nanomoles or the IC50 (the molar concentration of an unlabeled agonist or antagonist that inhibits the binding of a radioligand by 50%, [26]) value in nanomoles, and the literature reference from which the data was obtained.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Receptor affinity data for THC collected from the literature. The columns identify the receptor, the radioligand used in determining affinity, the source species from which the receptor was used, the tissue from which the receptor was used, the Ki value in nanomoles, and the literature reference from which the data was obtained.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Supplementary data file S4 from the manuscript 'The application of the Open Pharmacological Concepts Triple Store (Open PHACTS) to support Drug Discovery Research' to be published in PLOS ONE
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Historical price and volatility data for Affinity in US Dollar across different time periods.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Explore Affinity through data • Key facts: city, country, employees, revenues, company type, ESG score • Real-time news, visualizations and datasets
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Historical price and volatility data for Affinity in Taiwan New Dollar across different time periods.
http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F9770082%2Fb234dd748f233e4d3ef1d72d048828b5%2FMastering%20Drug%20Design.jpg?generation=1686502308761641&alt=media" alt="">
Read this article to get unlock the wonderful world Deep Reinforcement Learning for Drug Design
ReLeaSE is a public dataset, consisting of molecular structures and their corresponding binding affinity to proteins. The dataset was created for the purpose of evaluating and comparing machine learning models for the prediction of protein-ligand binding affinity.
The dataset contains a total of 10,000 molecules and their binding affinity to several target proteins, including thrombin, kinase, and protease. The molecular structures are represented using Simplified Molecular Input Line Entry System (SMILES) notation, which is a standardized method for representing molecular structures as a string of characters. The binding affinity is represented as a negative logarithm of the dissociation constant (pKd), which is a measure of the strength of the interaction between the molecule and the target protein.
The ReLeaSE dataset provides a standardized benchmark for evaluating machine learning models for protein-ligand binding affinity prediction. The dataset is publicly available and can be used for research purposes, making it an important resource for the drug discovery community.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset for LigUnity
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Over the years, several methods have been proposed for the computational PPI prediction with different performance evaluation strategies. While attempting to benchmark performance scores, most of these methods often suffer with ill-treated cross-validation strategies, adhoc selection of positive/negative samples etc. To address these issues, in our proposed multi-level feature based PPI prediction approach JUPPI, using sequence, domain and GO information as features, a refined evaluation strategy has been introduced. During the evaluation process, we first extract high quality negative data using three-stage filtering, and then introduce a pair-input based cross validation strategy with three difficulty levels for test-set predictions. Our proposed evaluation strategy reduces the component-level overlapping issue in test sets. Performance of JUPPI is compared with those of the state-of-the-art approaches in this domain and tested on six independent PPI datasets. In almost all the datasets, JUPPI outperforms the state-of-the-art not only at human proteome level for PPI prediction, but also for prediction of interactors for intrinsic disordered human proteins.
This dataset provides information about the number of properties, residents, and average property values for Affinity Street cross streets in Hoxie, AR.
This dataset was created by Anker Huang
It contains the following files:
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
In drug development, the efficacy of an antibody depends on how the antibody interacts with the target antigen. The strength of these interactions indicates how successful an antibody is in neutralizing an antigen. Therefore, the strength, measured by “binding affinity”, is a critical aspect of antibody engineering. In theory, the higher the binding affinity, the higher the chances are that the antibody is successful against the target antigen. Currently, techniques such as molecular docking and molecular dynamics are utilized in quantifying the binding affinity. However, owing to the computational complexity of the aforementioned techniques, running simulations for large antibodies/antigens remains a daunting task. Despite the commendable improvements in deep learning-based binding affinity prediction, such approaches are highly dependent on the quality of the antibody-antigen structures and they tend to overlook the importance of capturing the evolutionary details of proteins upon mutation. Further, most of the existing datasets for the task only include antibody-antigen pairs related to one antigen variant and, thus, are not suitable for developing comprehensive data-driven approaches. To circumvent the said complexities, we first curate the largest and most generalized datasets for antibody-antigen binding affinity prediction, consisting of both protein sequences and structures. Subsequently, we propose a deep geometric neural network comprising a structure-based model and a sequence-based model that considers both atomistic and evolutionary details when predicting the binding affinity. The proposed framework exhibited a 10% improvement in mean absolute error compared to the state-of-the-art models while showing a strong correlation between the predictions and target values. We release the datasets and code publicly https://drug-discovery-entc.github.io/p2pxml/ to support the development of antibody-antigen binding affinity prediction frameworks for the benefit of science and society.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Repository for "Experimental Uncertainty in Training Data for Protein-Ligand Binding Affinity Prediction Models"ChEMBL_33.tsv contains the raw data as downloaded from ChEMBLData_Processing_ChEMBL33.ipynb contains all the code necesary to reproduce the results reported in the manuscript.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset provides information about the number of properties, residents, and average property values for Affinity Road cross streets in Fairmont, NC.
Web accessible database of data extracted from scientific literature, focusing on proteins that are drug-targets or candidate drug-targets and for which structural data are present in Protein Data Bank . Website supports query types including searches by chemical structure, substructure and similarity, protein sequence, ligand and protein names, affinity ranges and molecular weight . Data sets generated by BindingDB queries can be downloaded in form of annotated SDfiles for further analysis, or used as basis for virtual screening of compound database uploaded by user. Data are linked to structural data in PDB via PDB IDs and chemical and sequence searches, and to literature in PubMed via PubMed IDs .