100+ datasets found
  1. Data from: PPB-Affinity: Protein-Protein Binding Affinity dataset for...

    • zenodo.org
    bin, zip
    Updated Jul 27, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Huaqing Liu; Huaqing Liu (2024). PPB-Affinity: Protein-Protein Binding Affinity dataset for AI-based protein drug discovery [Dataset]. http://doi.org/10.5281/zenodo.13054646
    Explore at:
    zip, binAvailable download formats
    Dataset updated
    Jul 27, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Huaqing Liu; Huaqing Liu
    License

    http://www.apache.org/licenses/LICENSE-2.0http://www.apache.org/licenses/LICENSE-2.0

    Description

    Prediction of protein-protein binding (PPB) affinity plays an important role in large-molecular drug discovery. Deep learning (DL) has been adopted to predict the change of PPB binding affinity upon mutation, but there was a scarcity of studies predicting the PPB affinity itself. The major reason is the paucity of open-source dataset concerning PPB affinity. Therefore, the current study aimed to introduce and disclose a PPB affinity dataset (PPB-Affinity), which will definitely benefit the development of applicable DL to predict the PPB affinity. The PPB-Affinity dataset contains key information such as crystal structures of protein-protein complexes (with or without protein mutation patterns), PPB affinity, receptor protein chain, ligand protein chain, etc. To the best of our knowledge, this is the largest and publicly available PPB-Affinity dataset, which may finally help the industry in improving the screening efficiency of discovering new large-molecular drugs.

    Codes for PPB-Affinity database preparation is disclosed at https://github.com/Huatsing-Lau/PPB-Affinity-DataPrepWorkflow" href="https://github.com/Huatsing-Lau/PPB-Affinity-DataPrepWorkflow">https://github.com/Huatsing-Lau/PPB-Affinity-DataPrepWorkflow.
    Codes for the benchmark algorithm is disclosed at https://github.com/ChenPy00/PPB-Affinity.

    Files are orginized as follows:

    - PPB-Affinity.xlsx

    - samples_deleted.zip

    - PDB/

    - Affinity Benchmark v5.5/

    - file1.pdb

    - file2.pdb

    - ...

    - filek.pdb

    - ATLAS/

    - PDBbind v2020/

    - SAbDab/

    - SKEMPIv2.0/

  2. f

    Data from: PPI-Affinity: A Web Tool for the Prediction and Optimization of...

    • datasetcatalog.nlm.nih.gov
    Updated Jun 2, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Münch, Jan; Mieres-Perez, Joel; Romero-Molina, Sandra; Sanchez-Garcia, Elsa; Ruiz-Blanco, Yasser B.; Ehrmann, Michael; Harms, Mirja (2022). PPI-Affinity: A Web Tool for the Prediction and Optimization of Protein–Peptide and Protein–Protein Binding Affinity [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000438753
    Explore at:
    Dataset updated
    Jun 2, 2022
    Authors
    Münch, Jan; Mieres-Perez, Joel; Romero-Molina, Sandra; Sanchez-Garcia, Elsa; Ruiz-Blanco, Yasser B.; Ehrmann, Michael; Harms, Mirja
    Description

    Virtual screening of protein–protein and protein–peptide interactions is a challenging task that directly impacts the processes of hit identification and hit-to-lead optimization in drug design projects involving peptide-based pharmaceuticals. Although several screening tools designed to predict the binding affinity of protein–protein complexes have been proposed, methods specifically developed to predict protein–peptide binding affinity are comparatively scarce. Frequently, predictors trained to score the affinity of small molecules are used for peptides indistinctively, despite the larger complexity and heterogeneity of interactions rendered by peptide binders. To address this issue, we introduce PPI-Affinity, a tool that leverages support vector machine (SVM) predictors of binding affinity to screen datasets of protein–protein and protein–peptide complexes, as well as to generate and rank mutants of a given structure. The performance of the SVM models was assessed on four benchmark datasets, which include protein–protein and protein–peptide binding affinity data. In addition, we evaluated our model on a set of mutants of EPI-X4, an endogenous peptide inhibitor of the chemokine receptor CXCR4, and on complexes of the serine proteases HTRA1 and HTRA3 with peptides. PPI-Affinity is freely accessible at https://protdcal.zmb.uni-due.de/PPIAffinity.

  3. Spatio-temporal learning from molecular dynamics simulations for...

    • zenodo.org
    zip
    Updated Aug 22, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pierre-Yves Libouban; Pierre-Yves Libouban; Camille Parisel; Maxime Song; Samia Aci-Sèche; Samia Aci-Sèche; Jose-Carlos Gómez-Tamayo; Gary Tresadern; Gary Tresadern; Pascal Bonnet; Pascal Bonnet; Camille Parisel; Maxime Song; Jose-Carlos Gómez-Tamayo (2025). Spatio-temporal learning from molecular dynamics simulations for protein-ligand binding affinity prediction [Dataset]. http://doi.org/10.5281/zenodo.10390550
    Explore at:
    zipAvailable download formats
    Dataset updated
    Aug 22, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Pierre-Yves Libouban; Pierre-Yves Libouban; Camille Parisel; Maxime Song; Samia Aci-Sèche; Samia Aci-Sèche; Jose-Carlos Gómez-Tamayo; Gary Tresadern; Gary Tresadern; Pascal Bonnet; Pascal Bonnet; Camille Parisel; Maxime Song; Jose-Carlos Gómez-Tamayo
    License

    https://github.com/DISIC/politique-de-contribution-open-source/blob/master/LICENSE.pdfhttps://github.com/DISIC/politique-de-contribution-open-source/blob/master/LICENSE.pdf

    Time period covered
    Jun 6, 2024
    Description

    This Zenodo repository provides comprehensive resources for the paper titled "Spatio-temporal learning from molecular dynamics simulations for protein-ligand binding affinity prediction" published on Bioinformatics. We created a dataset of 63,000 molecular dynamics simulations by performing 10 simulations of 10 ns on 6,300 complexes. Neural networks were developed to learn from this data in order to predict the binding affinities of protein-ligand complexes. The implementation of these neural networks are available on github. Our collection includes training/benchmark datasets, trained statistical models, and results on test sets (CSV & PDF files).

    Training/benchmark datasets:

    Training, validation and test sets are provided to train and evaluate the following neural networks:

    • Pafnucy, Proli and Densenucy without MD data augmentation (dataset file names contain "initial")
    • Pafnucy, Proli and Densenucy with MD data augmentation (dataset file names contain "MDDA")
    • Pafnucy with/without MD data augmentation and Proli and Densenucy with MD data augmentation were also evaluated on the fep test set (test set file name contain "fep")
    • Timenucy and Videonucy using spatiotemporal learning methods (dataset file names contain "4D")
    • Pafnucy without MD data augmentation and on a reduced training set (dataset file names contain "reduced")

    For each training methodology (MD data augmentation and spatiotemporal learning), we provide the data for the whole complex, only the ligand or only the protein. Additionally for spatiotemporal learning, we provide the data with only the ligand using the tracking mode.

    Statistical models:

    We provide the models trained with Pafnucy, Proli, Densenucy, Timenucy and Videonucy. Each models were trained in 10 replicates.

    For Pafnucy, Proli, Densenucy, we provide the models trained with random and systematic rotations, as well as with or without MD data augmentation.

    For Proli, Densenucy, Timenucy and Videonucy, we provide the models trained on the whole complex, only the ligand or only the protein.

    For Pafnucy we also provide the models trained on the reduced set (5932 complexes).

    Results on test sets (CSV & PDF files):

    We provide the predictions on the PDBbind v.2016 core set.

    • For spatiotemporal learning methods (Timenucy and Videonucy), there are predictions for only 83 complexes, as we did not perform simulations on the whole test set.
    • For models trained with MD DA, predictions were carried on the crystallographic structures as well as on the frames extracted from the simulations performed on the test set (augmented test).

    Results on the FEP dataset are also provided for Pafnucy, Proli and Densenucy.

    The Raw MD data (~4.5 To) are stored, and can be visualized/downloaded, on the MDDB.

    This work was performed using HPC resources from GENCI-IDRIS (Grant 2021-A0100712496 & 2022-AD011013521) and CRIANN (Grant 2021002).

  4. Data from: Improving generalisability of 3D binding affinity models in low...

    • zenodo.org
    bin, csv, txt, zip
    Updated Nov 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ward Haddadin; Julia Buhmann; Alan Bilsland; Lukáš Pravda; Hagen Triendl; Ward Haddadin; Julia Buhmann; Alan Bilsland; Lukáš Pravda; Hagen Triendl (2024). Improving generalisability of 3D binding affinity models in low data regimes [Dataset]. http://doi.org/10.5281/zenodo.14054484
    Explore at:
    zip, csv, bin, txtAvailable download formats
    Dataset updated
    Nov 8, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Ward Haddadin; Julia Buhmann; Alan Bilsland; Lukáš Pravda; Hagen Triendl; Ward Haddadin; Julia Buhmann; Alan Bilsland; Lukáš Pravda; Hagen Triendl
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Structures of the PDBBind dataset (general protein-ligand) prepared with CCDC protein preparation software. After preparation, 18310 structures out of the total 19443 remained (1133 failed).

  5. Compounds with binding affinity data for human DBP

    • figshare.com
    xls
    Updated Jan 19, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Christine Chichester (2016). Compounds with binding affinity data for human DBP [Dataset]. http://doi.org/10.6084/m9.figshare.1235442.v1
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jan 19, 2016
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Christine Chichester
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Supplementary data file S4 from the manuscript 'The application of the Open Pharmacological Concepts Triple Store (Open PHACTS) to support Drug Discovery Research' to be published in PLOS ONE

  6. f

    Data from: ProAffinity-GNN: A Novel Approach to Structure-Based...

    • acs.figshare.com
    xlsx
    Updated Nov 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zhiyuan Zhou; Yueming Yin; Hao Han; Yiping Jia; Jun Hong Koh; Adams Wai-Kin Kong; Yuguang Mu (2024). ProAffinity-GNN: A Novel Approach to Structure-Based Protein–Protein Binding Affinity Prediction via a Curated Data Set and Graph Neural Networks [Dataset]. http://doi.org/10.1021/acs.jcim.4c01850.s002
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Nov 19, 2024
    Dataset provided by
    ACS Publications
    Authors
    Zhiyuan Zhou; Yueming Yin; Hao Han; Yiping Jia; Jun Hong Koh; Adams Wai-Kin Kong; Yuguang Mu
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Protein–protein interactions (PPIs) are crucial for understanding biological processes and disease mechanisms, contributing significantly to advances in protein engineering and drug discovery. The accurate determination of binding affinities, essential for decoding PPIs, faces challenges due to the substantial time and financial costs involved in experimental and theoretical methods. This situation underscores the urgent need for more effective and precise methodologies for predicting binding affinity. Despite the abundance of research on PPI modeling, the field of quantitative binding affinity prediction remains underexplored, mainly due to a lack of comprehensive data. This study seeks to address these needs by manually curating pairwise interaction labels on available 3D structures of protein complexes, with experimentally determined binding affinities, creating the largest data set for structure-based pairwise protein interaction with binding affinity to date. Subsequently, we introduce ProAffinity-GNN, a novel deep learning framework using protein language model and graph neural network (GNN) to improve the accuracy of prediction of structure-based protein–protein binding affinities. The evaluation results across several benchmark test sets and an additional case study demonstrate that ProAffinity-GNN not only outperforms existing models in terms of accuracy but also shows strong generalization capabilities.

  7. r

    SARS-CoV-2 RBD binding affinity dataset

    • resodate.org
    Updated Jan 7, 2026
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shiwei Liu; Tian Zhu; Milong Ren; Chungong Yu; Dongbo Bu; Haicang Zhang (2026). SARS-CoV-2 RBD binding affinity dataset [Dataset]. https://resodate.org/resources/aHR0cHM6Ly9zZXJ2aWNlLnRpYi5ldS9sZG1zZXJ2aWNlL2RhdGFzZXQvc2Fycy1jb3YtMi1yYmQtYmluZGluZy1hZmZpbml0eS1kYXRhc2V0
    Explore at:
    Dataset updated
    Jan 7, 2026
    Dataset provided by
    Leibniz Data Manager
    Authors
    Shiwei Liu; Tian Zhu; Milong Ren; Chungong Yu; Dongbo Bu; Haicang Zhang
    Description

    The dataset used in the paper for predicting the effects of mutations on protein-protein binding.

  8. Performance comparison of BiComp encoding, against LZMA and SW encodings,...

    • plos.figshare.com
    xls
    Updated Jun 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mahmood Kalemati; Mojtaba Zamani Emani; Somayyeh Koohi (2023). Performance comparison of BiComp encoding, against LZMA and SW encodings, for drug-target binding affinity prediction, for Davis and Kiba datasets, using feature ablation experiments. [Dataset]. http://doi.org/10.1371/journal.pcbi.1011036.t009
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 21, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Mahmood Kalemati; Mojtaba Zamani Emani; Somayyeh Koohi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Performance comparison of BiComp encoding, against LZMA and SW encodings, for drug-target binding affinity prediction, for Davis and Kiba datasets, using feature ablation experiments.

  9. c

    Affinity Price Prediction Data

    • coinbase.com
    Updated Dec 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Affinity Price Prediction Data [Dataset]. https://www.coinbase.com/en-ca/price-prediction/safeaffinity
    Explore at:
    Dataset updated
    Dec 26, 2025
    Variables measured
    Growth Rate, Predicted Price
    Measurement technique
    User-defined projections based on compound growth. This is not a formal financial forecast.
    Description

    This dataset contains the predicted prices of the asset Affinity over the next 16 years. This data is calculated initially using a default 5 percent annual growth rate, and after page load, it features a sliding scale component where the user can then further adjust the growth rate to their own positive or negative projections. The maximum positive adjustable growth rate is 100 percent, and the minimum adjustable growth rate is -100 percent.

  10. Antibody and Nanobody Design Dataset (ANDD)

    • zenodo.org
    zip
    Updated Sep 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yikai Wu; Yikai Wu (2025). Antibody and Nanobody Design Dataset (ANDD) [Dataset]. http://doi.org/10.5281/zenodo.16894086
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 26, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Yikai Wu; Yikai Wu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Title: Antibody and Nanobody Design Dataset (ANDD): A Comprehensive Resource with Sequence, Structure, and Binding Affinity Data

    DOI: 10.5281/zenodo.16894086

    Resource Type: Dataset

    Publisher: Zenodo

    Publication Year: 2025

    License: Creative Commons Attribution 4.0 International (CC BY 4.0)

    Overview (Abstract):

    The Antibody and Nanobody Design Dataset (ANDD) is a unified, large-scale dataset created to overcome the limitations of data fragmentation and incompleteness in antibody and nanobody research. It integrates sequence, structure, antigen information, and binding affinity data from 15 diverse sources, including OAS, PDB, SabDab, and others. ANDD comprises 48,800 antibody/nanobody sequences, structural data for 25,158 entries, antigen sequences for 12,617 entries, and a total of 9,569 binding affinity values for antibody/nanobody-antigen pairs. A key innovation is the augmentation of experimental affinity data with 5,218 high-quality predictions generated by the ANTIPASTI model. This makes ANDD the largest available dataset of its kind, providing a robust foundation for training and validating deep learning models in therapeutic antibody and nanobody design.

    Keywords: Dataset, Antibody Design, Nanobody Design, VHH, Deep Learning, Protein Engineering, Binding Affinity, Therapeutic Antibodies, Computational Biology

    Methods (Data Curation and Processing):

    The ANDD was constructed through a rigorous multi-step process:

    1. Data Collection: Data was aggregated from 15 primary sources, including both antibody/nanobody-specific databases (e.g., OAS, SAbDab, INDI, sdAb-DB) and general protein databases (e.g., PDB, UNIPROT, PDBbind).
    2. Integration and Standardization: Data from disparate sources was consolidated into a consistent format, addressing challenges of format inconsistency. Entries were manually validated to exclude non-relevant data (e.g., T-cell receptors).
    3. Affinity Data Augmentation: The ANTIPASTI deep learning model was used to predict and add binding affinity values for entries that had structural data but lacked experimental affinity measurements.
    4. Manual Curation: Web-based data and information from publicly available patents targeting key antigens (HER2, IL-6, CD45, SARS-CoV-2 RBD) were manually extracted to enhance completeness.
    5. Hierarchical Organization: Data is organized in a hierarchical structure, offering four progressively detailed levels: Sequence-only, Sequence+Structure, Sequence+Structure+Antigen, and Sequence+Structure+Antigen+Affinity.

    Data Specifications and Format:

    The dataset is distributed in two parts:

    1. ANDD.csv: A comprehensive spreadsheet containing all annotated metadata for each entry.
    2. All_structures/Folder: A directory containing the corresponding PDB structure files for entries with structural data.

    The ANDD.csvfile includes the following key fields (a full description is available in the Data Record section of the paper):

    • General Info: Source, Update_Date, PDB_ID, Experimental_Method, Ab_or_Nano, Source_Organism.
    • Chain Details: Entity IDs, Asym IDs, Database Accession Codes, and Macromolecule Names for Heavy (H) and Light (L) chains.
    • Antigen Details: Ag_Name, Ag_Seq, Ag_Source Organism, and relevant database identifiers.
    • Sequence Data: Full amino acid sequences for H/L chains and individual CDR regions (H1-H3, L1-L3).
    • Affinity Data: Experimentally measured or predicted Affinity_Kd(M), ∆Gbinding(kJ), and the Affinity_Method.
    • Mutation Data: Annotation of any amino acid mutations (Ab/Nano_mutation).

    Technical Validation:

    The quality of ANDD has been ensured through extensive validation:

    1. Manual Curation: A rigorous manual review process was conducted to check for accuracy and consistency between sequence, structure, and affinity data across randomly selected entries.
    2. Affinity Validation with AlphaBind: The experimental Kd values were validated by comparing them against enrichment ratios predicted by the AlphaBind model, showing a significant correlation (Pearson’s r = 0.750).
    3. Cross-Mapping Validation: The internal consistency between Kd and ∆Gbinding values within the dataset was confirmed, showing a perfect correlation (Pearson’s r = 1.000) as per thermodynamic principles.
    4. Proof-of-Concept Application: The dataset's utility was demonstrated by fine-tuning the Diffab generative model on a subset of ANDD. The fine-tuned model showed significant improvements in generating nanobodies with better predicted binding affinity, structural diversity, and developability metrics.

    Potential Uses:

    ANDD is designed to accelerate research in computational biology and drug discovery, including:

    • Training and benchmarking deep learning models for de novoantibody/nanobody sequence and structure generation.
    • Developing and validating predictive models for antibody-antigen binding affinity.
    • Studying structure-function relationships in antibody-antigen interactions.
    • Facilitating the design of optimized therapeutic antibodies and nanobodies with improved specificity and efficacy.

    Access and License:

    The ANDD dataset is publicly available for download under a Creative Commons Attribution 4.0 International (CC BY 4.0) license. Users are free to share and adapt the material for any purpose, even commercially, provided appropriate credit is given to the original authors and this data descriptor is cited.

  11. r

    AffinDB

    • rrid.site
    Updated Jan 21, 2026
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2026). AffinDB [Dataset]. http://identifiers.org/RRID:SCR_001690
    Explore at:
    Dataset updated
    Jan 21, 2026
    Description

    Database of affinity data for protein-ligand complexes of the Protein Data Bank (PDB) providing direct and free access to the experimental affinity of a given complex structure. Affinity data are exclusively obtained from the scientific literature. As of Thursday, May 01st, 2014, AffinDB contains 748 affinity values covering 474 different PDB complexes. More than one affinity value may be associated with a single PDB complex, which is most frequently due to multiple references reporting affinity data for the same complex. AffinDB provides access to data in three different forms: # Summary information for PDB entry # Affinity information window # Tabular reports

  12. Davis and KIBA Datasets

    • kaggle.com
    zip
    Updated Jul 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Raj Aryan (2025). Davis and KIBA Datasets [Dataset]. https://www.kaggle.com/datasets/rajaryan2315/davis-and-kiba-datasets
    Explore at:
    zip(53365041 bytes)Available download formats
    Dataset updated
    Jul 6, 2025
    Authors
    Raj Aryan
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    This dataset is curated from two widely used benchmarks—Davis and KIBA—for drug-target interaction (DTI) prediction tasks. It includes compound SMILES strings, target protein sequences, and corresponding binding affinity values.

    It is ideal for developing and benchmarking deep learning models that combine molecular graph representations (from SMILES) and sequence-based encodings (from protein sequences).

    Files Included

    davis_all.csv – pKd binding values between kinase inhibitors and protein targets.

    kiba_all.csv – KIBA scores representing combined bioactivity data (Ki, Kd, IC50).

    Data Columns

    Each file contains the following columns:

    Column Name Description - canonical_smiles Isomeric SMILES string representing the compound structure - target_sequence Amino acid sequence of the protein target - affinity Binding affinity value (e.g., pKd for Davis, KIBA score for KIBA)

    Dataset Summary

    Davis Dataset

    Source: Davis et al., 2011

    Affinity values are provided as pKd = −log10(Kd).

    Focuses on kinase inhibitors and human kinase proteins.

    KIBA Dataset

    Source: Tang et al., 2014

    Combines multiple bioactivity types into a unified KIBA score.

    Broader coverage of compounds and targets.

    Applications

    Deep learning-based DTI prediction (e.g., GraphDTA, DeepDTA, MolTrans)

    Molecular representation learning (via GCN, SMILES encoding)

    Protein sequence embedding and joint modeling

    Drug discovery, repurposing, and virtual screening task

  13. D

    Affinity Analysis Platform Market Research Report 2033

    • dataintelo.com
    csv, pdf, pptx
    Updated Oct 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2025). Affinity Analysis Platform Market Research Report 2033 [Dataset]. https://dataintelo.com/report/affinity-analysis-platform-market
    Explore at:
    pptx, pdf, csvAvailable download formats
    Dataset updated
    Oct 1, 2025
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2025 - 2034
    Area covered
    Global
    Description

    Affinity Analysis Platform Market Outlook




    According to our latest research, the global affinity analysis platform market size reached USD 1.87 billion in 2024, demonstrating robust momentum across sectors. With a projected CAGR of 13.2% during the forecast period, the market is anticipated to attain a value of USD 5.58 billion by 2033. This impressive growth is primarily attributed to increasing demand for advanced data analytics solutions, rising adoption of AI-driven customer insights, and the ongoing digital transformation across industries. As organizations strive to gain a competitive edge through data-driven decision-making, affinity analysis platforms are rapidly becoming indispensable tools for uncovering actionable patterns and optimizing business strategies.




    A major growth factor propelling the affinity analysis platform market is the exponential increase in data generation from digital channels, IoT devices, and customer interactions. Organizations across retail, BFSI, healthcare, and e-commerce are leveraging affinity analysis to mine relationships and associations within large datasets, enabling them to understand customer behavior, preferences, and trends with unprecedented accuracy. This demand is further amplified by the proliferation of omnichannel strategies, where businesses seek to create seamless and personalized experiences for their customers. As a result, the need for sophisticated analytics tools capable of real-time processing and actionable insights has never been higher, driving continuous innovation and investment in affinity analysis technologies.




    Another significant driver is the integration of artificial intelligence and machine learning algorithms within affinity analysis platforms. These technologies empower organizations to automate complex analytical processes, enhance the accuracy of predictions, and uncover hidden correlations that traditional methods might overlook. The ability to deliver highly targeted marketing campaigns, optimize product recommendations, and detect fraudulent activities in real time has become a key differentiator for businesses. Furthermore, advancements in cloud computing have democratized access to these platforms, allowing even small and medium enterprises to benefit from enterprise-grade analytics without heavy upfront investments in infrastructure.




    The increasing regulatory focus on data privacy and security is also shaping the affinity analysis platform market. As data-driven strategies become central to business operations, organizations are under pressure to comply with stringent regulations such as GDPR, CCPA, and HIPAA. This has led to a surge in demand for platforms that offer robust security features, data governance capabilities, and compliance tools. Vendors are responding by enhancing their offerings with advanced encryption, access controls, and audit trails, thereby building trust and ensuring the responsible use of customer data. This regulatory landscape, while challenging, is also fostering innovation and driving adoption among risk-averse industries like healthcare and finance.




    From a regional perspective, North America continues to dominate the affinity analysis platform market, accounting for the largest share owing to the early adoption of advanced analytics, presence of key technology providers, and high digital maturity of enterprises. However, Asia Pacific is emerging as the fastest-growing region, fueled by rapid digitalization, booming e-commerce, and increasing investments in AI and big data. Europe remains a significant market, driven by stringent data protection regulations and a strong focus on customer-centric business models. Meanwhile, Latin America and the Middle East & Africa are witnessing steady growth, supported by expanding digital infrastructure and rising awareness of the benefits of affinity analysis.



    Component Analysis




    The affinity analysis platform market by component is segmented into software and services, each playing a crucial role in delivering value to end-users. The software segment, which includes analytics engines, visualization tools, and data integration modules, holds the lion’s share of the market. This dominance is attributed to the continuous advancements in analytics algorithms, user-friendly interfaces, and integration capabilities with existing enterprise systems. Organizations are increasingly seeking scalable and customizable software solutions that can handle large vol

  14. V

    Data from: High affinity binding of proteins HMG1 and HMG2 to semicatenated...

    • data.virginia.gov
    html
    Updated Sep 6, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institutes of Health (2025). High affinity binding of proteins HMG1 and HMG2 to semicatenated DNA loops [Dataset]. https://data.virginia.gov/dataset/high-affinity-binding-of-proteins-hmg1-and-hmg2-to-semicatenated-dna-loops
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Sep 6, 2025
    Dataset provided by
    National Institutes of Health
    Description

    Background Proteins HMG1 and HMG2 are two of the most abundant non histone proteins in the nucleus of mammalian cells, and contain a domain of homology with many proteins implicated in the control of development, such as the sex-determination factor Sry and the Sox family of proteins. In vitro studies of interactions of HMG1/2 with DNA have shown that these proteins can bind to many unusual DNA structures, in particular to four-way junctions, with binding affinities of 107 to 109 M-1.

       Results
       Here we show that HMG1 and HMG2 bind with a much higher affinity, at least 4 orders of magnitude higher, to a new structure, Form X, which consists of a DNA loop closed at its base by a semicatenated DNA junction, forming a DNA hemicatenane. The binding constant of HMG1 to Form X is higher than 5 × 1012 M-1, and the half-life of the complex is longer than one hour in vitro.
    
    
       Conclusions
       Of all DNA structures described so far with which HMG1 and HMG2 interact, we have found that Form X, a DNA loop with a semicatenated DNA junction at its base, is the structure with the highest affinity by more than 4 orders of magnitude. This suggests that, if similar structures exist in the cell nucleus, one of the functions of these proteins might be linked to the remarkable property of DNA hemicatenanes to associate two distant regions of the genome in a stable but reversible manner.
    
  15. ATOM3D: Ligand Binding Affinity (LBA) Dataset

    • zenodo.org
    application/gzip, bin
    Updated Jun 16, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Raphael J.L. Townshend; Raphael J.L. Townshend; Martin Vögele; Martin Vögele; Patricia Suriana; Patricia Suriana; Alexander Derry; Alexander Derry; Alexander Powers; Yianni Laloudakis; Sidhika Balachandar; Brandon Anderson; Stephan Eismann; Risi Kondor; Russ B. Altman; Ron O. Dror; Alexander Powers; Yianni Laloudakis; Sidhika Balachandar; Brandon Anderson; Stephan Eismann; Risi Kondor; Russ B. Altman; Ron O. Dror (2021). ATOM3D: Ligand Binding Affinity (LBA) Dataset [Dataset]. http://doi.org/10.5281/zenodo.4914718
    Explore at:
    application/gzip, binAvailable download formats
    Dataset updated
    Jun 16, 2021
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Raphael J.L. Townshend; Raphael J.L. Townshend; Martin Vögele; Martin Vögele; Patricia Suriana; Patricia Suriana; Alexander Derry; Alexander Derry; Alexander Powers; Yianni Laloudakis; Sidhika Balachandar; Brandon Anderson; Stephan Eismann; Risi Kondor; Russ B. Altman; Ron O. Dror; Alexander Powers; Yianni Laloudakis; Sidhika Balachandar; Brandon Anderson; Stephan Eismann; Risi Kondor; Russ B. Altman; Ron O. Dror
    Description

    Ligand Binding Affinity (LBA) dataset from the ATOM3D project. This upload includes five zipped data directories:

    1. Full, unsplit dataset in LMDB format
    2. Split datasets at 60% sequence identity, with each in LMDB format
    3. Split datasets at 30% sequence identity, with each in LMDB format
    4. Text files containing train, validation, and test indices used to split raw dataset (for both 30% and 60%)
    5. README containing dataset details
  16. n

    AffinDB

    • neuinfo.org
    Updated Jan 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). AffinDB [Dataset]. http://identifiers.org/RRID:SCR_001690
    Explore at:
    Dataset updated
    Jan 29, 2022
    Description

    Database of affinity data for protein-ligand complexes of the Protein Data Bank (PDB) providing direct and free access to the experimental affinity of a given complex structure. Affinity data are exclusively obtained from the scientific literature. As of Thursday, May 01st, 2014, AffinDB contains 748 affinity values covering 474 different PDB complexes. More than one affinity value may be associated with a single PDB complex, which is most frequently due to multiple references reporting affinity data for the same complex. AffinDB provides access to data in three different forms: # Summary information for PDB entry # Affinity information window # Tabular reports

  17. f

    Data from: Automated High-Throughput Affinity Capture-Mass Spectrometry...

    • datasetcatalog.nlm.nih.gov
    Updated Jan 27, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Williams, Jon D.; Kath, James E.; Marin, Violeta L.; Ma, Renze; Banlasan, Adam; Tang, Hua; Torrent, Maricel; Jing, Hui; Senaweera, Sameera; Potts, Gregory K.; Richardson, Paul L.; Patel, Shitalben; McClure, Ryan A. (2025). Automated High-Throughput Affinity Capture-Mass Spectrometry Platform with Data-Independent Acquisition [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001448713
    Explore at:
    Dataset updated
    Jan 27, 2025
    Authors
    Williams, Jon D.; Kath, James E.; Marin, Violeta L.; Ma, Renze; Banlasan, Adam; Tang, Hua; Torrent, Maricel; Jing, Hui; Senaweera, Sameera; Potts, Gregory K.; Richardson, Paul L.; Patel, Shitalben; McClure, Ryan A.
    Description

    Affinity capture (AC) combined with mass spectrometry (MS)-based proteomics is highly utilized throughout the drug discovery pipeline to determine small-molecule target selectivity and engagement. However, the tedious sample preparation steps and time-consuming MS acquisition process have limited its use in a high-throughput format. Here, we report an automated workflow employing biotinylated probes and streptavidin magnetic beads for small-molecule target enrichment in the 96-well plate format, ending with direct sampling from EvoSep Solid Phase Extraction tips for liquid chromatography (LC)-tandem mass spectrometry (MS/MS) analysis. The streamlined process significantly reduced both the overall and hands-on time needed for sample preparation. Additionally, we developed a data-independent acquisition-mass spectrometry (DIA-MS) method to establish an efficient label-free quantitative chemical proteomic kinome profiling workflow. DIA-MS yielded a coverage of ∼380 kinases, a > 60% increase compared to using a data-dependent acquisition (DDA)-MS method, and provided reproducible target profiling of the kinase inhibitor dasatinib. We further showcased the applicability of this AC-MS workflow for assessing the selectivity of two clinical-stage CDK9 inhibitors against ∼250 probe-enriched kinases. Our study here provides a roadmap for efficient target engagement and selectivity profiling in native cell or tissue lysates using AC-MS.

  18. p

    Affinity Group Locations Data for United States

    • poidata.io
    csv, json
    Updated Feb 12, 2026
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Business Data Provider (2026). Affinity Group Locations Data for United States [Dataset]. https://poidata.io/brand-report/affinity-group/united-states
    Explore at:
    csv, jsonAvailable download formats
    Dataset updated
    Feb 12, 2026
    Dataset authored and provided by
    Business Data Provider
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    2026
    Area covered
    United States
    Variables measured
    Website URL, Phone Number, Review Count, Business Name, Email Address, Business Hours, Customer Rating, Business Address, Brand Affiliation, Geographic Coordinates
    Description

    Comprehensive dataset containing 65 verified Affinity Group locations in United States with complete contact information, ratings, reviews, and location data.

  19. f

    Data from: A single-residue affinity scale for DNA-binding using linear...

    • datasetcatalog.nlm.nih.gov
    Updated Nov 21, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ahmad, Shandar; Andrabi, Munazah (2017). A single-residue affinity scale for DNA-binding using linear perceptron [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001745785
    Explore at:
    Dataset updated
    Nov 21, 2017
    Authors
    Ahmad, Shandar; Andrabi, Munazah
    Description

    A linear scale to estimate DNA-binding free energy of amino acid residues is reported. Scales derived exclusively for irregular and helical positions give 76% and 68% classification accuracy between stabilizing and destabilizing protein-DNA interaction. Mean absolute error (MAE) in ddG values is 0.786 and 0.883 kcal/mol respectively. Without using structure information of residues to derive affinity scales, 67.0% mutations could be correctly classified between those stabilizing and destabilizing binding. Mean absolute error (MAE) and correlation of ddG predictions are 0.953 kcal/mol and 0.385 respectively. PRIB 2008 proceedings found at: http://dx.doi.org/10.1007/978-3-540-88436-1 Contributors: Monash University. Faculty of Information Technology. Gippsland School of Information Technology ; Chetty, Madhu ; Ahmad, Shandar ; Ngom, Alioune ; Teng, Shyh Wei ; Third IAPR International Conference on Pattern Recognition in Bioinformatics (PRIB) (3rd : 2008 : Melbourne, Australia) ; Coverage: Rights: Copyright by Third IAPR International Conference on Pattern Recognition in Bioinformatics. All rights reserved.

  20. GEMS: GNN Framework For Efficient Protein-Ligand Binding Affinity Prediction...

    • zenodo.org
    application/gzip, bin +1
    Updated Dec 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Peter Stockinger; Peter Stockinger (2024). GEMS: GNN Framework For Efficient Protein-Ligand Binding Affinity Prediction Through Robust Data Filtering and Language Model Integration [Dataset]. http://doi.org/10.5281/zenodo.14260171
    Explore at:
    json, application/gzip, binAvailable download formats
    Dataset updated
    Dec 8, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Peter Stockinger; Peter Stockinger
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    For fast reproduction of our results, we provide PyTorch datasets of precomputed interaction graphs for the entire PDBbind database on Zenodo. To enable quick establishment of leakage-free evaluation setups with PDBbind, we also provide pairwise similarity matrices for the entire PDBbind dataset on Zenodo.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Huaqing Liu; Huaqing Liu (2024). PPB-Affinity: Protein-Protein Binding Affinity dataset for AI-based protein drug discovery [Dataset]. http://doi.org/10.5281/zenodo.13054646
Organization logo

Data from: PPB-Affinity: Protein-Protein Binding Affinity dataset for AI-based protein drug discovery

Related Article
Explore at:
zip, binAvailable download formats
Dataset updated
Jul 27, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Huaqing Liu; Huaqing Liu
License

http://www.apache.org/licenses/LICENSE-2.0http://www.apache.org/licenses/LICENSE-2.0

Description

Prediction of protein-protein binding (PPB) affinity plays an important role in large-molecular drug discovery. Deep learning (DL) has been adopted to predict the change of PPB binding affinity upon mutation, but there was a scarcity of studies predicting the PPB affinity itself. The major reason is the paucity of open-source dataset concerning PPB affinity. Therefore, the current study aimed to introduce and disclose a PPB affinity dataset (PPB-Affinity), which will definitely benefit the development of applicable DL to predict the PPB affinity. The PPB-Affinity dataset contains key information such as crystal structures of protein-protein complexes (with or without protein mutation patterns), PPB affinity, receptor protein chain, ligand protein chain, etc. To the best of our knowledge, this is the largest and publicly available PPB-Affinity dataset, which may finally help the industry in improving the screening efficiency of discovering new large-molecular drugs.

Codes for PPB-Affinity database preparation is disclosed at https://github.com/Huatsing-Lau/PPB-Affinity-DataPrepWorkflow" href="https://github.com/Huatsing-Lau/PPB-Affinity-DataPrepWorkflow">https://github.com/Huatsing-Lau/PPB-Affinity-DataPrepWorkflow.
Codes for the benchmark algorithm is disclosed at https://github.com/ChenPy00/PPB-Affinity.

Files are orginized as follows:

- PPB-Affinity.xlsx

- samples_deleted.zip

- PDB/

- Affinity Benchmark v5.5/

- file1.pdb

- file2.pdb

- ...

- filek.pdb

- ATLAS/

- PDBbind v2020/

- SAbDab/

- SKEMPIv2.0/

Search
Clear search
Close search
Google apps
Main menu