100+ datasets found
  1. Affinity Dataset

    • kaggle.com
    zip
    Updated Mar 4, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Peeyush Kant Misra (2021). Affinity Dataset [Dataset]. https://www.kaggle.com/datasets/pkmisra/affinity-dataset
    Explore at:
    zip(342 bytes)Available download formats
    Dataset updated
    Mar 4, 2021
    Authors
    Peeyush Kant Misra
    Description

    Dataset

    This dataset was created by Peeyush Kant Misra

    Contents

  2. o

    AI3 Protein-Ligand Binding Affinity Dataset

    • registry.opendata.aws
    Updated Sep 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    International Institute of Information Technology Hyderabad (2025). AI3 Protein-Ligand Binding Affinity Dataset [Dataset]. https://registry.opendata.aws/ai3/
    Explore at:
    Dataset updated
    Sep 18, 2025
    Dataset provided by
    International Institute of Information Technology Hyderabad
    Description

    The rapid advancement of computing technologies, particularly artificial intelligence (AI), has revolutionized various domains, including drug discovery. Curated datasets are crucial for developing reliable, generalizable, and accurate models for practical applications. Generating experimental data on a large scale is an expensive and arduous process. In domains such as medical diagnostics where real-life data is hard to obtain, synthetic data has been shown to be extremely valuable. We, teams from IIIT Hyderabad, Intel, AWS, and Insilico Medicine, have performed physics-based calculations (molecular dynamics simulations) on about 20,000 protein-ligand complexes. The dataset comprises molecular dynamics snapshots, binding affinities calculated using the MM-PBSA method, and individual energy components, including electrostatic and van der Waals interactions. DatasetFileFormats essentially incorporate i. 3D coordinates of the protein-ligand complexes (pdb) in tar.gz files, and ii. CSV files containing the energy data. DatasetUsages are on i. ML scoring function for predicting binding affinities of given protein-ligand complexes, ii. Classification models for predicting correct binding poses of ligands, iii. identification of cryptic binding pockets, and iv. optimization of binding features by exploiting the individual components of the energy (experimental data has only the total binding affinity). Further, the novelty of the dataset highlights the fact that existing AI/ML training datasets lack dynamic data and are inherently biased. Further, binding affinity data existing in the literature are obtained from different experimental protocols. Therefore, this dataset has been uniquely created (from the same computational protocols) followed by free energy calculations with molecular dynamics (MD) simulations. The dynamic data-enriched protein-ligand coordinates can be used to effectively train convolutional neural network-based regression models for more accurate binding affinity prediction.

  3. PDBbind Protein-Ligand Binding Affinity Dataset

    • kaggle.com
    Updated Mar 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Maduka Charles (2025). PDBbind Protein-Ligand Binding Affinity Dataset [Dataset]. https://www.kaggle.com/datasets/madukacharles/pdbbind-protein-ligand-binding-affinity-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 24, 2025
    Dataset provided by
    Kaggle
    Authors
    Maduka Charles
    Description

    This dataset contains raw protein-ligand complexes sourced from PDBbind, along with their experimentally measured binding affinities in -log Kd/pKd units. It serves as a valuable benchmark for training and evaluating molecular docking, scoring functions, and deep learning models for binding affinity prediction.

    The dataset includes: • Protein-ligand complexes identified by PDB codes • Binding affinity values in -log Kd (pKd) • Structural data needed for molecular modeling and machine learning applications • A dataset CSV file listing all protein-ligand complexes and their binding affinities (-log Kd/pKd)

    This dataset is not preprocessed, meaning the raw structural files (PDB/MOL2/SDF) are intact and can be featurized using tools like DeepChem for deep learning applications, such as Atomic Convolutional Neural Networks (ACNNs).

    References: (1) Li, Y.; Liu, Z. H.; Han, L.; Li, J.; Liu, J.; Zhao, Z. X.; Li, C. K.; Wang, R. X. (2014). Comparative Assessment of Scoring Functions on an Updated Benchmark: I. Compilation of the Test Set. J. Chem. Inf. Model. DOI: 10.1021/ci500080q. (2) Li, Y.; Han, L.; Liu, Z. H.; Wang, R. X. (2014). Comparative Assessment of Scoring Functions on an Updated Benchmark: II. Evaluation Methods and General Results. J. Chem. Inf. Model. DOI: 10.1021/ci500081m. (3)DeepChem: An Open-Source Toolkit for Deep Learning in Drug Discovery, Quantum Chemistry, Materials Science, and Biology. Available at: https://deepchem.io.

  4. f

    Data from: PPI-Affinity: A Web Tool for the Prediction and Optimization of...

    • datasetcatalog.nlm.nih.gov
    • acs.figshare.com
    Updated Jun 2, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Münch, Jan; Mieres-Perez, Joel; Romero-Molina, Sandra; Sanchez-Garcia, Elsa; Ruiz-Blanco, Yasser B.; Ehrmann, Michael; Harms, Mirja (2022). PPI-Affinity: A Web Tool for the Prediction and Optimization of Protein–Peptide and Protein–Protein Binding Affinity [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000438753
    Explore at:
    Dataset updated
    Jun 2, 2022
    Authors
    Münch, Jan; Mieres-Perez, Joel; Romero-Molina, Sandra; Sanchez-Garcia, Elsa; Ruiz-Blanco, Yasser B.; Ehrmann, Michael; Harms, Mirja
    Description

    Virtual screening of protein–protein and protein–peptide interactions is a challenging task that directly impacts the processes of hit identification and hit-to-lead optimization in drug design projects involving peptide-based pharmaceuticals. Although several screening tools designed to predict the binding affinity of protein–protein complexes have been proposed, methods specifically developed to predict protein–peptide binding affinity are comparatively scarce. Frequently, predictors trained to score the affinity of small molecules are used for peptides indistinctively, despite the larger complexity and heterogeneity of interactions rendered by peptide binders. To address this issue, we introduce PPI-Affinity, a tool that leverages support vector machine (SVM) predictors of binding affinity to screen datasets of protein–protein and protein–peptide complexes, as well as to generate and rank mutants of a given structure. The performance of the SVM models was assessed on four benchmark datasets, which include protein–protein and protein–peptide binding affinity data. In addition, we evaluated our model on a set of mutants of EPI-X4, an endogenous peptide inhibitor of the chemokine receptor CXCR4, and on complexes of the serine proteases HTRA1 and HTRA3 with peptides. PPI-Affinity is freely accessible at https://protdcal.zmb.uni-due.de/PPIAffinity.

  5. Antibody and Nanobody Design Dataset (ANDD)

    • zenodo.org
    zip
    Updated Sep 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yikai Wu; Yikai Wu (2025). Antibody and Nanobody Design Dataset (ANDD) [Dataset]. http://doi.org/10.5281/zenodo.16894086
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 26, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Yikai Wu; Yikai Wu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Title: Antibody and Nanobody Design Dataset (ANDD): A Comprehensive Resource with Sequence, Structure, and Binding Affinity Data

    DOI: 10.5281/zenodo.16894086

    Resource Type: Dataset

    Publisher: Zenodo

    Publication Year: 2025

    License: Creative Commons Attribution 4.0 International (CC BY 4.0)

    Overview (Abstract):

    The Antibody and Nanobody Design Dataset (ANDD) is a unified, large-scale dataset created to overcome the limitations of data fragmentation and incompleteness in antibody and nanobody research. It integrates sequence, structure, antigen information, and binding affinity data from 15 diverse sources, including OAS, PDB, SabDab, and others. ANDD comprises 48,800 antibody/nanobody sequences, structural data for 25,158 entries, antigen sequences for 12,617 entries, and a total of 9,569 binding affinity values for antibody/nanobody-antigen pairs. A key innovation is the augmentation of experimental affinity data with 5,218 high-quality predictions generated by the ANTIPASTI model. This makes ANDD the largest available dataset of its kind, providing a robust foundation for training and validating deep learning models in therapeutic antibody and nanobody design.

    Keywords: Dataset, Antibody Design, Nanobody Design, VHH, Deep Learning, Protein Engineering, Binding Affinity, Therapeutic Antibodies, Computational Biology

    Methods (Data Curation and Processing):

    The ANDD was constructed through a rigorous multi-step process:

    1. Data Collection: Data was aggregated from 15 primary sources, including both antibody/nanobody-specific databases (e.g., OAS, SAbDab, INDI, sdAb-DB) and general protein databases (e.g., PDB, UNIPROT, PDBbind).
    2. Integration and Standardization: Data from disparate sources was consolidated into a consistent format, addressing challenges of format inconsistency. Entries were manually validated to exclude non-relevant data (e.g., T-cell receptors).
    3. Affinity Data Augmentation: The ANTIPASTI deep learning model was used to predict and add binding affinity values for entries that had structural data but lacked experimental affinity measurements.
    4. Manual Curation: Web-based data and information from publicly available patents targeting key antigens (HER2, IL-6, CD45, SARS-CoV-2 RBD) were manually extracted to enhance completeness.
    5. Hierarchical Organization: Data is organized in a hierarchical structure, offering four progressively detailed levels: Sequence-only, Sequence+Structure, Sequence+Structure+Antigen, and Sequence+Structure+Antigen+Affinity.

    Data Specifications and Format:

    The dataset is distributed in two parts:

    1. ANDD.csv: A comprehensive spreadsheet containing all annotated metadata for each entry.
    2. All_structures/Folder: A directory containing the corresponding PDB structure files for entries with structural data.

    The ANDD.csvfile includes the following key fields (a full description is available in the Data Record section of the paper):

    • General Info: Source, Update_Date, PDB_ID, Experimental_Method, Ab_or_Nano, Source_Organism.
    • Chain Details: Entity IDs, Asym IDs, Database Accession Codes, and Macromolecule Names for Heavy (H) and Light (L) chains.
    • Antigen Details: Ag_Name, Ag_Seq, Ag_Source Organism, and relevant database identifiers.
    • Sequence Data: Full amino acid sequences for H/L chains and individual CDR regions (H1-H3, L1-L3).
    • Affinity Data: Experimentally measured or predicted Affinity_Kd(M), ∆Gbinding(kJ), and the Affinity_Method.
    • Mutation Data: Annotation of any amino acid mutations (Ab/Nano_mutation).

    Technical Validation:

    The quality of ANDD has been ensured through extensive validation:

    1. Manual Curation: A rigorous manual review process was conducted to check for accuracy and consistency between sequence, structure, and affinity data across randomly selected entries.
    2. Affinity Validation with AlphaBind: The experimental Kd values were validated by comparing them against enrichment ratios predicted by the AlphaBind model, showing a significant correlation (Pearson’s r = 0.750).
    3. Cross-Mapping Validation: The internal consistency between Kd and ∆Gbinding values within the dataset was confirmed, showing a perfect correlation (Pearson’s r = 1.000) as per thermodynamic principles.
    4. Proof-of-Concept Application: The dataset's utility was demonstrated by fine-tuning the Diffab generative model on a subset of ANDD. The fine-tuned model showed significant improvements in generating nanobodies with better predicted binding affinity, structural diversity, and developability metrics.

    Potential Uses:

    ANDD is designed to accelerate research in computational biology and drug discovery, including:

    • Training and benchmarking deep learning models for de novoantibody/nanobody sequence and structure generation.
    • Developing and validating predictive models for antibody-antigen binding affinity.
    • Studying structure-function relationships in antibody-antigen interactions.
    • Facilitating the design of optimized therapeutic antibodies and nanobodies with improved specificity and efficacy.

    Access and License:

    The ANDD dataset is publicly available for download under a Creative Commons Attribution 4.0 International (CC BY 4.0) license. Users are free to share and adapt the material for any purpose, even commercially, provided appropriate credit is given to the original authors and this data descriptor is cited.

  6. f

    Data from: SFCscoreRF: A Random Forest-Based Scoring Function for Improved...

    • acs.figshare.com
    zip
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    David Zilian; Christoph A. Sotriffer (2023). SFCscoreRF: A Random Forest-Based Scoring Function for Improved Affinity Prediction of Protein–Ligand Complexes [Dataset]. http://doi.org/10.1021/ci400120b.s002
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    ACS Publications
    Authors
    David Zilian; Christoph A. Sotriffer
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    A major shortcoming of empirical scoring functions for protein–ligand complexes is the low degree of correlation between predicted and experimental binding affinities, as frequently observed not only for large and diverse data sets but also for SAR series of individual targets. Improvements can be envisaged by developing new descriptors, employing larger training sets of higher quality, and resorting to more sophisticated regression methods. Herein, we describe the use of SFCscore descriptors to develop an improved scoring function by means of a PDBbind training set of 1005 complexes in combination with random forest for regression. This provided SFCscoreRF as a new scoring function with significantly improved performance on the PDBbind and CSAR–NRC HiQ benchmarks in comparison to previously developed SFCscore functions. A leave-cluster-out cross-validation and performance in the CSAR 2012 scoring exercise point out remaining limitations but also directions for further improvements of SFCscoreRF and empirical scoring functions in general.

  7. d

    BindingDB

    • dknet.org
    • scicrunch.org
    • +2more
    Updated Jan 29, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). BindingDB [Dataset]. http://identifiers.org/RRID:SCR_000390
    Explore at:
    Dataset updated
    Jan 29, 2022
    Description

    Web accessible database of data extracted from scientific literature, focusing on proteins that are drug-targets or candidate drug-targets and for which structural data are present in Protein Data Bank . Website supports query types including searches by chemical structure, substructure and similarity, protein sequence, ligand and protein names, affinity ranges and molecular weight . Data sets generated by BindingDB queries can be downloaded in form of annotated SDfiles for further analysis, or used as basis for virtual screening of compound database uploaded by user. Data are linked to structural data in PDB via PDB IDs and chemical and sequence searches, and to literature in PubMed via PubMed IDs .

  8. Compounds with binding affinity data for human DBP

    • figshare.com
    xls
    Updated Jan 19, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Christine Chichester (2016). Compounds with binding affinity data for human DBP [Dataset]. http://doi.org/10.6084/m9.figshare.1235442.v1
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jan 19, 2016
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Christine Chichester
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Supplementary data file S4 from the manuscript 'The application of the Open Pharmacological Concepts Triple Store (Open PHACTS) to support Drug Discovery Research' to be published in PLOS ONE

  9. Data from: Spatio-temporal learning from molecular dynamics simulations for...

    • zenodo.org
    zip
    Updated Aug 22, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pierre-Yves Libouban; Pierre-Yves Libouban; Camille Parisel; Maxime Song; Samia Aci-Sèche; Samia Aci-Sèche; Jose-Carlos Gómez-Tamayo; Gary Tresadern; Gary Tresadern; Pascal Bonnet; Pascal Bonnet; Camille Parisel; Maxime Song; Jose-Carlos Gómez-Tamayo (2025). Spatio-temporal learning from molecular dynamics simulations for protein-ligand binding affinity prediction [Dataset]. http://doi.org/10.5281/zenodo.10390550
    Explore at:
    zipAvailable download formats
    Dataset updated
    Aug 22, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Pierre-Yves Libouban; Pierre-Yves Libouban; Camille Parisel; Maxime Song; Samia Aci-Sèche; Samia Aci-Sèche; Jose-Carlos Gómez-Tamayo; Gary Tresadern; Gary Tresadern; Pascal Bonnet; Pascal Bonnet; Camille Parisel; Maxime Song; Jose-Carlos Gómez-Tamayo
    License

    https://github.com/DISIC/politique-de-contribution-open-source/blob/master/LICENSE.pdfhttps://github.com/DISIC/politique-de-contribution-open-source/blob/master/LICENSE.pdf

    Time period covered
    Jun 6, 2024
    Description

    This Zenodo repository provides comprehensive resources for the paper titled "Spatio-temporal learning from molecular dynamics simulations for protein-ligand binding affinity prediction" published on Bioinformatics. We created a dataset of 63,000 molecular dynamics simulations by performing 10 simulations of 10 ns on 6,300 complexes. Neural networks were developed to learn from this data in order to predict the binding affinities of protein-ligand complexes. The implementation of these neural networks are available on github. Our collection includes training/benchmark datasets, trained statistical models, and results on test sets (CSV & PDF files).

    Training/benchmark datasets:

    Training, validation and test sets are provided to train and evaluate the following neural networks:

    • Pafnucy, Proli and Densenucy without MD data augmentation (dataset file names contain "initial")
    • Pafnucy, Proli and Densenucy with MD data augmentation (dataset file names contain "MDDA")
    • Pafnucy with/without MD data augmentation and Proli and Densenucy with MD data augmentation were also evaluated on the fep test set (test set file name contain "fep")
    • Timenucy and Videonucy using spatiotemporal learning methods (dataset file names contain "4D")
    • Pafnucy without MD data augmentation and on a reduced training set (dataset file names contain "reduced")

    For each training methodology (MD data augmentation and spatiotemporal learning), we provide the data for the whole complex, only the ligand or only the protein. Additionally for spatiotemporal learning, we provide the data with only the ligand using the tracking mode.

    Statistical models:

    We provide the models trained with Pafnucy, Proli, Densenucy, Timenucy and Videonucy. Each models were trained in 10 replicates.

    For Pafnucy, Proli, Densenucy, we provide the models trained with random and systematic rotations, as well as with or without MD data augmentation.

    For Proli, Densenucy, Timenucy and Videonucy, we provide the models trained on the whole complex, only the ligand or only the protein.

    For Pafnucy we also provide the models trained on the reduced set (5932 complexes).

    Results on test sets (CSV & PDF files):

    We provide the predictions on the PDBbind v.2016 core set.

    • For spatiotemporal learning methods (Timenucy and Videonucy), there are predictions for only 83 complexes, as we did not perform simulations on the whole test set.
    • For models trained with MD DA, predictions were carried on the crystallographic structures as well as on the frames extracted from the simulations performed on the test set (augmented test).

    Results on the FEP dataset are also provided for Pafnucy, Proli and Densenucy.

    The Raw MD data (~4.5 To) are stored, and can be visualized/downloaded, on the MDDB.

    This work was performed using HPC resources from GENCI-IDRIS (Grant 2021-A0100712496 & 2022-AD011013521) and CRIANN (Grant 2021002).

  10. c

    Affinity Price Prediction Data

    • coinbase.com
    Updated Nov 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Affinity Price Prediction Data [Dataset]. https://www.coinbase.com/en-ar/price-prediction/safeaffinity
    Explore at:
    Dataset updated
    Nov 8, 2025
    Variables measured
    Growth Rate, Predicted Price
    Measurement technique
    User-defined projections based on compound growth. This is not a formal financial forecast.
    Description

    This dataset contains the predicted prices of the asset Affinity over the next 16 years. This data is calculated initially using a default 5 percent annual growth rate, and after page load, it features a sliding scale component where the user can then further adjust the growth rate to their own positive or negative projections. The maximum positive adjustable growth rate is 100 percent, and the minimum adjustable growth rate is -100 percent.

  11. p

    Affinity Group Locations Data for United States

    • poidata.io
    csv, json
    Updated Dec 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Business Data Provider (2025). Affinity Group Locations Data for United States [Dataset]. https://poidata.io/brand-report/affinity-group/united-states
    Explore at:
    csv, jsonAvailable download formats
    Dataset updated
    Dec 3, 2025
    Dataset authored and provided by
    Business Data Provider
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    2025
    Area covered
    United States
    Variables measured
    Website URL, Phone Number, Review Count, Business Name, Email Address, Business Hours, Customer Rating, Business Address, Brand Affiliation, Geographic Coordinates
    Description

    Comprehensive dataset containing 65 verified Affinity Group locations in United States with complete contact information, ratings, reviews, and location data.

  12. n

    AffinDB

    • neuinfo.org
    • dknet.org
    • +2more
    Updated Jan 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). AffinDB [Dataset]. http://identifiers.org/RRID:SCR_001690
    Explore at:
    Dataset updated
    Jan 29, 2022
    Description

    Database of affinity data for protein-ligand complexes of the Protein Data Bank (PDB) providing direct and free access to the experimental affinity of a given complex structure. Affinity data are exclusively obtained from the scientific literature. As of Thursday, May 01st, 2014, AffinDB contains 748 affinity values covering 474 different PDB complexes. More than one affinity value may be associated with a single PDB complex, which is most frequently due to multiple references reporting affinity data for the same complex. AffinDB provides access to data in three different forms: # Summary information for PDB entry # Affinity information window # Tabular reports

  13. D

    Affinity Analysis Platform Market Research Report 2033

    • dataintelo.com
    csv, pdf, pptx
    Updated Oct 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2025). Affinity Analysis Platform Market Research Report 2033 [Dataset]. https://dataintelo.com/report/affinity-analysis-platform-market
    Explore at:
    pptx, pdf, csvAvailable download formats
    Dataset updated
    Oct 1, 2025
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Affinity Analysis Platform Market Outlook




    According to our latest research, the global affinity analysis platform market size reached USD 1.87 billion in 2024, demonstrating robust momentum across sectors. With a projected CAGR of 13.2% during the forecast period, the market is anticipated to attain a value of USD 5.58 billion by 2033. This impressive growth is primarily attributed to increasing demand for advanced data analytics solutions, rising adoption of AI-driven customer insights, and the ongoing digital transformation across industries. As organizations strive to gain a competitive edge through data-driven decision-making, affinity analysis platforms are rapidly becoming indispensable tools for uncovering actionable patterns and optimizing business strategies.




    A major growth factor propelling the affinity analysis platform market is the exponential increase in data generation from digital channels, IoT devices, and customer interactions. Organizations across retail, BFSI, healthcare, and e-commerce are leveraging affinity analysis to mine relationships and associations within large datasets, enabling them to understand customer behavior, preferences, and trends with unprecedented accuracy. This demand is further amplified by the proliferation of omnichannel strategies, where businesses seek to create seamless and personalized experiences for their customers. As a result, the need for sophisticated analytics tools capable of real-time processing and actionable insights has never been higher, driving continuous innovation and investment in affinity analysis technologies.




    Another significant driver is the integration of artificial intelligence and machine learning algorithms within affinity analysis platforms. These technologies empower organizations to automate complex analytical processes, enhance the accuracy of predictions, and uncover hidden correlations that traditional methods might overlook. The ability to deliver highly targeted marketing campaigns, optimize product recommendations, and detect fraudulent activities in real time has become a key differentiator for businesses. Furthermore, advancements in cloud computing have democratized access to these platforms, allowing even small and medium enterprises to benefit from enterprise-grade analytics without heavy upfront investments in infrastructure.




    The increasing regulatory focus on data privacy and security is also shaping the affinity analysis platform market. As data-driven strategies become central to business operations, organizations are under pressure to comply with stringent regulations such as GDPR, CCPA, and HIPAA. This has led to a surge in demand for platforms that offer robust security features, data governance capabilities, and compliance tools. Vendors are responding by enhancing their offerings with advanced encryption, access controls, and audit trails, thereby building trust and ensuring the responsible use of customer data. This regulatory landscape, while challenging, is also fostering innovation and driving adoption among risk-averse industries like healthcare and finance.




    From a regional perspective, North America continues to dominate the affinity analysis platform market, accounting for the largest share owing to the early adoption of advanced analytics, presence of key technology providers, and high digital maturity of enterprises. However, Asia Pacific is emerging as the fastest-growing region, fueled by rapid digitalization, booming e-commerce, and increasing investments in AI and big data. Europe remains a significant market, driven by stringent data protection regulations and a strong focus on customer-centric business models. Meanwhile, Latin America and the Middle East & Africa are witnessing steady growth, supported by expanding digital infrastructure and rising awareness of the benefits of affinity analysis.



    Component Analysis




    The affinity analysis platform market by component is segmented into software and services, each playing a crucial role in delivering value to end-users. The software segment, which includes analytics engines, visualization tools, and data integration modules, holds the lion’s share of the market. This dominance is attributed to the continuous advancements in analytics algorithms, user-friendly interfaces, and integration capabilities with existing enterprise systems. Organizations are increasingly seeking scalable and customizable software solutions that can handle large vol

  14. GEMS: Resolving Data Bias Improves Generalization in Binding Affinity...

    • zenodo.org
    application/gzip +1
    Updated Sep 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Peter Stockinger; Peter Stockinger (2025). GEMS: Resolving Data Bias Improves Generalization in Binding Affinity Prediction [Dataset]. http://doi.org/10.5281/zenodo.17190227
    Explore at:
    application/gzip, jsonAvailable download formats
    Dataset updated
    Sep 24, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Peter Stockinger; Peter Stockinger
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Sep 24, 2025
    Description

    For fast reproduction of our results, we provide PyTorch datasets of precomputed interaction graphs for the entire PDBbind database on Zenodo. To enable quick establishment of leakage-free evaluation setups with PDBbind, we also provide pairwise similarity matrices for the entire PDBbind dataset on Zenodo.

    Version 2 - Updated to improve the accuracy of Tanimoto Scores in the pairwise similarity matrices, which also caused minor changes in the composition of PDBbind CleanSplit.

    Version 3 - Including pairwise similarity matrix for sequence identity (from TM-align)

  15. d

    Data from: High affinity binding of proteins HMG1 and HMG2 to semicatenated...

    • catalog.data.gov
    • data.virginia.gov
    • +1more
    Updated Sep 7, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institutes of Health (2025). High affinity binding of proteins HMG1 and HMG2 to semicatenated DNA loops [Dataset]. https://catalog.data.gov/dataset/high-affinity-binding-of-proteins-hmg1-and-hmg2-to-semicatenated-dna-loops
    Explore at:
    Dataset updated
    Sep 7, 2025
    Dataset provided by
    National Institutes of Health
    Description

    Background Proteins HMG1 and HMG2 are two of the most abundant non histone proteins in the nucleus of mammalian cells, and contain a domain of homology with many proteins implicated in the control of development, such as the sex-determination factor Sry and the Sox family of proteins. In vitro studies of interactions of HMG1/2 with DNA have shown that these proteins can bind to many unusual DNA structures, in particular to four-way junctions, with binding affinities of 107 to 109 M-1. Results Here we show that HMG1 and HMG2 bind with a much higher affinity, at least 4 orders of magnitude higher, to a new structure, Form X, which consists of a DNA loop closed at its base by a semicatenated DNA junction, forming a DNA hemicatenane. The binding constant of HMG1 to Form X is higher than 5 × 1012 M-1, and the half-life of the complex is longer than one hour in vitro. Conclusions Of all DNA structures described so far with which HMG1 and HMG2 interact, we have found that Form X, a DNA loop with a semicatenated DNA junction at its base, is the structure with the highest affinity by more than 4 orders of magnitude. This suggests that, if similar structures exist in the cell nucleus, one of the functions of these proteins might be linked to the remarkable property of DNA hemicatenanes to associate two distant regions of the genome in a stable but reversible manner.

  16. Z

    Data from: Interformer: An Interaction-Aware Model for Protein-Ligand...

    • data.niaid.nih.gov
    Updated Jan 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lai, Houtim (2025). Interformer: An Interaction-Aware Model for Protein-Ligand Docking and Affinity Prediction [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10828798
    Explore at:
    Dataset updated
    Jan 8, 2025
    Dataset provided by
    Tencent (China)
    Authors
    Lai, Houtim
    License

    http://www.apache.org/licenses/LICENSE-2.0http://www.apache.org/licenses/LICENSE-2.0

    Description

    The code, dataset, and model weights are described in the paper "Interformer: An Interaction-Aware Model for Protein-Ligand Docking and Affinity Prediction."

    experiment_results.zip: Contains generated results that can reproduce the result from the reported paper.

    benchmark.zip: Contains docking and affinity input data of the interformer. You can use the source code to make predictions and reproduce the number of the reported paper.

    checkpoints.zip: Contains one weight for the Energy and four PoseScore and Affinity models.

    source_code_1.0.zip: Contains the initial version of the source code.

    interformer_train.tar.gz: Contains prepared training data for interformer. poses/ contains all structure need for training, poses/ligand contains the re-docking poses generated by interformer energy, poses/ligand/rcsb contains the conformation of reference ligand, poses/pocket contains all pocket extract by raw PDB from rcsb, poses/uff contains all ligand conformation minimized using UFF from reference ligand, and train/ contains the training csv.

    You can also find the newest version of the source code at https://github.com/tencent-ailab/Interformer

  17. Receptor affinity data for THC.

    • plos.figshare.com
    xls
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Thomas S. Ray (2023). Receptor affinity data for THC. [Dataset]. http://doi.org/10.1371/journal.pone.0009019.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Thomas S. Ray
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Receptor affinity data for THC collected from the literature. The columns identify the receptor, the radioligand used in determining affinity, the source species from which the receptor was used, the tissue from which the receptor was used, the Ki value in nanomoles, and the literature reference from which the data was obtained.

  18. p

    Guaranteed Rate Affinity Locations Data for United States

    • poidata.io
    csv, json
    Updated Oct 31, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Business Data Provider (2025). Guaranteed Rate Affinity Locations Data for United States [Dataset]. https://poidata.io/brand-report/guaranteed-rate-affinity/united-states
    Explore at:
    json, csvAvailable download formats
    Dataset updated
    Oct 31, 2025
    Dataset authored and provided by
    Business Data Provider
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    2025
    Area covered
    United States
    Variables measured
    Website URL, Phone Number, Review Count, Business Name, Email Address, Business Hours, Customer Rating, Business Address, Brand Affiliation, Geographic Coordinates
    Description

    Comprehensive dataset containing 262 verified Guaranteed Rate Affinity locations in United States with complete contact information, ratings, reviews, and location data.

  19. o

    Affinity Street Cross Street Data in Hoxie, AR

    • ownerly.com
    Updated Dec 9, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ownerly (2021). Affinity Street Cross Street Data in Hoxie, AR [Dataset]. https://www.ownerly.com/ar/hoxie/affinity-st-home-details
    Explore at:
    Dataset updated
    Dec 9, 2021
    Dataset authored and provided by
    Ownerly
    Area covered
    Hoxie, Arkansas, Affinity Street
    Description

    This dataset provides information about the number of properties, residents, and average property values for Affinity Street cross streets in Hoxie, AR.

  20. Experimental Protein-Ligand Affinity Data for CASP16 Pharmaceutical Ligand...

    • zenodo.org
    txt
    Updated Sep 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andreas Tosstorff; Andreas Tosstorff; Markus G. Rudolph; Markus G. Rudolph; Bernd Kuhn; Bernd Kuhn; Christian Kramer; Christian Kramer; May Sharpe; May Sharpe; Chia-Ying Huang; Chia-Ying Huang; Alexander Metz; Alexander Metz; Julien Hazemann; Julien Hazemann; Daniel Ritz; Daniel Ritz; Aengus Mac Sweeney; Aengus Mac Sweeney; Joerg Benz; Joerg Benz; Michael K. Gilson; Michael K. Gilson (2025). Experimental Protein-Ligand Affinity Data for CASP16 Pharmaceutical Ligand Challenge [Dataset]. http://doi.org/10.5281/zenodo.16762332
    Explore at:
    txtAvailable download formats
    Dataset updated
    Sep 15, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Andreas Tosstorff; Andreas Tosstorff; Markus G. Rudolph; Markus G. Rudolph; Bernd Kuhn; Bernd Kuhn; Christian Kramer; Christian Kramer; May Sharpe; May Sharpe; Chia-Ying Huang; Chia-Ying Huang; Alexander Metz; Alexander Metz; Julien Hazemann; Julien Hazemann; Daniel Ritz; Daniel Ritz; Aengus Mac Sweeney; Aengus Mac Sweeney; Joerg Benz; Joerg Benz; Michael K. Gilson; Michael K. Gilson
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is the Supporting Information for the experimental data paper associated with the CASP16 pharmaceutical protein-ligand pose- and affinity-prediction challenge. The contents are summarized as follows. The paper's DOI will be added to this Zenodo record once it is available.

    Roche: Semicolon-delimited files with ligand SMILES strings, PDB identifiers, IC50 data (μM) for chymase and ATX, and ligand pKa data, as well as IC50 for cathepsin G, which is similar to chymase but was not used as a CASP16 target.

    Idorsia: Semicolon-delimited files with ligand SMILES strings and PDB identifiers for 3CL/Mpro targets. Table of X-ray data processing statistics (Table S1) and structure refinement statistics (Table S2).

    The SI also includes an inventory of the SI files with data definitions.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Peeyush Kant Misra (2021). Affinity Dataset [Dataset]. https://www.kaggle.com/datasets/pkmisra/affinity-dataset
Organization logo

Affinity Dataset

Explore at:
zip(342 bytes)Available download formats
Dataset updated
Mar 4, 2021
Authors
Peeyush Kant Misra
Description

Dataset

This dataset was created by Peeyush Kant Misra

Contents

Search
Clear search
Close search
Google apps
Main menu