20 datasets found

f
DataSheet1_A Set of Experimentally Validated Decoys for the Human CC...
frontiersin.figshare.com
docx
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Matic Proj; Steven De Jonghe; Tom Van Loy; Marko Jukič; Anže Meden; Luka Ciber; Črtomir Podlipnik; Uroš Grošelj; Janez Konc; Dominique Schols; Stanislav Gobec (2023). DataSheet1_A Set of Experimentally Validated Decoys for the Human CC Chemokine Receptor 7 (CCR7) Obtained by Virtual Screening.docx [Dataset]. http://doi.org/10.3389/fphar.2022.855653.s001
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.3389/fphar.2022.855653.s001
Dataset updated
May 31, 2023
Dataset provided by
Frontiers
Authors
Matic Proj; Steven De Jonghe; Tom Van Loy; Marko Jukič; Anže Meden; Luka Ciber; Črtomir Podlipnik; Uroš Grošelj; Janez Konc; Dominique Schols; Stanislav Gobec
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
We present a state-of-the-art virtual screening workflow aiming at the identification of novel CC chemokine receptor 7 (CCR7) antagonists. Although CCR7 is associated with a variety of human diseases, such as immunological disorders, inflammatory diseases, and cancer, this target is underexplored in drug discovery and there are no potent and selective CCR7 small molecule antagonists available today. Therefore, computer-aided ligand-based, structure-based, and joint virtual screening campaigns were performed. Hits from these virtual screenings were tested in a CCL19-induced calcium signaling assay. After careful evaluation, none of the in silico hits were confirmed to have an antagonistic effect on CCR7. Hence, we report here a valuable set of 287 inactive compounds that can be used as experimentally validated decoys.
f
DataSheet2_A Set of Experimentally Validated Decoys for the Human CC...
frontiersin.figshare.com
xlsx
Updated Jun 4, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Matic Proj; Steven De Jonghe; Tom Van Loy; Marko Jukič; Anže Meden; Luka Ciber; Črtomir Podlipnik; Uroš Grošelj; Janez Konc; Dominique Schols; Stanislav Gobec (2023). DataSheet2_A Set of Experimentally Validated Decoys for the Human CC Chemokine Receptor 7 (CCR7) Obtained by Virtual Screening.xlsx [Dataset]. http://doi.org/10.3389/fphar.2022.855653.s002
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.3389/fphar.2022.855653.s002
Dataset updated
Jun 4, 2023
Dataset provided by
Frontiers
Authors
Matic Proj; Steven De Jonghe; Tom Van Loy; Marko Jukič; Anže Meden; Luka Ciber; Črtomir Podlipnik; Uroš Grošelj; Janez Konc; Dominique Schols; Stanislav Gobec
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
We present a state-of-the-art virtual screening workflow aiming at the identification of novel CC chemokine receptor 7 (CCR7) antagonists. Although CCR7 is associated with a variety of human diseases, such as immunological disorders, inflammatory diseases, and cancer, this target is underexplored in drug discovery and there are no potent and selective CCR7 small molecule antagonists available today. Therefore, computer-aided ligand-based, structure-based, and joint virtual screening campaigns were performed. Hits from these virtual screenings were tested in a CCL19-induced calcium signaling assay. After careful evaluation, none of the in silico hits were confirmed to have an antagonistic effect on CCR7. Hence, we report here a valuable set of 287 inactive compounds that can be used as experimentally validated decoys.
Z
Data from: Deep Reinforcement Learning Enables Better Bias Control in...
data.niaid.nih.gov
Updated Feb 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Li, Shan (2024). Deep Reinforcement Learning Enables Better Bias Control in Benchmark for Virtual Screening [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7861684
Explore at:
Dataset updated
Feb 16, 2024
Dataset provided by
Wu, Song
Wang, Simon, Xiang
Wang, Dongmei
Li, Shan
Xia, Jie
Shen, Tao
Zhang, Liangren
License
http://www.apache.org/licenses/LICENSE-2.0http://www.apache.org/licenses/LICENSE-2.0
Description
This compressed file contains all datasets made for the validation of MUBDsyn.

datasets_int_val: 17 cases in this folder are derived from MUBD for GPCRs. MUBDreal was made by MUBD-DecoyMaker2.0 and MUBDsyn was made by MUBD-DecoyMakersyn. datasets_ext_val_classical_VS: Five cases in this folder are derived from the shared cases of MUV and DUD-E. The active sets of MUV were taken as the input to make corresponding MUBD datasets. Files in SBVS are raw molecular docking results by smina. datasets_ext_val_SI_classical_VS: DeepCoy and TocoDecoy were used to make the datasets corresponding to the same five cases above. The data of DeepCoy was directly retrieved from DeepCoy resources at OPIG while topology decoys of TocoDecoy_9W were made based on the scripts provided at TocoDecoy GitHub Repository. Files in SBVS are raw molecular docking results by smina. datasets_ext_val_ML_VS: Ten cases in this folder are derived from NRLiSt-BDB. Corresponding MUBD datasets were made as described above. All these datasets can be used for the reproduction of validation performed in the manuscript or to benchmark various virtual screening methods.
Data from: Comparison of Topological, Shape, and Docking Methods in Virtual...
figshare.com
acs.figshare.com
txt
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Georgia B. McGaughey; Robert P. Sheridan; Christopher I. Bayly; J. Chris Culberson; Constantine Kreatsoulas; Stacey Lindsley; Vladimir Maiorov; Jean-Francois Truchon; Wendy D. Cornell (2023). Comparison of Topological, Shape, and Docking Methods in Virtual Screening [Dataset]. http://doi.org/10.1021/ci700052x.s005
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.1021/ci700052x.s005
Dataset updated
Jun 1, 2023
Dataset provided by
ACS Publications
Authors
Georgia B. McGaughey; Robert P. Sheridan; Christopher I. Bayly; J. Chris Culberson; Constantine Kreatsoulas; Stacey Lindsley; Vladimir Maiorov; Jean-Francois Truchon; Wendy D. Cornell
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Virtual screening benchmarking studies were carried out on 11 targets to evaluate the performance of three commonly used approaches: 2D ligand similarity (Daylight, TOPOSIM), 3D ligand similarity (SQW, ROCS), and protein structure-based docking (FLOG, FRED, Glide). Active and decoy compound sets were assembled from both the MDDR and the Merck compound databases. Averaged over multiple targets, ligand-based methods outperformed docking algorithms. This was true for 3D ligand-based methods only when chemical typing was included. Using mean enrichment factor as a performance metric, Glide appears to be the best docking method among the three with FRED a close second. Results for all virtual screening methods are database dependent and can vary greatly for particular targets.
f
Data from: Discovery of Novel Inhibitors Targeting the Menin-Mixed Lineage...
acs.figshare.com
xlsx
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yuan Xu; Liyan Yue; Yulan Wang; Jing Xing; Zhifeng Chen; Zhe Shi; Rongfeng Liu; Yu-Chih Liu; Xiaomin Luo; Hualiang Jiang; Kaixian Chen; Cheng Luo; Mingyue Zheng (2023). Discovery of Novel Inhibitors Targeting the Menin-Mixed Lineage Leukemia Interface Using Pharmacophore- and Docking-Based Virtual Screening [Dataset]. http://doi.org/10.1021/acs.jcim.6b00185.s001
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.1021/acs.jcim.6b00185.s001
Dataset updated
May 30, 2023
Dataset provided by
ACS Publications
Authors
Yuan Xu; Liyan Yue; Yulan Wang; Jing Xing; Zhifeng Chen; Zhe Shi; Rongfeng Liu; Yu-Chih Liu; Xiaomin Luo; Hualiang Jiang; Kaixian Chen; Cheng Luo; Mingyue Zheng
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Disrupting the interaction between mixed lineage leukemia (MLL) fusion protein and menin provides a therapeutic approach for MLL-mediated leukemia. Here, we aim to discover novel inhibitors targeting the menin-MLL interface with virtual screening. Both structure-based molecular docking and ligand-based pharmacophore models were established, and the models used for compound screening show a remarkable ability to retrieve known active ligands from decoy molecules. Verified by a fluorescence polarization assay, five hits with novel scaffolds were identified. Among them, DCZ_M123 exhibited potent inhibitory activity with an IC50 of 4.71 ± 0.12 μM and a KD of 14.70 ± 2.13 μM, and it can effectively inhibit the human MLL-rearranged leukemia cells MV4;11 and KOPN8 with GI50 values of 0.84 μM and 0.54 μM, respectively.
f
Representability of algebraic topology for biomolecules in machine learning...
plos.figshare.com
pdf
Updated Jun 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zixuan Cang; Lin Mu; Guo-Wei Wei (2023). Representability of algebraic topology for biomolecules in machine learning based scoring and virtual screening [Dataset]. http://doi.org/10.1371/journal.pcbi.1005929
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pcbi.1005929
Dataset updated
Jun 3, 2023
Dataset provided by
PLOS Computational Biology
Authors
Zixuan Cang; Lin Mu; Guo-Wei Wei
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This work introduces a number of algebraic topology approaches, including multi-component persistent homology, multi-level persistent homology, and electrostatic persistence for the representation, characterization, and description of small molecules and biomolecular complexes. In contrast to the conventional persistent homology, multi-component persistent homology retains critical chemical and biological information during the topological simplification of biomolecular geometric complexity. Multi-level persistent homology enables a tailored topological description of inter- and/or intra-molecular interactions of interest. Electrostatic persistence incorporates partial charge information into topological invariants. These topological methods are paired with Wasserstein distance to characterize similarities between molecules and are further integrated with a variety of machine learning algorithms, including k-nearest neighbors, ensemble of trees, and deep convolutional neural networks, to manifest their descriptive and predictive powers for protein-ligand binding analysis and virtual screening of small molecules. Extensive numerical experiments involving 4,414 protein-ligand complexes from the PDBBind database and 128,374 ligand-target and decoy-target pairs in the DUD database are performed to test respectively the scoring power and the discriminatory power of the proposed topological learning strategies. It is demonstrated that the present topological learning outperforms other existing methods in protein-ligand binding affinity prediction and ligand-decoy discrimination.
f
Data from: Structure-Based Virtual Screening Approach for Discovery of...
figshare.com
txt
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dora Toledo Warshaviak; Gali Golan; Kenneth W. Borrelli; Kai Zhu; Ori Kalid (2023). Structure-Based Virtual Screening Approach for Discovery of Covalently Bound Ligands [Dataset]. http://doi.org/10.1021/ci500175r.s004
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.1021/ci500175r.s004
Dataset updated
May 31, 2023
Dataset provided by
ACS Publications
Authors
Dora Toledo Warshaviak; Gali Golan; Kenneth W. Borrelli; Kai Zhu; Ori Kalid
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
We present a fast and effective covalent docking approach suitable for large-scale virtual screening (VS). We applied this method to four targets (HCV NS3 protease, Cathepsin K, EGFR, and XPO1) with known crystal structures and known covalent inhibitors. We implemented a customized “VS mode” of the Schrödinger Covalent Docking algorithm (CovDock), which we refer to as CovDock-VS. Known actives and target-specific sets of decoys were docked to selected X-ray structures, and poses were filtered based on noncovalent protein–ligand interactions known to be important for activity. We were able to retrieve 71%, 72%, and 77% of the known actives for Cathepsin K, HCV NS3 protease, and EGFR within 5% of the decoy library, respectively. With the more challenging XPO1 target, where no specific interactions with the protein could be used for postprocessing of the docking results, we were able to retrieve 95% of the actives within 30% of the decoy library and achieved an early enrichment factor (EF1%) of 33. The poses of the known actives bound to existing crystal structures of 4 targets were predicted with an average RMSD of 1.9 Å. To the best of our knowledge, CovDock-VS is the first fully automated tool for efficient virtual screening of covalent inhibitors. Importantly, CovDock-VS can handle multiple chemical reactions within the same library, only requiring a generic SMARTS-based predefinition of the reaction. CovDock-VS provides a fast and accurate way of differentiating actives from decoys without significantly deteriorating the accuracy of the predicted poses for covalent protein–ligand complexes. Therefore, we propose CovDock-VS as an efficient structure-based virtual screening method for discovery of novel and diverse covalent ligands.
f
Hidden bias in the DUD-E dataset leads to misleading performance of deep...
plos.figshare.com
tiff
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lieyang Chen; Anthony Cruz; Steven Ramsey; Callum J. Dickson; Jose S. Duca; Viktor Hornak; David R. Koes; Tom Kurtzman (2023). Hidden bias in the DUD-E dataset leads to misleading performance of deep learning in structure-based virtual screening [Dataset]. http://doi.org/10.1371/journal.pone.0220113
Explore at:
tiffAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0220113
Dataset updated
May 31, 2023
Dataset provided by
PLOS ONE
Authors
Lieyang Chen; Anthony Cruz; Steven Ramsey; Callum J. Dickson; Jose S. Duca; Viktor Hornak; David R. Koes; Tom Kurtzman
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Recently much effort has been invested in using convolutional neural network (CNN) models trained on 3D structural images of protein-ligand complexes to distinguish binding from non-binding ligands for virtual screening. However, the dearth of reliable protein-ligand x-ray structures and binding affinity data has required the use of constructed datasets for the training and evaluation of CNN molecular recognition models. Here, we outline various sources of bias in one such widely-used dataset, the Directory of Useful Decoys: Enhanced (DUD-E). We have constructed and performed tests to investigate whether CNN models developed using DUD-E are properly learning the underlying physics of molecular recognition, as intended, or are instead learning biases inherent in the dataset itself. We find that superior enrichment efficiency in CNN models can be attributed to the analogue and decoy bias hidden in the DUD-E dataset rather than successful generalization of the pattern of protein-ligand interactions. Comparing additional deep learning models trained on PDBbind datasets, we found that their enrichment performances using DUD-E are not superior to the performance of the docking program AutoDock Vina. Together, these results suggest that biases that could be present in constructed datasets should be thoroughly evaluated before applying them to machine learning based methodology development.
Data from: Best of Both Worlds: On the Complementarity of Ligand-Based and...
acs.figshare.com
xlsx
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Fabio Broccatelli; Nathan Brown (2023). Best of Both Worlds: On the Complementarity of Ligand-Based and Structure-Based Virtual Screening [Dataset]. http://doi.org/10.1021/ci5001604.s001
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.1021/ci5001604.s001
Dataset updated
May 31, 2023
Dataset provided by
ACS Publications
Authors
Fabio Broccatelli; Nathan Brown
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Virtual screening with docking is an integral component of drug design, particularly during hit finding phases. While successful prospective studies of virtual screening exist, it remains a significant challenge to identify best practices a priori due to the many factors that influence the final outcome, including targets, data sets, software, metrics, and expert knowledge of the users. This study investigates the extent to which ligand-based methods can be applied to improve structure-based methods. The use of ligand-based methods to modulate the number of hits identified using the protein–ligand complex and also the diversity of these hits from the crystallographic ligand is discussed. In this study, 40 CDK2 ligand complexes were used together with two external data sets containing both actives and inactives from GlaxoSmithKline (GSK) and actives and decoys from the Directory of Useful Decoys (DUD). Results show how ligand-based modeling can be used to select a more appropriate protein conformation for docking, as well as to assess the reliability of the docking experiment. The time gained by reducing the pool of virtual screening candidates via ligand-based similarity can be invested in more accurate docking procedures, as well as in downstream labor-intensive approaches (e.g., visual inspection) maximizing the use of the chemical and biological information available. This provides a framework for molecular modeling scientists that are involved in initiating virtual screening campaigns with practical advice to make best use of the information available to them.
f
Data from: Optimization of Cavity-Based Negative Images to Boost Docking...
figshare.com
zip
Updated Jun 2, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sami T. Kurkinen; Jukka V. Lehtonen; Olli T. Pentikäinen; Pekka A. Postila (2023). Optimization of Cavity-Based Negative Images to Boost Docking Enrichment in Virtual Screening [Dataset]. http://doi.org/10.1021/acs.jcim.1c01145.s002
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.1021/acs.jcim.1c01145.s002
Dataset updated
Jun 2, 2023
Dataset provided by
ACS Publications
Authors
Sami T. Kurkinen; Jukka V. Lehtonen; Olli T. Pentikäinen; Pekka A. Postila
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Molecular docking is a key in silico method used routinely in modern drug discovery projects. Although docking provides high-quality ligand binding predictions, it regularly fails to separate the active compounds from the inactive ones. In negative image-based rescoring (R-NiB), the shape/electrostatic potential (ESP) of docking poses is compared to the negative image of the protein’s ligand binding cavity. While R-NiB often improves the docking yield considerably, the cavity-based models do not reach their full potential without expert editing. Accordingly, a greedy search-driven methodology, brute force negative image-based optimization (BR-NiB), is presented for optimizing the models via iterative editing and benchmarking. Thorough and unbiased training, testing and stringent validation with a multitude of drug targets, and alternative docking software show that BR-NiB ensures excellent docking efficacy. BR-NiB can be considered as a new type of shape-focused pharmacophore modeling, where the optimized models contain only the most vital cavity information needed for effectively filtering docked actives from the inactive or decoy compounds. Finally, the BR-NiB code for performing the automated optimization is provided free-of-charge under MIT license via GitHub (https://github.com/jvlehtonen/brutenib) for boosting the success rates of docking-based virtual screening campaigns.
f
Comparison of Ligand- and Structure-Based Virtual Screening on the DUD Data...
acs.figshare.com
txt
Updated Jun 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Modest von Korff; Joel Freyss; Thomas Sander (2023). Comparison of Ligand- and Structure-Based Virtual Screening on the DUD Data Set [Dataset]. http://doi.org/10.1021/ci800303k.s003
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.1021/ci800303k.s003
Dataset updated
Jun 3, 2023
Dataset provided by
ACS Publications
Authors
Modest von Korff; Joel Freyss; Thomas Sander
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Several in-house developed descriptors and our in-house docking tool ActDock were compared with virtual screening on the data set of useful decoys (DUD). The results were compared with the chemical fingerprint descriptor from ChemAxon and with the docking results of the original DUD publication. The DUD is the first published data set providing active molecules, decoys, and references for crystal structures of ligand-target complexes. The DUD was designed for the purpose of evaluating docking programs. It contains 2950 active compounds against a total of 40 target proteins. Furthermore, for every ligand the data set contains 36 structurally dissimilar decoy compounds with similar physicochemical properties. We extracted the ligands from the target proteins to extend the applicability of the data set to include ligand based virtual screening. From the 40 target proteins, 37 contained ligands that we used as query molecules for virtual screening evaluation. With this data set a large comparison was done between four different chemical fingerprints, a topological pharmacophore descriptor, the Flexophore descriptor, and ActDock. The Actelion docking tool relies on a MM2 forcefield and a pharmacophore point interaction statistic for scoring; the details are described in this publication. In terms of enrichment rates the chemical fingerprint descriptors performed better than the Flexophore and the docking tool. After removing molecules chemically similar to the query molecules the Flexophore descriptor outperformed the chemical descriptors and the topological pharmacophore descriptors. With the similarity matrix calculations used in this study it was shown that the Flexophore is well suited to find new chemical entities via “scaffold hopping”. The Flexophore descriptor can be explored with a Java applet at http://www.cheminformatics.ch in the submenu Tools→Flexophore. Its usage is free of charge and does not require registration.
f
Data from: AlzyFinder: A Machine-Learning-Driven Platform for Ligand-Based...
figshare.com
xlsx
Updated Dec 10, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jessica Valero-Rojas; Camilo Ramírez-Sánchez; Laura Pacheco-Paternina; Paulina Valenzuela-Hormazabal; Fernanda I. Saldivar-González; Paula Santana; Janneth González; Tatiana Gutiérrez-Bunster; Alejandro Valdés-Jiménez; David Ramírez (2024). AlzyFinder: A Machine-Learning-Driven Platform for Ligand-Based Virtual Screening and Network Pharmacology [Dataset]. http://doi.org/10.1021/acs.jcim.4c01481.s001
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.1021/acs.jcim.4c01481.s001
Dataset updated
Dec 10, 2024
Dataset provided by
ACS Publications
Authors
Jessica Valero-Rojas; Camilo Ramírez-Sánchez; Laura Pacheco-Paternina; Paulina Valenzuela-Hormazabal; Fernanda I. Saldivar-González; Paula Santana; Janneth González; Tatiana Gutiérrez-Bunster; Alejandro Valdés-Jiménez; David Ramírez
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Alzheimer’s disease (AD), a prevalent neurodegenerative disorder, presents significant challenges in drug development due to its multifactorial nature and complex pathophysiology. The AlzyFinder Platform, introduced in this study, addresses these challenges by providing a comprehensive, free web-based tool for parallel ligand-based virtual screening and network pharmacology, specifically targeting over 85 key proteins implicated in AD. This innovative approach is designed to enhance the identification and analysis of potential multitarget ligands, thereby accelerating the development of effective therapeutic strategies against AD. AlzyFinder Platform incorporates machine learning models to facilitate the ligand-based virtual screening process. These models, built with the XGBoost algorithm and optimized through Optuna, were meticulously trained and validated using robust methodologies to ensure high predictive accuracy. Validation included extensive testing with active, inactive, and decoy molecules, demonstrating the platform’s efficacy in distinguishing active compounds. The models are evaluated based on balanced accuracy, precision, and F1 score metrics. A unique soft-voting ensemble approach is utilized to refine the classification process, integrating the strengths of individual models. This methodological framework enables a comprehensive analysis of interaction data, which is presented in multiple formats such as tables, heat maps, and interactive Ligand–Protein Interaction networks, thus enhancing the visualization and analysis of drug–protein interactions. AlzyFinder was applied to screen five molecules recently reported (and not used to train or validate the ML models) as active compounds against five key AD targets. The platform demonstrated its efficacy by accurately predicting all five molecules as true positives with a probability greater than 0.70. This result underscores the platform’s capability in identifying potential therapeutic compounds with high precision. In conclusion, AlzyFinder’s innovative approach extends beyond traditional virtual screening by incorporating network pharmacology analysis, thus providing insights into the systemic actions of drug candidates. This feature allows for the exploration of ligand–protein and protein–protein interactions and their extensions, offering a comprehensive view of potential therapeutic impacts. As the first open-access platform of its kind, AlzyFinder stands as a valuable resource for the AD research community, available at http://www.alzyfinder-platform.udec.cl with supporting data and scripts accessible via GitHub https://github.com/ramirezlab/AlzyFinder.
f
Data from: Predicting Ligand Binding Modes from Neural Networks Trained on...
acs.figshare.com
zip
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Vladimir Chupakhin; Gilles Marcou; Igor Baskin; Alexandre Varnek; Didier Rognan (2023). Predicting Ligand Binding Modes from Neural Networks Trained on Protein–Ligand Interaction Fingerprints [Dataset]. http://doi.org/10.1021/ci300200r.s002
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.1021/ci300200r.s002
Dataset updated
Jun 1, 2023
Dataset provided by
ACS Publications
Authors
Vladimir Chupakhin; Gilles Marcou; Igor Baskin; Alexandre Varnek; Didier Rognan
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
We herewith present a novel approach to predict protein–ligand binding modes from the single two-dimensional structure of the ligand. Known protein–ligand X-ray structures were converted into binary bit strings encoding protein–ligand interactions. An artificial neural network was then set up to first learn and then predict protein–ligand interaction fingerprints from simple ligand descriptors. Specific models were constructed for three targets (CDK2, p38-α, HSP90-α) and 146 ligands for which protein–ligand X-ray structures are available. These models were able to predict protein–ligand interaction fingerprints and to discriminate important features from minor interactions. Predicted interaction fingerprints were successfully used as descriptors to discriminate true ligands from decoys by virtual screening. In some but not all cases, the predicted interaction fingerprints furthermore enable to efficiently rerank cross-docking poses and prioritize the best possible docking solutions.
f
Data from: Identification of Potential TMPRSS2 Inhibitors for COVID-19...
figshare.com
txt
Updated Jun 2, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rong Yang; Linhua Liu; Dansheng Jiang; Lei Liu; Huili Yang; Hongling Xu; Meirong Qin; Ping Wang; Jiangyong Gu; Yufeng Xing (2023). Identification of Potential TMPRSS2 Inhibitors for COVID-19 Treatment in Chinese Medicine by Computational Approaches and Surface Plasmon Resonance Technology [Dataset]. http://doi.org/10.1021/acs.jcim.2c01643.s002
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.1021/acs.jcim.2c01643.s002
Dataset updated
Jun 2, 2023
Dataset provided by
ACS Publications
Authors
Rong Yang; Linhua Liu; Dansheng Jiang; Lei Liu; Huili Yang; Hongling Xu; Meirong Qin; Ping Wang; Jiangyong Gu; Yufeng Xing
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Background: Coronavirus disease-19 (COVID-19) pneumonia continues to spread in the entire globe with limited medication available. In this study, the active compounds in Chinese medicine (CM) recipes targeting the transmembrane serine protease 2 (TMPRSS2) protein for the treatment of COVID-19 were explored. Methods: The conformational structure of TMPRSS2 protein (TMPS2) was built through homology modeling. A training set covering TMPS2 inhibitors and decoy molecules was docked to TMPS2, and their docking poses were re-scored with scoring schemes. A receiver operating characteristic (ROC) curve was applied to select the best scoring function. Virtual screening of the candidate compounds (CCDs) in the six highly effective CM recipes against TMPS2 was conducted based on the validated docking protocol. The potential CCDs after docking were subject to molecular dynamics (MD) simulations and surface plasmon resonance (SPR) experiment. Results: A training set of 65 molecules were docked with modeled TMPS2 and LigScore2 with the highest area under the curve, AUC, value (0.886) after ROC analysis selected to best differentiate inhibitors from decoys. A total of 421 CCDs in the six recipes were successfully docked into TMPS2, and the top 16 CCDs with LigScore2 higher than the cutoff (4.995) were screened out. MD simulations revealed a stable binding between these CCDs and TMPS2 due to the negative binding free energy. Lastly, SPR experiments validated the direct combination of narirutin, saikosaponin B1, and rutin with TMPS2. Conclusions: Specific active compounds including narirutin, saikosaponin B1, and rutin in CM recipes potentially target and inhibit TMPS2, probably exerting a therapeutic effect on COVID-19.
f
Data from: Selecting an Optimal Number of Binding Site Waters To Improve...
acs.figshare.com
zip
Updated Jun 2, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Eelke B. Lenselink; Thijs Beuming; Woody Sherman; Herman W. T. van Vlijmen; Adriaan P. IJzerman (2023). Selecting an Optimal Number of Binding Site Waters To Improve Virtual Screening Enrichments Against the Adenosine A2A Receptor [Dataset]. http://doi.org/10.1021/ci5000455.s002
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.1021/ci5000455.s002
Dataset updated
Jun 2, 2023
Dataset provided by
ACS Publications
Authors
Eelke B. Lenselink; Thijs Beuming; Woody Sherman; Herman W. T. van Vlijmen; Adriaan P. IJzerman
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
A major challenge in structure-based virtual screening (VS) involves the treatment of explicit water molecules during docking in order to improve the enrichment of active compounds over decoys. Here we have investigated this in the context of the adenosine A2A receptor, where water molecules have previously been shown to be important for achieving high enrichment rates with docking, and where the positions of some binding site waters are known from a high-resolution crystal structure. The effect of these waters (both their presence and orientations) on VS enrichment was assessed using a carefully curated set of 299 high affinity A2A antagonists and 17,337 decoys. We show that including certain crystal waters greatly improves VS enrichment and that optimization of water hydrogen positions is needed in order to achieve the best results. We also show that waters derived from a molecular dynamics simulation  without any knowledge of crystallographic waters  can improve enrichments to a similar degree as the crystallographic waters, which makes this strategy applicable to structures without experimental knowledge of water positions. Finally, we used decision trees to select an ensemble of structures with different water molecule positions and orientations that outperforms any single structure with water molecules. The approach presented here is validated against independent test sets of A2A receptor antagonists and decoys from the literature. In general, this water optimization strategy could be applied to any target with waters-mediated protein–ligand interactions.
f
Data Sheet 1_i-DENV: development of QSAR based regression models for...
figshare.com
docx
Updated Jun 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sakshi Gautam; Anamika Thakur; Manoj Kumar (2025). Data Sheet 1_i-DENV: development of QSAR based regression models for predicting inhibitors targeting non-structural (NS) proteins of dengue virus.docx [Dataset]. http://doi.org/10.3389/fphar.2025.1605722.s002
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.3389/fphar.2025.1605722.s002
Dataset updated
Jun 26, 2025
Dataset provided by
Frontiers
Authors
Sakshi Gautam; Anamika Thakur; Manoj Kumar
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
IntroductionDengue virus (DENV) is a significant global arboviral threat with fatal potential, currently lacking effective antiviral treatments or a universally applicable vaccine. In response to this unmet need, we developed the “i‐DENV” web server to facilitate structure‐based drug prediction targeting key viral proteins.MethodsThe i‐DENV platform focuses on the NS3 protease and NS5 polymerase of DENV using machine learning techniques (MLTs) and quantitative structure‐activity relationship (QSAR) modeling. A total of 1213 and 157 unique compounds, along with their IC50 values targeting NS3 and NS5 respectively, were retrieved from the ChEMBL and DenvInD databases. Molecular descriptors and fingerprints were computed and used to train multiple regression‐based MLTs, including SVM, RF, kNN, ANN, XGBoost, and DNN, with ten‐fold cross‐validation.ResultsThe best-performing SVM and ANN models achieved Pearson correlation coefficients (PCCs) of 0.857/0.862 (NS3) and 0.982/0.964 (NS5) on training/testing sets, and 0.870/0.894 (NS3) and 0.970/0.977 (NS5) on independent validation sets. Model robustness was supported through scatter plots, chemical clustering, statistical analyses, decoy set etc. Virtual screening identified Micafungin, Oritavancin, and Iodixanol as top hits for NS2B/NS3 protease, and Cangrelor, Eravacycline, and Baloxavir marboxil for NS5 polymerase. Molecular docking further confirmed strong binding affinities of these compounds.DiscussionOur in-silico findings suggest these repurposed drugs as promising antiviral candidates against DENV. However, further in vitro and in vivo studies are essential to validate their therapeutic potential. The i-DENV web server is freely accessible at http://bioinfo.imtech.res.in/manojk/idenv/, offering a structure-specific drug prediction platform for DENV research and antiviral drug discovery.
f
Table 1_i-DENV: development of QSAR based regression models for predicting...
frontiersin.figshare.com
xlsx
Updated Jun 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sakshi Gautam; Anamika Thakur; Manoj Kumar (2025). Table 1_i-DENV: development of QSAR based regression models for predicting inhibitors targeting non-structural (NS) proteins of dengue virus.xlsx [Dataset]. http://doi.org/10.3389/fphar.2025.1605722.s001
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.3389/fphar.2025.1605722.s001
Dataset updated
Jun 26, 2025
Dataset provided by
Frontiers
Authors
Sakshi Gautam; Anamika Thakur; Manoj Kumar
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
IntroductionDengue virus (DENV) is a significant global arboviral threat with fatal potential, currently lacking effective antiviral treatments or a universally applicable vaccine. In response to this unmet need, we developed the “i‐DENV” web server to facilitate structure‐based drug prediction targeting key viral proteins.MethodsThe i‐DENV platform focuses on the NS3 protease and NS5 polymerase of DENV using machine learning techniques (MLTs) and quantitative structure‐activity relationship (QSAR) modeling. A total of 1213 and 157 unique compounds, along with their IC50 values targeting NS3 and NS5 respectively, were retrieved from the ChEMBL and DenvInD databases. Molecular descriptors and fingerprints were computed and used to train multiple regression‐based MLTs, including SVM, RF, kNN, ANN, XGBoost, and DNN, with ten‐fold cross‐validation.ResultsThe best-performing SVM and ANN models achieved Pearson correlation coefficients (PCCs) of 0.857/0.862 (NS3) and 0.982/0.964 (NS5) on training/testing sets, and 0.870/0.894 (NS3) and 0.970/0.977 (NS5) on independent validation sets. Model robustness was supported through scatter plots, chemical clustering, statistical analyses, decoy set etc. Virtual screening identified Micafungin, Oritavancin, and Iodixanol as top hits for NS2B/NS3 protease, and Cangrelor, Eravacycline, and Baloxavir marboxil for NS5 polymerase. Molecular docking further confirmed strong binding affinities of these compounds.DiscussionOur in-silico findings suggest these repurposed drugs as promising antiviral candidates against DENV. However, further in vitro and in vivo studies are essential to validate their therapeutic potential. The i-DENV web server is freely accessible at http://bioinfo.imtech.res.in/manojk/idenv/, offering a structure-specific drug prediction platform for DENV research and antiviral drug discovery.
f
Data from: Homology Modeling of Human Muscarinic Acetylcholine Receptors
acs.figshare.com
figshare.com
txt
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Trayder Thomas; Kimberley C. McLean; Fiona M. McRobb; David T. Manallack; David K. Chalmers; Elizabeth Yuriev (2023). Homology Modeling of Human Muscarinic Acetylcholine Receptors [Dataset]. http://doi.org/10.1021/ci400502u.s002
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.1021/ci400502u.s002
Dataset updated
Jun 1, 2023
Dataset provided by
ACS Publications
Authors
Trayder Thomas; Kimberley C. McLean; Fiona M. McRobb; David T. Manallack; David K. Chalmers; Elizabeth Yuriev
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
We have developed homology models of the acetylcholine muscarinic receptors M1R–M5R, based on the β2-adrenergic receptor crystal as the template. This is the first report of homology modeling of all five subtypes of acetylcholine muscarinic receptors with binding sites optimized for ligand binding. The models were evaluated for their ability to discriminate between muscarinic antagonists and decoy compounds using virtual screening using enrichment factors, area under the ROC curve (AUC), and an early enrichment measure, LogAUC. The models produce rational binding modes of docked ligands as well as good enrichment capacity when tested against property-matched decoy libraries, which demonstrates their unbiased predictive ability. To test the relative effects of homology model template selection and the binding site optimization procedure, we generated and evaluated a naı̈ve M2R model, using the M3R crystal structure as a template. Our results confirm previous findings that binding site optimization using ligand(s) active at a particular receptor, i.e. including functional knowledge into the model building process, has a more pronounced effect on model quality than target–template sequence similarity. The optimized M1R–M5R homology models are made available as part of the Supporting Information to allow researchers to use these structures, compare them to their own results, and thus advance the development of better modeling approaches.
f
Table_1_An Improved Receptor-Based Pharmacophore Generation Algorithm Guided...
figshare.com
docx
Updated Dec 17, 2018
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gaoqi He; Bojie Gong; Jianqiang Li; Yiping Song; Shiliang Li; Xingjian Lu (2018). Table_1_An Improved Receptor-Based Pharmacophore Generation Algorithm Guided by Atomic Chemical Characteristics and Hybridization Types.DOCX [Dataset]. http://doi.org/10.3389/fphar.2018.01463.s001
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.3389/fphar.2018.01463.s001
Dataset updated
Dec 17, 2018
Dataset provided by
Frontiers
Authors
Gaoqi He; Bojie Gong; Jianqiang Li; Yiping Song; Shiliang Li; Xingjian Lu
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Pharmacophore-based virtual screening is an important and leading compound discovery method. However, current pharmacophore generation algorithms suffer from difficulties, such as ligand-dependent computation and massive extractive chemical features. On the basis of the features extracted by the five probes in Pocket v.3, this paper presents an improved receptor-based pharmacophore generation algorithm guided by atomic chemical characteristics and hybridization types. The algorithm works under the constraint of receptor atom hybridization types and space distance. Four chemical characteristics (H-A, H-D, and positive and negative charges) were extracted using the hybridization type of receptor atoms, and the feature point sets were merged with 3 Å space constraints. Furthermore, on the basis of the original extraction of hydrophobic characteristics, extraction of aromatic ring chemical characteristics was achieved by counting the number of aromatics, searching for residual base aromatic ring, and determining the direction of aromatic rings. Accordingly, extraction of six kinds of chemical characteristics of the pharmacophore was achieved. In view of the pharmacophore characteristics, our algorithm was compared with the existing LigandScout algorithm. The results demonstrate that the pharmacophore possessing six chemical characteristics can be characterized using our algorithm, which features fewer pharmacophore characteristics and is ligand independent. The computation of many instances from the directory of useful decoy dataset show that the active molecules and decoy molecules can be effectively differentiated through the presented method in this paper.
f
Table_2_An Improved Receptor-Based Pharmacophore Generation Algorithm Guided...
frontiersin.figshare.com
xlsx
Updated Jun 1, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gaoqi He; Bojie Gong; Jianqiang Li; Yiping Song; Shiliang Li; Xingjian Lu (2023). Table_2_An Improved Receptor-Based Pharmacophore Generation Algorithm Guided by Atomic Chemical Characteristics and Hybridization Types.XLSX [Dataset]. http://doi.org/10.3389/fphar.2018.01463.s002
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.3389/fphar.2018.01463.s002
Dataset updated
Jun 1, 2023
Dataset provided by
Frontiers
Authors
Gaoqi He; Bojie Gong; Jianqiang Li; Yiping Song; Shiliang Li; Xingjian Lu
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Pharmacophore-based virtual screening is an important and leading compound discovery method. However, current pharmacophore generation algorithms suffer from difficulties, such as ligand-dependent computation and massive extractive chemical features. On the basis of the features extracted by the five probes in Pocket v.3, this paper presents an improved receptor-based pharmacophore generation algorithm guided by atomic chemical characteristics and hybridization types. The algorithm works under the constraint of receptor atom hybridization types and space distance. Four chemical characteristics (H-A, H-D, and positive and negative charges) were extracted using the hybridization type of receptor atoms, and the feature point sets were merged with 3 Å space constraints. Furthermore, on the basis of the original extraction of hydrophobic characteristics, extraction of aromatic ring chemical characteristics was achieved by counting the number of aromatics, searching for residual base aromatic ring, and determining the direction of aromatic rings. Accordingly, extraction of six kinds of chemical characteristics of the pharmacophore was achieved. In view of the pharmacophore characteristics, our algorithm was compared with the existing LigandScout algorithm. The results demonstrate that the pharmacophore possessing six chemical characteristics can be characterized using our algorithm, which features fewer pharmacophore characteristics and is ligand independent. The computation of many instances from the directory of useful decoy dataset show that the active molecules and decoy molecules can be effectively differentiated through the presented method in this paper.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Matic Proj; Steven De Jonghe; Tom Van Loy; Marko Jukič; Anže Meden; Luka Ciber; Črtomir Podlipnik; Uroš Grošelj; Janez Konc; Dominique Schols; Stanislav Gobec (2023). DataSheet1_A Set of Experimentally Validated Decoys for the Human CC Chemokine Receptor 7 (CCR7) Obtained by Virtual Screening.docx [Dataset]. http://doi.org/10.3389/fphar.2022.855653.s001

DataSheet1_A Set of Experimentally Validated Decoys for the Human CC Chemokine Receptor 7 (CCR7) Obtained by Virtual Screening.docx

Explore at:

docxAvailable download formats

Unique identifier

https://doi.org/10.3389/fphar.2022.855653.s001

Dataset updated

May 31, 2023

Dataset provided by

Frontiers

Authors

Matic Proj; Steven De Jonghe; Tom Van Loy; Marko Jukič; Anže Meden; Luka Ciber; Črtomir Podlipnik; Uroš Grošelj; Janez Konc; Dominique Schols; Stanislav Gobec

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

We present a state-of-the-art virtual screening workflow aiming at the identification of novel CC chemokine receptor 7 (CCR7) antagonists. Although CCR7 is associated with a variety of human diseases, such as immunological disorders, inflammatory diseases, and cancer, this target is underexplored in drug discovery and there are no potent and selective CCR7 small molecule antagonists available today. Therefore, computer-aided ligand-based, structure-based, and joint virtual screening campaigns were performed. Hits from these virtual screenings were tested in a CCL19-induced calcium signaling assay. After careful evaluation, none of the in silico hits were confirmed to have an antagonistic effect on CCR7. Hence, we report here a valuable set of 287 inactive compounds that can be used as experimentally validated decoys.

Clear search

Close search

Google apps

Main menu

DataSheet1_A Set of Experimentally Validated Decoys for the Human CC...

DataSheet2_A Set of Experimentally Validated Decoys for the Human CC...

Data from: Deep Reinforcement Learning Enables Better Bias Control in...

Data from: Comparison of Topological, Shape, and Docking Methods in Virtual...

Data from: Discovery of Novel Inhibitors Targeting the Menin-Mixed Lineage...

Representability of algebraic topology for biomolecules in machine learning...

Data from: Structure-Based Virtual Screening Approach for Discovery of...

Hidden bias in the DUD-E dataset leads to misleading performance of deep...

Data from: Best of Both Worlds: On the Complementarity of Ligand-Based and...

Data from: Optimization of Cavity-Based Negative Images to Boost Docking...

Comparison of Ligand- and Structure-Based Virtual Screening on the DUD Data...

Data from: AlzyFinder: A Machine-Learning-Driven Platform for Ligand-Based...

Data from: Predicting Ligand Binding Modes from Neural Networks Trained on...

Data from: Identification of Potential TMPRSS2 Inhibitors for COVID-19...

Data from: Selecting an Optimal Number of Binding Site Waters To Improve...

Data Sheet 1_i-DENV: development of QSAR based regression models for...

Table 1_i-DENV: development of QSAR based regression models for predicting...

Data from: Homology Modeling of Human Muscarinic Acetylcholine Receptors

Table_1_An Improved Receptor-Based Pharmacophore Generation Algorithm Guided...

Table_2_An Improved Receptor-Based Pharmacophore Generation Algorithm Guided...

DataSheet1_A Set of Experimentally Validated Decoys for the Human CC Chemokine Receptor 7 (CCR7) Obtained by Virtual Screening.docx