33 datasets found

f
Hidden bias in the DUD-E dataset leads to misleading performance of deep...
plos.figshare.com
tiff
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lieyang Chen; Anthony Cruz; Steven Ramsey; Callum J. Dickson; Jose S. Duca; Viktor Hornak; David R. Koes; Tom Kurtzman (2023). Hidden bias in the DUD-E dataset leads to misleading performance of deep learning in structure-based virtual screening [Dataset]. http://doi.org/10.1371/journal.pone.0220113
Explore at:
tiffAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0220113
Dataset updated
May 31, 2023
Dataset provided by
PLOS ONE
Authors
Lieyang Chen; Anthony Cruz; Steven Ramsey; Callum J. Dickson; Jose S. Duca; Viktor Hornak; David R. Koes; Tom Kurtzman
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Recently much effort has been invested in using convolutional neural network (CNN) models trained on 3D structural images of protein-ligand complexes to distinguish binding from non-binding ligands for virtual screening. However, the dearth of reliable protein-ligand x-ray structures and binding affinity data has required the use of constructed datasets for the training and evaluation of CNN molecular recognition models. Here, we outline various sources of bias in one such widely-used dataset, the Directory of Useful Decoys: Enhanced (DUD-E). We have constructed and performed tests to investigate whether CNN models developed using DUD-E are properly learning the underlying physics of molecular recognition, as intended, or are instead learning biases inherent in the dataset itself. We find that superior enrichment efficiency in CNN models can be attributed to the analogue and decoy bias hidden in the DUD-E dataset rather than successful generalization of the pattern of protein-ligand interactions. Comparing additional deep learning models trained on PDBbind datasets, we found that their enrichment performances using DUD-E are not superior to the performance of the docking program AutoDock Vina. Together, these results suggest that biases that could be present in constructed datasets should be thoroughly evaluated before applying them to machine learning based methodology development.
Summary of Vina, Gnina and Pafnucy performance on DUD-E targets.
plos.figshare.com
xls
Updated Jun 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lieyang Chen; Anthony Cruz; Steven Ramsey; Callum J. Dickson; Jose S. Duca; Viktor Hornak; David R. Koes; Tom Kurtzman (2023). Summary of Vina, Gnina and Pafnucy performance on DUD-E targets. [Dataset]. http://doi.org/10.1371/journal.pone.0220113.t004
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0220113.t004
Dataset updated
Jun 3, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Lieyang Chen; Anthony Cruz; Steven Ramsey; Callum J. Dickson; Jose S. Duca; Viktor Hornak; David R. Koes; Tom Kurtzman
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Summary of Vina, Gnina and Pafnucy performance on DUD-E targets.
f
Data from: Property-Unmatched Decoys in Docking Benchmarks
figshare.com
acs.figshare.com
xlsx
Updated Jun 2, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Reed M. Stein; Ying Yang; Trent E. Balius; Matt J. O’Meara; Jiankun Lyu; Jennifer Young; Khanh Tang; Brian K. Shoichet; John J. Irwin (2023). Property-Unmatched Decoys in Docking Benchmarks [Dataset]. http://doi.org/10.1021/acs.jcim.0c00598.s003
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.1021/acs.jcim.0c00598.s003
Dataset updated
Jun 2, 2023
Dataset provided by
ACS Publications
Authors
Reed M. Stein; Ying Yang; Trent E. Balius; Matt J. O’Meara; Jiankun Lyu; Jennifer Young; Khanh Tang; Brian K. Shoichet; John J. Irwin
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Enrichment of ligands versus property-matched decoys is widely used to test and optimize docking library screens. However, the unconstrained optimization of enrichment alone can mislead, leading to false confidence in prospective performance. This can arise by over-optimizing for enrichment against property-matched decoys, without considering the full spectrum of molecules to be found in a true large library screen. Adding decoys representing charge extrema helps mitigate over-optimizing for electrostatic interactions. Adding decoys that represent the overall characteristics of the library to be docked allows one to sample molecules not represented by ligands and property-matched decoys but that one will encounter in a prospective screen. An optimized version of the DUD-E set (DUDE-Z), as well as Extrema and sets representing broad features of the library (Goldilocks), is developed here. We also explore the variability that one can encounter in enrichment calculations and how that can temper one’s confidence in small enrichment differences. The new tools and new decoy sets are freely available at http://tldr.docking.org and http://dudez.docking.org.
PIGNet2: A versatile deep learning-based protein-ligand interaction...
zenodo.org
data.niaid.nih.gov
xz
Updated Jul 18, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Moon Seokhyun; Moon Seokhyun; Hwang Sang-Yeon; Hwang Sang-Yeon; Lim Jaechang; Lim Jaechang; Kim Woo Youn; Kim Woo Youn (2023). PIGNet2: A versatile deep learning-based protein-ligand interaction prediction model for accurate binding affinity scoring and virtual screening [Dataset]. http://doi.org/10.5281/zenodo.8091220
Explore at:
xzAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.8091220
Dataset updated
Jul 18, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Moon Seokhyun; Moon Seokhyun; Hwang Sang-Yeon; Hwang Sang-Yeon; Lim Jaechang; Lim Jaechang; Kim Woo Youn; Kim Woo Youn
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Training and test datasets of the paper "Improving the versatility of deep learning-based protein-ligand interaction prediction for accurate binding affinity scoring and virtual screening".
P
LIT-PCBA(ESR1_ant) Dataset
paperswithcode.com
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yuquan Li; Chang-Yu Hsieh; Ruiqiang Lu; Xiaoqing Gong; Xiaorui Wang; Pengyong Li; Shuo Liu; Yanan Tian; Dejun Jiang; Jiaxian Yan; Qifeng Bai; Huanxiang Liu; Shengyu Zhang; Xiaojun Yao, LIT-PCBA(ESR1_ant) Dataset [Dataset]. https://paperswithcode.com/dataset/lit-pcba-esr1-ant
Explore at:
Authors
Yuquan Li; Chang-Yu Hsieh; Ruiqiang Lu; Xiaoqing Gong; Xiaorui Wang; Pengyong Li; Shuo Liu; Yanan Tian; Dejun Jiang; Jiaxian Yan; Qifeng Bai; Huanxiang Liu; Shengyu Zhang; Xiaojun Yao
Description
Comparative evaluation of virtual screening methods requires a rigorous benchmarking procedure on diverse, realistic, and unbiased data sets. Recent investigations from numerous research groups unambiguously demonstrate that artificially constructed ligand sets classically used by the community (e.g., DUD, DUD-E, MUV) are unfortunately biased by both obvious and hidden chemical biases, therefore overestimating the true accuracy of virtual screening methods. We herewith present a novel data set (LIT-PCBA) specifically designed for virtual screening and machine learning. LIT-PCBA relies on 149 dose–response PubChem bioassays that were additionally processed to remove false positives and assay artifacts and keep active and inactive compounds within similar molecular property ranges. To ascertain that the data set is suited to both ligand-based and structure-based virtual screening, target sets were restricted to single protein targets for which at least one X-ray structure is available in complex with ligands of the same phenotype (e.g., inhibitor, inverse agonist) as that of the PubChem active compounds. Preliminary virtual screening on the 21 remaining target sets with state-of-the-art orthogonal methods (2D fingerprint similarity, 3D shape similarity, molecular docking) enabled us to select 15 target sets for which at least one of the three screening methods is able to enrich the top 1%-ranked compounds in true actives by at least a factor of 2. The corresponding ligand sets (training, validation) were finally unbiased by the recently described asymmetric validation embedding (AVE) procedure to afford the LIT-PCBA data set, consisting of 15 targets and 7844 confirmed active and 407,381 confirmed inactive compounds. The data set mimics experimental screening decks in terms of hit rate (ratio of active to inactive compounds) and potency distribution. It is available online at http://drugdesign.unistra.fr/LIT-PCBA for download and for benchmarking novel virtual screening methods, notably those relying on machine learning.
DockM8_Benchmarking_results
zenodo.org
data.niaid.nih.gov
txt, zip
Updated Jul 22, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Antoine Lacour; Andrea Volkamer; Andrea Volkamer; Anna Hirsch; Anna Hirsch; Hamza Ibrahim; Hamza Ibrahim; Antoine Lacour (2024). DockM8_Benchmarking_results [Dataset]. http://doi.org/10.5281/zenodo.11191685
Explore at:
zip, txtAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.11191685
Dataset updated
Jul 22, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Antoine Lacour; Andrea Volkamer; Andrea Volkamer; Anna Hirsch; Anna Hirsch; Hamza Ibrahim; Hamza Ibrahim; Antoine Lacour
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
May 14, 2024
Description
The repository contains the benchmarking data obtained alongside the first version of DockM8.

The file structure is explained in DockM8_v1_file_structure_explanation.txt

We hope this data is useful for benchmarking scoring functions and machine learning models, as well as being a large repository of pre-docked poses using a variety of algorithms.
Z
Associated Data: RASPD+: Fast protein-ligand binding free energy prediction...
data.niaid.nih.gov
zenodo.org
Updated Dec 7, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mukherjee, Goutam (2020). Associated Data: RASPD+: Fast protein-ligand binding free energy prediction using simplified physicochemical features [Dataset]. https://data.niaid.nih.gov/resources?id=ZENODO_3937425
Explore at:
Dataset updated
Dec 7, 2020
Dataset provided by
Mukherjee, Goutam
Adam, Lukas
Jayaram, B
Holderbach, Stefan
Wade, Rebecca
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Additional digital data to "RASPD+: Fast protein-ligand binding free energy prediction using simplified physicochemical features" (ChemRxiv preprint:https://doi.org/10.26434/chemrxiv.12636704).

Associated code can be found at: https://github.com/HITS-MCM/RASPDplus

Files:

weights.tar.gz: contains the model weights of one random dataset split and its associated crossvalidation folds. Used for standard RASPD+ evaluation.

additional_model_replicates.tar.gz: contains the remaining models trained on the full set of descriptors.

external_test_sets.tar.gz: contains the descriptor tables for all external test sets used

dude.tar.gz: contains the descriptor tables for and several identifier lists for evaluation on the Directory of Useful Decoys - Enhanced (DUD-E)

run_outputs.tar.gz: Performance metric data and predicted values created during the model training and evaluation runs. Basis for the figures and metrics in the manuscript.
Distribution Dude Importer/Buyer Data in USA, Distribution Dude Imports Data...
seair.co.in
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Seair Exim, Distribution Dude Importer/Buyer Data in USA, Distribution Dude Imports Data [Dataset]. https://www.seair.co.in
Explore at:
.bin, .xml, .csv, .xlsAvailable download formats
Dataset provided by
Seair Exim Solutions
Authors
Seair Exim
Area covered
United States
Description
Subscribers can find out export and import data of 23 countries by HS code or product’s name. This demo is helpful for market analysis.
o
Dude Hadley Road Cross Street Data in Perdido, AL
ownerly.com
Updated Mar 6, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ownerly (2022). Dude Hadley Road Cross Street Data in Perdido, AL [Dataset]. https://www.ownerly.com/al/perdido/dude-hadley-rd-home-details
Explore at:
Dataset updated
Mar 6, 2022
Dataset authored and provided by
Ownerly
Area covered
Perdido, Alabama, Dude Hadley Road
Description
This dataset provides information about the number of properties, residents, and average property values for Dude Hadley Road cross streets in Perdido, AL.
Ligand-only CNN models that achieved high AUC (greater than 0.9) for COMT.
figshare.com
xls
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lieyang Chen; Anthony Cruz; Steven Ramsey; Callum J. Dickson; Jose S. Duca; Viktor Hornak; David R. Koes; Tom Kurtzman (2023). Ligand-only CNN models that achieved high AUC (greater than 0.9) for COMT. [Dataset]. http://doi.org/10.1371/journal.pone.0220113.t002
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0220113.t002
Dataset updated
May 31, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Lieyang Chen; Anthony Cruz; Steven Ramsey; Callum J. Dickson; Jose S. Duca; Viktor Hornak; David R. Koes; Tom Kurtzman
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Ligand-only CNN models that achieved high AUC (greater than 0.9) for COMT.
Data from: ESSENCE-Dock: A Consensus-Based Approach to Enhance Virtual...
zenodo.org
data.niaid.nih.gov
zip
Updated Nov 14, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jochem Nelen; Jochem Nelen; Miguel Carmena-Bargueño; Carlos Martínez-Cortés; Alejandro Rodríguez-Martínez; José Manuel Villalgordo-Soto; Horacio Pérez-Sánchez; Miguel Carmena-Bargueño; Carlos Martínez-Cortés; Alejandro Rodríguez-Martínez; José Manuel Villalgordo-Soto; Horacio Pérez-Sánchez (2023). ESSENCE-Dock: A Consensus-Based Approach to Enhance Virtual Screening Enrichment in Drug Discovery [Dataset]. http://doi.org/10.5281/zenodo.10025840
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.10025840
Dataset updated
Nov 14, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Jochem Nelen; Jochem Nelen; Miguel Carmena-Bargueño; Carlos Martínez-Cortés; Alejandro Rodríguez-Martínez; José Manuel Villalgordo-Soto; Horacio Pérez-Sánchez; Miguel Carmena-Bargueño; Carlos Martínez-Cortés; Alejandro Rodríguez-Martínez; José Manuel Villalgordo-Soto; Horacio Pérez-Sánchez
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
All of the individual docking results and ESSENCE-Dock consensus results for 21 diverse DUD-E targets as presented in the paper "ESSENCE-Dock: A Consensus-Based Approach to Enhance Virtual Screening Enrichment in Drug Discovery".
Docking calculations were perfomed using:
Metascreener (Gnina and LeadFinder Calculations; prefix VS_GN_ and VS_LF_ respectively)
DiffDockHPC (DiffDock calculations; prefix VS_DD_ )
The consensus calculations were performed using ESSENCE-Dock, available via Metascreener.
ESSENCE-Dock preprint: https://doi.org/10.26434/chemrxiv-2023-21wtv
Paper Abstract
Developing new drugs is an expensive and lengthy endeavor, partly due to the reliance on high-throughput screening (HTS), which involves significant costs and is time-consuming. Virtual screening, particularly molecular docking, offers a more cost-effective and faster alternative for identifying promising drug candidates. However, the effectiveness of molecular docking can vary greatly, which has led to the use of consensus docking approaches. These approaches combine results from different docking methods to improve the identification of active compounds and can reduce the occurrence of false positives. However, many of these methods do not fully leverage the latest advancements in docking technology. In response, we present ESSENCE-Dock (Effective Structural Screening ENrichment ConsEnsus Dock), a new consensus docking workflow aimed at decreasing false positives and increasing the discovery of active compounds. By utilizing a combination of novel docking algorithms, we improve the selection process for potential active compounds. ESSENCE-Dock has been made to be user-friendly, requiring only a few simple commands to perform a complete screening, while also being designed for use in high-performance computing (HPC) environments.
o
Dude Waters Drive Cross Street Data in Mulberry, FL
ownerly.com
Updated Jan 16, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ownerly (2022). Dude Waters Drive Cross Street Data in Mulberry, FL [Dataset]. https://www.ownerly.com/fl/mulberry/dude-waters-dr-home-details
Explore at:
Dataset updated
Jan 16, 2022
Dataset authored and provided by
Ownerly
Area covered
Dude Waters Drive, Mulberry, Florida
Description
This dataset provides information about the number of properties, residents, and average property values for Dude Waters Drive cross streets in Mulberry, FL.
f
The mean and SD of the AUC values across three target groups.
plos.figshare.com
xls
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lieyang Chen; Anthony Cruz; Steven Ramsey; Callum J. Dickson; Jose S. Duca; Viktor Hornak; David R. Koes; Tom Kurtzman (2023). The mean and SD of the AUC values across three target groups. [Dataset]. http://doi.org/10.1371/journal.pone.0220113.t003
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0220113.t003
Dataset updated
May 30, 2023
Dataset provided by
PLOS ONE
Authors
Lieyang Chen; Anthony Cruz; Steven Ramsey; Callum J. Dickson; Jose S. Duca; Viktor Hornak; David R. Koes; Tom Kurtzman
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The mean and SD of the AUC values across three target groups.
d
Dude Mining Claims, Claim Map
datadiscoverystudio.org
pdf
Updated May 7, 2014
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Richard E. Mieritz (2014). Dude Mining Claims, Claim Map [Dataset]. http://datadiscoverystudio.org/geoportal/rest/metadata/item/28fb90f6302a4b95ad7ad4ce8222db37/html
Explore at:
pdfAvailable download formats
Dataset updated
May 7, 2014
Authors
Richard E. Mieritz
Area covered

Description
ADMMR map collection: Dude Mining Claims, Claim Map; 1 in. to 200 feet; 22 x 17 in.
Data from: Deep Reinforcement Learning Enables Better Bias Control in...
zenodo.org
data.niaid.nih.gov
application/gzip
Updated Feb 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tao Shen; Tao Shen; Shan Li; Wang, Simon, Xiang; Dongmei Wang; Song Wu; Jie Xia; Jie Xia; Liangren Zhang; Shan Li; Wang, Simon, Xiang; Dongmei Wang; Song Wu; Liangren Zhang (2024). Deep Reinforcement Learning Enables Better Bias Control in Benchmark for Virtual Screening [Dataset]. http://doi.org/10.5281/zenodo.7943200
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7943200
Dataset updated
Feb 16, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Tao Shen; Tao Shen; Shan Li; Wang, Simon, Xiang; Dongmei Wang; Song Wu; Jie Xia; Jie Xia; Liangren Zhang; Shan Li; Wang, Simon, Xiang; Dongmei Wang; Song Wu; Liangren Zhang
License
http://www.apache.org/licenses/LICENSE-2.0http://www.apache.org/licenses/LICENSE-2.0
Description
This compressed file contains all datasets made for the validation of MUBDsyn.
datasets_int_val: 17 cases in this folder are derived from MUBD for GPCRs. MUBDreal was made by MUBD-DecoyMaker2.0 and MUBDsyn was made by MUBD-DecoyMakersyn.
datasets_ext_val_classical_VS: Five cases in this folder are derived from the shared cases of MUV and DUD-E. The active sets of MUV were taken as the input to make corresponding MUBD datasets. Files in SBVS are raw molecular docking results by smina.
datasets_ext_val_SI_classical_VS: DeepCoy and TocoDecoy were used to make the datasets corresponding to the same five cases above. The data of DeepCoy was directly retrieved from DeepCoy resources at OPIG while topology decoys of TocoDecoy_9W were made based on the scripts provided at TocoDecoy GitHub Repository. Files in SBVS are raw molecular docking results by smina.
datasets_ext_val_ML_VS: Ten cases in this folder are derived from NRLiSt-BDB. Corresponding MUBD datasets were made as described above.
All these datasets can be used for the reproduction of validation performed in the manuscript or to benchmark various virtual screening methods.
w
dude.com - Historical whois Lookup
whoisdatacenter.com
csv
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
AllHeart Web Inc, dude.com - Historical whois Lookup [Dataset]. https://whoisdatacenter.com/domain/dude.com/
Explore at:
csvAvailable download formats
Dataset authored and provided by
AllHeart Web Inc
License
https://whoisdatacenter.com/terms-of-use/https://whoisdatacenter.com/terms-of-use/
Time period covered
Mar 15, 1985 - Jan 31, 2025
Description
Explore the historical Whois records related to dude.com (Domain). Get insights into ownership history and changes over time.
p
Dude Ranches in India - 67 Available (Free Sample)
poidata.io
csv
Updated Mar 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Poidata.io (2025). Dude Ranches in India - 67 Available (Free Sample) [Dataset]. https://www.poidata.io/report/dude-ranch/india
Explore at:
csvAvailable download formats
Dataset updated
Mar 26, 2025
Dataset provided by
Poidata.io
Area covered
India
Description
This dataset provides information on 67 in India as of March, 2025. It includes details such as email addresses (where publicly available), phone numbers (where publicly available), and geocoded addresses. Explore market trends, identify potential business partners, and gain valuable insights into the industry. Download a complimentary sample of 10 records to see what's included.
o
Dude Street Cross Street Data in Sullivan, IN
ownerly.com
Updated Feb 18, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ownerly (2022). Dude Street Cross Street Data in Sullivan, IN [Dataset]. https://www.ownerly.com/in/sullivan/dude-st-home-details
Explore at:
Dataset updated
Feb 18, 2022
Dataset authored and provided by
Ownerly
Area covered
Sullivan, Indiana
Description
This dataset provides information about the number of properties, residents, and average property values for Dude Street cross streets in Sullivan, IN.
Dude Perfect
wikipedia.tr-tr.nina.az
Updated Jul 9, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dude Perfect [Dataset]. https://www.wikipedia.tr-tr.nina.az/Dude_Perfect.html
Explore at:
Dataset updated
Jul 9, 2024
Dataset provided by
Vikipedi//www.wikipedia.org/
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dude Perfect Amerikalı internet içerik üreticisi spor ve komedi grubudur 19 Mart 2009 tarihinde kurulmuş grup hepsi
w
Subjects of Dude gun
workwithdata.com
Updated Nov 8, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Work With Data (2024). Subjects of Dude gun [Dataset]. https://www.workwithdata.com/datasets/book-subjects?f=1&fcol0=j0-book&fop0=%3D&fval0=Dude+gun&j=1&j0=books
Explore at:
Dataset updated
Nov 8, 2024
Dataset authored and provided by
Work With Data
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is about book subjects and is filtered where the books is Dude gun, featuring 10 columns including authors, average publication date, book publishers, book subject, and books. The preview is ordered by number of books (descending).

Facebook

Twitter

Click to copy link

Link copied

Cite

Lieyang Chen; Anthony Cruz; Steven Ramsey; Callum J. Dickson; Jose S. Duca; Viktor Hornak; David R. Koes; Tom Kurtzman (2023). Hidden bias in the DUD-E dataset leads to misleading performance of deep learning in structure-based virtual screening [Dataset]. http://doi.org/10.1371/journal.pone.0220113

Hidden bias in the DUD-E dataset leads to misleading performance of deep learning in structure-based virtual screening

Explore at:

98 scholarly articles cite this dataset (View in Google Scholar)

tiffAvailable download formats

Unique identifier

https://doi.org/10.1371/journal.pone.0220113

Dataset updated

May 31, 2023

Dataset provided by

PLOS ONE

Authors

Lieyang Chen; Anthony Cruz; Steven Ramsey; Callum J. Dickson; Jose S. Duca; Viktor Hornak; David R. Koes; Tom Kurtzman

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Recently much effort has been invested in using convolutional neural network (CNN) models trained on 3D structural images of protein-ligand complexes to distinguish binding from non-binding ligands for virtual screening. However, the dearth of reliable protein-ligand x-ray structures and binding affinity data has required the use of constructed datasets for the training and evaluation of CNN molecular recognition models. Here, we outline various sources of bias in one such widely-used dataset, the Directory of Useful Decoys: Enhanced (DUD-E). We have constructed and performed tests to investigate whether CNN models developed using DUD-E are properly learning the underlying physics of molecular recognition, as intended, or are instead learning biases inherent in the dataset itself. We find that superior enrichment efficiency in CNN models can be attributed to the analogue and decoy bias hidden in the DUD-E dataset rather than successful generalization of the pattern of protein-ligand interactions. Comparing additional deep learning models trained on PDBbind datasets, we found that their enrichment performances using DUD-E are not superior to the performance of the docking program AutoDock Vina. Together, these results suggest that biases that could be present in constructed datasets should be thoroughly evaluated before applying them to machine learning based methodology development.

Clear search

Close search

Google apps

Main menu

Hidden bias in the DUD-E dataset leads to misleading performance of deep...

Summary of Vina, Gnina and Pafnucy performance on DUD-E targets.

Data from: Property-Unmatched Decoys in Docking Benchmarks

PIGNet2: A versatile deep learning-based protein-ligand interaction...

LIT-PCBA(ESR1_ant) Dataset

DockM8_Benchmarking_results

Associated Data: RASPD+: Fast protein-ligand binding free energy prediction...

Distribution Dude Importer/Buyer Data in USA, Distribution Dude Imports Data...

Dude Hadley Road Cross Street Data in Perdido, AL

Ligand-only CNN models that achieved high AUC (greater than 0.9) for COMT.

Data from: ESSENCE-Dock: A Consensus-Based Approach to Enhance Virtual...

Dude Waters Drive Cross Street Data in Mulberry, FL

The mean and SD of the AUC values across three target groups.

Dude Mining Claims, Claim Map

Data from: Deep Reinforcement Learning Enables Better Bias Control in...

dude.com - Historical whois Lookup

Dude Ranches in India - 67 Available (Free Sample)

Dude Street Cross Street Data in Sullivan, IN

Dude Perfect

Subjects of Dude gun

Hidden bias in the DUD-E dataset leads to misleading performance of deep learning in structure-based virtual screening