33 datasets found
  1. f

    Hidden bias in the DUD-E dataset leads to misleading performance of deep...

    • plos.figshare.com
    tiff
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lieyang Chen; Anthony Cruz; Steven Ramsey; Callum J. Dickson; Jose S. Duca; Viktor Hornak; David R. Koes; Tom Kurtzman (2023). Hidden bias in the DUD-E dataset leads to misleading performance of deep learning in structure-based virtual screening [Dataset]. http://doi.org/10.1371/journal.pone.0220113
    Explore at:
    tiffAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Lieyang Chen; Anthony Cruz; Steven Ramsey; Callum J. Dickson; Jose S. Duca; Viktor Hornak; David R. Koes; Tom Kurtzman
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Recently much effort has been invested in using convolutional neural network (CNN) models trained on 3D structural images of protein-ligand complexes to distinguish binding from non-binding ligands for virtual screening. However, the dearth of reliable protein-ligand x-ray structures and binding affinity data has required the use of constructed datasets for the training and evaluation of CNN molecular recognition models. Here, we outline various sources of bias in one such widely-used dataset, the Directory of Useful Decoys: Enhanced (DUD-E). We have constructed and performed tests to investigate whether CNN models developed using DUD-E are properly learning the underlying physics of molecular recognition, as intended, or are instead learning biases inherent in the dataset itself. We find that superior enrichment efficiency in CNN models can be attributed to the analogue and decoy bias hidden in the DUD-E dataset rather than successful generalization of the pattern of protein-ligand interactions. Comparing additional deep learning models trained on PDBbind datasets, we found that their enrichment performances using DUD-E are not superior to the performance of the docking program AutoDock Vina. Together, these results suggest that biases that could be present in constructed datasets should be thoroughly evaluated before applying them to machine learning based methodology development.

  2. Summary of Vina, Gnina and Pafnucy performance on DUD-E targets.

    • plos.figshare.com
    xls
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lieyang Chen; Anthony Cruz; Steven Ramsey; Callum J. Dickson; Jose S. Duca; Viktor Hornak; David R. Koes; Tom Kurtzman (2023). Summary of Vina, Gnina and Pafnucy performance on DUD-E targets. [Dataset]. http://doi.org/10.1371/journal.pone.0220113.t004
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Lieyang Chen; Anthony Cruz; Steven Ramsey; Callum J. Dickson; Jose S. Duca; Viktor Hornak; David R. Koes; Tom Kurtzman
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Summary of Vina, Gnina and Pafnucy performance on DUD-E targets.

  3. f

    Data from: Property-Unmatched Decoys in Docking Benchmarks

    • figshare.com
    • acs.figshare.com
    xlsx
    Updated Jun 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Reed M. Stein; Ying Yang; Trent E. Balius; Matt J. O’Meara; Jiankun Lyu; Jennifer Young; Khanh Tang; Brian K. Shoichet; John J. Irwin (2023). Property-Unmatched Decoys in Docking Benchmarks [Dataset]. http://doi.org/10.1021/acs.jcim.0c00598.s003
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    ACS Publications
    Authors
    Reed M. Stein; Ying Yang; Trent E. Balius; Matt J. O’Meara; Jiankun Lyu; Jennifer Young; Khanh Tang; Brian K. Shoichet; John J. Irwin
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Enrichment of ligands versus property-matched decoys is widely used to test and optimize docking library screens. However, the unconstrained optimization of enrichment alone can mislead, leading to false confidence in prospective performance. This can arise by over-optimizing for enrichment against property-matched decoys, without considering the full spectrum of molecules to be found in a true large library screen. Adding decoys representing charge extrema helps mitigate over-optimizing for electrostatic interactions. Adding decoys that represent the overall characteristics of the library to be docked allows one to sample molecules not represented by ligands and property-matched decoys but that one will encounter in a prospective screen. An optimized version of the DUD-E set (DUDE-Z), as well as Extrema and sets representing broad features of the library (Goldilocks), is developed here. We also explore the variability that one can encounter in enrichment calculations and how that can temper one’s confidence in small enrichment differences. The new tools and new decoy sets are freely available at http://tldr.docking.org and http://dudez.docking.org.

  4. PIGNet2: A versatile deep learning-based protein-ligand interaction...

    • zenodo.org
    • data.niaid.nih.gov
    xz
    Updated Jul 18, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Moon Seokhyun; Moon Seokhyun; Hwang Sang-Yeon; Hwang Sang-Yeon; Lim Jaechang; Lim Jaechang; Kim Woo Youn; Kim Woo Youn (2023). PIGNet2: A versatile deep learning-based protein-ligand interaction prediction model for accurate binding affinity scoring and virtual screening [Dataset]. http://doi.org/10.5281/zenodo.8091220
    Explore at:
    xzAvailable download formats
    Dataset updated
    Jul 18, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Moon Seokhyun; Moon Seokhyun; Hwang Sang-Yeon; Hwang Sang-Yeon; Lim Jaechang; Lim Jaechang; Kim Woo Youn; Kim Woo Youn
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Training and test datasets of the paper "Improving the versatility of deep learning-based protein-ligand interaction prediction for accurate binding affinity scoring and virtual screening".

  5. P

    LIT-PCBA(ESR1_ant) Dataset

    • paperswithcode.com
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yuquan Li; Chang-Yu Hsieh; Ruiqiang Lu; Xiaoqing Gong; Xiaorui Wang; Pengyong Li; Shuo Liu; Yanan Tian; Dejun Jiang; Jiaxian Yan; Qifeng Bai; Huanxiang Liu; Shengyu Zhang; Xiaojun Yao, LIT-PCBA(ESR1_ant) Dataset [Dataset]. https://paperswithcode.com/dataset/lit-pcba-esr1-ant
    Explore at:
    Authors
    Yuquan Li; Chang-Yu Hsieh; Ruiqiang Lu; Xiaoqing Gong; Xiaorui Wang; Pengyong Li; Shuo Liu; Yanan Tian; Dejun Jiang; Jiaxian Yan; Qifeng Bai; Huanxiang Liu; Shengyu Zhang; Xiaojun Yao
    Description

    Comparative evaluation of virtual screening methods requires a rigorous benchmarking procedure on diverse, realistic, and unbiased data sets. Recent investigations from numerous research groups unambiguously demonstrate that artificially constructed ligand sets classically used by the community (e.g., DUD, DUD-E, MUV) are unfortunately biased by both obvious and hidden chemical biases, therefore overestimating the true accuracy of virtual screening methods. We herewith present a novel data set (LIT-PCBA) specifically designed for virtual screening and machine learning. LIT-PCBA relies on 149 dose–response PubChem bioassays that were additionally processed to remove false positives and assay artifacts and keep active and inactive compounds within similar molecular property ranges. To ascertain that the data set is suited to both ligand-based and structure-based virtual screening, target sets were restricted to single protein targets for which at least one X-ray structure is available in complex with ligands of the same phenotype (e.g., inhibitor, inverse agonist) as that of the PubChem active compounds. Preliminary virtual screening on the 21 remaining target sets with state-of-the-art orthogonal methods (2D fingerprint similarity, 3D shape similarity, molecular docking) enabled us to select 15 target sets for which at least one of the three screening methods is able to enrich the top 1%-ranked compounds in true actives by at least a factor of 2. The corresponding ligand sets (training, validation) were finally unbiased by the recently described asymmetric validation embedding (AVE) procedure to afford the LIT-PCBA data set, consisting of 15 targets and 7844 confirmed active and 407,381 confirmed inactive compounds. The data set mimics experimental screening decks in terms of hit rate (ratio of active to inactive compounds) and potency distribution. It is available online at http://drugdesign.unistra.fr/LIT-PCBA for download and for benchmarking novel virtual screening methods, notably those relying on machine learning.

  6. DockM8_Benchmarking_results

    • zenodo.org
    • data.niaid.nih.gov
    txt, zip
    Updated Jul 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Antoine Lacour; Andrea Volkamer; Andrea Volkamer; Anna Hirsch; Anna Hirsch; Hamza Ibrahim; Hamza Ibrahim; Antoine Lacour (2024). DockM8_Benchmarking_results [Dataset]. http://doi.org/10.5281/zenodo.11191685
    Explore at:
    zip, txtAvailable download formats
    Dataset updated
    Jul 22, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Antoine Lacour; Andrea Volkamer; Andrea Volkamer; Anna Hirsch; Anna Hirsch; Hamza Ibrahim; Hamza Ibrahim; Antoine Lacour
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    May 14, 2024
    Description

    The repository contains the benchmarking data obtained alongside the first version of DockM8.

    The file structure is explained in DockM8_v1_file_structure_explanation.txt

    We hope this data is useful for benchmarking scoring functions and machine learning models, as well as being a large repository of pre-docked poses using a variety of algorithms.

  7. Z

    Associated Data: RASPD+: Fast protein-ligand binding free energy prediction...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Dec 7, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mukherjee, Goutam (2020). Associated Data: RASPD+: Fast protein-ligand binding free energy prediction using simplified physicochemical features [Dataset]. https://data.niaid.nih.gov/resources?id=ZENODO_3937425
    Explore at:
    Dataset updated
    Dec 7, 2020
    Dataset provided by
    Mukherjee, Goutam
    Adam, Lukas
    Jayaram, B
    Holderbach, Stefan
    Wade, Rebecca
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Additional digital data to "RASPD+: Fast protein-ligand binding free energy prediction using simplified physicochemical features" (ChemRxiv preprint:https://doi.org/10.26434/chemrxiv.12636704).

    Associated code can be found at: https://github.com/HITS-MCM/RASPDplus

    Files:

    weights.tar.gz: contains the model weights of one random dataset split and its associated crossvalidation folds. Used for standard RASPD+ evaluation.

    additional_model_replicates.tar.gz: contains the remaining models trained on the full set of descriptors.

    external_test_sets.tar.gz: contains the descriptor tables for all external test sets used

    dude.tar.gz: contains the descriptor tables for and several identifier lists for evaluation on the Directory of Useful Decoys - Enhanced (DUD-E)

    run_outputs.tar.gz: Performance metric data and predicted values created during the model training and evaluation runs. Basis for the figures and metrics in the manuscript.

  8. Distribution Dude Importer/Buyer Data in USA, Distribution Dude Imports Data...

    • seair.co.in
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Seair Exim, Distribution Dude Importer/Buyer Data in USA, Distribution Dude Imports Data [Dataset]. https://www.seair.co.in
    Explore at:
    .bin, .xml, .csv, .xlsAvailable download formats
    Dataset provided by
    Seair Exim Solutions
    Authors
    Seair Exim
    Area covered
    United States
    Description

    Subscribers can find out export and import data of 23 countries by HS code or product’s name. This demo is helpful for market analysis.

  9. o

    Dude Hadley Road Cross Street Data in Perdido, AL

    • ownerly.com
    Updated Mar 6, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ownerly (2022). Dude Hadley Road Cross Street Data in Perdido, AL [Dataset]. https://www.ownerly.com/al/perdido/dude-hadley-rd-home-details
    Explore at:
    Dataset updated
    Mar 6, 2022
    Dataset authored and provided by
    Ownerly
    Area covered
    Perdido, Alabama, Dude Hadley Road
    Description

    This dataset provides information about the number of properties, residents, and average property values for Dude Hadley Road cross streets in Perdido, AL.

  10. Ligand-only CNN models that achieved high AUC (greater than 0.9) for COMT.

    • figshare.com
    xls
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lieyang Chen; Anthony Cruz; Steven Ramsey; Callum J. Dickson; Jose S. Duca; Viktor Hornak; David R. Koes; Tom Kurtzman (2023). Ligand-only CNN models that achieved high AUC (greater than 0.9) for COMT. [Dataset]. http://doi.org/10.1371/journal.pone.0220113.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Lieyang Chen; Anthony Cruz; Steven Ramsey; Callum J. Dickson; Jose S. Duca; Viktor Hornak; David R. Koes; Tom Kurtzman
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Ligand-only CNN models that achieved high AUC (greater than 0.9) for COMT.

  11. Data from: ESSENCE-Dock: A Consensus-Based Approach to Enhance Virtual...

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Nov 14, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jochem Nelen; Jochem Nelen; Miguel Carmena-Bargueño; Carlos Martínez-Cortés; Alejandro Rodríguez-Martínez; José Manuel Villalgordo-Soto; Horacio Pérez-Sánchez; Miguel Carmena-Bargueño; Carlos Martínez-Cortés; Alejandro Rodríguez-Martínez; José Manuel Villalgordo-Soto; Horacio Pérez-Sánchez (2023). ESSENCE-Dock: A Consensus-Based Approach to Enhance Virtual Screening Enrichment in Drug Discovery [Dataset]. http://doi.org/10.5281/zenodo.10025840
    Explore at:
    zipAvailable download formats
    Dataset updated
    Nov 14, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Jochem Nelen; Jochem Nelen; Miguel Carmena-Bargueño; Carlos Martínez-Cortés; Alejandro Rodríguez-Martínez; José Manuel Villalgordo-Soto; Horacio Pérez-Sánchez; Miguel Carmena-Bargueño; Carlos Martínez-Cortés; Alejandro Rodríguez-Martínez; José Manuel Villalgordo-Soto; Horacio Pérez-Sánchez
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    All of the individual docking results and ESSENCE-Dock consensus results for 21 diverse DUD-E targets as presented in the paper "ESSENCE-Dock: A Consensus-Based Approach to Enhance Virtual Screening Enrichment in Drug Discovery".

    Docking calculations were perfomed using:

    • Metascreener (Gnina and LeadFinder Calculations; prefix VS_GN_ and VS_LF_ respectively)
    • DiffDockHPC (DiffDock calculations; prefix VS_DD_ )

    The consensus calculations were performed using ESSENCE-Dock, available via Metascreener.

    ESSENCE-Dock preprint: https://doi.org/10.26434/chemrxiv-2023-21wtv

    Paper Abstract

    Developing new drugs is an expensive and lengthy endeavor, partly due to the reliance on high-throughput screening (HTS), which involves significant costs and is time-consuming. Virtual screening, particularly molecular docking, offers a more cost-effective and faster alternative for identifying promising drug candidates. However, the effectiveness of molecular docking can vary greatly, which has led to the use of consensus docking approaches. These approaches combine results from different docking methods to improve the identification of active compounds and can reduce the occurrence of false positives. However, many of these methods do not fully leverage the latest advancements in docking technology. In response, we present ESSENCE-Dock (Effective Structural Screening ENrichment ConsEnsus Dock), a new consensus docking workflow aimed at decreasing false positives and increasing the discovery of active compounds. By utilizing a combination of novel docking algorithms, we improve the selection process for potential active compounds. ESSENCE-Dock has been made to be user-friendly, requiring only a few simple commands to perform a complete screening, while also being designed for use in high-performance computing (HPC) environments.

  12. o

    Dude Waters Drive Cross Street Data in Mulberry, FL

    • ownerly.com
    Updated Jan 16, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ownerly (2022). Dude Waters Drive Cross Street Data in Mulberry, FL [Dataset]. https://www.ownerly.com/fl/mulberry/dude-waters-dr-home-details
    Explore at:
    Dataset updated
    Jan 16, 2022
    Dataset authored and provided by
    Ownerly
    Area covered
    Dude Waters Drive, Mulberry, Florida
    Description

    This dataset provides information about the number of properties, residents, and average property values for Dude Waters Drive cross streets in Mulberry, FL.

  13. f

    The mean and SD of the AUC values across three target groups.

    • plos.figshare.com
    xls
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lieyang Chen; Anthony Cruz; Steven Ramsey; Callum J. Dickson; Jose S. Duca; Viktor Hornak; David R. Koes; Tom Kurtzman (2023). The mean and SD of the AUC values across three target groups. [Dataset]. http://doi.org/10.1371/journal.pone.0220113.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Lieyang Chen; Anthony Cruz; Steven Ramsey; Callum J. Dickson; Jose S. Duca; Viktor Hornak; David R. Koes; Tom Kurtzman
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The mean and SD of the AUC values across three target groups.

  14. d

    Dude Mining Claims, Claim Map

    • datadiscoverystudio.org
    pdf
    Updated May 7, 2014
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Richard E. Mieritz (2014). Dude Mining Claims, Claim Map [Dataset]. http://datadiscoverystudio.org/geoportal/rest/metadata/item/28fb90f6302a4b95ad7ad4ce8222db37/html
    Explore at:
    pdfAvailable download formats
    Dataset updated
    May 7, 2014
    Authors
    Richard E. Mieritz
    Area covered
    Description

    ADMMR map collection: Dude Mining Claims, Claim Map; 1 in. to 200 feet; 22 x 17 in.

  15. Data from: Deep Reinforcement Learning Enables Better Bias Control in...

    • zenodo.org
    • data.niaid.nih.gov
    application/gzip
    Updated Feb 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tao Shen; Tao Shen; Shan Li; Wang, Simon, Xiang; Dongmei Wang; Song Wu; Jie Xia; Jie Xia; Liangren Zhang; Shan Li; Wang, Simon, Xiang; Dongmei Wang; Song Wu; Liangren Zhang (2024). Deep Reinforcement Learning Enables Better Bias Control in Benchmark for Virtual Screening [Dataset]. http://doi.org/10.5281/zenodo.7943200
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Feb 16, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Tao Shen; Tao Shen; Shan Li; Wang, Simon, Xiang; Dongmei Wang; Song Wu; Jie Xia; Jie Xia; Liangren Zhang; Shan Li; Wang, Simon, Xiang; Dongmei Wang; Song Wu; Liangren Zhang
    License

    http://www.apache.org/licenses/LICENSE-2.0http://www.apache.org/licenses/LICENSE-2.0

    Description

    This compressed file contains all datasets made for the validation of MUBDsyn.

    • datasets_int_val: 17 cases in this folder are derived from MUBD for GPCRs. MUBDreal was made by MUBD-DecoyMaker2.0 and MUBDsyn was made by MUBD-DecoyMakersyn.
    • datasets_ext_val_classical_VS: Five cases in this folder are derived from the shared cases of MUV and DUD-E. The active sets of MUV were taken as the input to make corresponding MUBD datasets. Files in SBVS are raw molecular docking results by smina.
    • datasets_ext_val_SI_classical_VS: DeepCoy and TocoDecoy were used to make the datasets corresponding to the same five cases above. The data of DeepCoy was directly retrieved from DeepCoy resources at OPIG while topology decoys of TocoDecoy_9W were made based on the scripts provided at TocoDecoy GitHub Repository. Files in SBVS are raw molecular docking results by smina.
    • datasets_ext_val_ML_VS: Ten cases in this folder are derived from NRLiSt-BDB. Corresponding MUBD datasets were made as described above.

    All these datasets can be used for the reproduction of validation performed in the manuscript or to benchmark various virtual screening methods.

  16. w

    dude.com - Historical whois Lookup

    • whoisdatacenter.com
    csv
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AllHeart Web Inc, dude.com - Historical whois Lookup [Dataset]. https://whoisdatacenter.com/domain/dude.com/
    Explore at:
    csvAvailable download formats
    Dataset authored and provided by
    AllHeart Web Inc
    License

    https://whoisdatacenter.com/terms-of-use/https://whoisdatacenter.com/terms-of-use/

    Time period covered
    Mar 15, 1985 - Jan 31, 2025
    Description

    Explore the historical Whois records related to dude.com (Domain). Get insights into ownership history and changes over time.

  17. p

    Dude Ranches in India - 67 Available (Free Sample)

    • poidata.io
    csv
    Updated Mar 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Poidata.io (2025). Dude Ranches in India - 67 Available (Free Sample) [Dataset]. https://www.poidata.io/report/dude-ranch/india
    Explore at:
    csvAvailable download formats
    Dataset updated
    Mar 26, 2025
    Dataset provided by
    Poidata.io
    Area covered
    India
    Description

    This dataset provides information on 67 in India as of March, 2025. It includes details such as email addresses (where publicly available), phone numbers (where publicly available), and geocoded addresses. Explore market trends, identify potential business partners, and gain valuable insights into the industry. Download a complimentary sample of 10 records to see what's included.

  18. o

    Dude Street Cross Street Data in Sullivan, IN

    • ownerly.com
    Updated Feb 18, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ownerly (2022). Dude Street Cross Street Data in Sullivan, IN [Dataset]. https://www.ownerly.com/in/sullivan/dude-st-home-details
    Explore at:
    Dataset updated
    Feb 18, 2022
    Dataset authored and provided by
    Ownerly
    Area covered
    Sullivan, Indiana
    Description

    This dataset provides information about the number of properties, residents, and average property values for Dude Street cross streets in Sullivan, IN.

  19. Dude Perfect

    • wikipedia.tr-tr.nina.az
    Updated Jul 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dude Perfect [Dataset]. https://www.wikipedia.tr-tr.nina.az/Dude_Perfect.html
    Explore at:
    Dataset updated
    Jul 9, 2024
    Dataset provided by
    Vikipedi//www.wikipedia.org/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dude Perfect Amerikalı internet içerik üreticisi spor ve komedi grubudur 19 Mart 2009 tarihinde kurulmuş grup hepsi

  20. w

    Subjects of Dude gun

    • workwithdata.com
    Updated Nov 8, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2024). Subjects of Dude gun [Dataset]. https://www.workwithdata.com/datasets/book-subjects?f=1&fcol0=j0-book&fop0=%3D&fval0=Dude+gun&j=1&j0=books
    Explore at:
    Dataset updated
    Nov 8, 2024
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is about book subjects and is filtered where the books is Dude gun, featuring 10 columns including authors, average publication date, book publishers, book subject, and books. The preview is ordered by number of books (descending).

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Lieyang Chen; Anthony Cruz; Steven Ramsey; Callum J. Dickson; Jose S. Duca; Viktor Hornak; David R. Koes; Tom Kurtzman (2023). Hidden bias in the DUD-E dataset leads to misleading performance of deep learning in structure-based virtual screening [Dataset]. http://doi.org/10.1371/journal.pone.0220113

Hidden bias in the DUD-E dataset leads to misleading performance of deep learning in structure-based virtual screening

Explore at:
98 scholarly articles cite this dataset (View in Google Scholar)
tiffAvailable download formats
Dataset updated
May 31, 2023
Dataset provided by
PLOS ONE
Authors
Lieyang Chen; Anthony Cruz; Steven Ramsey; Callum J. Dickson; Jose S. Duca; Viktor Hornak; David R. Koes; Tom Kurtzman
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Recently much effort has been invested in using convolutional neural network (CNN) models trained on 3D structural images of protein-ligand complexes to distinguish binding from non-binding ligands for virtual screening. However, the dearth of reliable protein-ligand x-ray structures and binding affinity data has required the use of constructed datasets for the training and evaluation of CNN molecular recognition models. Here, we outline various sources of bias in one such widely-used dataset, the Directory of Useful Decoys: Enhanced (DUD-E). We have constructed and performed tests to investigate whether CNN models developed using DUD-E are properly learning the underlying physics of molecular recognition, as intended, or are instead learning biases inherent in the dataset itself. We find that superior enrichment efficiency in CNN models can be attributed to the analogue and decoy bias hidden in the DUD-E dataset rather than successful generalization of the pattern of protein-ligand interactions. Comparing additional deep learning models trained on PDBbind datasets, we found that their enrichment performances using DUD-E are not superior to the performance of the docking program AutoDock Vina. Together, these results suggest that biases that could be present in constructed datasets should be thoroughly evaluated before applying them to machine learning based methodology development.

Search
Clear search
Close search
Google apps
Main menu