2 datasets found
  1. f

    Bioactivity datasets from ChEMBL.

    • plos.figshare.com
    xls
    Updated Sep 6, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Faisal Bin Ashraf; Sanjida Akter; Sumona Hoque Mumu; Muhammad Usama Islam; Jasim Uddin (2023). Bioactivity datasets from ChEMBL. [Dataset]. http://doi.org/10.1371/journal.pone.0288053.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Sep 6, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Faisal Bin Ashraf; Sanjida Akter; Sumona Hoque Mumu; Muhammad Usama Islam; Jasim Uddin
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The SARS-CoV-2 3CLpro protein is one of the key therapeutic targets of interest for COVID-19 due to its critical role in viral replication, various high-quality protein crystal structures, and as a basis for computationally screening for compounds with improved inhibitory activity, bioavailability, and ADMETox properties. The ChEMBL and PubChem database contains experimental data from screening small molecules against SARS-CoV-2 3CLpro, which expands the opportunity to learn the pattern and design a computational model that can predict the potency of any drug compound against coronavirus before in-vitro and in-vivo testing. In this study, Utilizing several descriptors, we evaluated 27 machine learning classifiers. We also developed a neural network model that can correctly identify bioactive and inactive chemicals with 91% accuracy, on CheMBL data and 93% accuracy on combined data on both CheMBL and Pubchem. The F1-score for inactive and active compounds was 93% and 94%, respectively. SHAP (SHapley Additive exPlanations) on XGB classifier to find important fingerprints from the PaDEL descriptors for this task. The results indicated that the PaDEL descriptors were effective in predicting bioactivity, the proposed neural network design was efficient, and the Explanatory factor through SHAP correctly identified the important fingertips. In addition, we validated the effectiveness of our proposed model using a large dataset encompassing over 100,000 molecules. This research employed various molecular descriptors to discover the optimal one for this task. To evaluate the effectiveness of these possible medications against SARS-CoV-2, more in-vitro and in-vivo research is required.

  2. f

    Performance of neural network on ChEMBL data.

    • plos.figshare.com
    xls
    Updated Sep 6, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Faisal Bin Ashraf; Sanjida Akter; Sumona Hoque Mumu; Muhammad Usama Islam; Jasim Uddin (2023). Performance of neural network on ChEMBL data. [Dataset]. http://doi.org/10.1371/journal.pone.0288053.t005
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Sep 6, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Faisal Bin Ashraf; Sanjida Akter; Sumona Hoque Mumu; Muhammad Usama Islam; Jasim Uddin
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The SARS-CoV-2 3CLpro protein is one of the key therapeutic targets of interest for COVID-19 due to its critical role in viral replication, various high-quality protein crystal structures, and as a basis for computationally screening for compounds with improved inhibitory activity, bioavailability, and ADMETox properties. The ChEMBL and PubChem database contains experimental data from screening small molecules against SARS-CoV-2 3CLpro, which expands the opportunity to learn the pattern and design a computational model that can predict the potency of any drug compound against coronavirus before in-vitro and in-vivo testing. In this study, Utilizing several descriptors, we evaluated 27 machine learning classifiers. We also developed a neural network model that can correctly identify bioactive and inactive chemicals with 91% accuracy, on CheMBL data and 93% accuracy on combined data on both CheMBL and Pubchem. The F1-score for inactive and active compounds was 93% and 94%, respectively. SHAP (SHapley Additive exPlanations) on XGB classifier to find important fingerprints from the PaDEL descriptors for this task. The results indicated that the PaDEL descriptors were effective in predicting bioactivity, the proposed neural network design was efficient, and the Explanatory factor through SHAP correctly identified the important fingertips. In addition, we validated the effectiveness of our proposed model using a large dataset encompassing over 100,000 molecules. This research employed various molecular descriptors to discover the optimal one for this task. To evaluate the effectiveness of these possible medications against SARS-CoV-2, more in-vitro and in-vivo research is required.

  3. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Faisal Bin Ashraf; Sanjida Akter; Sumona Hoque Mumu; Muhammad Usama Islam; Jasim Uddin (2023). Bioactivity datasets from ChEMBL. [Dataset]. http://doi.org/10.1371/journal.pone.0288053.t001

Bioactivity datasets from ChEMBL.

Related Article
Explore at:
5 scholarly articles cite this dataset (View in Google Scholar)
xlsAvailable download formats
Dataset updated
Sep 6, 2023
Dataset provided by
PLOS ONE
Authors
Faisal Bin Ashraf; Sanjida Akter; Sumona Hoque Mumu; Muhammad Usama Islam; Jasim Uddin
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

The SARS-CoV-2 3CLpro protein is one of the key therapeutic targets of interest for COVID-19 due to its critical role in viral replication, various high-quality protein crystal structures, and as a basis for computationally screening for compounds with improved inhibitory activity, bioavailability, and ADMETox properties. The ChEMBL and PubChem database contains experimental data from screening small molecules against SARS-CoV-2 3CLpro, which expands the opportunity to learn the pattern and design a computational model that can predict the potency of any drug compound against coronavirus before in-vitro and in-vivo testing. In this study, Utilizing several descriptors, we evaluated 27 machine learning classifiers. We also developed a neural network model that can correctly identify bioactive and inactive chemicals with 91% accuracy, on CheMBL data and 93% accuracy on combined data on both CheMBL and Pubchem. The F1-score for inactive and active compounds was 93% and 94%, respectively. SHAP (SHapley Additive exPlanations) on XGB classifier to find important fingerprints from the PaDEL descriptors for this task. The results indicated that the PaDEL descriptors were effective in predicting bioactivity, the proposed neural network design was efficient, and the Explanatory factor through SHAP correctly identified the important fingertips. In addition, we validated the effectiveness of our proposed model using a large dataset encompassing over 100,000 molecules. This research employed various molecular descriptors to discover the optimal one for this task. To evaluate the effectiveness of these possible medications against SARS-CoV-2, more in-vitro and in-vivo research is required.

Search
Clear search
Close search
Google apps
Main menu