31 datasets found
  1. f

    Performance comparison of machine learning models across accuracy, AUC, MCC,...

    • plos.figshare.com
    xls
    Updated Dec 31, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Seongil Han; Haemin Jung (2024). Performance comparison of machine learning models across accuracy, AUC, MCC, and F1 score on GMSC dataset. [Dataset]. http://doi.org/10.1371/journal.pone.0316454.t005
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Dec 31, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Seongil Han; Haemin Jung
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Performance comparison of machine learning models across accuracy, AUC, MCC, and F1 score on GMSC dataset.

  2. f

    GMSC dataset (IR: Imbalance Ratio).

    • plos.figshare.com
    xls
    Updated Dec 31, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Seongil Han; Haemin Jung (2024). GMSC dataset (IR: Imbalance Ratio). [Dataset]. http://doi.org/10.1371/journal.pone.0316454.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Dec 31, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Seongil Han; Haemin Jung
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Credit scoring models play a crucial role for financial institutions in evaluating borrower risk and sustaining profitability. Logistic regression is widely used in credit scoring due to its robustness, interpretability, and computational efficiency; however, its predictive power decreases when applied to complex or non-linear datasets, resulting in reduced accuracy. In contrast, tree-based machine learning models often provide enhanced predictive performance but struggle with interpretability. Furthermore, imbalanced class distributions, which are prevalent in credit scoring, can adversely impact model accuracy and robustness, as the majority class tends to dominate. Despite these challenges, research that comprehensively addresses both the predictive performance and explainability aspects within the credit scoring domain remains limited. This paper introduces the Non-pArameTric oversampling approach for Explainable credit scoring (NATE), a framework designed to address these challenges by combining oversampling techniques with tree-based classifiers to enhance model performance and interpretability. NATE incorporates class balancing methods to mitigate the impact of imbalanced data distributions and integrates interpretability features to elucidate the model’s decision-making process. Experimental results show that NATE substantially outperforms traditional logistic regression in credit risk classification, with improvements of 19.33% in AUC, 71.56% in MCC, and 85.33% in F1 Score. Oversampling approaches, particularly when used with gradient boosting, demonstrated superior effectiveness compared to undersampling, achieving optimal metrics of AUC: 0.9649, MCC: 0.8104, and F1 Score: 0.9072. Moreover, NATE enhances interpretability by providing detailed insights into feature contributions, aiding in understanding individual predictions. These findings highlight NATE’s capability in managing class imbalance, improving predictive performance, and enhancing model interpretability, demonstrating its potential as a reliable and transparent tool for credit scoring applications.

  3. f

    Under-sampled dataset.

    • plos.figshare.com
    xls
    Updated Dec 31, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Seongil Han; Haemin Jung (2024). Under-sampled dataset. [Dataset]. http://doi.org/10.1371/journal.pone.0316454.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Dec 31, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Seongil Han; Haemin Jung
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Credit scoring models play a crucial role for financial institutions in evaluating borrower risk and sustaining profitability. Logistic regression is widely used in credit scoring due to its robustness, interpretability, and computational efficiency; however, its predictive power decreases when applied to complex or non-linear datasets, resulting in reduced accuracy. In contrast, tree-based machine learning models often provide enhanced predictive performance but struggle with interpretability. Furthermore, imbalanced class distributions, which are prevalent in credit scoring, can adversely impact model accuracy and robustness, as the majority class tends to dominate. Despite these challenges, research that comprehensively addresses both the predictive performance and explainability aspects within the credit scoring domain remains limited. This paper introduces the Non-pArameTric oversampling approach for Explainable credit scoring (NATE), a framework designed to address these challenges by combining oversampling techniques with tree-based classifiers to enhance model performance and interpretability. NATE incorporates class balancing methods to mitigate the impact of imbalanced data distributions and integrates interpretability features to elucidate the model’s decision-making process. Experimental results show that NATE substantially outperforms traditional logistic regression in credit risk classification, with improvements of 19.33% in AUC, 71.56% in MCC, and 85.33% in F1 Score. Oversampling approaches, particularly when used with gradient boosting, demonstrated superior effectiveness compared to undersampling, achieving optimal metrics of AUC: 0.9649, MCC: 0.8104, and F1 Score: 0.9072. Moreover, NATE enhances interpretability by providing detailed insights into feature contributions, aiding in understanding individual predictions. These findings highlight NATE’s capability in managing class imbalance, improving predictive performance, and enhancing model interpretability, demonstrating its potential as a reliable and transparent tool for credit scoring applications.

  4. f

    Increase in AUC, MCC, and F1 between oversampling and undersampling.

    • plos.figshare.com
    Updated Dec 31, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Increase in AUC, MCC, and F1 between oversampling and undersampling. [Dataset]. https://plos.figshare.com/articles/dataset/Increase_in_AUC_MCC_and_F1_between_oversampling_and_undersampling_/28118713
    Explore at:
    Dataset updated
    Dec 31, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Seongil Han; Haemin Jung
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Increase in AUC, MCC, and F1 between oversampling and undersampling.

  5. f

    Searching space for hyperparameters in Table 7.

    • plos.figshare.com
    xls
    Updated Dec 31, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Seongil Han; Haemin Jung (2024). Searching space for hyperparameters in Table 7. [Dataset]. http://doi.org/10.1371/journal.pone.0316454.t006
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Dec 31, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Seongil Han; Haemin Jung
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Credit scoring models play a crucial role for financial institutions in evaluating borrower risk and sustaining profitability. Logistic regression is widely used in credit scoring due to its robustness, interpretability, and computational efficiency; however, its predictive power decreases when applied to complex or non-linear datasets, resulting in reduced accuracy. In contrast, tree-based machine learning models often provide enhanced predictive performance but struggle with interpretability. Furthermore, imbalanced class distributions, which are prevalent in credit scoring, can adversely impact model accuracy and robustness, as the majority class tends to dominate. Despite these challenges, research that comprehensively addresses both the predictive performance and explainability aspects within the credit scoring domain remains limited. This paper introduces the Non-pArameTric oversampling approach for Explainable credit scoring (NATE), a framework designed to address these challenges by combining oversampling techniques with tree-based classifiers to enhance model performance and interpretability. NATE incorporates class balancing methods to mitigate the impact of imbalanced data distributions and integrates interpretability features to elucidate the model’s decision-making process. Experimental results show that NATE substantially outperforms traditional logistic regression in credit risk classification, with improvements of 19.33% in AUC, 71.56% in MCC, and 85.33% in F1 Score. Oversampling approaches, particularly when used with gradient boosting, demonstrated superior effectiveness compared to undersampling, achieving optimal metrics of AUC: 0.9649, MCC: 0.8104, and F1 Score: 0.9072. Moreover, NATE enhances interpretability by providing detailed insights into feature contributions, aiding in understanding individual predictions. These findings highlight NATE’s capability in managing class imbalance, improving predictive performance, and enhancing model interpretability, demonstrating its potential as a reliable and transparent tool for credit scoring applications.

  6. f

    Evaluation of benchmark and optimal model performance with resampling...

    • plos.figshare.com
    xls
    Updated Dec 31, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Seongil Han; Haemin Jung (2024). Evaluation of benchmark and optimal model performance with resampling techniques. [Dataset]. http://doi.org/10.1371/journal.pone.0316454.t008
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Dec 31, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Seongil Han; Haemin Jung
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Evaluation of benchmark and optimal model performance with resampling techniques.

  7. f

    iProtDNA-SMOTE code.

    • plos.figshare.com
    rar
    Updated May 15, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ruiyan Huang; Wangren Qiu; Xuan Xiao; Weizhong Lin (2025). iProtDNA-SMOTE code. [Dataset]. http://doi.org/10.1371/journal.pone.0320817.s003
    Explore at:
    rarAvailable download formats
    Dataset updated
    May 15, 2025
    Dataset provided by
    PLOS ONE
    Authors
    Ruiyan Huang; Wangren Qiu; Xuan Xiao; Weizhong Lin
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Protein-DNA interactions play a crucial role in cellular biology, essential for maintaining life processes and regulating cellular functions. We propose a method called iProtDNA-SMOTE, which utilizes non-equilibrium graph neural networks along with pre-trained protein language models to predict DNA binding residues. This approach effectively addresses the class imbalance issue in predicting protein-DNA binding sites by leveraging unbalanced graph data, thus enhancing model’s generalization and specificity. We trained the model on two datasets, TR646 and TR573, and conducted a series of experiments to evaluate its performance. The model achieved AUC values of 0.850, 0.896, and 0.858 on the independent test datasets TE46, TE129, and TE181, respectively. These results indicate that iProtDNA-SMOTE outperforms existing methods in terms of accuracy and generalization for predicting DNA binding sites, offering reliable and effective predictions to minimize errors. The model has been thoroughly validated for its ability to predict protein-DNA binding sites with high reliability and precision. For the convenience of the scientific community, the benchmark datasets and codes are publicly available at https://github.com/primrosehry/iProtDNA-SMOTE.

  8. f

    Sample size (n) of the full dataset generated under each class-imbalance...

    • figshare.com
    xls
    Updated Jun 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Khurram Nadeem; Mehdi-Abderrahman Jabri (2023). Sample size (n) of the full dataset generated under each class-imbalance ratio (IR) to achieve a target balanced sample size (nb). [Dataset]. http://doi.org/10.1371/journal.pone.0280258.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 21, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Khurram Nadeem; Mehdi-Abderrahman Jabri
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Sample size (n) of the full dataset generated under each class-imbalance ratio (IR) to achieve a target balanced sample size (nb).

  9. f

    MCL-FWA-BILSTM accuracy comparison with existing approaches for multiclass...

    • plos.figshare.com
    xls
    Updated May 23, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Arshad Hashmi; Omar M. Barukab; Ahmad Hamza Osman (2024). MCL-FWA-BILSTM accuracy comparison with existing approaches for multiclass classification with state of art on UNSW-NB15 and NSL-KDD. [Dataset]. http://doi.org/10.1371/journal.pone.0302294.t010
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 23, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Arshad Hashmi; Omar M. Barukab; Ahmad Hamza Osman
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    MCL-FWA-BILSTM accuracy comparison with existing approaches for multiclass classification with state of art on UNSW-NB15 and NSL-KDD.

  10. f

    Accuracy comparison with existing approaches for Binary Classification with...

    • plos.figshare.com
    xls
    Updated May 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Arshad Hashmi; Omar M. Barukab; Ahmad Hamza Osman (2024). Accuracy comparison with existing approaches for Binary Classification with state of art on UNSW-NB15 and NSL-KDD. [Dataset]. http://doi.org/10.1371/journal.pone.0302294.t008
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 23, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Arshad Hashmi; Omar M. Barukab; Ahmad Hamza Osman
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Accuracy comparison with existing approaches for Binary Classification with state of art on UNSW-NB15 and NSL-KDD.

  11. f

    MCL-FWA-BILSTM and other existing approaches for multiclass classification...

    • plos.figshare.com
    xls
    Updated May 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Arshad Hashmi; Omar M. Barukab; Ahmad Hamza Osman (2024). MCL-FWA-BILSTM and other existing approaches for multiclass classification in both datasets. [Dataset]. http://doi.org/10.1371/journal.pone.0302294.t014
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 23, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Arshad Hashmi; Omar M. Barukab; Ahmad Hamza Osman
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    MCL-FWA-BILSTM and other existing approaches for multiclass classification in both datasets.

  12. f

    Performance Metrics of the UNSW-NB15 dataset on the proposed approach.

    • figshare.com
    xls
    Updated May 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Arshad Hashmi; Omar M. Barukab; Ahmad Hamza Osman (2024). Performance Metrics of the UNSW-NB15 dataset on the proposed approach. [Dataset]. http://doi.org/10.1371/journal.pone.0302294.t007
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 23, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Arshad Hashmi; Omar M. Barukab; Ahmad Hamza Osman
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Performance Metrics of the UNSW-NB15 dataset on the proposed approach.

  13. Data from: Multitask Modeling with Confidence Using Matrix Factorization and...

    • acs.figshare.com
    xlsx
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ulf Norinder; Fredrik Svensson (2023). Multitask Modeling with Confidence Using Matrix Factorization and Conformal Prediction [Dataset]. http://doi.org/10.1021/acs.jcim.9b00027.s001
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    ACS Publications
    Authors
    Ulf Norinder; Fredrik Svensson
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Multitask prediction of bioactivities is often faced with challenges relating to the sparsity of data and imbalance between different labels. We propose class conditional (Mondrian) conformal predictors using underlying Macau models as a novel approach for large scale bioactivity prediction. This approach handles both high degrees of missing data and label imbalances while still producing high quality predictive models. When applied to ten assay end points from PubChem, the models generated valid models with an efficiency of 74.0–80.1% at the 80% confidence level with similar performance both for the minority and majority class. Also when deleting progressively larger portions of the available data (0–80%) the performance of the models remained robust with only minor deterioration (reduction in efficiency between 5 and 10%). Compared to using Macau without conformal prediction the method presented here significantly improves the performance on imbalanced data sets.

  14. f

    Confusion matrix of UNSW-NB-15 dataset using MCL-FWA-BILSTM approach.

    • plos.figshare.com
    xls
    Updated May 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Arshad Hashmi; Omar M. Barukab; Ahmad Hamza Osman (2024). Confusion matrix of UNSW-NB-15 dataset using MCL-FWA-BILSTM approach. [Dataset]. http://doi.org/10.1371/journal.pone.0302294.t006
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 23, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Arshad Hashmi; Omar M. Barukab; Ahmad Hamza Osman
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Confusion matrix of UNSW-NB-15 dataset using MCL-FWA-BILSTM approach.

  15. f

    Description of the NSL-KDD dataset attack categories.

    • plos.figshare.com
    xls
    Updated May 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Arshad Hashmi; Omar M. Barukab; Ahmad Hamza Osman (2024). Description of the NSL-KDD dataset attack categories. [Dataset]. http://doi.org/10.1371/journal.pone.0302294.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 23, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Arshad Hashmi; Omar M. Barukab; Ahmad Hamza Osman
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Description of the NSL-KDD dataset attack categories.

  16. f

    Data from: FT-GNN Tool for Bridging HRMS Features and Bioactivity:...

    • acs.figshare.com
    xlsx
    Updated Apr 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fan Fan; Fu Liu; Qingmiao Yu; Ran Yi; Hongqiang Ren; Jinju Geng (2025). FT-GNN Tool for Bridging HRMS Features and Bioactivity: Uncovering Unidentified Estrogen Receptor Agonists in Sewage [Dataset]. http://doi.org/10.1021/acs.est.5c02324.s002
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Apr 9, 2025
    Dataset provided by
    ACS Publications
    Authors
    Fan Fan; Fu Liu; Qingmiao Yu; Ran Yi; Hongqiang Ren; Jinju Geng
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Identifying primary estrogen receptor (ER) agonists in municipal sewage is essential for ensuring the health of aquatic environments. Given the complex and variable chemical composition of sewage, the predominant ER agonists remain unclear. High-resolution mass spectrometry (HRMS)-based models have been developed to predict compound bioactivity in complex matrices, but further optimization is needed to effectively bridge HRMS features with ER agonists. To address this challenge, an FT-GNN (fragmentation tree-based graph neural network) model was proposed. Given limited data and class imbalance, data augmentation was performed using model predictions within the applicability domain (AD) and oversampling technique (OTE). Model development results demonstrated that integrating the FT-GNN with data augmentation improved the balanced accuracy (bACC) value by 6%–31%. The developed model, with a high bACC to identify more true ER agonists, efficiently classified tens of thousands of unidentified HRMS features in sewage, reducing postprocessing workload in nontargeted screening. Analysis of ER agonist transformation during sewage treatment revealed the anaerobic stage as key to both their removal and formation. Estrogenic effect balance analysis suggests that α-E2 and 9,11-didehydroestriol may be two previously overlooked key ER agonists. Collectively, the development and application of the FT-GNN model are crucial advancements toward credible tracking and efficient control of estrogenic risks in water.

  17. f

    Performance metrics of NSL-KDD dataset using MCL-FWA-BILSTM model.

    • figshare.com
    xls
    Updated May 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Arshad Hashmi; Omar M. Barukab; Ahmad Hamza Osman (2024). Performance metrics of NSL-KDD dataset using MCL-FWA-BILSTM model. [Dataset]. http://doi.org/10.1371/journal.pone.0302294.t005
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 23, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Arshad Hashmi; Omar M. Barukab; Ahmad Hamza Osman
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Performance metrics of NSL-KDD dataset using MCL-FWA-BILSTM model.

  18. f

    Performance comparisons of iProtDNA-SMOTE and 5 competing predictors on...

    • plos.figshare.com
    xls
    Updated May 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ruiyan Huang; Wangren Qiu; Xuan Xiao; Weizhong Lin (2025). Performance comparisons of iProtDNA-SMOTE and 5 competing predictors on TE129 under independent validation. [Dataset]. http://doi.org/10.1371/journal.pone.0320817.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 15, 2025
    Dataset provided by
    PLOS ONE
    Authors
    Ruiyan Huang; Wangren Qiu; Xuan Xiao; Weizhong Lin
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Performance comparisons of iProtDNA-SMOTE and 5 competing predictors on TE129 under independent validation.

  19. f

    Performance comparisons of iProtDNA-SMOTE and 4 competing predictors on...

    • plos.figshare.com
    xls
    Updated May 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ruiyan Huang; Wangren Qiu; Xuan Xiao; Weizhong Lin (2025). Performance comparisons of iProtDNA-SMOTE and 4 competing predictors on TE181 under independent validation. [Dataset]. http://doi.org/10.1371/journal.pone.0320817.t004
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 15, 2025
    Dataset provided by
    PLOS ONE
    Authors
    Ruiyan Huang; Wangren Qiu; Xuan Xiao; Weizhong Lin
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Performance comparisons of iProtDNA-SMOTE and 4 competing predictors on TE181 under independent validation.

  20. f

    Performance comparisons of iProtDNA-SMOTE and 6 competing predictors on TE46...

    • plos.figshare.com
    xls
    Updated May 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ruiyan Huang; Wangren Qiu; Xuan Xiao; Weizhong Lin (2025). Performance comparisons of iProtDNA-SMOTE and 6 competing predictors on TE46 under independent validation. [Dataset]. http://doi.org/10.1371/journal.pone.0320817.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 15, 2025
    Dataset provided by
    PLOS ONE
    Authors
    Ruiyan Huang; Wangren Qiu; Xuan Xiao; Weizhong Lin
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Performance comparisons of iProtDNA-SMOTE and 6 competing predictors on TE46 under independent validation.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Seongil Han; Haemin Jung (2024). Performance comparison of machine learning models across accuracy, AUC, MCC, and F1 score on GMSC dataset. [Dataset]. http://doi.org/10.1371/journal.pone.0316454.t005

Performance comparison of machine learning models across accuracy, AUC, MCC, and F1 score on GMSC dataset.

Related Article
Explore at:
xlsAvailable download formats
Dataset updated
Dec 31, 2024
Dataset provided by
PLOS ONE
Authors
Seongil Han; Haemin Jung
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Performance comparison of machine learning models across accuracy, AUC, MCC, and F1 score on GMSC dataset.

Search
Clear search
Close search
Google apps
Main menu