15 datasets found
  1. Classification result classifiers using TF-IDF with SMOTE.

    • plos.figshare.com
    • datasetcatalog.nlm.nih.gov
    xls
    Updated May 28, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Khaled Alnowaiser (2024). Classification result classifiers using TF-IDF with SMOTE. [Dataset]. http://doi.org/10.1371/journal.pone.0302304.t007
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 28, 2024
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Khaled Alnowaiser
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Classification result classifiers using TF-IDF with SMOTE.

  2. i

    Balanced API Call Sequences Synthetic Dataset (SMOTE)

    • ieee-dataport.org
    Updated Aug 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    MD RAHMAN (2025). Balanced API Call Sequences Synthetic Dataset (SMOTE) [Dataset]. https://ieee-dataport.org/documents/balanced-api-call-sequences-synthetic-dataset-smote
    Explore at:
    Dataset updated
    Aug 18, 2025
    Authors
    MD RAHMAN
    Description

    797 malware API call sequences and 1

  3. Classification result of classifiers models using TF without SMOTE.

    • plos.figshare.com
    • datasetcatalog.nlm.nih.gov
    xls
    Updated May 28, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Khaled Alnowaiser (2024). Classification result of classifiers models using TF without SMOTE. [Dataset]. http://doi.org/10.1371/journal.pone.0302304.t004
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 28, 2024
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Khaled Alnowaiser
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Classification result of classifiers models using TF without SMOTE.

  4. i

    UIC GII ML SMOTE

    • ieee-dataport.org
    Updated Aug 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ezenwa Nwanesi (2025). UIC GII ML SMOTE [Dataset]. https://ieee-dataport.org/documents/uic-gii-ml-smote
    Explore at:
    Dataset updated
    Aug 2, 2025
    Authors
    Ezenwa Nwanesi
    Description

    their implementation in Africa is limited.

  5. f

    Classification results of machine learning models using TF-IDF with SMOTE.

    • plos.figshare.com
    • datasetcatalog.nlm.nih.gov
    xls
    Updated Jun 14, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eysha Saad; Saima Sadiq; Ramish Jamil; Furqan Rustam; Arif Mehmood; Gyu Sang Choi; Imran Ashraf (2023). Classification results of machine learning models using TF-IDF with SMOTE. [Dataset]. http://doi.org/10.1371/journal.pone.0270327.t006
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 14, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Eysha Saad; Saima Sadiq; Ramish Jamil; Furqan Rustam; Arif Mehmood; Gyu Sang Choi; Imran Ashraf
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Classification results of machine learning models using TF-IDF with SMOTE.

  6. i

    Korean Voice Phishing Detection Dataset with Multilingual Back-Translation...

    • ieee-dataport.org
    Updated Nov 11, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    MILANDU KEITH MOUSSAVOU BOUSSOUGOU (2024). Korean Voice Phishing Detection Dataset with Multilingual Back-Translation and SMOTE Augmentations [Dataset]. https://ieee-dataport.org/documents/korean-voice-phishing-detection-dataset-multilingual-back-translation-and-smote
    Explore at:
    Dataset updated
    Nov 11, 2024
    Authors
    MILANDU KEITH MOUSSAVOU BOUSSOUGOU
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Chinese

  7. f

    Classification results of machine learning models using BoW with SMOTE.

    • figshare.com
    xls
    Updated Jun 17, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eysha Saad; Saima Sadiq; Ramish Jamil; Furqan Rustam; Arif Mehmood; Gyu Sang Choi; Imran Ashraf (2023). Classification results of machine learning models using BoW with SMOTE. [Dataset]. http://doi.org/10.1371/journal.pone.0270327.t007
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 17, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Eysha Saad; Saima Sadiq; Ramish Jamil; Furqan Rustam; Arif Mehmood; Gyu Sang Choi; Imran Ashraf
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Classification results of machine learning models using BoW with SMOTE.

  8. f

    Classification results of proposed ER-VC model for binary classification on...

    • plos.figshare.com
    xls
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eysha Saad; Saima Sadiq; Ramish Jamil; Furqan Rustam; Arif Mehmood; Gyu Sang Choi; Imran Ashraf (2023). Classification results of proposed ER-VC model for binary classification on SMOTE-balanced data. [Dataset]. http://doi.org/10.1371/journal.pone.0270327.t014
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Eysha Saad; Saima Sadiq; Ramish Jamil; Furqan Rustam; Arif Mehmood; Gyu Sang Choi; Imran Ashraf
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Classification results of proposed ER-VC model for binary classification on SMOTE-balanced data.

  9. f

    Classification results of deep neural networks without SMOTE.

    • plos.figshare.com
    xls
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eysha Saad; Saima Sadiq; Ramish Jamil; Furqan Rustam; Arif Mehmood; Gyu Sang Choi; Imran Ashraf (2023). Classification results of deep neural networks without SMOTE. [Dataset]. http://doi.org/10.1371/journal.pone.0270327.t013
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Eysha Saad; Saima Sadiq; Ramish Jamil; Furqan Rustam; Arif Mehmood; Gyu Sang Choi; Imran Ashraf
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Classification results of deep neural networks without SMOTE.

  10. f

    Classification results of classifiers using fastText.

    • plos.figshare.com
    • datasetcatalog.nlm.nih.gov
    xls
    Updated May 28, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Khaled Alnowaiser (2024). Classification results of classifiers using fastText. [Dataset]. http://doi.org/10.1371/journal.pone.0302304.t011
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 28, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Khaled Alnowaiser
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Classification results of classifiers using fastText.

  11. f

    Hyperparameter details of all machine learning models.

    • figshare.com
    xls
    Updated May 28, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Khaled Alnowaiser (2024). Hyperparameter details of all machine learning models. [Dataset]. http://doi.org/10.1371/journal.pone.0302304.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 28, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Khaled Alnowaiser
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Hyperparameter details of all machine learning models.

  12. f

    Strength and weakness of feature representation technique.

    • plos.figshare.com
    xls
    Updated May 28, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Khaled Alnowaiser (2024). Strength and weakness of feature representation technique. [Dataset]. http://doi.org/10.1371/journal.pone.0302304.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 28, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Khaled Alnowaiser
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Strength and weakness of feature representation technique.

  13. f

    Example of different sentiments from the citation sentiment corpus.

    • plos.figshare.com
    xls
    Updated May 28, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Khaled Alnowaiser (2024). Example of different sentiments from the citation sentiment corpus. [Dataset]. http://doi.org/10.1371/journal.pone.0302304.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 28, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Khaled Alnowaiser
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Example of different sentiments from the citation sentiment corpus.

  14. f

    Classification results of machine learning models using CNN features with.

    • figshare.com
    xls
    Updated May 28, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Khaled Alnowaiser (2024). Classification results of machine learning models using CNN features with. [Dataset]. http://doi.org/10.1371/journal.pone.0302304.t008
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 28, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Khaled Alnowaiser
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Classification results of machine learning models using CNN features with.

  15. f

    Comparison of genomic diabetes prediction models.

    • plos.figshare.com
    xls
    Updated Jul 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Said A. Salloum; Khaled Mohammad Alomari; Ayham Salloum (2025). Comparison of genomic diabetes prediction models. [Dataset]. http://doi.org/10.1371/journal.pone.0328253.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jul 18, 2025
    Dataset provided by
    PLOS ONE
    Authors
    Said A. Salloum; Khaled Mohammad Alomari; Ayham Salloum
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Diabetes Mellitus is a global health concern, characterized by high blood sugar levels over a prolonged period, leading to severe complications if left unmanaged. The early identification of individuals at risk is critical for effective intervention and treatment. Traditional diagnostic methods rely heavily on clinical symptoms and biochemical tests, which may not capture the underlying genetic predispositions. With the advent of genomics, DNA sequence analysis has emerged as a promising approach to uncover the genetic markers associated with Diabetes Mellitus. However, the challenge lies in accurately classifying DNA sequences to predict susceptibility to the disease, given the complex nature of genetic data. This study addresses this challenge by employing two advanced machine learning models, NuSVC (Nu-Support Vector Classification) and XGBoost (Extreme Gradient Boosting), to classify DNA sequences related to Diabetes Mellitus. The dataset, obtained from reputable sources like NCBI, was preprocessed using Natural Language Processing (NLP) techniques, where DNA sequences were treated as textual data and transformed into numerical features using TF-IDF (Term Frequency-Inverse Document Frequency). To handle the class imbalance in the dataset, SMOTE (Synthetic Minority Over-sampling Technique) was applied. The models were trained and validated using 10-fold cross-validation. XGBoost was trained with up to 300 boosting rounds, and performance was evaluated using accuracy, precision, recall, F1-score, ROC-AUC, and log loss. The results demonstrate that XGBoost outperformed NuSVC across all metrics, achieving an accuracy of 98%, a log loss of 0.0650, and an AUC of 1.00, compared to NuSVC’s accuracy of 87%, log loss of 0.2649, and AUC of 0.95. The superior performance of XGBoost indicates its robustness in handling complex genetic data and its potential utility in clinical applications for early diagnosis of Diabetes Mellitus. The findings of this study underscore the importance of advanced machine learning techniques in genomics and suggest that integrating such models into healthcare systems could significantly enhance predictive diagnostics.

  16. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Khaled Alnowaiser (2024). Classification result classifiers using TF-IDF with SMOTE. [Dataset]. http://doi.org/10.1371/journal.pone.0302304.t007
Organization logo

Classification result classifiers using TF-IDF with SMOTE.

Related Article
Explore at:
2 scholarly articles cite this dataset (View in Google Scholar)
xlsAvailable download formats
Dataset updated
May 28, 2024
Dataset provided by
PLOShttp://plos.org/
Authors
Khaled Alnowaiser
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Classification result classifiers using TF-IDF with SMOTE.

Search
Clear search
Close search
Google apps
Main menu