15 datasets found

Classification result classifiers using TF-IDF with SMOTE.
plos.figshare.com
datasetcatalog.nlm.nih.gov
xls
Updated May 28, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Khaled Alnowaiser (2024). Classification result classifiers using TF-IDF with SMOTE. [Dataset]. http://doi.org/10.1371/journal.pone.0302304.t007
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0302304.t007
Dataset updated
May 28, 2024
Dataset provided by
PLOShttp://plos.org/
Authors
Khaled Alnowaiser
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Classification result classifiers using TF-IDF with SMOTE.
i
Balanced API Call Sequences Synthetic Dataset (SMOTE)
ieee-dataport.org
Updated Aug 18, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
MD RAHMAN (2025). Balanced API Call Sequences Synthetic Dataset (SMOTE) [Dataset]. https://ieee-dataport.org/documents/balanced-api-call-sequences-synthetic-dataset-smote
Explore at:
Dataset updated
Aug 18, 2025
Authors
MD RAHMAN
Description
797 malware API call sequences and 1
Classification result of classifiers models using TF without SMOTE.
plos.figshare.com
datasetcatalog.nlm.nih.gov
xls
Updated May 28, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Khaled Alnowaiser (2024). Classification result of classifiers models using TF without SMOTE. [Dataset]. http://doi.org/10.1371/journal.pone.0302304.t004
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0302304.t004
Dataset updated
May 28, 2024
Dataset provided by
PLOShttp://plos.org/
Authors
Khaled Alnowaiser
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Classification result of classifiers models using TF without SMOTE.
i
UIC GII ML SMOTE
ieee-dataport.org
Updated Aug 2, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ezenwa Nwanesi (2025). UIC GII ML SMOTE [Dataset]. https://ieee-dataport.org/documents/uic-gii-ml-smote
Explore at:
Dataset updated
Aug 2, 2025
Authors
Ezenwa Nwanesi
Description
their implementation in Africa is limited.
f
Classification results of machine learning models using TF-IDF with SMOTE.
plos.figshare.com
datasetcatalog.nlm.nih.gov
xls
Updated Jun 14, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Eysha Saad; Saima Sadiq; Ramish Jamil; Furqan Rustam; Arif Mehmood; Gyu Sang Choi; Imran Ashraf (2023). Classification results of machine learning models using TF-IDF with SMOTE. [Dataset]. http://doi.org/10.1371/journal.pone.0270327.t006
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0270327.t006
Dataset updated
Jun 14, 2023
Dataset provided by
PLOS ONE
Authors
Eysha Saad; Saima Sadiq; Ramish Jamil; Furqan Rustam; Arif Mehmood; Gyu Sang Choi; Imran Ashraf
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Classification results of machine learning models using TF-IDF with SMOTE.
i
Korean Voice Phishing Detection Dataset with Multilingual Back-Translation...
ieee-dataport.org
Updated Nov 11, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
MILANDU KEITH MOUSSAVOU BOUSSOUGOU (2024). Korean Voice Phishing Detection Dataset with Multilingual Back-Translation and SMOTE Augmentations [Dataset]. https://ieee-dataport.org/documents/korean-voice-phishing-detection-dataset-multilingual-back-translation-and-smote
Explore at:
Dataset updated
Nov 11, 2024
Authors
MILANDU KEITH MOUSSAVOU BOUSSOUGOU
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Chinese
f
Classification results of machine learning models using BoW with SMOTE.
figshare.com
xls
Updated Jun 17, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Eysha Saad; Saima Sadiq; Ramish Jamil; Furqan Rustam; Arif Mehmood; Gyu Sang Choi; Imran Ashraf (2023). Classification results of machine learning models using BoW with SMOTE. [Dataset]. http://doi.org/10.1371/journal.pone.0270327.t007
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0270327.t007
Dataset updated
Jun 17, 2023
Dataset provided by
PLOS ONE
Authors
Eysha Saad; Saima Sadiq; Ramish Jamil; Furqan Rustam; Arif Mehmood; Gyu Sang Choi; Imran Ashraf
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Classification results of machine learning models using BoW with SMOTE.
f
Classification results of proposed ER-VC model for binary classification on...
plos.figshare.com
xls
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Eysha Saad; Saima Sadiq; Ramish Jamil; Furqan Rustam; Arif Mehmood; Gyu Sang Choi; Imran Ashraf (2023). Classification results of proposed ER-VC model for binary classification on SMOTE-balanced data. [Dataset]. http://doi.org/10.1371/journal.pone.0270327.t014
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0270327.t014
Dataset updated
Jun 1, 2023
Dataset provided by
PLOS ONE
Authors
Eysha Saad; Saima Sadiq; Ramish Jamil; Furqan Rustam; Arif Mehmood; Gyu Sang Choi; Imran Ashraf
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Classification results of proposed ER-VC model for binary classification on SMOTE-balanced data.
f
Classification results of deep neural networks without SMOTE.
plos.figshare.com
xls
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Eysha Saad; Saima Sadiq; Ramish Jamil; Furqan Rustam; Arif Mehmood; Gyu Sang Choi; Imran Ashraf (2023). Classification results of deep neural networks without SMOTE. [Dataset]. http://doi.org/10.1371/journal.pone.0270327.t013
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0270327.t013
Dataset updated
Jun 1, 2023
Dataset provided by
PLOS ONE
Authors
Eysha Saad; Saima Sadiq; Ramish Jamil; Furqan Rustam; Arif Mehmood; Gyu Sang Choi; Imran Ashraf
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Classification results of deep neural networks without SMOTE.
f
Classification results of classifiers using fastText.
plos.figshare.com
datasetcatalog.nlm.nih.gov
xls
Updated May 28, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Khaled Alnowaiser (2024). Classification results of classifiers using fastText. [Dataset]. http://doi.org/10.1371/journal.pone.0302304.t011
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0302304.t011
Dataset updated
May 28, 2024
Dataset provided by
PLOS ONE
Authors
Khaled Alnowaiser
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Classification results of classifiers using fastText.
f
Hyperparameter details of all machine learning models.
figshare.com
xls
Updated May 28, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Khaled Alnowaiser (2024). Hyperparameter details of all machine learning models. [Dataset]. http://doi.org/10.1371/journal.pone.0302304.t002
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0302304.t002
Dataset updated
May 28, 2024
Dataset provided by
PLOS ONE
Authors
Khaled Alnowaiser
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Hyperparameter details of all machine learning models.
f
Strength and weakness of feature representation technique.
plos.figshare.com
xls
Updated May 28, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Khaled Alnowaiser (2024). Strength and weakness of feature representation technique. [Dataset]. http://doi.org/10.1371/journal.pone.0302304.t003
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0302304.t003
Dataset updated
May 28, 2024
Dataset provided by
PLOS ONE
Authors
Khaled Alnowaiser
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Strength and weakness of feature representation technique.
f
Example of different sentiments from the citation sentiment corpus.
plos.figshare.com
xls
Updated May 28, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Khaled Alnowaiser (2024). Example of different sentiments from the citation sentiment corpus. [Dataset]. http://doi.org/10.1371/journal.pone.0302304.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0302304.t001
Dataset updated
May 28, 2024
Dataset provided by
PLOS ONE
Authors
Khaled Alnowaiser
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Example of different sentiments from the citation sentiment corpus.
f
Classification results of machine learning models using CNN features with.
figshare.com
xls
Updated May 28, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Khaled Alnowaiser (2024). Classification results of machine learning models using CNN features with. [Dataset]. http://doi.org/10.1371/journal.pone.0302304.t008
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0302304.t008
Dataset updated
May 28, 2024
Dataset provided by
PLOS ONE
Authors
Khaled Alnowaiser
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Classification results of machine learning models using CNN features with.
f
Comparison of genomic diabetes prediction models.
plos.figshare.com
xls
Updated Jul 18, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Said A. Salloum; Khaled Mohammad Alomari; Ayham Salloum (2025). Comparison of genomic diabetes prediction models. [Dataset]. http://doi.org/10.1371/journal.pone.0328253.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0328253.t001
Dataset updated
Jul 18, 2025
Dataset provided by
PLOS ONE
Authors
Said A. Salloum; Khaled Mohammad Alomari; Ayham Salloum
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Diabetes Mellitus is a global health concern, characterized by high blood sugar levels over a prolonged period, leading to severe complications if left unmanaged. The early identification of individuals at risk is critical for effective intervention and treatment. Traditional diagnostic methods rely heavily on clinical symptoms and biochemical tests, which may not capture the underlying genetic predispositions. With the advent of genomics, DNA sequence analysis has emerged as a promising approach to uncover the genetic markers associated with Diabetes Mellitus. However, the challenge lies in accurately classifying DNA sequences to predict susceptibility to the disease, given the complex nature of genetic data. This study addresses this challenge by employing two advanced machine learning models, NuSVC (Nu-Support Vector Classification) and XGBoost (Extreme Gradient Boosting), to classify DNA sequences related to Diabetes Mellitus. The dataset, obtained from reputable sources like NCBI, was preprocessed using Natural Language Processing (NLP) techniques, where DNA sequences were treated as textual data and transformed into numerical features using TF-IDF (Term Frequency-Inverse Document Frequency). To handle the class imbalance in the dataset, SMOTE (Synthetic Minority Over-sampling Technique) was applied. The models were trained and validated using 10-fold cross-validation. XGBoost was trained with up to 300 boosting rounds, and performance was evaluated using accuracy, precision, recall, F1-score, ROC-AUC, and log loss. The results demonstrate that XGBoost outperformed NuSVC across all metrics, achieving an accuracy of 98%, a log loss of 0.0650, and an AUC of 1.00, compared to NuSVC’s accuracy of 87%, log loss of 0.2649, and AUC of 0.95. The superior performance of XGBoost indicates its robustness in handling complex genetic data and its potential utility in clinical applications for early diagnosis of Diabetes Mellitus. The findings of this study underscore the importance of advanced machine learning techniques in genomics and suggest that integrating such models into healthcare systems could significantly enhance predictive diagnostics.
Not seeing a result you expected?
Learn how you can add new datasets to our index.