90 datasets found
  1. f

    Classification result classifiers using TF-IDF with SMOTE.

    • plos.figshare.com
    xls
    Updated May 28, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Khaled Alnowaiser (2024). Classification result classifiers using TF-IDF with SMOTE. [Dataset]. http://doi.org/10.1371/journal.pone.0302304.t007
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 28, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Khaled Alnowaiser
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Classification result classifiers using TF-IDF with SMOTE.

  2. f

    Adaptive Swarm Balancing Algorithms for rare-event prediction in imbalanced...

    • plos.figshare.com
    txt
    Updated Jun 18, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jinyan Li; Lian-sheng Liu; Simon Fong; Raymond K. Wong; Sabah Mohammed; Jinan Fiaidhi; Yunsick Sung; Kelvin K. L. Wong (2023). Adaptive Swarm Balancing Algorithms for rare-event prediction in imbalanced healthcare data [Dataset]. http://doi.org/10.1371/journal.pone.0180830
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 18, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Jinyan Li; Lian-sheng Liu; Simon Fong; Raymond K. Wong; Sabah Mohammed; Jinan Fiaidhi; Yunsick Sung; Kelvin K. L. Wong
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Clinical data analysis and forecasting have made substantial contributions to disease control, prevention and detection. However, such data usually suffer from highly imbalanced samples in class distributions. In this paper, we aim to formulate effective methods to rebalance binary imbalanced dataset, where the positive samples take up only the minority. We investigate two different meta-heuristic algorithms, particle swarm optimization and bat algorithm, and apply them to empower the effects of synthetic minority over-sampling technique (SMOTE) for pre-processing the datasets. One approach is to process the full dataset as a whole. The other is to split up the dataset and adaptively process it one segment at a time. The experimental results reported in this paper reveal that the performance improvements obtained by the former methods are not scalable to larger data scales. The latter methods, which we call Adaptive Swarm Balancing Algorithms, lead to significant efficiency and effectiveness improvements on large datasets while the first method is invalid. We also find it more consistent with the practice of the typical large imbalanced medical datasets. We further use the meta-heuristic algorithms to optimize two key parameters of SMOTE. The proposed methods lead to more credible performances of the classifier, and shortening the run time compared to brute-force method.

  3. f

    Summary table: Oversampling techniques using SMOTE, ADASYN, and weighted...

    • plos.figshare.com
    xls
    Updated Nov 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alaa Alomari; Hossam Faris; Pedro A. Castillo (2023). Summary table: Oversampling techniques using SMOTE, ADASYN, and weighted rare classes. [Dataset]. http://doi.org/10.1371/journal.pone.0290581.t007
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Nov 16, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Alaa Alomari; Hossam Faris; Pedro A. Castillo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Summary table: Oversampling techniques using SMOTE, ADASYN, and weighted rare classes.

  4. Indian Liver Patient Dataset (ILPD)

    • kaggle.com
    Updated Sep 27, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Saumya Mohandas N (2021). Indian Liver Patient Dataset (ILPD) [Dataset]. https://www.kaggle.com/datasets/saumyamohandas/indian-liver-patient-dataset-ilpd/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 27, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Saumya Mohandas N
    Description

    Dataset

    This dataset was created by Saumya Mohandas N

    Contents

  5. f

    A comparative analysis of earlier studies.

    • plos.figshare.com
    xls
    Updated Jan 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Praveen Talari; Bharathiraja N; Gaganpreet Kaur; Hani Alshahrani; Mana Saleh Al Reshan; Adel Sulaiman; Asadullah Shaikh (2024). A comparative analysis of earlier studies. [Dataset]. http://doi.org/10.1371/journal.pone.0292100.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jan 18, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Praveen Talari; Bharathiraja N; Gaganpreet Kaur; Hani Alshahrani; Mana Saleh Al Reshan; Adel Sulaiman; Asadullah Shaikh
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Diabetes prediction is an ongoing study topic in which medical specialists are attempting to forecast the condition with greater precision. Diabetes typically stays lethargic, and on the off chance that patients are determined to have another illness, like harm to the kidney vessels, issues with the retina of the eye, or a heart issue, it can cause metabolic problems and various complexities in the body. Various worldwide learning procedures, including casting a ballot, supporting, and sacking, have been applied in this review. The Engineered Minority Oversampling Procedure (Destroyed), along with the K-overlay cross-approval approach, was utilized to achieve class evening out and approve the discoveries. Pima Indian Diabetes (PID) dataset is accumulated from the UCI Machine Learning (UCI ML) store for this review, and this dataset was picked. A highlighted engineering technique was used to calculate the influence of lifestyle factors. A two-phase classification model has been developed to predict insulin resistance using the Sequential Minimal Optimisation (SMO) and SMOTE approaches together. The SMOTE technique is used to preprocess data in the model’s first phase, while SMO classes are used in the second phase. All other categorization techniques were outperformed by bagging decision trees in terms of Misclassification Error rate, Accuracy, Specificity, Precision, Recall, F1 measures, and ROC curve. The model was created using a combined SMOTE and SMO strategy, which achieved 99.07% correction with 0.1 ms of runtime. The suggested system’s result is to enhance the classifier’s performance in spotting illness early.

  6. f

    Additional file 2 of Implementation of ensemble machine learning algorithms...

    • springernature.figshare.com
    txt
    Updated Jun 20, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abdu Rehaman Pasha Syed; Rahul Anbalagan; Anagha S. Setlur; Chandrashekar Karunakaran; Jyoti Shetty; Jitendra Kumar; Vidya Niranjan (2023). Additional file 2 of Implementation of ensemble machine learning algorithms on exome datasets for predicting early diagnosis of cancers [Dataset]. http://doi.org/10.6084/m9.figshare.21592787.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 20, 2023
    Dataset provided by
    figshare
    Authors
    Abdu Rehaman Pasha Syed; Rahul Anbalagan; Anagha S. Setlur; Chandrashekar Karunakaran; Jyoti Shetty; Jitendra Kumar; Vidya Niranjan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Additional file 2. The synthetic dataset generated through TVAE method.

  7. f

    Classification result of classifiers models using TF without SMOTE.

    • figshare.com
    xls
    Updated May 28, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Khaled Alnowaiser (2024). Classification result of classifiers models using TF without SMOTE. [Dataset]. http://doi.org/10.1371/journal.pone.0302304.t004
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 28, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Khaled Alnowaiser
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Classification result of classifiers models using TF without SMOTE.

  8. m

    Data from: Mental issues, internet addiction and quality of life predict...

    • data.mendeley.com
    Updated Jul 12, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andras Matuz (2024). Mental issues, internet addiction and quality of life predict burnout among Hungarian teachers: a machine learning analysis [Dataset]. http://doi.org/10.17632/2yy4j7rgvg.1
    Explore at:
    Dataset updated
    Jul 12, 2024
    Authors
    Andras Matuz
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Background: Burnout is usually defined as a state of emotional, physical, and mental exhaustion that affects people in various professions (e.g. physicians, nurses, teachers). The consequences of burnout involve decreased motivation, productivity, and overall diminished well-being. The machine learning-based prediction of burnout has therefore become the focus of recent research. In this study, the aim was to detect burnout using machine learning and to identify its most important predictors in a sample of Hungarian high-school teachers. Methods: The final sample consisted of 1,576 high-school teachers (522 male), who completed a survey including various sociodemographic and health-related questions and psychological questionnaires. Specifically, depression, insomnia, internet habits (e.g. when and why one uses the internet) and problematic internet usage were among the most important predictors tested in this study. Supervised classification algorithms were trained to detect burnout assessed by two well-known burnout questionnaires. Feature selection was conducted using recursive feature elimination. Hyperparameters were tuned via grid search with 5-fold cross-validation. Due to class imbalance, class weights (i.e. cost-sensitive learning), downsampling and a hybrid method (SMOTE-ENN) were applied in separate analyses. The final model evaluation was carried out on a previously unseen holdout test sample. Results: Burnout was detected in 19.7% of the teachers included in the final dataset. The best predictive performance on the holdout test sample was achieved by support vector machine with SMOTE-ENN (AUC = .942; balanced accuracy = .868, sensitivity = .898; specificity = .837). The best predictors of burnout were Beck’s Depression Inventory scores, Athen’s Insomnia Scale scores, subscales of the Problematic Internet Use Questionnaire and self-reported current health status. Conclusions: The performances of the algorithms were comparable with previous studies; however, it is important to note that we tested our models on previously unseen holdout samples suggesting higher levels of generalizability. Another remarkable finding is that besides depression and insomnia, other variables such as problematic internet use and time spent online also turned out to be important predictors of burnout.

  9. S

    Systematic analysis and modeling of the FLASH sparing effect as a function...

    • scidb.cn
    Updated Jun 29, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Qibin FU; Tuchen HUANG (2024). Systematic analysis and modeling of the FLASH sparing effect as a function of dose rate and dose [Dataset]. http://doi.org/10.57760/sciencedb.j00186.00150
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 29, 2024
    Dataset provided by
    Science Data Bank
    Authors
    Qibin FU; Tuchen HUANG
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Online searches through Web of Science and PubMed were conducted on 15 September, 2023 for articles published after 1950 using the following terms: TS = (ultra high dose rate OR ultra-high dose rate OR ultrahigh dose rate) AND TS = (in vivo OR animal model OR mice OR preclinical). The queries produced 980 results in total, with 564 results left after removing duplicate entries.The titles and abstracts were reviewed manually by two authors and the full-text of suitable manuscripts was further screened considering the factors such as topics, experiment condition and methods, research objects, endpoints, etc. The detailed record identification and screening flows based on Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) are summarized in Figure 1. Finally, forty articles were included in our analysis.The FLASH effect was confirmed if there were significant differences in experimental phenomena and data under the two radiation conditions. In the same article, the research items with different endpoints but otherwise identical conditions were regarded as one item. As summarized in Table 1, a total of 131 items were extracted from the 40 articles included in the analysis. For each item, the FLASH effect (1 represents significant sparing effect and 0 represents no sparing effect) and detailed parameters were recorded, including type and energy of the radiation, dose, dose rate, experimental object, pulse characteristics (if provided), etc.According to emulate the quantitative analyses of normal tissue effect in the clinic (QUANTEC), the probability of triggering the FLASH effect as a function of mean dose rate or dose was analyzed with the binary logistic regression model. The analysis was done using the SPSS software. For the statistical data items, there are large imbalances in the number of data entries with and without FLASH effect (people are more inclined to report the research with positive results). Therefore, a more balanced dataset was obtained by oversampling using the K-Means SMOTE algorithm (Figure S1), which was implemented using Python based on the imblearn library.The ROC curve (receiver operating characteristic curve) was plotted as FPR (False Positive Rate) against TPR (True Positive Rate) at different threshold values. The classification model was validated using the AUC (area under ROC curve) value, which was threshold and scale invariant.

  10. c

    Data from: CreditCardTransactions Dataset

    • cubig.ai
    Updated Jul 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CUBIG (2025). CreditCardTransactions Dataset [Dataset]. https://cubig.ai/store/products/554/creditcardtransactions-dataset
    Explore at:
    Dataset updated
    Jul 8, 2025
    Dataset authored and provided by
    CUBIG
    License

    https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service

    Measurement technique
    Privacy-preserving data transformation via differential privacy, Synthetic data generation using AI techniques for model training
    Description

    1) Data Introduction • The Credit_Card_Transactions Dataset is a representative sample data for building fraud detection models, including anonymized real-world transaction data such as financial transaction type, amount, sender/receiver account balance, and fraud indicators.

    2) Data Utilization (1) Credit_Card_Transactions Dataset has characteristics that: • This dataset provides individual transaction records on a row-by-row basis, reflecting the real-world class imbalance problem with the extremely low percentage of fraudulent transactions (isFraud=1). • It is an unprocessed raw data structure that allows you to directly utilize key variables such as transaction time, amount, and account change. (2) Credit_Card_Transactions Dataset can be used to: • Binary classification modeling: Fraud transaction detection models can be developed by applying imbalance processing techniques such as SMOTE and undersampling, and appropriate evaluation indicators such as F1-score and ROC-AUC. • Real-time anomaly detection: It can be used to build a real-time anomaly signal detection system through analysis of transaction patterns (amount, frequency, account change).

  11. n

    Data from: Image-based automated species identification: Can virtual data...

    • data.niaid.nih.gov
    • zenodo.org
    • +1more
    zip
    Updated Jul 12, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Morris Klasen; Jonas Eberle; Dirk Ahrens; Volker Steinhage (2021). Image-based automated species identification: Can virtual data augmentation overcome problems of insufficient sampling? [Dataset]. http://doi.org/10.5061/dryad.f1vhhmgx9
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 12, 2021
    Dataset provided by
    Zoological Research Museum Alexander Koenig
    University of Bonn
    Authors
    Morris Klasen; Jonas Eberle; Dirk Ahrens; Volker Steinhage
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    Automated species identification and delimitation is challenging, particularly in rare and thus often scarcely sampled species, which do not allow sufficient discrimination of infraspecific versus interspecific variation. Typical problems arising from either low or exaggerated interspecific morphological differentiation are best met by automated methods of machine learning that learn efficient and effective species identification from training samples. However, limited infraspecific sampling remains a key challenge also in machine learning. In this study, we assessed whether a data augmentation approach may help to overcome the problem of scarce training data in automated visual species identification. The stepwise augmentation of data comprised image rotation as well as visual and virtual augmentation. The visual data augmentation applies classic approaches of data augmentation and generation of artificial images using a Generative Adversarial Networks (GAN) approach. Descriptive feature vectors are derived from bottleneck features of a VGG-16 convolutional neural network (CNN) that are then stepwise reduced in dimensionality using Global Average Pooling and PCA to prevent overfitting. Finally, data augmentation employs synthetic additional sampling in feature space by an oversampling algorithm in vector space (SMOTE). Applied on four different image datasets, which include scarab beetle genitalia (Pleophylla, Schizonycha) as well as wing patterns of bees (Osmia) and cattleheart butterflies (Parides), our augmentation approach outperformed a deep learning baseline approach by means of resulting identification accuracy with non-augmented data as well as a traditional 2D morphometric approach (Procrustes analysis of scarab beetle genitalia).

  12. f

    Results of bioassay 439 dataset in experiment 1.

    • plos.figshare.com
    xls
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jinyan Li; Lian-sheng Liu; Simon Fong; Raymond K. Wong; Sabah Mohammed; Jinan Fiaidhi; Yunsick Sung; Kelvin K. L. Wong (2023). Results of bioassay 439 dataset in experiment 1. [Dataset]. http://doi.org/10.1371/journal.pone.0180830.t004
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Jinyan Li; Lian-sheng Liu; Simon Fong; Raymond K. Wong; Sabah Mohammed; Jinan Fiaidhi; Yunsick Sung; Kelvin K. L. Wong
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Results of bioassay 439 dataset in experiment 1.

  13. The top five rules based on association rule learning with SMOTE for each...

    • plos.figshare.com
    xls
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aziz Zafar; Ziad Attia; Mehret Tesfaye; Sosina Walelign; Moges Wordofa; Dessie Abera; Kassu Desta; Aster Tsegaye; Ahmet Ay; Bineyam Taye (2023). The top five rules based on association rule learning with SMOTE for each infection outcome. [Dataset]. http://doi.org/10.1371/journal.pntd.0010517.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Aziz Zafar; Ziad Attia; Mehret Tesfaye; Sosina Walelign; Moges Wordofa; Dessie Abera; Kassu Desta; Aster Tsegaye; Ahmet Ay; Bineyam Taye
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    For each infection, the five rules with the highest lift values are chosen and sorted. The combinations of risk factors specified on the left leads to the given infection.

  14. h

    Language_Indentification_v2

    • huggingface.co
    Updated Mar 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ProcessVenue (2025). Language_Indentification_v2 [Dataset]. https://huggingface.co/datasets/Process-Venue/Language_Indentification_v2
    Explore at:
    Dataset updated
    Mar 18, 2025
    Dataset authored and provided by
    ProcessVenue
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Dataset Card for Language Identification Dataset

      Sample Notebook:
    

    https://www.kaggle.com/code/rishabhbhartiya/indian-language-classification-smote-resampled

      Kaggle Dataset link:
    

    https://www.kaggle.com/datasets/processvenue/indian-language-identification

      Dataset Summary
    

    A comprehensive dataset for Indian language identification and text classification. The dataset contains text samples across 18 major Indian languages, making it suitable for… See the full description on the dataset page: https://huggingface.co/datasets/Process-Venue/Language_Indentification_v2.

  15. f

    Results of bioassay 721 dataset in experiment 1.

    • plos.figshare.com
    xls
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jinyan Li; Lian-sheng Liu; Simon Fong; Raymond K. Wong; Sabah Mohammed; Jinan Fiaidhi; Yunsick Sung; Kelvin K. L. Wong (2023). Results of bioassay 721 dataset in experiment 1. [Dataset]. http://doi.org/10.1371/journal.pone.0180830.t005
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Jinyan Li; Lian-sheng Liu; Simon Fong; Raymond K. Wong; Sabah Mohammed; Jinan Fiaidhi; Yunsick Sung; Kelvin K. L. Wong
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Results of bioassay 721 dataset in experiment 1.

  16. f

    Results of Bioassay 687 dataset in experiment 2.

    • plos.figshare.com
    xls
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jinyan Li; Lian-sheng Liu; Simon Fong; Raymond K. Wong; Sabah Mohammed; Jinan Fiaidhi; Yunsick Sung; Kelvin K. L. Wong (2023). Results of Bioassay 687 dataset in experiment 2. [Dataset]. http://doi.org/10.1371/journal.pone.0180830.t013
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Jinyan Li; Lian-sheng Liu; Simon Fong; Raymond K. Wong; Sabah Mohammed; Jinan Fiaidhi; Yunsick Sung; Kelvin K. L. Wong
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Results of Bioassay 687 dataset in experiment 2.

  17. f

    Results of bioassay 362 dataset in experiment 1.

    • figshare.com
    xls
    Updated Jun 17, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jinyan Li; Lian-sheng Liu; Simon Fong; Raymond K. Wong; Sabah Mohammed; Jinan Fiaidhi; Yunsick Sung; Kelvin K. L. Wong (2023). Results of bioassay 362 dataset in experiment 1. [Dataset]. http://doi.org/10.1371/journal.pone.0180830.t007
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 17, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Jinyan Li; Lian-sheng Liu; Simon Fong; Raymond K. Wong; Sabah Mohammed; Jinan Fiaidhi; Yunsick Sung; Kelvin K. L. Wong
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Results of bioassay 362 dataset in experiment 1.

  18. f

    Results of Bioassay 373 dataset in experiment 2.

    • plos.figshare.com
    xls
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jinyan Li; Lian-sheng Liu; Simon Fong; Raymond K. Wong; Sabah Mohammed; Jinan Fiaidhi; Yunsick Sung; Kelvin K. L. Wong (2023). Results of Bioassay 373 dataset in experiment 2. [Dataset]. http://doi.org/10.1371/journal.pone.0180830.t012
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Jinyan Li; Lian-sheng Liu; Simon Fong; Raymond K. Wong; Sabah Mohammed; Jinan Fiaidhi; Yunsick Sung; Kelvin K. L. Wong
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Results of Bioassay 373 dataset in experiment 2.

  19. f

    Average values of Accuracy, Kappa and imbalance ratio (min/maj) for the two...

    • figshare.com
    xls
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jinyan Li; Lian-sheng Liu; Simon Fong; Raymond K. Wong; Sabah Mohammed; Jinan Fiaidhi; Yunsick Sung; Kelvin K. L. Wong (2023). Average values of Accuracy, Kappa and imbalance ratio (min/maj) for the two methods in experiment 2. [Dataset]. http://doi.org/10.1371/journal.pone.0180830.t008
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Jinyan Li; Lian-sheng Liu; Simon Fong; Raymond K. Wong; Sabah Mohammed; Jinan Fiaidhi; Yunsick Sung; Kelvin K. L. Wong
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Average values of Accuracy, Kappa and imbalance ratio (min/maj) for the two methods in experiment 2.

  20. f

    Results of Bioassay 1608 dataset in experiment 2.

    • plos.figshare.com
    xls
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jinyan Li; Lian-sheng Liu; Simon Fong; Raymond K. Wong; Sabah Mohammed; Jinan Fiaidhi; Yunsick Sung; Kelvin K. L. Wong (2023). Results of Bioassay 1608 dataset in experiment 2. [Dataset]. http://doi.org/10.1371/journal.pone.0180830.t011
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Jinyan Li; Lian-sheng Liu; Simon Fong; Raymond K. Wong; Sabah Mohammed; Jinan Fiaidhi; Yunsick Sung; Kelvin K. L. Wong
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Results of Bioassay 1608 dataset in experiment 2.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Khaled Alnowaiser (2024). Classification result classifiers using TF-IDF with SMOTE. [Dataset]. http://doi.org/10.1371/journal.pone.0302304.t007

Classification result classifiers using TF-IDF with SMOTE.

Related Article
Explore at:
2 scholarly articles cite this dataset (View in Google Scholar)
xlsAvailable download formats
Dataset updated
May 28, 2024
Dataset provided by
PLOS ONE
Authors
Khaled Alnowaiser
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Classification result classifiers using TF-IDF with SMOTE.

Search
Clear search
Close search
Google apps
Main menu