1 dataset found
  1. f

    Descriptions of dataset(s) employed.

    • plos.figshare.com
    xls
    Updated Aug 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abhilash Pati; Amrutanshu Panigrahi; Manoranjan Parhi; Jayant Giri; Hong Qin; Saurav Mallik; Sambit Ranjan Pattanayak; Umang Kumar Agrawal (2024). Descriptions of dataset(s) employed. [Dataset]. http://doi.org/10.1371/journal.pone.0304768.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Aug 1, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Abhilash Pati; Amrutanshu Panigrahi; Manoranjan Parhi; Jayant Giri; Hong Qin; Saurav Mallik; Sambit Ranjan Pattanayak; Umang Kumar Agrawal
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Breast cancer is a major health concern for women everywhere and a major killer of women. Malignant tumors may be distinguished from benign ones, allowing for early diagnosis of this disease. Therefore, doctors need an accurate method of diagnosing tumors as either malignant or benign. Even if therapy begins immediately after diagnosis, some cancer cells may persist in the body, increasing the risk of a recurrence. Metastasis and recurrence are the leading causes of death from breast cancer. Therefore, detecting a return of breast cancer early has become a pressing medical issue. Evaluating and contrasting various Machine Learning (ML) techniques for breast cancer and recurrence prediction is crucial to choosing the best successful method. Inaccurate forecasts are common when using datasets with a large number of attributes. This study addresses the need for effective feature selection and optimization methods by introducing Recursive Feature Elimination (RFE) and Grey Wolf Optimizer (GWO), in response to the limitations observed in existing approaches. In this research, the performance evaluation of methods is enhanced by employing the RFE and GWO, considering the Wisconsin Diagnostic Breast Cancer (WDBC) and Wisconsin Prognostic Breast Cancer (WPBC) datasets taken from the UCI-ML repository. Various preprocessing techniques are applied to raw data, including imputation, scaling, and others. In the second step, relevant feature correlations are used with RFE to narrow down candidate discriminative features. The GWO chooses the best possible combination of attributes for the most accurate result in the next step. We use seven ML classifiers in both datasets to make a binary decision. On the WDBC and WPBC datasets, several experiments have shown accuracies of 98.25% and 93.27%, precisions of 98.13% and 95.56%, sensitivities of 99.06% and 96.63%, specificities of 96.92% and 73.33%, F1-scores of 98.59% and 96.09% and AUCs of 0.982 and 0.936, respectively. The hybrid approach’s superior feature selection improved the accuracy of breast cancer performance indicators and recurrence classification.

  2. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Abhilash Pati; Amrutanshu Panigrahi; Manoranjan Parhi; Jayant Giri; Hong Qin; Saurav Mallik; Sambit Ranjan Pattanayak; Umang Kumar Agrawal (2024). Descriptions of dataset(s) employed. [Dataset]. http://doi.org/10.1371/journal.pone.0304768.t002

Descriptions of dataset(s) employed.

Related Article
Explore at:
xlsAvailable download formats
Dataset updated
Aug 1, 2024
Dataset provided by
PLOS ONE
Authors
Abhilash Pati; Amrutanshu Panigrahi; Manoranjan Parhi; Jayant Giri; Hong Qin; Saurav Mallik; Sambit Ranjan Pattanayak; Umang Kumar Agrawal
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Breast cancer is a major health concern for women everywhere and a major killer of women. Malignant tumors may be distinguished from benign ones, allowing for early diagnosis of this disease. Therefore, doctors need an accurate method of diagnosing tumors as either malignant or benign. Even if therapy begins immediately after diagnosis, some cancer cells may persist in the body, increasing the risk of a recurrence. Metastasis and recurrence are the leading causes of death from breast cancer. Therefore, detecting a return of breast cancer early has become a pressing medical issue. Evaluating and contrasting various Machine Learning (ML) techniques for breast cancer and recurrence prediction is crucial to choosing the best successful method. Inaccurate forecasts are common when using datasets with a large number of attributes. This study addresses the need for effective feature selection and optimization methods by introducing Recursive Feature Elimination (RFE) and Grey Wolf Optimizer (GWO), in response to the limitations observed in existing approaches. In this research, the performance evaluation of methods is enhanced by employing the RFE and GWO, considering the Wisconsin Diagnostic Breast Cancer (WDBC) and Wisconsin Prognostic Breast Cancer (WPBC) datasets taken from the UCI-ML repository. Various preprocessing techniques are applied to raw data, including imputation, scaling, and others. In the second step, relevant feature correlations are used with RFE to narrow down candidate discriminative features. The GWO chooses the best possible combination of attributes for the most accurate result in the next step. We use seven ML classifiers in both datasets to make a binary decision. On the WDBC and WPBC datasets, several experiments have shown accuracies of 98.25% and 93.27%, precisions of 98.13% and 95.56%, sensitivities of 99.06% and 96.63%, specificities of 96.92% and 73.33%, F1-scores of 98.59% and 96.09% and AUCs of 0.982 and 0.936, respectively. The hybrid approach’s superior feature selection improved the accuracy of breast cancer performance indicators and recurrence classification.

Search
Clear search
Close search
Google apps
Main menu