31 datasets found
  1. c

    Pima Indians Diabetes Dataset

    • cubig.ai
    Updated Jun 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CUBIG (2025). Pima Indians Diabetes Dataset [Dataset]. https://cubig.ai/store/products/488/pima-indians-diabetes-dataset
    Explore at:
    Dataset updated
    Jun 22, 2025
    Dataset authored and provided by
    CUBIG
    License

    https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service

    Measurement technique
    Synthetic data generation using AI techniques for model training, Privacy-preserving data transformation via differential privacy
    Description

    1) Data Introduction • The Pima Indians Diabetes Dataset is a tabular medical dataset for predicting diabetes (0: non-diabetic, 1: diabetic) based on health examination data of Pima Indian women in the United States.

    2) Data Utilization (1) Pima Indians Diabetes Dataset has characteristics that: • Each row contains eight health indicators, including the number of pregnancies, blood sugar, diastolic blood pressure, arm triceps skin thickness, two-hour blood insulin, BMI, family history-based diabetes risk, and age, as well as binary outcomes (with or without diabetes). • The data is constructed without personal identification information and is widely used in medical diagnosis support and in the practice of various binary classification algorithms. (2) Pima Indians Diabetes Dataset can be used to: • Developing Diabetes Prediction Models: Using health indicator data, we can build a variety of machine learning-based diabetes prediction models such as logistic regression, decision tree, and neural networks. • Medical Data Interpretation and Variable Importance Analysis: It can be used in research to analyze the diabetes prediction contribution and clinical significance of each health variable by applying interpretation techniques such as SHAP.

  2. A

    ‘Pima Indians Diabetes Database’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Jan 28, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘Pima Indians Diabetes Database’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-pima-indians-diabetes-database-607a/d2070de9/?iid=003-553&v=presentation
    Explore at:
    Dataset updated
    Jan 28, 2022
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Pima Indians Diabetes Database’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/uciml/pima-indians-diabetes-database on 28 January 2022.

    --- Dataset description provided by original source is as follows ---

    Context

    This dataset is originally from the National Institute of Diabetes and Digestive and Kidney Diseases. The objective of the dataset is to diagnostically predict whether or not a patient has diabetes, based on certain diagnostic measurements included in the dataset. Several constraints were placed on the selection of these instances from a larger database. In particular, all patients here are females at least 21 years old of Pima Indian heritage.

    Content

    The datasets consists of several medical predictor variables and one target variable, Outcome. Predictor variables includes the number of pregnancies the patient has had, their BMI, insulin level, age, and so on.

    Acknowledgements

    Smith, J.W., Everhart, J.E., Dickson, W.C., Knowler, W.C., & Johannes, R.S. (1988). Using the ADAP learning algorithm to forecast the onset of diabetes mellitus. In Proceedings of the Symposium on Computer Applications and Medical Care (pp. 261--265). IEEE Computer Society Press.

    Inspiration

    Can you build a machine learning model to accurately predict whether or not the patients in the dataset have diabetes or not?

    --- Original source retains full ownership of the source dataset ---

  3. h

    pima-indians-diabetes-database-partitions

    • huggingface.co
    Updated May 28, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Khoa Nguyen (2025). pima-indians-diabetes-database-partitions [Dataset]. https://huggingface.co/datasets/khoaguin/pima-indians-diabetes-database-partitions
    Explore at:
    Dataset updated
    May 28, 2025
    Authors
    Khoa Nguyen
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Pima Indians Diabetes Dataset Split

    This directory contains a dataset split for Pima Indians Diabetes Database.

      Mock Data
    

    The mock data is a smaller dataset (10 rows) that is used to test the model components.

      Private Data
    

    The private data is the remaining data that is used to train the model.

  4. Pima Indians Diabetes Database

    • kaggle.com
    Updated Jun 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Darshil06Shah (2023). Pima Indians Diabetes Database [Dataset]. https://www.kaggle.com/datasets/darshil06shah/pima-indians-diabetes-database
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 28, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Darshil06Shah
    Description

    Dataset

    This dataset was created by Darshil06Shah

    Contents

  5. pima-indians-diabetes-database

    • kaggle.com
    zip
    Updated Nov 6, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Angel Torres del Alamo (2020). pima-indians-diabetes-database [Dataset]. https://www.kaggle.com/angeltorresdelalamo/pimaindiansdiabetesdatabase
    Explore at:
    zip(9128 bytes)Available download formats
    Dataset updated
    Nov 6, 2020
    Authors
    Angel Torres del Alamo
    Description

    Dataset

    This dataset was created by Angel Torres del Alamo

    Contents

    It contains the following files:

  6. Pima-Indians-diabetes

    • kaggle.com
    zip
    Updated Sep 19, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SandeepN (2021). Pima-Indians-diabetes [Dataset]. https://www.kaggle.com/sandeep2812/pimaindiansdiabetes
    Explore at:
    zip(9003 bytes)Available download formats
    Dataset updated
    Sep 19, 2021
    Authors
    SandeepN
    Description

    Dataset

    This dataset was created by SandeepN

    Contents

  7. [Global Dataset] Pima Indians Diabetes

    • kaggle.com
    zip
    Updated Apr 30, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Manas Garg (2021). [Global Dataset] Pima Indians Diabetes [Dataset]. https://www.kaggle.com/gargmanas/pima-indians-diabetes
    Explore at:
    zip(9001 bytes)Available download formats
    Dataset updated
    Apr 30, 2021
    Authors
    Manas Garg
    License

    http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.htmlhttp://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html

    Description

    Context

    Share key insights, awesome visualizations, or simply discuss advantages of data, any observed or known properties, challenges, problems, corrections, and any other helpful comments! Post and discuss recent published works that utilize this dataset (including your own). Any and all feedback is welcome and encouraged.

  8. h

    Data from: Pima

    • huggingface.co
    Updated Sep 25, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pima [Dataset]. https://huggingface.co/datasets/Genius-Society/Pima
    Explore at:
    Dataset updated
    Sep 25, 2023
    Dataset authored and provided by
    Genius Society
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset Card for Pima

    The Pima dataset is a well-known data repository in the field of healthcare and machine learning. The dataset contains demographic, clinical and diagnostic characteristics of Pima Indian women and is primarily used to predict the onset of diabetes based on these attributes. Each data point includes information such as age, number of pregnancies, body mass index, blood pressure, and glucose concentration. Researchers and data scientists use the Pima dataset to… See the full description on the dataset page: https://huggingface.co/datasets/Genius-Society/Pima.

  9. Pima Indians Diabetes Database

    • kaggle.com
    zip
    Updated Oct 6, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    UCI Machine Learning (2016). Pima Indians Diabetes Database [Dataset]. https://www.kaggle.com/uciml/pima-indians-diabetes-database
    Explore at:
    zip(9128 bytes)Available download formats
    Dataset updated
    Oct 6, 2016
    Dataset authored and provided by
    UCI Machine Learning
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    This dataset is originally from the National Institute of Diabetes and Digestive and Kidney Diseases. The objective of the dataset is to diagnostically predict whether or not a patient has diabetes, based on certain diagnostic measurements included in the dataset. Several constraints were placed on the selection of these instances from a larger database. In particular, all patients here are females at least 21 years old of Pima Indian heritage.

    Content

    The datasets consists of several medical predictor variables and one target variable, Outcome. Predictor variables includes the number of pregnancies the patient has had, their BMI, insulin level, age, and so on.

    Acknowledgements

    Smith, J.W., Everhart, J.E., Dickson, W.C., Knowler, W.C., & Johannes, R.S. (1988). Using the ADAP learning algorithm to forecast the onset of diabetes mellitus. In Proceedings of the Symposium on Computer Applications and Medical Care (pp. 261--265). IEEE Computer Society Press.

    Inspiration

    Can you build a machine learning model to accurately predict whether or not the patients in the dataset have diabetes or not?

  10. PIMA Indians diabetes dataset classification result.

    • plos.figshare.com
    xls
    Updated May 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nur Farahaina Idris; Mohd Arfian Ismail; Mohd Izham Mohd Jaya; Ashraf Osman Ibrahim; Anas W. Abulfaraj; Faisal Binzagr (2024). PIMA Indians diabetes dataset classification result. [Dataset]. http://doi.org/10.1371/journal.pone.0302595.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 8, 2024
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Nur Farahaina Idris; Mohd Arfian Ismail; Mohd Izham Mohd Jaya; Ashraf Osman Ibrahim; Anas W. Abulfaraj; Faisal Binzagr
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    PIMA Indians diabetes dataset classification result.

  11. A

    ‘Diabetics prediction using logistic regression’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Mar 24, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2018). ‘Diabetics prediction using logistic regression’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-diabetics-prediction-using-logistic-regression-7c04/latest
    Explore at:
    Dataset updated
    Mar 24, 2018
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Diabetics prediction using logistic regression’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/kandij/diabetes-dataset on 13 February 2022.

    --- Dataset description provided by original source is as follows ---

    The data was collected and made available by “National Institute of Diabetes and Digestive and Kidney Diseases” as part of the Pima Indians Diabetes Database. Several constraints were placed on the selection of these instances from a larger database. In particular, all patients here belong to the Pima Indian heritage (subgroup of Native Americans), and are females of ages 21 and above.

    We’ll be using Python and some of its popular data science related packages. First of all, we will import pandas to read our data from a CSV file and manipulate it for further use. We will also use numpy to convert out data into a format suitable to feed our classification model. We’ll use seaborn and matplotlib for visualizations. We will then import Logistic Regression algorithm from sklearn. This algorithm will help us build our classification model. Lastly, we will use joblib available in sklearn to save our model for future use.

    --- Original source retains full ownership of the source dataset ---

  12. Diabetes pima-indians-diabetes-database

    • kaggle.com
    Updated Jun 18, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kannan.K.R (2020). Diabetes pima-indians-diabetes-database [Dataset]. https://www.kaggle.com/imkrkannan/diabetes-pimaindiansdiabetesdatabase/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 18, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Kannan.K.R
    Description

    This dataset is originally from the National Institute of Diabetes and Digestive and Kidney Diseases. The objective is to predict based on diagnostic measurements whether a patient has diabetes.

    Several constraints were placed on the selection of these instances from a larger database. In particular, all patients here are females at least 21 years old of Pima Indian heritage.

    Pregnancies: Number of times pregnant Glucose: Plasma glucose concentration a 2 hours in an oral glucose tolerance test BloodPressure: Diastolic blood pressure (mm Hg) SkinThickness: Triceps skin fold thickness (mm) Insulin: 2-Hour serum insulin (mu U/ml) BMI: Body mass index (weight in kg/(height in m)^2) DiabetesPedigreeFunction: Diabetes pedigree function Age: Age (years) Outcome: Class variable (0 or 1)

  13. Diabetics prediction using logistic regression

    • kaggle.com
    Updated May 12, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    kandi jagadish (2019). Diabetics prediction using logistic regression [Dataset]. https://www.kaggle.com/kandij/diabetes-dataset/activity
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 12, 2019
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    kandi jagadish
    Description

    The data was collected and made available by “National Institute of Diabetes and Digestive and Kidney Diseases” as part of the Pima Indians Diabetes Database. Several constraints were placed on the selection of these instances from a larger database. In particular, all patients here belong to the Pima Indian heritage (subgroup of Native Americans), and are females of ages 21 and above.

    We’ll be using Python and some of its popular data science related packages. First of all, we will import pandas to read our data from a CSV file and manipulate it for further use. We will also use numpy to convert out data into a format suitable to feed our classification model. We’ll use seaborn and matplotlib for visualizations. We will then import Logistic Regression algorithm from sklearn. This algorithm will help us build our classification model. Lastly, we will use joblib available in sklearn to save our model for future use.

  14. Pima Indians Diabetes Database

    • kaggle.com
    zip
    Updated Jul 13, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abhishek Kumar (2021). Pima Indians Diabetes Database [Dataset]. https://www.kaggle.com/datasets/uniabhi/diabetes-dataset
    Explore at:
    zip(9128 bytes)Available download formats
    Dataset updated
    Jul 13, 2021
    Authors
    Abhishek Kumar
    Description

    Dataset

    This dataset was created by Abhishek Kumar

    Released under Other (specified in description)

    Contents

  15. Diabetes prediction dataset classification result.

    • plos.figshare.com
    xls
    Updated May 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nur Farahaina Idris; Mohd Arfian Ismail; Mohd Izham Mohd Jaya; Ashraf Osman Ibrahim; Anas W. Abulfaraj; Faisal Binzagr (2024). Diabetes prediction dataset classification result. [Dataset]. http://doi.org/10.1371/journal.pone.0302595.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 8, 2024
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Nur Farahaina Idris; Mohd Arfian Ismail; Mohd Izham Mohd Jaya; Ashraf Osman Ibrahim; Anas W. Abulfaraj; Faisal Binzagr
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Diabetes prediction dataset classification result.

  16. Pima Indians Diabetes Database

    • kaggle.com
    Updated Feb 27, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vivek Prasad Kushwaha (2025). Pima Indians Diabetes Database [Dataset]. https://www.kaggle.com/datasets/vivekprasadkushwaha/pima-indians-diabetes-database/versions/1
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 27, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Vivek Prasad Kushwaha
    Description

    Dataset

    This dataset was created by Vivek Prasad Kushwaha

    Contents

  17. Pima Indians Diabetes (PID).

    • plos.figshare.com
    xls
    Updated Jun 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fei Ye; Xin Yuan Lou; Lin Fu Sun (2023). Pima Indians Diabetes (PID). [Dataset]. http://doi.org/10.1371/journal.pone.0173516.t006
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 4, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Fei Ye; Xin Yuan Lou; Lin Fu Sun
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Pima Indians Diabetes (PID).

  18. f

    Comparing the average time performance, in seconds, of the GLocal-LS-SVM...

    • figshare.com
    xls
    Updated Jun 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ahmed Youssef Ali Amer (2023). Comparing the average time performance, in seconds, of the GLocal-LS-SVM model to the global LS-SVM model, Glocal-SVM, and standard SVM applied to the Pima Indians diabetes dataset. [Dataset]. http://doi.org/10.1371/journal.pone.0285131.t008
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 21, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Ahmed Youssef Ali Amer
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Comparing the average time performance, in seconds, of the GLocal-LS-SVM model to the global LS-SVM model, Glocal-SVM, and standard SVM applied to the Pima Indians diabetes dataset.

  19. Pima Indians Diabetes Dataset

    • kaggle.com
    Updated May 13, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Muhammad Jamal Tariq (2020). Pima Indians Diabetes Dataset [Dataset]. https://www.kaggle.com/jamaltariqcheema/pima-indians-diabetes-dataset/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 13, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Muhammad Jamal Tariq
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    The unprocessed dataset was acquired from UCI Machine Learning organisation. This dataset is preprocessed by me, originally from the National Institute of Diabetes and Digestive and Kidney Diseases. The objective of the dataset is to accurately predict whether or not, a patient has diabetes, based on multiple features included in the dataset. I've achieved an accuracy metric score of 92.86 % with Random Forest Classifier using this dataset. I've even developed a web-service Diabetes Prediction System using that trained model. You can explore the Exploratory Data Analysis notebook to better understand the data.

    Attributes Normal Value Range

    • Glucose: Glucose (< 140) = Normal, Glucose (140-200) = Pre-Diabetic, Glucose (> 200) = Diabetic
    • BloodPressure: B.P (< 60) = Below Normal, B.P (60-80) = Normal, B.P (80-90) = Stage 1 Hypertension, B.P (90-120) = Stage 2 Hypertension, B.P (> 120) = Hypertensive Crisis
    • SkinThickness: SkinThickness (< 10) = Below Normal, SkinThickness (10-30) = Normal, SkinThickness (> 30) = Above Normal
    • Insulin: Insulin (< 200) = Normal, Insulin (> 200) = Above Normal BMI: BMI (< 18.5) = Underweight, BMI (18.5-25) = Normal, BMI (25-30) = Overweight, BMI (> 30) = Obese

    Acknowledgements

    J. W. Smith, J. E. Everhart, W. C. Dickson, W. C. Knowler and R. S. Johannes, "Using the ADAP Learning Algorithm to Forecast the Onset of Diabetes Mellitus" in Proc. of the Symposium on Computer Applications and Medical Care, pp. 261-265. IEEE Computer Society Press. 1988.

    Inspiration

    Multiple models were trained on the original dataset but only Random Forest Classifier was able to score an accuracy metric of 78.57 % but with this new preprocessed dataset an accuracy metric score of 92.86 % was achieved. Can you build a machine learning model that can accurately predict whether a patient has diabetes or not? and can you achieve an accuracy metric score even higher than 92.86 % without overfitting the model?

  20. f

    Comparing the average error performance of the GLocal-LS-SVM and LS-SVM...

    • figshare.com
    xls
    Updated Jun 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ahmed Youssef Ali Amer (2023). Comparing the average error performance of the GLocal-LS-SVM and LS-SVM applied to the Pima Indians Diabetes dataset. [Dataset]. http://doi.org/10.1371/journal.pone.0285131.t007
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 21, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Ahmed Youssef Ali Amer
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Comparing the average error performance of the GLocal-LS-SVM and LS-SVM applied to the Pima Indians Diabetes dataset.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
CUBIG (2025). Pima Indians Diabetes Dataset [Dataset]. https://cubig.ai/store/products/488/pima-indians-diabetes-dataset

Pima Indians Diabetes Dataset

Explore at:
Dataset updated
Jun 22, 2025
Dataset authored and provided by
CUBIG
License

https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service

Measurement technique
Synthetic data generation using AI techniques for model training, Privacy-preserving data transformation via differential privacy
Description

1) Data Introduction • The Pima Indians Diabetes Dataset is a tabular medical dataset for predicting diabetes (0: non-diabetic, 1: diabetic) based on health examination data of Pima Indian women in the United States.

2) Data Utilization (1) Pima Indians Diabetes Dataset has characteristics that: • Each row contains eight health indicators, including the number of pregnancies, blood sugar, diastolic blood pressure, arm triceps skin thickness, two-hour blood insulin, BMI, family history-based diabetes risk, and age, as well as binary outcomes (with or without diabetes). • The data is constructed without personal identification information and is widely used in medical diagnosis support and in the practice of various binary classification algorithms. (2) Pima Indians Diabetes Dataset can be used to: • Developing Diabetes Prediction Models: Using health indicator data, we can build a variety of machine learning-based diabetes prediction models such as logistic regression, decision tree, and neural networks. • Medical Data Interpretation and Variable Importance Analysis: It can be used in research to analyze the diabetes prediction contribution and clinical significance of each health variable by applying interpretation techniques such as SHAP.

Search
Clear search
Close search
Google apps
Main menu