26 datasets found
  1. [Global Dataset] Pima Indians Diabetes

    • kaggle.com
    zip
    Updated Apr 30, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Manas Garg (2021). [Global Dataset] Pima Indians Diabetes [Dataset]. https://www.kaggle.com/gargmanas/pima-indians-diabetes
    Explore at:
    zip(9001 bytes)Available download formats
    Dataset updated
    Apr 30, 2021
    Authors
    Manas Garg
    License

    http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.htmlhttp://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html

    Description

    Context

    Share key insights, awesome visualizations, or simply discuss advantages of data, any observed or known properties, challenges, problems, corrections, and any other helpful comments! Post and discuss recent published works that utilize this dataset (including your own). Any and all feedback is welcome and encouraged.

  2. A

    ‘Pima Indians Diabetes Database’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Nov 13, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2021). ‘Pima Indians Diabetes Database’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-pima-indians-diabetes-database-846b/34a07be3/?iid=004-844&v=presentation
    Explore at:
    Dataset updated
    Nov 13, 2021
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Pima Indians Diabetes Database’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/uciml/pima-indians-diabetes-database on 12 November 2021.

    --- Dataset description provided by original source is as follows ---

    Context

    This dataset is originally from the National Institute of Diabetes and Digestive and Kidney Diseases. The objective of the dataset is to diagnostically predict whether or not a patient has diabetes, based on certain diagnostic measurements included in the dataset. Several constraints were placed on the selection of these instances from a larger database. In particular, all patients here are females at least 21 years old of Pima Indian heritage.

    Content

    The datasets consists of several medical predictor variables and one target variable, Outcome. Predictor variables includes the number of pregnancies the patient has had, their BMI, insulin level, age, and so on.

    Acknowledgements

    Smith, J.W., Everhart, J.E., Dickson, W.C., Knowler, W.C., & Johannes, R.S. (1988). Using the ADAP learning algorithm to forecast the onset of diabetes mellitus. In Proceedings of the Symposium on Computer Applications and Medical Care (pp. 261--265). IEEE Computer Society Press.

    Inspiration

    Can you build a machine learning model to accurately predict whether or not the patients in the dataset have diabetes or not?

    --- Original source retains full ownership of the source dataset ---

  3. pima indians diabetes dataset

    • kaggle.com
    Updated Oct 25, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The citation is currently not available for this dataset.
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 25, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    shivam khatri
    Description

    Dataset

    This dataset was created by shivam khatri

    Contents

  4. Pima-Indians-diabetes

    • kaggle.com
    zip
    Updated Sep 19, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SandeepN (2021). Pima-Indians-diabetes [Dataset]. https://www.kaggle.com/sandeep2812/pimaindiansdiabetes
    Explore at:
    zip(9003 bytes)Available download formats
    Dataset updated
    Sep 19, 2021
    Authors
    SandeepN
    Description

    Dataset

    This dataset was created by SandeepN

    Contents

  5. A

    ‘Pima Indians Diabetes’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Sep 30, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2021). ‘Pima Indians Diabetes’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-pima-indians-diabetes-5e94/f679d3ea/?iid=001-678&v=presentation
    Explore at:
    Dataset updated
    Sep 30, 2021
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Pima Indians Diabetes’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/gargmanas/pima-indians-diabetes on 30 September 2021.

    --- Dataset description provided by original source is as follows ---

    Context

    Share key insights, awesome visualizations, or simply discuss advantages of data, any observed or known properties, challenges, problems, corrections, and any other helpful comments! Post and discuss recent published works that utilize this dataset (including your own). Any and all feedback is welcome and encouraged.

    --- Original source retains full ownership of the source dataset ---

  6. Pima Diabetes Database

    • kaggle.com
    Updated Jan 12, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The citation is currently not available for this dataset.
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 12, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Rishabh Malhotra
    Description

    Sources: (a) Original owners: National Institute of Diabetes and Digestive and Kidney Diseases (b) Donor of database: Vincent Sigillito (vgs@aplcen.apl.jhu.edu) Research Center, RMI Group Leader Applied Physics Laboratory The Johns Hopkins University Johns Hopkins Road Laurel, MD 20707 (301) 953-6231 (c) Date received: 9 May 1990

    Past Usage:

    Smith,~J.~W., Everhart,~J.~E., Dickson,~W.~C., Knowler,~W.~C., & Johannes,~R.~S. (1988). Using the ADAP learning algorithm to forecast the onset of diabetes mellitus. In {\it Proceedings of the Symposium on Computer Applications and Medical Care} (pp. 261--265). IEEE Computer Society Press.

    The diagnostic, binary-valued variable investigated is whether the patient shows signs of diabetes according to World Health Organization criteria (i.e., if the 2 hour post-load plasma glucose was at least 200 mg/dl at any survey examination or if found during routine medical care). The population lives near Phoenix, Arizona, USA.

    Results: Their ADAP algorithm makes a real-valued prediction between 0 and 1. This was transformed into a binary decision using a cutoff of 0.448. Using 576 training instances, the sensitivity and specificity of their algorithm was 76% on the remaining 192 instances.

    Relevant Information: Several constraints were placed on the selection of these instances from a larger database. In particular, all patients here are females at least 21 years old of Pima Indian heritage. ADAP is an adaptive learning routine that generates and executes digital analogs of perceptron-like devices. It is a unique algorithm; see the paper for details.

  7. H

    Replication Data for: Pima Indians Diabetes

    • dataverse.harvard.edu
    Updated Apr 6, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Christopher Bartley (2016). Replication Data for: Pima Indians Diabetes [Dataset]. http://doi.org/10.7910/DVN/XFOZQR
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 6, 2016
    Dataset provided by
    Harvard Dataverse
    Authors
    Christopher Bartley
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Original data from: https://archive.ics.uci.edu/ml/datasets/Pima+Indians+Diabetes Changes made: - Rows with missing values ('0' values) for BP column, triceps, insulin and BMI were removed. Number of rows reduced from 768 (original) to 394. Atrributes 0. Class variable (-1=normal or +1=diabetes) 1. Number of times pregnant 2. Plasma glucose concentration a 2 hours in an oral glucose tolerance test 3. Diastolic blood pressure (mm Hg) 4. Triceps skin fold thickness (mm) 5. 2-Hour serum insulin (mu U/ml) 6. Body mass index (weight in kg/(height in m)^2) 7. Diabetes pedigree function 8. Age (years)

  8. pima-indians-diabetes-database

    • kaggle.com
    zip
    Updated Nov 6, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Angel Torres del Alamo (2020). pima-indians-diabetes-database [Dataset]. https://www.kaggle.com/angeltorresdelalamo/pimaindiansdiabetesdatabase
    Explore at:
    zip(9128 bytes)Available download formats
    Dataset updated
    Nov 6, 2020
    Authors
    Angel Torres del Alamo
    Description

    Dataset

    This dataset was created by Angel Torres del Alamo

    Contents

    It contains the following files:

  9. A

    ‘Diabetics prediction using logistic regression’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Sep 30, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2021). ‘Diabetics prediction using logistic regression’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-diabetics-prediction-using-logistic-regression-2c86/7b553d7b/?iid=004-895&v=presentation
    Explore at:
    Dataset updated
    Sep 30, 2021
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Diabetics prediction using logistic regression’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/kandij/diabetes-dataset on 30 September 2021.

    --- Dataset description provided by original source is as follows ---

    The data was collected and made available by “National Institute of Diabetes and Digestive and Kidney Diseases” as part of the Pima Indians Diabetes Database. Several constraints were placed on the selection of these instances from a larger database. In particular, all patients here belong to the Pima Indian heritage (subgroup of Native Americans), and are females of ages 21 and above.

    We’ll be using Python and some of its popular data science related packages. First of all, we will import pandas to read our data from a CSV file and manipulate it for further use. We will also use numpy to convert out data into a format suitable to feed our classification model. We’ll use seaborn and matplotlib for visualizations. We will then import Logistic Regression algorithm from sklearn. This algorithm will help us build our classification model. Lastly, we will use joblib available in sklearn to save our model for future use.

    --- Original source retains full ownership of the source dataset ---

  10. Pima Indians Diabetes

    • kaggle.com
    Updated Feb 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The citation is currently not available for this dataset.
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 8, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    NikhilNarasimhan3264
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset

    This dataset was created by NikhilNarasimhan3264

    Released under Apache 2.0

    Contents

  11. Pima Indians Diabetes

    • kaggle.com
    Updated Jun 25, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The citation is currently not available for this dataset.
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 25, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Sevdanur GENC
    Description

    Dataset

    This dataset was created by Sevdanur GENC

    Contents

  12. Pima Indians Diabetes Prediction

    • kaggle.com
    Updated Feb 18, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Md. Anas Mondol (2022). Pima Indians Diabetes Prediction [Dataset]. https://www.kaggle.com/mdanasmondol/pima-indians-diabetes-prediction/metadata
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 18, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Md. Anas Mondol
    Description

    Dataset

    This dataset was created by Md. Anas Mondol

    Contents

  13. pima-indians-diabetes

    • kaggle.com
    Updated Nov 28, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nayan Kapri (2019). pima-indians-diabetes [Dataset]. https://www.kaggle.com/datasets/nrkapri/pimaindiansdiabetes
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 28, 2019
    Dataset provided by
    Kaggle
    Authors
    Nayan Kapri
    Description

    Dataset

    This dataset was created by Nayan Kapri

    Contents

  14. pima-indian-diabetes

    • kaggle.com
    Updated Mar 2, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The citation is currently not available for this dataset.
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 2, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Krish Kotha
    Description

    Dataset

    This dataset was created by Krish Kotha

    Contents

  15. diabetes

    • kaggle.com
    Updated Mar 22, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The citation is currently not available for this dataset.
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 22, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Salih ACUR
    Description
    • Number of Instances: 768
    • Number of Attributes: 8 plus class

    For Each Attribute: (all numeric-valued) 1. Number of times pregnant 2. Plasma glucose concentration a 2 hours in an oral glucose tolerance test 3. Diastolic blood pressure (mm Hg) 4. Triceps skin fold thickness (mm) 5. 2-Hour serum insulin (mu U/ml) 6. Body mass index (weight in kg/(height in m)^2) 7. Diabetes pedigree function 8. Age (years) 9. Class variable (0 or 1)

    • Missing Attribute Values: Yes

    • Class Distribution: (class value 1 is interpreted as "tested positive for diabetes")

    Class Value Number of instances 0 : 500 1 : 268

  16. Pima Indian Diabetes Data

    • kaggle.com
    zip
    Updated Oct 4, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Daniel Silion (2017). Pima Indian Diabetes Data [Dataset]. https://www.kaggle.com/danielsilion/pimadata
    Explore at:
    zip(10748 bytes)Available download formats
    Dataset updated
    Oct 4, 2017
    Authors
    Daniel Silion
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    There's a story behind every dataset and here's your opportunity to share yours.

    Content

    Pima Indian Diabetes Data

    Acknowledgements

    Jerry Kurata

    Inspiration

    Your data will be in front of the world's largest data science community. What questions do you want to see answered?

  17. Diabetes PIMA Indian

    • kaggle.com
    Updated Aug 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nurussakinahh (2024). Diabetes PIMA Indian [Dataset]. https://www.kaggle.com/nurussakinahh/diabetes-pima-indian/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 1, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    nurussakinahh
    Description

    Dataset

    This dataset was created by nurussakinahh

    Released under Other (specified in description)

    Contents

  18. Diabetes Dataset

    • kaggle.com
    Updated Jul 30, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    codestarters (2022). Diabetes Dataset [Dataset]. https://www.kaggle.com/datasets/codestarters/diabetes-dataset/data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 30, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    codestarters
    Description

    Dataset

    This dataset was created by codestarters

    Contents

    pima indian diabetes dataset.

  19. Pima Indians onset of diabetes

    • kaggle.com
    Updated Aug 10, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    lianglirong (2018). Pima Indians onset of diabetes [Dataset]. https://www.kaggle.com/lianglirong/pima-indians-onset-of-diabetes/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 10, 2018
    Dataset provided by
    Kaggle
    Authors
    lianglirong
    Description

    Dataset

    This dataset was created by lianglirong

    Contents

  20. Pima Indians Diabetes Dataset

    • kaggle.com
    zip
    Updated May 13, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Muhammad Jamal Tariq (2020). Pima Indians Diabetes Dataset [Dataset]. https://www.kaggle.com/jamaltariqcheema/pima-indians-diabetes-dataset
    Explore at:
    zip(9353 bytes)Available download formats
    Dataset updated
    May 13, 2020
    Authors
    Muhammad Jamal Tariq
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    The unprocessed dataset was acquired from UCI Machine Learning organisation. This dataset is preprocessed by me, originally from the National Institute of Diabetes and Digestive and Kidney Diseases. The objective of the dataset is to accurately predict whether or not, a patient has diabetes, based on multiple features included in the dataset. I've achieved an accuracy metric score of 92.86 % with Random Forest Classifier using this dataset. I've even developed a web-service Diabetes Prediction System using that trained model. You can explore the Exploratory Data Analysis notebook to better understand the data.

    Attributes Normal Value Range

    • Glucose: Glucose (< 140) = Normal, Glucose (140-200) = Pre-Diabetic, Glucose (> 200) = Diabetic
    • BloodPressure: B.P (< 60) = Below Normal, B.P (60-80) = Normal, B.P (80-90) = Stage 1 Hypertension, B.P (90-120) = Stage 2 Hypertension, B.P (> 120) = Hypertensive Crisis
    • SkinThickness: SkinThickness (< 10) = Below Normal, SkinThickness (10-30) = Normal, SkinThickness (> 30) = Above Normal
    • Insulin: Insulin (< 200) = Normal, Insulin (> 200) = Above Normal BMI: BMI (< 18.5) = Underweight, BMI (18.5-25) = Normal, BMI (25-30) = Overweight, BMI (> 30) = Obese

    Acknowledgements

    J. W. Smith, J. E. Everhart, W. C. Dickson, W. C. Knowler and R. S. Johannes, "Using the ADAP Learning Algorithm to Forecast the Onset of Diabetes Mellitus" in Proc. of the Symposium on Computer Applications and Medical Care, pp. 261-265. IEEE Computer Society Press. 1988.

    Inspiration

    Multiple models were trained on the original dataset but only Random Forest Classifier was able to score an accuracy metric of 78.57 % but with this new preprocessed dataset an accuracy metric score of 92.86 % was achieved. Can you build a machine learning model that can accurately predict whether a patient has diabetes or not? and can you achieve an accuracy metric score even higher than 92.86 % without overfitting the model?

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Manas Garg (2021). [Global Dataset] Pima Indians Diabetes [Dataset]. https://www.kaggle.com/gargmanas/pima-indians-diabetes
Organization logo

[Global Dataset] Pima Indians Diabetes

The Pima Indians Diabetes Dataset!

Explore at:
2 scholarly articles cite this dataset (View in Google Scholar)
zip(9001 bytes)Available download formats
Dataset updated
Apr 30, 2021
Authors
Manas Garg
License

http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.htmlhttp://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html

Description

Context

Share key insights, awesome visualizations, or simply discuss advantages of data, any observed or known properties, challenges, problems, corrections, and any other helpful comments! Post and discuss recent published works that utilize this dataset (including your own). Any and all feedback is welcome and encouraged.

Search
Clear search
Close search
Google apps
Main menu