3 datasets found

f
Final model performance.
datasetcatalog.nlm.nih.gov
plos.figshare.com
Updated Jun 28, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kim, Min-Hee; Ishikawa, Kyle; Ahn, Hyeong Jun (2024). Final model performance. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001483647
Explore at:
Dataset updated
Jun 28, 2024
Authors
Kim, Min-Hee; Ishikawa, Kyle; Ahn, Hyeong Jun
Description
In this study, we employed various machine learning models to predict metabolic phenotypes, focusing on thyroid function, using a dataset from the National Health and Nutrition Examination Survey (NHANES) from 2007 to 2012. Our analysis utilized laboratory parameters relevant to thyroid function or metabolic dysregulation in addition to demographic features, aiming to uncover potential associations between thyroid function and metabolic phenotypes by various machine learning methods. Multinomial Logistic Regression performed best to identify the relationship between thyroid function and metabolic phenotypes, achieving an area under receiver operating characteristic curve (AUROC) of 0.818, followed closely by Neural Network (AUROC: 0.814). Following the above, the performance of Random Forest, Boosted Trees, and K Nearest Neighbors was inferior to the first two methods (AUROC 0.811, 0.811, and 0.786, respectively). In Random Forest, homeostatic model assessment for insulin resistance, serum uric acid, serum albumin, gamma glutamyl transferase, and triiodothyronine/thyroxine ratio were positioned in the upper ranks of variable importance. These results highlight the potential of machine learning in understanding complex relationships in health data. However, it’s important to note that model performance may vary depending on data characteristics and specific requirements. Furthermore, we emphasize the significance of accounting for sampling weights in complex survey data analysis and the potential benefits of incorporating additional variables to enhance model accuracy and insights. Future research can explore advanced methodologies combining machine learning, sample weights, and expanded variable sets to further advance survey data analysis.
f
Preprocessing steps.
datasetcatalog.nlm.nih.gov
plos.figshare.com
Updated Jun 28, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kim, Min-Hee; Ahn, Hyeong Jun; Ishikawa, Kyle (2024). Preprocessing steps. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001483628
Explore at:
Dataset updated
Jun 28, 2024
Authors
Kim, Min-Hee; Ahn, Hyeong Jun; Ishikawa, Kyle
Description
In this study, we employed various machine learning models to predict metabolic phenotypes, focusing on thyroid function, using a dataset from the National Health and Nutrition Examination Survey (NHANES) from 2007 to 2012. Our analysis utilized laboratory parameters relevant to thyroid function or metabolic dysregulation in addition to demographic features, aiming to uncover potential associations between thyroid function and metabolic phenotypes by various machine learning methods. Multinomial Logistic Regression performed best to identify the relationship between thyroid function and metabolic phenotypes, achieving an area under receiver operating characteristic curve (AUROC) of 0.818, followed closely by Neural Network (AUROC: 0.814). Following the above, the performance of Random Forest, Boosted Trees, and K Nearest Neighbors was inferior to the first two methods (AUROC 0.811, 0.811, and 0.786, respectively). In Random Forest, homeostatic model assessment for insulin resistance, serum uric acid, serum albumin, gamma glutamyl transferase, and triiodothyronine/thyroxine ratio were positioned in the upper ranks of variable importance. These results highlight the potential of machine learning in understanding complex relationships in health data. However, it’s important to note that model performance may vary depending on data characteristics and specific requirements. Furthermore, we emphasize the significance of accounting for sampling weights in complex survey data analysis and the potential benefits of incorporating additional variables to enhance model accuracy and insights. Future research can explore advanced methodologies combining machine learning, sample weights, and expanded variable sets to further advance survey data analysis.
f
Hyperparameter settings.
datasetcatalog.nlm.nih.gov
plos.figshare.com
Updated Jun 28, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ishikawa, Kyle; Ahn, Hyeong Jun; Kim, Min-Hee (2024). Hyperparameter settings. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001483636
Explore at:
Dataset updated
Jun 28, 2024
Authors
Ishikawa, Kyle; Ahn, Hyeong Jun; Kim, Min-Hee
Description
In this study, we employed various machine learning models to predict metabolic phenotypes, focusing on thyroid function, using a dataset from the National Health and Nutrition Examination Survey (NHANES) from 2007 to 2012. Our analysis utilized laboratory parameters relevant to thyroid function or metabolic dysregulation in addition to demographic features, aiming to uncover potential associations between thyroid function and metabolic phenotypes by various machine learning methods. Multinomial Logistic Regression performed best to identify the relationship between thyroid function and metabolic phenotypes, achieving an area under receiver operating characteristic curve (AUROC) of 0.818, followed closely by Neural Network (AUROC: 0.814). Following the above, the performance of Random Forest, Boosted Trees, and K Nearest Neighbors was inferior to the first two methods (AUROC 0.811, 0.811, and 0.786, respectively). In Random Forest, homeostatic model assessment for insulin resistance, serum uric acid, serum albumin, gamma glutamyl transferase, and triiodothyronine/thyroxine ratio were positioned in the upper ranks of variable importance. These results highlight the potential of machine learning in understanding complex relationships in health data. However, it’s important to note that model performance may vary depending on data characteristics and specific requirements. Furthermore, we emphasize the significance of accounting for sampling weights in complex survey data analysis and the potential benefits of incorporating additional variables to enhance model accuracy and insights. Future research can explore advanced methodologies combining machine learning, sample weights, and expanded variable sets to further advance survey data analysis.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Kim, Min-Hee; Ishikawa, Kyle; Ahn, Hyeong Jun (2024). Final model performance. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001483647

Final model performance.

Explore at:

Dataset updated

Jun 28, 2024

Authors

Kim, Min-Hee; Ishikawa, Kyle; Ahn, Hyeong Jun

Description

In this study, we employed various machine learning models to predict metabolic phenotypes, focusing on thyroid function, using a dataset from the National Health and Nutrition Examination Survey (NHANES) from 2007 to 2012. Our analysis utilized laboratory parameters relevant to thyroid function or metabolic dysregulation in addition to demographic features, aiming to uncover potential associations between thyroid function and metabolic phenotypes by various machine learning methods. Multinomial Logistic Regression performed best to identify the relationship between thyroid function and metabolic phenotypes, achieving an area under receiver operating characteristic curve (AUROC) of 0.818, followed closely by Neural Network (AUROC: 0.814). Following the above, the performance of Random Forest, Boosted Trees, and K Nearest Neighbors was inferior to the first two methods (AUROC 0.811, 0.811, and 0.786, respectively). In Random Forest, homeostatic model assessment for insulin resistance, serum uric acid, serum albumin, gamma glutamyl transferase, and triiodothyronine/thyroxine ratio were positioned in the upper ranks of variable importance. These results highlight the potential of machine learning in understanding complex relationships in health data. However, it’s important to note that model performance may vary depending on data characteristics and specific requirements. Furthermore, we emphasize the significance of accounting for sampling weights in complex survey data analysis and the potential benefits of incorporating additional variables to enhance model accuracy and insights. Future research can explore advanced methodologies combining machine learning, sample weights, and expanded variable sets to further advance survey data analysis.

Clear search

Close search

Google apps

Main menu

Final model performance.

Preprocessing steps.

Hyperparameter settings.

Final model performance.