63 datasets found
  1. ObesityDataSet_raw_and_data_sinthetic

    • kaggle.com
    zip
    Updated Nov 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ezzaldeen Esmail (2025). ObesityDataSet_raw_and_data_sinthetic [Dataset]. https://www.kaggle.com/datasets/ezzaldeenesmail/obesitydataset-raw-and-data-sinthetic
    Explore at:
    zip(58967 bytes)Available download formats
    Dataset updated
    Nov 8, 2025
    Authors
    Ezzaldeen Esmail
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Now I have comprehensive information about the obesity dataset. Let me create a detailed Kaggle-style description for this dataset.

    Obesity Level Estimation Dataset

    This dataset contains comprehensive information for estimating obesity levels in individuals based on their eating habits and physical conditions. The data includes 2,111 records with 17 attributes collected from individuals in Mexico, Peru, and Colombia, aged between 14 and 61 years.[1][2][3][4]

    Dataset Overview

    The dataset comprises 2,111 observations across 17 features, with no missing values, making it ready for immediate analysis and modeling. An important characteristic of this dataset is that 77% of the data was generated synthetically using the Weka tool and the SMOTE (Synthetic Minority Over-sampling Technique) filter, while 23% was collected directly from real users through a web platform. The data is relatively balanced across seven obesity categories, ranging from insufficient weight to obesity type III.[2][4][1]

    Origin and Context

    This dataset was donated to the UCI Machine Learning Repository on August 26, 2019 by Fabio Mendoza Palechor and Alexis De la Hoz Manotas, and published in the journal Data in Brief. The dataset was created to support the development of intelligent computational tools for identifying obesity levels and building recommender systems to monitor obesity. The synthetic data augmentation approach has been validated and is widely recognized as an effective method for obesity detection research.[4][5][2]

    Features Description

    Demographic Information: - Gender: Male or Female - Age: Age of the individual (14-61 years) - Height: Height in meters (1.45-1.98m) - Weight: Weight in kilograms (39-173 kg)

    Family History: - family_history_with_overweight: Family history of overweight (yes/no)

    Eating Habits: - FAVC (Frequent consumption of high caloric food): yes/no - FCVC (Frequency of consumption of vegetables): Scale 1-3 - NCP (Number of main meals): 1-4 meals per day - CAEC (Consumption of food between meals): no, Sometimes, Frequently, Always - CH2O (Consumption of water daily): Scale 1-3 liters

    Physical Condition and Lifestyle: - SCC (Calories consumption monitoring): yes/no - FAF (Physical activity frequency): Scale 0-3 (times per week) - TUE (Time using technology devices): Scale 0-2 hours per day - CALC (Consumption of alcohol): no, Sometimes, Frequently, Always

    Habits: - SMOKE: Smoking habit (yes/no) - MTRANS (Transportation used): Public_Transportation, Automobile, Walking, Motorbike, Bike

    Target Variable: - NObeyesdad (Obesity Level): Seven categories - Insufficient_Weight (272 records) - Normal_Weight (287 records) - Overweight_Level_I (290 records) - Overweight_Level_II (290 records) - Obesity_Type_I (351 records) - Obesity_Type_II (297 records) - Obesity_Type_III (324 records)

    Dataset Statistics

    The dataset exhibits diverse characteristics with ages averaging 24.3 years (ranging from 14 to 61), heights averaging 1.70m, and weights averaging 86.6 kg. The gender distribution is nearly balanced with 1,068 males and 1,043 females. Notably, 81.8% of individuals have a family history of overweight, and 88.4% frequently consume high-caloric food. The most common transportation method is public transportation (74.8%), and most individuals do not smoke (97.9%) or monitor their calorie consumption (95.5%).[1]

    Data Characteristics

    Feature Types: Mixed (continuous, categorical, ordinal, binary)[2] Subject Area: Health and Medicine[2] Associated Tasks: Multi-class Classification, Regression, Clustering[2] Data Source: 23% real survey data + 77% synthetic data using SMOTE[4][2]

    Potential Use Cases

    This dataset is ideal for: 1. Multi-class Classification: Predicting obesity levels (7 categories) using machine learning algorithms (Decision Trees, Random Forest, SVM, Neural Networks, XGBoost) 2. Binary Classification: Simplifying to obese vs. non-obese predictions 3. Regression Analysis: Predicting BMI based on lifestyle and eating habits 4. Feature Importance Analysis: Identifying key factors contributing to obesity 5. Clustering Analysis: Discovering natural groupings in eating habits and physical conditions 6. Health Recommender Systems: Building personalized health monitoring and intervention systems 7. Public Health Research: Understanding obesity patterns across Latin American populations 8. Synthetic Data Methodology: Studying the effectiveness of SMOTE for healthcare data augmentation

    Research Applications

    This dataset has been extensively used in machine learning research, with state-of-the-art models achieving accuracy rates exceeding 97% when including BMI-related features (height and weigh...

  2. Percentage of obese U.S. adults by state 2023

    • statista.com
    Updated Nov 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Percentage of obese U.S. adults by state 2023 [Dataset]. https://www.statista.com/statistics/378988/us-obesity-rate-by-state/
    Explore at:
    Dataset updated
    Nov 19, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    2023
    Area covered
    United States
    Description

    West Virginia, Mississippi, and Arkansas are the U.S. states with the highest percentage of their population who are obese. The states with the lowest percentage of their population who are obese include Colorado, Hawaii, and Massachusetts. Obesity in the United States Obesity is a growing problem in many countries around the world, but the United States has the highest rate of obesity among all OECD countries. The prevalence of obesity in the United States has risen steadily over the previous two decades, with no signs of declining. Obesity in the U.S. is more common among women than men, and overweight and obesity rates are higher among African Americans than any other race or ethnicity. Causes and health impacts Obesity is most commonly the result of a combination of poor diet, overeating, physical inactivity, and a genetic susceptibility. Obesity is associated with various negative health impacts, including an increased risk of cardiovascular diseases, certain types of cancer, and diabetes type 2. As of 2022, around 8.4 percent of the U.S. population had been diagnosed with diabetes. Diabetes is currently the eighth leading cause of death in the United States.

  3. 💀Deaths And Obesity - 🎀Health

    • kaggle.com
    zip
    Updated May 24, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    waticson (2024). 💀Deaths And Obesity - 🎀Health [Dataset]. https://www.kaggle.com/datasets/yutodennou/death-and-obesity
    Explore at:
    zip(224551 bytes)Available download formats
    Dataset updated
    May 24, 2024
    Authors
    waticson
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    This data set summarizes obesity and the number of deaths caused by it in each country

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F2993575%2Fb55c8c53db1eb6809cc0fb6b5a081195%2F2024-05-25%20093352.png?generation=1716597253375211&alt=media" alt="">

    💡I have already divided these into TRAIN data, TEST data, and ANSWER data so you guys can start working on the regression problem right away.

    • train.csv: Obesity and deaths data from 1990 to 2013
    • test.csv: The explanatory variable in 2014
    • answer.csv: The objective variable in 2014

    These data were created with the assumption that the number of deaths due to obesity in 2014 will be estimated from data from 1990 to 2013.

    There is also something called HINT data(hint.csv). This is data for 2015 and beyond. I have left it out of the train or test data because it has many missing values, but it may be useful for forecasting and for those who are interested in more recent data.

    VariablesDiscription
    Country205 country names
    CodeCountry code like AFG for Afghanistan
    YearYear of collecting data
    PopulationPopulation in a country
    Percentage-OverweightPercentage of defined as overweight, BMI >= 25(age-standardized estimate)(%),Sex: both sexes, Age group:18+
    Mean-Daily-Caloric-SupplyMean of daily supply of calories among overweight or obesity, BMI >= 25(age-standardized). Only about men
    Mean-BMIBMI, Age group:18+ years. 2 columns for both male and female
    Percentage-Overweighted-MalePercentage of adults who are overweight (age-standardized) - Age group: 18+ years. 2 columns for both male and female
    Prevalence-Hypertension-MalePrevalence of hypertension among adults aged 30-79 years(age-standardized). 2 columns for both male and female
    Prevalence-ObesityPrevalence of obesity among adults, BMI >= 30(age-standardized estimate)(%),Sex: both sexes, Age group:18+
    Death-By-High-BMIDeaths that are from all causes attributed to high body-mass index per 100,000 people, in both sexes aged age-standarized
  4. Diabetes Prevalence Data

    • kaggle.com
    zip
    Updated Feb 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sirisha Singla (2024). Diabetes Prevalence Data [Dataset]. https://www.kaggle.com/datasets/sirishasingla1906/diabetes-prevalence-data
    Explore at:
    zip(101551 bytes)Available download formats
    Dataset updated
    Feb 22, 2024
    Authors
    Sirisha Singla
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    "Explore detailed statistics on diabetes and obesity prevalence in U.S. states and counties, with a focus on both men and women. This dataset includes numeric data and percentages, shedding light on critical health indicators. The comprehensive insights derived from this dataset serve as a valuable resource for public health professionals, policymakers, and researchers to inform evidence-based interventions and strategies for addressing health disparities across regions."

  5. d

    Statistics on Obesity, Physical Activity and Diet (replaced by Statistics on...

    • digital.nhs.uk
    Updated May 5, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2020). Statistics on Obesity, Physical Activity and Diet (replaced by Statistics on Public Health) [Dataset]. https://digital.nhs.uk/data-and-information/publications/statistical/statistics-on-obesity-physical-activity-and-diet
    Explore at:
    Dataset updated
    May 5, 2020
    License

    https://digital.nhs.uk/about-nhs-digital/terms-and-conditionshttps://digital.nhs.uk/about-nhs-digital/terms-and-conditions

    Time period covered
    Apr 1, 2018 - Dec 31, 2019
    Description

    This report presents information on obesity, physical activity and diet drawn together from a variety of sources for England. More information can be found in the source publications which contain a wider range of data and analysis. Each section provides an overview of key findings, as well as providing links to relevant documents and sources. Some of the data have been published previously by NHS Digital. A data visualisation tool (link provided within the key facts) allows users to select obesity related hospital admissions data for any Local Authority (as contained in the data tables), along with time series data from 2013/14. Regional and national comparisons are also provided. The report includes information on: Obesity related hospital admissions, including obesity related bariatric surgery. Obesity prevalence. Physical activity levels. Walking and cycling rates. Prescriptions items for the treatment of obesity. Perception of weight and weight management. Food and drink purchases and expenditure. Fruit and vegetable consumption. Key facts cover the latest year of data available: Hospital admissions: 2018/19 Adult obesity: 2018 Childhood obesity: 2018/19 Adult physical activity: 12 months to November 2019 Children and young people's physical activity: 2018/19 academic year

  6. Estimation of obesity levels UCI dataset

    • kaggle.com
    zip
    Updated Dec 12, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jayita Bhattacharyya (2021). Estimation of obesity levels UCI dataset [Dataset]. https://www.kaggle.com/datasets/jayitabhattacharyya/estimation-of-obesity-levels-uci-dataset
    Explore at:
    zip(118158 bytes)Available download formats
    Dataset updated
    Dec 12, 2021
    Authors
    Jayita Bhattacharyya
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    This dataset include data for the estimation of obesity levels in individuals from the countries of Mexico, Peru and Colombia, based on their eating habits and physical condition. The data contains 17 attributes and 2111 records, the records are labeled with the class variable NObesity (Obesity Level), that allows classification of the data using the values of Insufficient Weight, Normal Weight, Overweight Level I, Overweight Level II, Obesity Type I, Obesity Type II and Obesity Type III. 77% of the data was generated synthetically using the Weka tool and the SMOTE filter, 23% of the data was collected directly from users through a web platform.

    Original dataset

    Gender - Female/Male age - Numeric value height - Numeric value in meters weight - Numeric value in kilograms Has a family member suffered or suffers from overweight - Yes/No Do you eat high caloric food frequently - Yes/No Do you usually eat vegetables in your meals - Never/Sometimes/Always How many main meals do you have daily - Between 1 y 2/Three/More than three Do you eat any food between meals? No/Sometimes/Frequently/Always Do you smoke? Yes/No How much water do you drink daily? Less than a liter/Between 1 and 2 L/More than 2 L Do you monitor the calories you eat daily - Yes/No How often do you have physical activity? I do not have/1 or 2 days/2 or 4 days/4 or 5 days How much time do you use technological devices such as cell phone, videogames, television, computer and others - 0–2 hours/3–5 hours/More than 5 hours how often do you drink alcohol? - I do not drink/Sometimes/Frequently/Always Which transportation do you usually use? Automobile/Motorbike/Bike/Public Transportation/Walking

  7. Obesity Level

    • kaggle.com
    zip
    Updated Jan 2, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Muh. Rama Saputra (2024). Obesity Level [Dataset]. https://www.kaggle.com/datasets/muhramasaputra/obesity-based-on-eating-habits-and-physical-cond
    Explore at:
    zip(39310 bytes)Available download formats
    Dataset updated
    Jan 2, 2024
    Authors
    Muh. Rama Saputra
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset include data for the estimation of obesity levels in individuals from the countries of Mexico, Peru and Colombia, based on their eating habits and physical condition. 23% of the data was collected directly from users through a survey conducted by Fabio Mendoza Palechor and Alexis de la Hoz Manotas in a web platform and 77% of the data was generated synthetically using the Weka tool and the SMOTE filter.

    The data contains 19 attributes and 2111 records.

    • Gender is 1 if a respondent is male and 0 if a respondent is female.
    • Age is a respondent’s age in years.
    • family_history_with_overweight is 1 if a respondent has family member who is or was overweight, 0 if not.
    • FAVC is 1 if a respondent eats high caloric food frequently, 0 if not.
    • FCVC is 1 if a respondent usually eats vegetables in their meals, 0 if not.
    • NCP represents how many main meals a respondent has daily (0 for 1-2 meals, 1 for 3 meals, and 2 for more than 3 meals).
    • CAEC represents how much food a respondent eats between meals on a scale of 0 to 3.
    • SMOKE is 1 if a respondent smokes, 0 if not.
    • CH2O represents how much water a respondent drinks on a scale of 0 to 2.
    • SCC is 1 if a respondent monitors their caloric intake, 0 if not.
    • FAF represents how much physical activity a respondent does on a scale of 0 to 3.
    • TUE represents how much time a respondent spends looking at devices with screens on a scale of 0 to 2.
    • CALC represents how often a respondent drinks alcohol on a scale of 0 to 3.
    • Automobile, Bike, Motorbike, Public_Transportation, and Walking indicate a respondent’s primary mode of transportation. Their primary mode of transportation is indicated by a 1 and the other columns will contain a 0.
    • NObeyesdad is a 1 if a patient is obese and a 0 if not.
  8. Z

    Data set from Ranucci M, de Vincentiis C, Menicanti L, La Rovere MT,...

    • data.niaid.nih.gov
    Updated Oct 3, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ranucci M; de Vincentiis C,; Menicanti L,; La Rovere MT,; Pistuddi V. (2020). Data set from Ranucci M, de Vincentiis C, Menicanti L, La Rovere MT, Pistuddi V. A gender-based analysis of the obesity paradox in cardiac surgery: height for women, weight for men? Eur J Cardiothorac Surg. 2019 Jul 1;56(1):72-78. doi: 10.1093/ejcts/ezy454. PMID: 30657927. [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4063852
    Explore at:
    Dataset updated
    Oct 3, 2020
    Dataset provided by
    Department of Cardiovascular Anesthesia and Intensive Care, IRCCS Policlinico San Donato, San Donato Milanese, Milan, Italy.
    Department of Cardiac Surgery, IRCCS Policlinico San Donato, San Donato Milanese, Milan, Italy.
    Department of Cardiology, Fondazione Salvatore Maugeri, IRCCS Istituto Scientifico di Montescano, Montescano, Italy.
    Authors
    Ranucci M; de Vincentiis C,; Menicanti L,; La Rovere MT,; Pistuddi V.
    Description

    Data set from Ranucci M, de Vincentiis C, Menicanti L, La Rovere MT, Pistuddi V. A gender-based analysis of the obesity paradox in cardiac surgery: height for women, weight for men? Eur J Cardiothorac Surg. 2019 Jul 1;56(1):72-78. doi: 10.1093/ejcts/ezy454. PMID: 30657927.

    This is the abstract:

    Objectives: In cardiac surgery, obesity is associated with a lower mortality risk. This study aims to investigate the association between body mass index (BMI) and operative mortality separately in female patients and male patients undergoing cardiac surgery and to separate the effects of weight and height in each gender-based cohort of patients.

    Methods: A retrospective cohort study including 7939 consecutive patients who underwent cardiac surgery was conducted. The outcome measure was the operative mortality.

    Results: In men, there was a U-shaped relationship between the BMI and the operative mortality, with the lower mortality rate at a BMI of 35 kg/m2. In women, the relationship is J-shaped, with the lower mortality at a BMI of 22 kg/m2. Female patients with obesity class II-III had a relative risk for operative mortality of 2.6 [95% confidence interval (CI) 1.37-4.81, P = 0.002]. The relationship between weight and mortality rate is a U-shaped bot in men and women, with the lower mortality rate at 100 kg for men and 70 kg for women. Height was linearly and inversely associated with the operative mortality in men and women. After correction for the potential confounders, height, but not weight, was independently associated with operative mortality in women (odds ratio 0.949, 95% CI 0.915-0.983; P = 0.004); conversely, in men, this association exists for weight (odds ratio 1.017, 95% CI 1.001-1.032; P = 0.034), but not height.

    Conclusions: Contrary to men, in women obesity does not reduce the operative mortality in cardiac surgery, whereas the height seems to be associated with a lower mortality.

  9. Data_Sheet_1_Sex-Specific Temporal Trends in Overweight and Obese Among...

    • frontiersin.figshare.com
    pdf
    Updated Jun 11, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yung-Chieh Chang; Wan-Hua Hsieh; Sen-Fang Huang; Hsinyi Hsiao; Ying-Wei Wang; Chia-Hsiang Chu; Shu-Hui Wen (2023). Data_Sheet_1_Sex-Specific Temporal Trends in Overweight and Obese Among Schoolchildren From 2009 to 2018: An Age Period Cohort Analysis.PDF [Dataset]. http://doi.org/10.3389/fped.2021.615483.s001
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Jun 11, 2023
    Dataset provided by
    Frontiers Mediahttp://www.frontiersin.org/
    Authors
    Yung-Chieh Chang; Wan-Hua Hsieh; Sen-Fang Huang; Hsinyi Hsiao; Ying-Wei Wang; Chia-Hsiang Chu; Shu-Hui Wen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Background: Our study examined the age, period, and cohort effects on overweight and obesity in children using a 10-year dataset collected from schoolchildren in Hualien, Taiwan.Methods: We used data from the annual health checkup of a total of 94,661 schoolchildren in primary schools and junior high schools in Hualien from 2009 to 2018. Children were defined as overweight or obese by the gender- and age-specific norm of the body mass index. We conducted the age-period-cohort (APC) analysis in boys and girls separately.Results: From 2009 to 2018, the rates of children overweight and obese were 12.78 and 14.23%, respectively. Boys had higher rates of overweight and obesity than girls (29.73 vs. 24.03%, P < 0.001). Based on APC analysis results, positive age effect existed regardless of gender. The risk of overweight or obesity of children aged 9 or 12 years was significantly higher compared to the average rate. As for period effect, a fluctuating downward trend in overweight was evident in 2016, and a similar trend in obesity was seen in 2017 across gender groups. The birth cohort of 2007 to 2009 had a significant higher proportion of overweight and obese than other birth cohorts. This indicated that the proportion of children overweight and obese in the young generation is higher than that in the old generation.Conclusion: An increased risk of children overweight or obese was associated with age and later birth cohort. For the period effect, the trend in the prevalence of overweight and obesity fluctuated downward slowly from 2016 to 2017.

  10. d

    Data from: Sex differences in risk factors for coronary heart disease: a...

    • catalog.data.gov
    • data.virginia.gov
    Updated Sep 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institutes of Health (2025). Sex differences in risk factors for coronary heart disease: a study in a Brazilian population [Dataset]. https://catalog.data.gov/dataset/sex-differences-in-risk-factors-for-coronary-heart-disease-a-study-in-a-brazilian-populati
    Explore at:
    Dataset updated
    Sep 6, 2025
    Dataset provided by
    National Institutes of Health
    Description

    Background In Brazil coronary heart disease (CHD) constitutes the most important cause of death in both sexes in all the regions of the country and interestingly, the difference between the sexes in the CHD mortality rates is one of the smallest in the world because of high rates among women. Since a question has been raised about whether or how the incidence of several CHD risk factors differs between the sexes in Brazil the prevalence of various risk factors for CHD such as high blood cholesterol, diabetes mellitus, hypertension, obesity, sedentary lifestyle and cigarette smoking was compared between the sexes in a Brazilian population; also the relationships between blood cholesterol and the other risk factors were evaluated. Results The population presented high frequencies of all the risk factors evaluated. High blood cholesterol (CHOL) and hypertension were more prevalent among women as compared to men. Hypertension, diabetes and smoking showed equal or higher prevalence in women in pre-menopausal ages as compared to men. Obesity and physical inactivity were equally prevalent in both sexes respectively in the postmenopausal age group and at all ages. CHOL was associated with BMI, sex, age, hypertension and physical inactivity. Conclusions In this population the high prevalence of the CHD risk factors indicated that there is an urgent need for its control; the higher or equal prevalences of several risk factors in women could in part explain the high rates of mortality from CHD in females as compared to males.

  11. Obesity_worlds

    • kaggle.com
    zip
    Updated Jan 10, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    willian oliveira (2024). Obesity_worlds [Dataset]. https://www.kaggle.com/datasets/willianoliveiragibin/obesity-worlds/versions/1
    Explore at:
    zip(40316 bytes)Available download formats
    Dataset updated
    Jan 10, 2024
    Authors
    willian oliveira
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F16731800%2Fad70fc7248777405b3a0d7b26ae6bc03%2Fgraphnet.png?generation=1704919089836236&alt=media" alt="">https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F16731800%2F9182a1368987311fe3dd8d6e19088372%2Fgraph1.png?generation=1704919096793980&alt=media" alt="">

    In this dataset, various factors are considered to analyze the health and lifestyle of respondents. The binary variable "Gender" takes a value of 1 for males and 0 for females. "Age" represents the respondent's age in years, providing insight into the demographic composition. The presence of a family history of overweight individuals is denoted by the variable "family_history_with_overweight," where 1 signifies the existence of such a history and 0 indicates the absence.

    Dietary habits are reflected in variables such as "FAVC," indicating frequent consumption of high-caloric foods, and "FCVC," denoting the regular intake of vegetables. The number of main meals consumed daily is captured by "NCP," with values of 0, 1, and 2 representing 1-2 meals, 3 meals, and more than 3 meals, respectively. "CAEC" quantifies the amount of food consumed between meals on a scale from 0 to 3.

    Health-related behaviors include "SMOKE," indicating smoking habits, and "CH2O," measuring daily water intake on a scale of 0 to 2. The monitoring of caloric intake is represented by "SCC," where 1 signifies adherence to such monitoring. Physical activity levels are expressed through "FAF," ranging from 0 to 3. The time spent looking at screens is captured by "TUE" on a scale of 0 to 2, providing insights into sedentary behaviors.

    Alcohol consumption frequency is coded in "CALC," with values ranging from 0 to 3. Modes of transportation are illustrated by the variables "Automobile," "Bike," "Motorbike," "Public_Transportation," and "Walking," each having a value of 1 for the respondent's primary mode and 0 for others.

    The target variable, "NObeyesdad," categorizes respondents as obese (1) or not (0), offering a crucial health indicator. This dataset amalgamates diverse aspects of lifestyle, dietary choices, and health-related behaviors, enabling a comprehensive analysis of factors influencing obesity. By examining the interplay of these variables, researchers and healthcare professionals can gain valuable insights into patterns and trends contributing to obesity within this population.

  12. l

    Supplementary information files for Trends in childhood body mass index...

    • repository.lboro.ac.uk
    • datasetcatalog.nlm.nih.gov
    Updated Sep 25, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Will Johnson (2023). Supplementary information files for Trends in childhood body mass index between 1936 and 2011 showed that underweight remained more common than obesity among 398,970 Danish school children [Dataset]. http://doi.org/10.17028/rd.lboro.24190311.v1
    Explore at:
    Dataset updated
    Sep 25, 2023
    Dataset provided by
    Loughborough University
    Authors
    Will Johnson
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description

    Supplementary files for article Trends in childhood body mass index between 1936 and 2011 showed that underweight remained more common than obesity among 398,970 Danish school childrenAim: To examine trends in all body mass index (BMI) groups in children from 1936-2011.Methods: We included 197,694 girls and 201,276 boys from the Copenhagen School Health Records Register, born 1930-1996, with longitudinal weight and height measurements (6-14 years). Using International Obesity Task Force criteria, BMI was classified as underweight, normal-weight, overweight and obesity. Sex- and age-specific prevalences were calculated.Results: From the 1930s, the prevalence of underweight was stable until a small increase occurred from 1950-1970s, and thereafter it declined into the early 2000s. Using 7-year-olds as an example, underweight changed from 10% to 7% in girls and from 9% to 6% in boys during the study period. The prevalence of overweight plateaued from 1950-1970s and then steeply increased from 1970s onwards and in 1990-2000s 15% girls and 11% boys at 7 years had overweight. The prevalence of obesity particularly increased from 1980s onwards and in 1990-2000s 5% girls and 4% boys at 7 years had obesity. These trends slightly differed by age.Conclusion: Among Danish schoolchildren, the prevalence of underweight was greater than overweight until the 1980s and greater than obesity throughout the period. Thus, monitoring the prevalence of childhood underweight remains an important public health issue.

  13. Table_1_Sex non-specific growth charts and potential clinical implications...

    • frontiersin.figshare.com
    • datasetcatalog.nlm.nih.gov
    bin
    Updated Aug 11, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eric Morris Bomberg; Bradley Scott Miller; Oppong Yaw Addo; Alan David Rogol; Mutaz M. Jaber; Kyriakie Sarafoglou (2023). Table_1_Sex non-specific growth charts and potential clinical implications in the care of transgender youth.docx [Dataset]. http://doi.org/10.3389/fendo.2023.1227886.s001
    Explore at:
    binAvailable download formats
    Dataset updated
    Aug 11, 2023
    Dataset provided by
    Frontiers Mediahttp://www.frontiersin.org/
    Authors
    Eric Morris Bomberg; Bradley Scott Miller; Oppong Yaw Addo; Alan David Rogol; Mutaz M. Jaber; Kyriakie Sarafoglou
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    IntroductionThe Centers for Disease Control and Prevention (CDC) and World Health Organization (WHO) created separate growth charts for girls and boys because growth patterns and rates differ between sexes. However, scenarios exist in which this dichotomizing “girls versus boys” approach may not be ideal, including the care of non-binary youth or transgender youth undergoing transitions consistent with their gender identity. There is therefore a need for growth charts that age smooth differences in pubertal timing between sexes to determine how youth are growing as “children” versus “girls or boys” (e.g., age- and sex-neutral, compared to age- and sex-specific, growth charts).MethodsEmploying similar statistical techniques and datasets used to create the CDC 2000 growth charts, we developed age-adjusted, sex non-specific growth charts for height, weight, and body mass index (BMI), and z-score calculators for these parameters. Specifically, these were created using anthropometric data from five US cross-sectional studies including National Health Examination Surveys II-III and National Health and Nutrition Examination Surveys I-III. To illustrate contemporary clinical practice, we overlaid our charts on CDC 2000 girls and boys growth charts.Results39,119 youth 2-20 years old (49.5% female; 66.7% non-Hispanic White; 21.7% non-Hispanic Black) were included in the development of our growth charts, reference ranges, and z-score calculators. Respective curves were largely superimposable through around 10 years of age after which, coinciding with pubertal onset timing, differences became more apparent.DiscussionWe conclude that age-adjusted, sex non-specific growth charts may be used in clinical situations such as transgender youth in which standard “girls versus boys” growth charts are not ideal. Until longitudinal auxological data are available in these populations, our growth charts may help to assess a transgender youth’s growth trajectory and weight classification, and expectations surrounding these.

  14. f

    Supplementary information files for: The associations of maternal and...

    • datasetcatalog.nlm.nih.gov
    • repository.lboro.ac.uk
    Updated Dec 8, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Johnson, Will; Baker, Jenifer L.; Pereira, Snehal M. Pinto; Norris, Tom; Costa, Silvia (2022). Supplementary information files for: The associations of maternal and paternal obesity with latent patterns of offspring BMI development between 7-17 years of age: pooled analyses of cohorts born in 1958 and 2001 in the United Kingdom [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000281654
    Explore at:
    Dataset updated
    Dec 8, 2022
    Authors
    Johnson, Will; Baker, Jenifer L.; Pereira, Snehal M. Pinto; Norris, Tom; Costa, Silvia
    Area covered
    United Kingdom
    Description

    Supplementary information files for: The associations of maternal and paternal obesity with latent patterns of offspring BMI development between 7-17 years of age: pooled analyses of cohorts born in 1958 and 2001 in the United Kingdom Objective We aimed to 1) describe how the UK obesity epidemic reflects a change over time in the proportion of the population demonstrating adverse latent patterns of BMI development and 2) investigate the potential roles of maternal and paternal BMI in this secular process. Methods We used serial BMI data between 7-17 years of age from 13220 boys and 12711 girls. Half the sample was born in 1958 and half in 2001. Sex-specific growth mixture models were developed. The relationships of maternal and paternal BMI and weight status with class membership were estimated using the 3-step BCH approach, with covariate adjustment. Results The selected models had five classes. For each sex, in addition to the two largest normal weight classes, there were “normal weight increasing to overweight” (17% of boys and 20% of girls), “overweight increasing to obesity” (8% and 6%), and “overweight decreasing to normal weight” (3% and 6%) classes. More than 1-in-10 children from the 2001 birth cohort were in the “overweight increasing to obesity” class, compared to less than 1-in-30 from the 1958 birth cohort. Approximately 75% of the mothers and fathers of this class had overweight or obesity. When considered together, both maternal and paternal BMI were associated with latent class membership, with evidence of negative departure from additivity (i.e., the combined effect of maternal and paternal BMI was smaller than the sum of the individual effects). The odds of a girl belonging to the “overweight increasing to obesity” class (compared to the largest normal weight class) was 13.11 (8.74, 19.66) times higher if both parents had overweight or obesity (compared to both parents having normal weight); the equivalent estimate for boys was 9.01 (6.37, 12.75). Conclusions The increase in obesity rates in the UK over more than 40 years has been partly driven by the growth of a sub-population demonstrating excess BMI gain during adolescence. Our results implicate both maternal and paternal BMI as correlates of this secular process.

  15. Data from: Overweight in Brazilian industry workers: Prevalence and...

    • scielo.figshare.com
    • datasetcatalog.nlm.nih.gov
    jpeg
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pablo Magno da Silveira; Kelly Samara Silva; Jaqueline Aragoni da Silva; Elusa Santina Antunes de Oliveira; Mauro Virgílio Gomes de Barros; Markus Vinicius Nahas (2023). Overweight in Brazilian industry workers: Prevalence and association with demographic and socioeconomic factors and soft drink intake [Dataset]. http://doi.org/10.6084/m9.figshare.20018303.v1
    Explore at:
    jpegAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    SciELOhttp://www.scielo.org/
    Authors
    Pablo Magno da Silveira; Kelly Samara Silva; Jaqueline Aragoni da Silva; Elusa Santina Antunes de Oliveira; Mauro Virgílio Gomes de Barros; Markus Vinicius Nahas
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Objective: To estimate the prevalence of overweight in industry workers and its association with demographic and socioeconomic factors and soft drink intake (including type). Methods: This is a nationwide cross-sectional cohort survey of "Lifestyle and leisure habits of industry workers" conducted between 2006 and 2008 in 24 Brazilian federate units. The participants answered a previously tested questionnaire and self-reported their weight and height. Statistical analyses consisted of crude and adjusted Poisson regression. Results: Males and females had overweight prevalences of 45.7% (95%CI=45.1; 46.2) and 28.1% (95%CI=27.4; 28.9) respectively. Older and married individuals and those working in medium-sized and large factories were more likely to be overweight. Males with higher education levels and gross family incomes were also more likely to be overweight, but not females. Finally, men (PR=1.24; 95%CI=1.13; 1.36) and women (PR=1.40; 95%CI=1.22; 1.61) who consumed diet/light soft drinks were also more likely to be overweight than those who did not consume soft drinks. Conclusion: More than one-third of the workers were overweight according to their self-reported weight and height, and the prevalence of overweight was higher in males. Demographic and socioeconomic variables and diet/light soft drink intake were associated with overweight. These data may be helpful for the development of actions that reduce the risk of overweight in this population.

  16. Two-way ANOVA for BMI among male and female adolescents by age category.

    • plos.figshare.com
    • datasetcatalog.nlm.nih.gov
    xls
    Updated Jun 15, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ahmad R. Al-Haifi; Balqees A. Al-Awadhi; Yousef A. Al-Dashti; Badriyah H. Aljazzaf; Ahmad R. Allafi; Mariam A. Al-Mannai; Hazzaa M. Al-Hazzaa (2023). Two-way ANOVA for BMI among male and female adolescents by age category. [Dataset]. http://doi.org/10.1371/journal.pone.0262101.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 15, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Ahmad R. Al-Haifi; Balqees A. Al-Awadhi; Yousef A. Al-Dashti; Badriyah H. Aljazzaf; Ahmad R. Allafi; Mariam A. Al-Mannai; Hazzaa M. Al-Hazzaa
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Two-way ANOVA for BMI among male and female adolescents by age category.

  17. Z

    Obesity, Suicides and Unemployment by Country

    • data.niaid.nih.gov
    Updated Apr 12, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Martin Sanchez Pueyo; Marina Peña Alonso (2022). Obesity, Suicides and Unemployment by Country [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6448785
    Explore at:
    Dataset updated
    Apr 12, 2022
    Authors
    Martin Sanchez Pueyo; Marina Peña Alonso
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains data about obesity, suicides and unemployment segregated by Country. The sources of data are wikipedia tables as updated on 11/04/2022. More information can be found in project's github: https://github.com/martinsanc/wikipedia_scraper

    Países (List of countries by population (United Nations) - Wikipedia)

    Country

    UN continental region

    UN statistical subregion

    Population 1 July 2018

    Population 1 July 2019

    Change

    Desempleo (List of countries by unemployment rate - Wikipedia)

    Unemployment Rate

    Sourcedate of information

    Suicidios (List of countries by suicide rate - Wikipedia)

    All

    Male

    Female

    Tasa de obesidad por país (List of countries by suicide rate - Wikipedia)

    Rank

    Obesity rate

  18. o

    Data from: Secular trends of obesity prevalence in urban Chinese children...

    • omicsdi.org
    • plos.figshare.com
    xml
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Song Y, Secular trends of obesity prevalence in urban Chinese children from 1985 to 2010: gender disparity. [Dataset]. https://www.omicsdi.org/dataset/biostudies/S-EPMC3540080
    Explore at:
    xmlAvailable download formats
    Authors
    Song Y
    Variables measured
    Unknown
    Description

    Based on the data from six Chinese National Surveys on Students Constitution and Health (CNSSCH) from 1985 to 2010, we explored the secular trend in the prevalence of obesity in urban Chinese children over a period of 25 years. The aim of this study was to examine the gender disparities in the prevalence of childhood obesity over time. The standardized prevalence of obesity in Chinese children increased rapidly during the past 25 years from 0.2% in 1985 to 8.1% in 2010. The increasing trend was significant in all age subgroups (p<0.01). Although the prevalence of obesity continuously increased in both boys and girls, the changing pace in boys was faster than that in girls. Age-specific prevalence odds ratios (PORs) of boys versus girls for obesity increased over time during the 25 year period. The prevalence of obesity in boys was significantly higher than in girls in all age-specific subgroups from 1991 and after. The gradually expanding gender disparity suggests the prevalence of obesity in boys contribute to a large and growing proportion of obese children. Therefore, it is critical for developing and implementing gender-specific preventive guidelines and public health policies in China.

  19. f

    Data_Sheet_1_Natural Mineral Waters and Metabolic Syndrome: Insights From...

    • datasetcatalog.nlm.nih.gov
    • frontiersin.figshare.com
    Updated May 24, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marianelli, Cinzia; Chiarotti, Flavia; Narciso, Laura; Frassanito, Paolo; Torriani, Flavio; Bernardini, Roberta; Martinelli, Andrea (2022). Data_Sheet_1_Natural Mineral Waters and Metabolic Syndrome: Insights From Obese Male and Female C57BL/6 Mice on Caloric Restriction.docx [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000409350
    Explore at:
    Dataset updated
    May 24, 2022
    Authors
    Marianelli, Cinzia; Chiarotti, Flavia; Narciso, Laura; Frassanito, Paolo; Torriani, Flavio; Bernardini, Roberta; Martinelli, Andrea
    Description

    Metabolic syndrome (MetS) represents one of the greatest challenges to public health given its serious consequences on cardiovascular diseases and type 2 diabetes. A carbohydrate-restricted, low-fat diet is the current therapy for MetS. Natural mineral waters (NMWs) are known to exert beneficial effects on human health. Our primary objective was to shed light on the potential therapeutic properties of NMWs in MetS. A total of 125 C57BL/6 male and female mice were included in the study. Of these, 10 were left untreated. They were fed a standard diet with tap water throughout the study period, and stayed healthy. The remaining 115 mice were initially fed a high-calorie diet (HCD) consisting of a high-fat feed (60% of energy from fat) with 10% fructose in tap water, served ad libitum over a period of 4 months to induce MetS (the MetS induction phase). Mice were then randomly divided into six treatment groups and a control group, all of which received a low-calorie diet (LCD), but with a different kind of drinking water, for 2 months (the treatment phase). Five groups were each treated with a different kind of NMW, one group by alternating the five NMWs, and one group – the control group – was given tap water. Body weight and blood biochemistry were monitored over the 6-month trial. After 4 months, male and female mice on HCD developed obesity, hypercholesterolaemia and hyperglycaemia, although gains in body weight, total cholesterol, and blood glucose in males were greater than those observed in females (P < 0.0001). When combined with an LCD, the NMWs rich in sulphate, magnesium and bicarbonate, and the minimally mineralised one were the most effective in reducing the blood levels of total cholesterol, high-density lipoprotein (HDL) cholesterol, and glucose. Sex differences emerged during both the MetS induction phase and the treatment phase. These results suggest that NMWs rich in specific macronutrients, such as bicarbonate, sulphate and magnesium, and minimally mineralised water, in combination with an LCD, may contribute to controlling blood lipid and glucose levels in subjects with MetS. Further studies are needed to confirm these results and to extend them to humans.

  20. G

    Body mass index (BMI) based on self-reported height and weight, by age group...

    • open.canada.ca
    • www150.statcan.gc.ca
    • +1more
    csv, html, xml
    Updated Jan 17, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statistics Canada (2023). Body mass index (BMI) based on self-reported height and weight, by age group and sex, household population aged 18 and over excluding pregnant females, (CCHS 3.1, January to June 2005), Canada, provinces and health regions (June 2005 boundaries) [Dataset]. https://open.canada.ca/data/en/dataset/8a87e4a3-60b4-41fa-ba7f-efedf791d313
    Explore at:
    html, csv, xmlAvailable download formats
    Dataset updated
    Jan 17, 2023
    Dataset provided by
    Statistics Canada
    License

    Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
    License information was derived automatically

    Area covered
    Canada
    Description

    This table contains 136080 series, with data for years 2005 - 2005 (not all combinations necessarily have data for all years). This table contains data described by the following dimensions (Not all combinations are available): Geography (126 items: Canada; Central Regional Integrated Health Authority; Newfoundland and Labrador; Newfoundland and Labrador; Eastern Regional Integrated Health Authority; Newfoundland and Labrador ...), Age group (5 items: Total; 18 years and over;18 to 34 years ...), Sex (3 items: Both sexes; Males; Females ...), Body mass index (BMI), self-reported (9 items: Total population for the variable body mass index; self-reported; Normal weight; body mass index; self-reported 18.5 to 24.9;Overweight; body mass index; self-reported 25.0 to 29.9;Underweight; body mass index; self-reported under 18.5 ...), Characteristics (8 items: Number of persons; Low 95% confidence interval; number of persons; Coefficient of variation for number of persons; High 95% confidence interval; number of persons ...).

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Ezzaldeen Esmail (2025). ObesityDataSet_raw_and_data_sinthetic [Dataset]. https://www.kaggle.com/datasets/ezzaldeenesmail/obesitydataset-raw-and-data-sinthetic
Organization logo

ObesityDataSet_raw_and_data_sinthetic

Explore at:
zip(58967 bytes)Available download formats
Dataset updated
Nov 8, 2025
Authors
Ezzaldeen Esmail
License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

Now I have comprehensive information about the obesity dataset. Let me create a detailed Kaggle-style description for this dataset.

Obesity Level Estimation Dataset

This dataset contains comprehensive information for estimating obesity levels in individuals based on their eating habits and physical conditions. The data includes 2,111 records with 17 attributes collected from individuals in Mexico, Peru, and Colombia, aged between 14 and 61 years.[1][2][3][4]

Dataset Overview

The dataset comprises 2,111 observations across 17 features, with no missing values, making it ready for immediate analysis and modeling. An important characteristic of this dataset is that 77% of the data was generated synthetically using the Weka tool and the SMOTE (Synthetic Minority Over-sampling Technique) filter, while 23% was collected directly from real users through a web platform. The data is relatively balanced across seven obesity categories, ranging from insufficient weight to obesity type III.[2][4][1]

Origin and Context

This dataset was donated to the UCI Machine Learning Repository on August 26, 2019 by Fabio Mendoza Palechor and Alexis De la Hoz Manotas, and published in the journal Data in Brief. The dataset was created to support the development of intelligent computational tools for identifying obesity levels and building recommender systems to monitor obesity. The synthetic data augmentation approach has been validated and is widely recognized as an effective method for obesity detection research.[4][5][2]

Features Description

Demographic Information: - Gender: Male or Female - Age: Age of the individual (14-61 years) - Height: Height in meters (1.45-1.98m) - Weight: Weight in kilograms (39-173 kg)

Family History: - family_history_with_overweight: Family history of overweight (yes/no)

Eating Habits: - FAVC (Frequent consumption of high caloric food): yes/no - FCVC (Frequency of consumption of vegetables): Scale 1-3 - NCP (Number of main meals): 1-4 meals per day - CAEC (Consumption of food between meals): no, Sometimes, Frequently, Always - CH2O (Consumption of water daily): Scale 1-3 liters

Physical Condition and Lifestyle: - SCC (Calories consumption monitoring): yes/no - FAF (Physical activity frequency): Scale 0-3 (times per week) - TUE (Time using technology devices): Scale 0-2 hours per day - CALC (Consumption of alcohol): no, Sometimes, Frequently, Always

Habits: - SMOKE: Smoking habit (yes/no) - MTRANS (Transportation used): Public_Transportation, Automobile, Walking, Motorbike, Bike

Target Variable: - NObeyesdad (Obesity Level): Seven categories - Insufficient_Weight (272 records) - Normal_Weight (287 records) - Overweight_Level_I (290 records) - Overweight_Level_II (290 records) - Obesity_Type_I (351 records) - Obesity_Type_II (297 records) - Obesity_Type_III (324 records)

Dataset Statistics

The dataset exhibits diverse characteristics with ages averaging 24.3 years (ranging from 14 to 61), heights averaging 1.70m, and weights averaging 86.6 kg. The gender distribution is nearly balanced with 1,068 males and 1,043 females. Notably, 81.8% of individuals have a family history of overweight, and 88.4% frequently consume high-caloric food. The most common transportation method is public transportation (74.8%), and most individuals do not smoke (97.9%) or monitor their calorie consumption (95.5%).[1]

Data Characteristics

Feature Types: Mixed (continuous, categorical, ordinal, binary)[2] Subject Area: Health and Medicine[2] Associated Tasks: Multi-class Classification, Regression, Clustering[2] Data Source: 23% real survey data + 77% synthetic data using SMOTE[4][2]

Potential Use Cases

This dataset is ideal for: 1. Multi-class Classification: Predicting obesity levels (7 categories) using machine learning algorithms (Decision Trees, Random Forest, SVM, Neural Networks, XGBoost) 2. Binary Classification: Simplifying to obese vs. non-obese predictions 3. Regression Analysis: Predicting BMI based on lifestyle and eating habits 4. Feature Importance Analysis: Identifying key factors contributing to obesity 5. Clustering Analysis: Discovering natural groupings in eating habits and physical conditions 6. Health Recommender Systems: Building personalized health monitoring and intervention systems 7. Public Health Research: Understanding obesity patterns across Latin American populations 8. Synthetic Data Methodology: Studying the effectiveness of SMOTE for healthcare data augmentation

Research Applications

This dataset has been extensively used in machine learning research, with state-of-the-art models achieving accuracy rates exceeding 97% when including BMI-related features (height and weigh...

Search
Clear search
Close search
Google apps
Main menu