Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains 100,000 patient records designed for diabetes risk prediction, analysis, and machine learning applications. The dataset is clean, preprocessed, and ready for use in classification, regression, feature engineering, statistical analysis, and data visualization.
diabetes_dataset.csvThe dataset includes patient profiles with features based on demographics, lifestyle habits, family history, and clinical measurements that are well-established indicators of diabetes risk. All data is generated using statistical distributions inspired by real-world medical research, ensuring privacy preservation while reflecting realistic health patterns.
| Column | Type | Description | Values/Range |
|---|---|---|---|
| patient_id | Integer | Unique patient identifier | 1–100000 |
| age | Integer | Age of patient in years | 18–90 |
| gender | String | Patient gender | 'Male', 'Female', 'Other' |
| ethnicity | String | Ethnic background | 'White', 'Hispanic', 'Black', 'Asian', 'Other' |
| education_level | String | Highest completed education | 'No formal', 'Highschool', 'Graduate', 'Postgraduate' |
| income_level | String | Income category | 'Low', 'Medium', 'High' |
| employment_status | String | Employment type | 'Employed', 'Unemployed', 'Retired', 'Student' |
| smoking_status | String | Smoking behavior | 'Never', 'Former', 'Current' |
| alcohol_consumption_per_week | Float | Drinks consumed per week | 0–30 |
| physical_activity_minutes_per_week | Integer | Physical activity (weekly minutes) | 0–600 |
| diet_score | Integer | Diet quality (higher = healthier) | 0–10 |
| sleep_hours_per_day | Float | Average daily sleep hours | 3–12 |
| screen_time_hours_per_day | Float | Average daily screen time hours | 0–12 |
| family_history_diabetes | Integer | Family history of diabetes | 0 = No, 1 = Yes |
| hypertension_history | Integer | Hypertension history | 0 = No, 1 = Yes |
| cardiovascular_history | Integer | Cardiovascular history | 0 = No, 1 = Yes |
| bmi | Float | Body Mass Index (kg/m²) | 15–45 |
| waist_to_hip_ratio | Float | Waist-to-hip ratio | 0.7–1.2 |
| systolic_bp | Integer | Systolic blood pressure (mmHg) | 90–180 |
| diastolic_bp | Integer | Diastolic blood pressure (mmHg) | 60–120 |
| heart_rate | Integer | Resting heart rate (bpm) | 50–120 |
| cholesterol_total | Float | Total cholesterol (mg/dL) | 120–300 |
| hdl_cholesterol | Float | HDL cholesterol (mg/dL) | 20–100 |
| ldl_cholesterol | Float | LDL cholesterol (mg/dL) | 50–200 |
| triglycerides | Float | Triglycerides (mg/dL) | 50–500 |
| glucose_fasting | Float | Fasting glucose (mg/dL) | 70–250 |
| glucose_postprandial | Float | Post-meal glucose (mg/dL) | 90–350 |
| insulin_level | Float | Blood insulin level (µU/mL) | 2–50 |
| hba1c | Float | HbA1c (%) | 4–14 |
| diabetes_risk_score | Integer | Risk score (calculated, 0–100) | 0–100 |
| diabetes_stage | String | Stage of diabetes | 'No Diabetes', 'Pre-Diabetes', 'Type 1', 'Type 2', 'Gestational' |
| diagnosed_diabetes | Integer | Target: Diabetes diagnosis | 0 = No, 1 = Yes |
diagnosed_diabetes (Yes/No)diabetes_stageglucose_fasting, hba1c, or diabetes_risk_score
Facebook
TwitterThese indicators are presented by Public Health — Seattle & King County, in conjunction with the King County Hospitals for a Healthier Community (HHC). The data offer a comprehensive overview of demographics, health, and health behaviors among King County residents. Users can search by key word or topic area to filter the table of contents displayed below. After clicking on an indicator, a summary tab will open and users can click on additional tabs to explore data analyzed by demographic characteristics, see how rates have changed over time, and view data for cities/neighborhoods. Most indicators are interactive and users can hover over maps or charts to find more information. The data presented on this website may be reproduced without permission. Please use the following citation when reproducing: "Retrieved (date) from Public Health – Seattle & King County, Community Health Indicators. www.kingcounty.gov/chi"
Facebook
TwitterBy Health Data New York [source]
The New York State Community Health Indicator Reports (CHIRS) provides an incredible resource of data to analyze the health of all communities in this state. This dataset contains more than 300 indicators across 15 health topics, which are organized by region and county. These indicators include important information such as event counts, percent/rates, confidence intervals, measure units,quartiles and many more. Whether you're a researcher or a policymaker interested in public health issues in this state - this dataset can be used to inform your decisions by creating powerful visuals with it's wealth of data points. Use this dataset to explore different factors that could be impacting public health outcomes and discover key insights around public health trends in the Empire State!
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
This dataset contains data on more than 300 health indicators for all 62 New York State counties, 11 regions (including New York City), the State excluding New York City, and New York State. It can be used to analyze different trends in population health from a local and state-level perspective. Here is a guide on how to use this dataset:
- Familiarize yourself with the data columns: Have an understanding of what each column represents in order to have a better grasp of what type of analyses you will be able to do with this dataset. Additionally, look into other potential features that may not be included within this dataset but could help you with your research or analysis.
- Clean and prepare the data: Make sure that the data is up-to-date and free of errors by cleaning it up prior to conducting any analysis or research project. Some cleaning steps may include inspecting for accuracy, addressing missing values/outliers, formatting irregularities etc.
- Generate questions related to public health issues: Brainstorm ideas around public health topics or possible implications based on your curiosities then use those questions as stepping stones when conducting further research or analysis into this particular healthcare dataset..
- Visualize key information through visual plots/charts: Create charts and graphs which could significantly give out important insights by providing visualization capabilities that would allow users valuable information in an understandable manner such as indicating correlations between certain factors or determining frequency distributions among others.. 5 Develop conclusions from your exploratory findings : Through careful calculation using thoughtfully designed formulas as well as chart interpretation draw meaningful conclusions from continuous observation assessments performed within the contents of this healthcare related base answer pertinent queries raised at hand efficiently thereby leaving no room for ambiguity in user’s overall comprehension about subject matter discussed herein ensured efficient completion processes executed timely objectives justly desired
- Comparing health indicators across different New York state counties and regions: This dataset can be used to compare the health indicators of different New York county and region levels, helping identify areas of strength or weakness in an area's public health conditions.
- Examining changes over time: By analyzing data from multiple years, this dataset can be used to understand patterns in changes of public health outcomes throughout NY state regions since 2012.
- Generating targeted public health initiatives and interventions: Understanding the geographical distribution of positive or negative public health outcomes could help generate targeted policy interventions more effectively tailored to local needs than a one-size-fits-all approach
If you use this dataset in your research, please credit the original authors. Data Source
See the dataset description for more information.
File: community-health-indicator-reports-chirs-latest-data-1.csv | Column name | Description | |:----------------------------------|:-------------------------------------------------------------------------------| | County Name | Name of the county in New York State. (String) | | Health Topic Number | Number assigned to each hea...
Facebook
Twitterhttps://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
1) Data Introduction • The Diabetes Health Indicators Dataset is a large health dataset that collects various health indicators and lifestyle information related to diabetes diagnosis based on health surveys and medical records of the U.S. population.
2) Data Utilization (1) Diabetes Health Indicators Dataset has characteristics that: • The dataset consists of more than 250,000 samples and contains more than 20 health and demographic variables, including diabetes (binary or triage label), age, gender, BMI, blood pressure, cholesterol, smoking and drinking habits, physical activity, mental health, income, and education level. (2) Diabetes Health Indicators Dataset can be used to: • Diabetes prediction model development: It can be used to develop machine learning-based classification models that use health indicators and lifestyle data to predict the risk of developing diabetes. • A Study on the Correlation between Lifestyle and Diabetes: It can be used in epidemiological and public health studies to analyze the effects of various lifestyle and demographic variables such as smoking, drinking, exercise, and eating habits on diabetes incidence.
Facebook
TwitterThis dataset provides counts and percentages of diagnoses broken down by each patient’s Healthy Places Index percentile ranking (based on ZIP code of residence). Healthcare encounters are categorized into four diagnosis groups: mental health disorders, substance use disorders, co-occurring disorders, and all other diagnoses. To view and interact with a fully functioning version of the HPI map and data used in these HCAI analyses of behavioral health, please click the link to visit https://map.healthyplacesindex.org/.
Facebook
TwitterThe Service Delivery Indicators (SDI) are a set of health and education indicators that examine the effort and ability of staff and the availability of key inputs and resources that contribute to a functioning school or health facility. The indicators are standardized, allowing comparison between and within countries over time.
The Health SDIs include healthcare provider effort, knowledge and ability, and the availability of key inputs (for example, basic equipment, medicines and infrastructure, such as toilets and electricity). The indicators provide a snapshot of the health facility and assess the availability of key resources for providing high quality care.
The Uganda SDI Health survey team visited a sample of 394 health facilities across Uganda between June and October 2013. The survey team collected rosters covering 2,347 workers for absenteeism and assessed 733 health workers for competence using patient case simulations.
National
Health facilities and healthcare providers
All health facilities providing primary-level care.
Sample survey data [ssd]
The sampling strategy for SDI surveys is designed towards attaining indicators that are accurate and representative at the national level, as this allows for proper cross-country (i.e. international benchmarking) and across time comparisons, when applicable. In addition, other levels of representativeness are sought to allow for further disaggregation (rural/urban areas, public/private facilities, subregions, etc.) during the analysis stage.
The sampling strategy for SDI surveys follows a multistage sampling approach. The main units of analysis are facilities (schools and health centers) and providers (health and education workers: teachers, doctors, nurses, facility managers, etc.). The multi-stage sampling approach makes sampling procedures more practical by dividing the selection of large populations of sampling units in a step-by-step fashion. After defining the sampling frame and categorizing it by stratum, a first stage selection of sampling units is carried out independently within each stratum. Often, the primary sampling units (PSU) for this stage are cluster locations (e.g. districts, communities, counties, neighborhoods, etc.) which are randomly drawn within each stratum with a probability proportional to the size (PPS) of the cluster (measured by the location’s number of facilities, providers or pupils). Once locations are selected, a second stage takes place by randomly selecting facilities within location (either with equal probability or with PPS) as secondary sampling units. At a third stage, a fixed number of health and education workers and pupils are randomly selected within facilities to provide information for the different questionnaire modules.
Detailed information about the specific sampling process is available in the associated SDI Country Report included as part of the documentation that accompany these datasets.
Face-to-face [f2f]
The SDI Health Survey Questionnaire consists of four modules and weights:
Module 1: General Information - Administered to the health facility manager to collect information on equipment, medicines, infrastructure and other facets of the health facility.
Module 2: Provider Absence - A roster of healthcare providers is collected and absence measured.
Module 3: Clinical Vignettes – A selection of providers are given clinical vignettes to measure knowledge of common medical conditions.
Module 4: Facility finances – Information on facility revenue and expenditures is collected from the health facility manager.
Weights: Weights for facilities, absentee-related analyses and clinical vignette analyses.
Quality control was performed in Stata.
Facebook
TwitterBy Humanitarian Data Exchange [source]
This dataset contains a range of indicators related to health, health systems, and sustainable development from the World Health Organization's data portal. It covers topics ranging from mortality and global health estimates to essential health technologies, youth engagement, mental health initiatives, and infectious diseases. With data points including publich state codes and display values, this dataset provides detailed insight into how healthcare is managed all around the globe. From tracking malaria outbreaks to exploring various international agreements on public healthcare initiatives, this dataset offers a wide array of powerful information for machine learning projects that are designed to improve our understanding of global healthcare trends. Explore the correlations between different countries' universal healthcare coverage measures or investigate any discrepancies between developed and developing nations - unlock deeper insights with the WHO's extensive data!
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
Getting Started: First, you need to download the dataset from Kaggle. Once you have it saved in your computer, open it with a spreadsheet software such as Excel or Google Sheets.
Exploring the Data: The dataset contains columns that offer information about indicators related to health in Malaysia including mortality rates, prevention programs and providers, financing information, human resource information, and more. To explore particular aspects of this data you should filter the rows using any of these column values. For example if you want results for a specific year or region you can filter by ‘year’ or ‘region’ accordingly. It’s important to note that some columns have relation between them (e.g., country code corresponds with country display name).
Data Outputs:
Using this dataset allows users to generate visual representations such as graphs which can help display trends over time regarding our stability goals concerning human resources funding rates or pregnancies outcomes among other variables included in our report summary outputs on WHO dashboard at global level specifically representing data coming from our members countries likeMalaysia making sense out these actions performed by several governments highlights where we still have areas lacking risk mitigation efforts and core elements when tryingto achieve better life quality around world aiming better efficiency through good governance practices supported on demand reduction strategies coming from healthcare professionals expertise frame work .Conclusion:
- Analysis of health coverage and services in Malaysia, allowing comparison between different public health organizations and the effect of specific prevention programs.
- Identification of gaps between existing healthcare access and provide a standardized data-driven reference point to ensure equitable access across different regions in the country.
- Creation of interactive geographical dashboards that display comparisons among relevant indicators, providing visual representation on how to best target distribution resources for optimal impact
If you use this dataset in your research, please credit the original authors. Data Source
See the dataset description for more information.
File: rsud-service-organization-and-delivery-prevention-programs-and-providers-indicators-for-malaysia-38.csv | Column name | Description | |:--------------------------------------|:----------------------------------------------------------------| | GHO (CODE) | The Global Health Observatory code for the indicator. (String) | | GHO (DISPLAY) | The name of the indicator. (String) | | GHO (URL) | The URL for the indicator. (URL) | | PUBLISHSTATE (CODE) | The code for the publishing state of the indicator. (String) | | PUBLISHSTATE (DISPLAY) | The name of the publishing state of the indicator. (String) | | PUBLISHSTATE (URL) | The URL for the publishing state of the indicator. (URL) | | YEAR (CODE) | The code for...
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Contains data from World Health Organization's data portal covering the following categories: Mortality and global health estimates, Sustainable development goals, Millennium Development Goals (MDGs), Health systems, Malaria, Tuberculosis, Child health, Infectious diseases, Neglected Tropical Diseases, World Health Statistics, Health financing, Tobacco, Substance use and mental health, Injuries and violence, HIV/AIDS and other STIs, Public health and environment, Nutrition, Urban health, Child mortality, Noncommunicable diseases, Noncommunicable diseases CCS, Negelected tropical diseases, Infrastructure, Essential health technologies, Medical equipment, Demographic and socioeconomic statistics, Health inequality monitor, Health Equity Monitor, Child malnutrition, TOBACCO, Neglected tropical diseases, International Health Regulations (2005) monitoring framework, 0, Insecticide resistance, Oral health, Universal Health Coverage, Global Observatory for eHealth (GOe), RSUD: GOVERNANCE, POLICY AND FINANCING : PREVENTION, RSUD: GOVERNANCE, POLICY AND FINANCING: TREATMENT, RSUD: GOVERNANCE, POLICY AND FINANCING: FINANCING, RSUD: SERVICE ORGANIZATION AND DELIVERY: TREATMENT SECTORS AND PROVIDERS, RSUD: SERVICE ORGANIZATION AND DELIVERY: TREATMENT CAPACITY AND TREATMENT COVERAGE, RSUD: SERVICE ORGANIZATION AND DELIVERY: PHARMACOLOGICAL TREATMENT, RSUD: SERVICE ORGANIZATION AND DELIVERY: SCREENING AND BRIEF INTERVENTIONS, RSUD: SERVICE ORGANIZATION AND DELIVERY: PREVENTION PROGRAMS AND PROVIDERS, RSUD: SERVICE ORGANIZATION AND DELIVERY: SPECIAL PROGRAMMES AND SERVICES, RSUD: HUMAN RESOURCES, RSUD: INFORMATION SYSTEMS, RSUD: YOUTH, FINANCIAL PROTECTION, AMR GLASS, Noncommunicable diseases and mental health, Health workforce, AMR GASP, ICD, SEXUAL AND REPRODUCTIVE HEALTH, Immunization, NLIS, AMC GLASS. For links to individual indicator metadata, see resource descriptions.
Facebook
TwitterOpen Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Health Index scores at national, regional, and upper- and lower-tier local authority level for England, including indicator details to construct the Index.
Facebook
TwitterThe Case Mix Index (CMI) is the average relative DRG weight of a hospital’s inpatient discharges, calculated by summing the Medicare Severity-Diagnosis Related Group (MS-DRG) weight for each discharge and dividing the total by the number of discharges. The CMI reflects the diversity, clinical complexity, and resource needs of all the patients in the hospital. A higher CMI indicates a more complex and resource-intensive case load. Although the MS-DRG weights, provided by the Centers for Medicare & Medicaid Services (CMS), were designed for the Medicare population, they are applied here to all discharges regardless of payer. Note: It is not meaningful to add the CMI values together.
Facebook
TwitterThe California Healthy Places Index 3.0 data file was acquired on 04/25/22 from the Public Health Institute on behalf of the Public Health Alliance of Southern California.According to the Public Health Institute, "The HPI tool evaluates the relationship between 23 identified key drivers of health and life expectancy at birth -- which can vary dramatically by neighborhood. Based on that analysis, it produces a score ranking from 1 to 99 that shows the relative impact of conditions in a selected area compared to all other such places in the state." The HPI score is divided across four quartiles. (The Enhanced HPI 3.0: Advancing Health Equity Through High-Quality Data)Potential indicators assigned to eight policy action areas (domains):EconomicsEducationHealthcare accessHousingNeighborhood ConditionsClean EnvironmentSocial EnvironmentTransportationAn HPI score, domains, and individual indicator values and their percentile rankings are presented in the table.For more information, visit the California Healthy Places Index website at https://www.healthyplacesindex.org/ProcessConverted the XLSX file received from the Public Health Institute to a file geodatabase table. Filtered the statewide data to Los Angeles County only. The filtered dataset retains the original default HPI score rank, which is based on conditions across statewide census tracts. Edited field alias names for readability. Joined table to CENSUS_TRACTS_2010 from the Los Angeles County eGIS Data Repository. Exported to new file geodatabase feature class.
Facebook
TwitterBy City of Chicago [source]
This public health dataset contains a comprehensive selection of indicators related to natality, mortality, infectious disease, lead poisoning, and economic status from Chicago community areas. It is an invaluable resource for those interested in understanding the current state of public health within each area in order to identify any deficiencies or areas of improvement needed.
The data includes 27 indicators such as birth and death rates, prenatal care beginning in first trimester percentages, preterm birth rates, breast cancer incidences per hundred thousand female population, all-sites cancer rates per hundred thousand population and more. For each indicator provided it details the geographical region so that analyses can be made regarding trends on a local level. Furthermore this dataset allows various stakeholders to measure performance along these indicators or even compare different community areas side-by-side.
This dataset provides a valuable tool for those striving toward better public health outcomes for the citizens of Chicago's communities by allowing greater insight into trends specific to geographic regions that could potentially lead to further research and implementation practices based on empirical evidence gathered from this comprehensive yet digestible selection of indicators
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
In order to use this dataset effectively to assess the public health of a given area or areas in the city: - Understand which data is available: The list of data included in this dataset can be found above. It is important to know all that are included as well as their definitions so that accurate conclusions can be made when utilizing the data for research or analysis. - Identify areas of interest: Once you are familiar with what type of data is present it can help to identify which community areas you would like to study more closely or compare with one another. - Choose your variables: Once you have identified your areas it will be helpful to decide which variables are most relevant for your studies and research specific questions regarding these variables based on what you are trying to learn from this data set.
- Analyze the Data : Once your variables have been selected and clarified take right into analyzing the corresponding values across different community areas using statistical tests such as t-tests or correlations etc.. This will help answer questions like “Are there significant differences between two outputs?” allowing you to compare how different Chicago Community Areas stack up against each other with regards to public health statistics tracked by this dataset!
- Creating interactive maps that show data on public health indicators by Chicago community area to allow users to explore the data more easily.
- Designing a machine learning model to predict future variations in public health indicators by Chicago community area such as birth rate, preterm births, and childhood lead poisoning levels.
- Developing an app that enables users to search for public health information in their own community areas and compare with other areas within the city or across different cities in the US
If you use this dataset in your research, please credit the original authors. Data Source
See the dataset description for more information.
File: public-health-statistics-selected-public-health-indicators-by-chicago-community-area-1.csv | Column name | Description | |:-----------------------------------------------|:--------------------------------------------------------------------------------------------------| | Community Area | Unique identifier for each community area in Chicago. (Integer) | | Community Area Name | Name of the community area in Chicago. (String) | | Birth Rate | Number of live births per 1,000 population. (Float) | | General Fertility Rate | Number of live births per 1,000 women aged 15-44. (Float) ...
Facebook
TwitterThe Service Delivery Indicators (SDI) are a set of health and education indicators that examine the effort and ability of staff and the availability of key inputs and resources that contribute to a functioning school or health facility. The indicators are standardized, allowing comparison between and within countries over time.
The Health SDIs include healthcare provider effort, knowledge and ability, and the availability of key inputs (for example, basic equipment, medicines and infrastructure, such as toilets and electricity). The indicators provide a snapshot of the health facility and assess the availability of key resources for providing high quality care.
The Mozambique SDI Health survey team visited a sample of 195 health facilities across Mozambique between April and June 2014. The survey team collected rosters covering 2,972 workers for absenteeism and assessed 694 health workers for competence using patient case simulations.
National
Health facilities and healthcare providers
All health facilities providing primary-level care
Sample survey data [ssd]
The sampling strategy for SDI surveys is designed towards attaining indicators that are accurate and representative at the national level, as this allows for proper cross-country (i.e. international benchmarking) and across time comparisons, when applicable. In addition, other levels of representativeness are sought to allow for further disaggregation (rural/urban areas, public/private facilities, subregions, etc.) during the analysis stage.
The sampling strategy for SDI surveys follows a multistage sampling approach. The main units of analysis are facilities (schools and health centers) and providers (health and education workers: teachers, doctors, nurses, facility managers, etc.). The multi-stage sampling approach makes sampling procedures more practical by dividing the selection of large populations of sampling units in a step-by-step fashion. After defining the sampling frame and categorizing it by stratum, a first stage selection of sampling units is carried out independently within each stratum. Often, the primary sampling units (PSU) for this stage are cluster locations (e.g. districts, communities, counties, neighborhoods, etc.) which are randomly drawn within each stratum with a probability proportional to the size (PPS) of the cluster (measured by the location’s number of facilities, providers or pupils). Once locations are selected, a second stage takes place by randomly selecting facilities within location (either with equal probability or with PPS) as secondary sampling units. At a third stage, a fixed number of health and education workers and pupils are randomly selected within facilities to provide information for the different questionnaire modules.
Detailed information about the specific sampling process is available in the associated SDI Country Report included as part of the documentation that accompany these datasets.
Face-to-face [f2f]
The SDI Health Survey Questionnaire consists of four modules and weights:
Module 1: General Information - Administered to the health facility manager to collect information on equipment, medicines, infrastructure and other facets of the health facility.
Module 2: Provider Absence - A roster of healthcare providers is collected and absence measured.
Module 3: Clinical Vignettes – A selection of providers are given clinical vignettes to measure knowledge of common medical conditions.
Module 4: Facility finances – Information on facility revenue and expenditures is collected from the health facility manager.
Weights: Weights for facilities, absentee-related analyses and clinical vignette analyses.
Quality control was performed in Stata.
Facebook
TwitterThe Community Health and Equity Index was developed by Raimi + Associates to compare health conditions, vulnerabilities, and cumulative burdens across the City of Los Angeles. The Index standardizes demographic, socio-economic, health conditions, land use, transportation, food environment, crime, and pollution burden variables, and then averages them together, yielding a score on a scale of 0-100. Lower values indicate better community health.Variables used in the index include: Hardship Index, Life Expectancy, Health Variables (Heart Disease Mortality, Emergency Department Visits for Heart Attacks, Respiratory Disease Mortality, Diabetes Mortality, Stroke Mortality, Childhood Obesity, Percentage of Low Birth Weight Infants, Number of Emergency Department Visits for Asthma for Under 17 and 18+ age groups), Walkability Index, Complete Communities Index (amenities and establishments serving the community), Transportation Index, Modified Retail Food Environment Index, Crime Rate (Violent Crimes, Property Crimes), and Pollution Burden (Pollution Exposure, Environmental Effects).Variables were assigned weights and averaged together. Weights were assigned based on the weights used in the 2013 Health Atlas. For more information, see page 181 of the 2013 Health Atlas, which is available as a PDF on the Los Angeles City Planning website, https://planning.lacity.gov.
Facebook
TwitterBy Health Data New York [source]
This dataset provides comprehensive measures to evaluate the quality of medical services provided to Medicaid beneficiaries by Health Homes, including the Centers for Medicare & Medicaid Services (CMS) Core Set and Health Home State Plan Amendment (SPA). This allows us to gain insight into how well these health homes are performing in terms of delivering high-quality care. Our data sources include the Medicaid Data Mart, QARR Member Level Files, and New York State Delivery System Inform Incentive Program (DSRIP) Data Warehouse. With this data set you can explore essential indicators such as rates for indicators within scope of Core Set Measures, sub domains, domains and measure descriptions; age categories used; denominators of each measure; level of significance for each indicator; and more! By understanding more about Health Home Quality Measures from this resource you can help make informed decisions about evidence based health practices while also promoting better patient outcomes
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
This dataset contains measures that evaluate the quality of care delivered by Health Homes for the Centers for Medicare & Medicaid Services (CMS). With this dataset, you can get an overview of how a health home is performing in terms of quality. You can use this data to compare different health homes and their respective service offerings.
The data used to create this dataset was collected from Medicaid Data Mart, QARR Member Level Files, and New York State Delivery System Incentive Program (DSRIP) Data Warehouse sources.
In order to use this dataset effectively, you should start by looking at the columns provided. These include: Measurement Year; Health Home Name; Domain; Sub Domain; Measure Description; Age Category; Denominator; Rate; Level of Significance; Indicator. Each column provides valuable insight into how a particular health home is performing in various measurements of healthcare quality.
When examining this data, it is important to remember that many variables are included in any given measure and that changes may have occurred over time due to varying factors such as population or financial resources available for healthcare delivery. Furthermore, changes in policy may also affect performance over time so it is important to take these things into account when evaluating the performance of any given health home from one year to the next or when comparing different health homes on a specific measure or set of indicators over time
- Using this dataset, state governments can evaluate the effectiveness of their health home programs by comparing the performance across different domains and subdomains.
- Healthcare providers and organizations can use this data to identify areas for improvement in quality of care provided by health homes and strategies to reduce disparities between individuals receiving care from health homes.
- Researchers can use this dataset to analyze how variations in cultural context, geography, demographics or other factors impact delivery of quality health home services across different locations
If you use this dataset in your research, please credit the original authors. Data Source
See the dataset description for more information.
File: health-home-quality-measures-beginning-2013-1.csv | Column name | Description | |:--------------------------|:----------------------------------------------------| | Measurement Year | The year in which the data was collected. (Integer) | | Health Home Name | The name of the health home. (String) | | Domain | The domain of the measure. (String) | | Sub Domain | The sub domain of the measure. (String) | | Measure Description | A description of the measure. (String) | | Age Category | The age category of the patient. (String) | | Denominator | The denominator of the measure. (Integer) | | Rate | The rate of the measure. (Float) | | Level of Significance | The level of significance of the measure. (String) | | Indicator | The indicator of the measure. (String) |
...
Facebook
TwitterThis national report summarizes key findings from the 2016 National Survey on Drug Use and Health (NSDUH) for indicators of substance use and mental health among people aged 12 years old or older in the civilian, noninstitutionalized population of the United States. Estimates include tobacco use, alcohol use, illicit drug use, opioid use, substance use disorders, major depressive episode, any mental illness, serious mental illness, suicide, co-occurring disorders, and receipt of treatment or services.
Facebook
TwitterAn Environmental Quality Index (EQI) for all counties in the United States for the time period 2000-2005 was developed which incorporated data from five environmental domains: air, water, land, built, and socio-demographic. The EQI was developed in four parts: domain identification; data source identification and review; variable construction; and data reduction using principal components analysis (PCA). The methods applied provide a reproducible approach that capitalizes almost exclusively on publically-available data sources. The primary goal in creating the EQI is to use it as a composite environmental indicator for research on human health. A series of peer reviewed manuscripts utilized the EQI in examining health outcomes. This dataset is not publicly accessible because: This series of papers are considered Human health research - not to be loaded onto ScienceHub. It can be accessed through the following means: The EQI data can be accessed at: https://edg.epa.gov/data/Public/ORD/NHEERL/EQI. Format: EQI data, metadata, formats, and data dictionary all available at website. This dataset is associated with the following publications: Gray, C., L. Messer, K. Rappazzo, J. Jagai, S. Grabich, and D. Lobdell. The association between physical inactivity and obesity is modified by five domains of environmental quality in U.S. adults: A cross-sectional study. PLoS ONE. Public Library of Science, San Francisco, CA, USA, 13(8): e0203301, (2018). Patel, A., J. Jagai, L. Messer, C. Gray, K. Rappazzo, S. DeflorioBarker, and D. Lobdell. Associations between environmental quality and infant mortality in the United States, 2000-2005. Archives of Public Health. BioMed Central Ltd, London, UK, 76(60): 1, (2018). Gray, C., D. Lobdell, K. Rappazzo, Y. Jian, J. Jagai, L. Messer, A. Patel, S. Deflorio-Barker, C. Lyttle, J. Solway, and A. Rzhetsky. Associations between environmental quality and adult asthma prevalence in medical claims data. ENVIRONMENTAL RESEARCH. Elsevier B.V., Amsterdam, NETHERLANDS, 166: 529-536, (2018).
Facebook
TwitterPublic Domain Mark 1.0https://creativecommons.org/publicdomain/mark/1.0/
License information was derived automatically
The global Ocean Health Index measures the state of the world’s oceans.The global OHI score for the 2024 assessment was 69, which was quite a bit lower than last year’s score of 73. This was due to COVID-related declines in tourism and recreation [the 2024 scores reflect 2021 data]. You can explore this and other goals using the interactive map which shows how different countries and goals contribute to the global score, as well as how the score has changed since 2012. Click on colored regions (i.e. EEZs) to see short country summaries.
Facebook
TwitterOpen Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Score for each LSOA in the Health Deprivation and Disability domain. The English Indices of Deprivation provide a relative measure of deprivation at small area level across England. Areas are ranked from least deprived to most deprived on seven different dimensions of deprivation and an overall composite measure of multiple deprivation. Most of the data underlying the 2010 indices are for the year 2008. The indices have been constructed by the Social Disadvantage Research Centre at the University of Oxford for the Department for Communities and Local Government. All figures can only be reproduced if the source (Department for Communities and Local Government, Indices of Deprivation 2010) is fully acknowledged. The domains used in the Indices of Deprivation 2010 are: income deprivation; employment deprivation; health deprivation and disability; education deprivation; crime deprivation; barriers to housing and services deprivation; and living environment deprivation. Each of these domains has its own scores and ranks, allowing users to focus on specific aspects of deprivation. Because the indices give a relative measure, they can tell you if one area is more deprived than another but not by how much. For example, if an area has a rank of 40 it is not half as deprived as a place with a rank of 20. The Index of Multiple Deprivation was constructed by combining scores from the seven domains. When comparing areas, a higher deprivation score indicates a higher proportion of people living there who are classed as deprived. But as for ranks, deprivation scores can only tell you if one area is more deprived than another, but not by how much. This dataset was created from a spreadsheet provided by the Department of Communities and Local Government, which can be downloaded here. The method for calculating the IMD score and underlying indicators is detailed in the report 'The English Indices of Deprivation 2010: Technical Report'. The data is represented here as Linked Data, using the Data Cube ontology.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Small Business Sentiment in the United States increased to 51.67 in February from 49.03 in January of 2015. This dataset provides the latest reported value for - US Small Business Health Index - plus previous releases, historical high and low, short-term forecast and long-term prediction, economic calendar, survey consensus and news.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains 100,000 patient records designed for diabetes risk prediction, analysis, and machine learning applications. The dataset is clean, preprocessed, and ready for use in classification, regression, feature engineering, statistical analysis, and data visualization.
diabetes_dataset.csvThe dataset includes patient profiles with features based on demographics, lifestyle habits, family history, and clinical measurements that are well-established indicators of diabetes risk. All data is generated using statistical distributions inspired by real-world medical research, ensuring privacy preservation while reflecting realistic health patterns.
| Column | Type | Description | Values/Range |
|---|---|---|---|
| patient_id | Integer | Unique patient identifier | 1–100000 |
| age | Integer | Age of patient in years | 18–90 |
| gender | String | Patient gender | 'Male', 'Female', 'Other' |
| ethnicity | String | Ethnic background | 'White', 'Hispanic', 'Black', 'Asian', 'Other' |
| education_level | String | Highest completed education | 'No formal', 'Highschool', 'Graduate', 'Postgraduate' |
| income_level | String | Income category | 'Low', 'Medium', 'High' |
| employment_status | String | Employment type | 'Employed', 'Unemployed', 'Retired', 'Student' |
| smoking_status | String | Smoking behavior | 'Never', 'Former', 'Current' |
| alcohol_consumption_per_week | Float | Drinks consumed per week | 0–30 |
| physical_activity_minutes_per_week | Integer | Physical activity (weekly minutes) | 0–600 |
| diet_score | Integer | Diet quality (higher = healthier) | 0–10 |
| sleep_hours_per_day | Float | Average daily sleep hours | 3–12 |
| screen_time_hours_per_day | Float | Average daily screen time hours | 0–12 |
| family_history_diabetes | Integer | Family history of diabetes | 0 = No, 1 = Yes |
| hypertension_history | Integer | Hypertension history | 0 = No, 1 = Yes |
| cardiovascular_history | Integer | Cardiovascular history | 0 = No, 1 = Yes |
| bmi | Float | Body Mass Index (kg/m²) | 15–45 |
| waist_to_hip_ratio | Float | Waist-to-hip ratio | 0.7–1.2 |
| systolic_bp | Integer | Systolic blood pressure (mmHg) | 90–180 |
| diastolic_bp | Integer | Diastolic blood pressure (mmHg) | 60–120 |
| heart_rate | Integer | Resting heart rate (bpm) | 50–120 |
| cholesterol_total | Float | Total cholesterol (mg/dL) | 120–300 |
| hdl_cholesterol | Float | HDL cholesterol (mg/dL) | 20–100 |
| ldl_cholesterol | Float | LDL cholesterol (mg/dL) | 50–200 |
| triglycerides | Float | Triglycerides (mg/dL) | 50–500 |
| glucose_fasting | Float | Fasting glucose (mg/dL) | 70–250 |
| glucose_postprandial | Float | Post-meal glucose (mg/dL) | 90–350 |
| insulin_level | Float | Blood insulin level (µU/mL) | 2–50 |
| hba1c | Float | HbA1c (%) | 4–14 |
| diabetes_risk_score | Integer | Risk score (calculated, 0–100) | 0–100 |
| diabetes_stage | String | Stage of diabetes | 'No Diabetes', 'Pre-Diabetes', 'Type 1', 'Type 2', 'Gestational' |
| diagnosed_diabetes | Integer | Target: Diabetes diagnosis | 0 = No, 1 = Yes |
diagnosed_diabetes (Yes/No)diabetes_stageglucose_fasting, hba1c, or diabetes_risk_score