17 datasets found

Medical Service Study Areas
data.chhs.ca.gov
healthdata.gov
+5more
Updated Dec 6, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Department of Health Care Access and Information (2024). Medical Service Study Areas [Dataset]. https://data.chhs.ca.gov/dataset/medical-service-study-areas
Explore at:
csv, html, geojson, kml, zip, arcgis geoservices rest apiAvailable download formats
Dataset updated
Dec 6, 2024
Dataset authored and provided by
Department of Health Care Access and Information
Description
This is the current Medical Service Study Area. California Medical Service Study Areas are created by the California Department of Health Care Access and Information (HCAI).

Check the Data Dictionary for field descriptions.

Search for the Medical Service Study Area data on the CHHS Open Data Portal.

Checkout the California Healthcare Atlas for more Medical Service Study Area information.
This is an update to the MSSA geometries and demographics to reflect the new 2020 Census tract data. The Medical Service Study Area (MSSA) polygon layer represents the best fit mapping of all new 2020 California census tract boundaries to the original 2010 census tract boundaries used in the construction of the original 2010 MSSA file. Each of the state's new 9,129 census tracts was assigned to one of the previously established medical service study areas (excluding tracts with no land area), as identified in this data layer. The MSSA Census tract data is aggregated by HCAI, to create this MSSA data layer. This represents the final re-mapping of 2020 Census tracts to the original 2010 MSSA geometries. The 2010 MSSA were based on U.S. Census 2010 data and public meetings held throughout California.

Source of update: American Community Survey 5-year 2006-2010 data for poverty. For source tables refer to InfoUSA update procedural documentation. The 2010 MSSA Detail layer was developed to update fields affected by population change. The American Community Survey 5-year 2006-2010 population data pertaining to total, in households, race, ethnicity, age, and poverty was used in the update. The 2010 MSSA Census Tract Detail map layer was developed to support geographic information systems (GIS) applications, representing 2010 census tract geography that is the foundation of 2010 medical service study area (MSSA) boundaries. ***This version is the finalized MSSA reconfiguration boundaries based on the US Census Bureau 2010 Census. In 1976 Garamendi Rural Health Services Act, required the development of a geographic framework for determining which parts of the state were rural and which were urban, and for determining which parts of counties and cities had adequate health care resources and which were "medically underserved". Thus, sub-city and sub-county geographic units called "medical service study areas [MSSAs]" were developed, using combinations of census-defined geographic units, established following General Rules promulgated by a statutory commission. After each subsequent census the MSSAs were revised. In the scheduled revisions that followed the 1990 census, community meetings of stakeholders (including county officials, and representatives of hospitals and community health centers) were held in larger metropolitan areas. The meetings were designed to develop consensus as how to draw the sub-city units so as to best display health care disparities. The importance of involving stakeholders was heightened in 1992 when the United States Department of Health and Human Services' Health and Resources Administration entered a formal agreement to recognize the state-determined MSSAs as "rational service areas" for federal recognition of "health professional shortage areas" and "medically underserved areas". After the 2000 census, two innovations transformed the process, and set the stage for GIS to emerge as a major factor in health care resource planning in California. First, the Office of Statewide Health Planning and Development [OSHPD], which organizes the community stakeholder meetings and provides the staff to administer the MSSAs, entered into an Enterprise GIS contract. Second, OSHPD authorized at least one community meeting to be held in each of the 58 counties, a significant number of which were wholly rural or frontier counties. For populous Los Angeles County, 11 community meetings were held. As a result, health resource data in California are collected and organized by 541 geographic units. The boundaries of these units were established by community healthcare experts, with the objective of maximizing their usefulness for needs assessment purposes. The most dramatic consequence was introducing a data simultaneously displayed in a GIS format. A two-person team, incorporating healthcare policy and GIS expertise, conducted the series of meetings, and supervised the development of the 2000-census configuration of the MSSAs.

MSSA Configuration Guidelines (General Rules):- Each MSSA is composed of one or more complete census tracts.- As a general rule, MSSAs are deemed to be "rational service areas [RSAs]" for purposes of designating health professional shortage areas [HPSAs], medically underserved areas [MUAs] or medically underserved populations [MUPs].- MSSAs will not cross county lines.- To the extent practicable, all census-defined places within the MSSA are within 30 minutes travel time to the largest population center within the MSSA, except in those circumstances where meeting this criterion would require splitting a census tract.- To the extent practicable, areas that, standing alone, would meet both the definition of an MSSA and a Rural MSSA, should not be a part of an Urban MSSA.- Any Urban MSSA whose population exceeds 200,000 shall be divided into two or more Urban MSSA Subdivisions.- Urban MSSA Subdivisions should be within a population range of 75,000 to 125,000, but may not be smaller than five square miles in area. If removing any census tract on the perimeter of the Urban MSSA Subdivision would cause the area to fall below five square miles in area, then the population of the Urban MSSA may exceed 125,000. - To the extent practicable, Urban MSSA Subdivisions should reflect recognized community and neighborhood boundaries and take into account such demographic information as income level and ethnicity. Rural Definitions: A rural MSSA is an MSSA adopted by the Commission, which has a population density of less than 250 persons per square mile, and which has no census defined place within the area with a population in excess of 50,000. Only the population that is located within the MSSA is counted in determining the population of the census defined place. A frontier MSSA is a rural MSSA adopted by the Commission which has a population density of less than 11 persons per square mile. Any MSSA which is not a rural or frontier MSSA is an urban MSSA. Last updated December 6th 2024.
N
Medical Lake, WA Population Pyramid Dataset: Age Groups, Male and Female...
neilsberg.com
csv, json
Updated Sep 16, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2023). Medical Lake, WA Population Pyramid Dataset: Age Groups, Male and Female Population, and Total Population for Demographics Analysis [Dataset]. https://www.neilsberg.com/research/datasets/62e47581-3d85-11ee-9abe-0aa64bf2eeb2/
Explore at:
json, csvAvailable download formats
Dataset updated
Sep 16, 2023
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Medical Lake, Washington
Variables measured
Male and Female Population Under 5 Years, Male and Female Population over 85 years, Male and Female Total Population for Age Groups, Male and Female Population Between 5 and 9 years, Male and Female Population Between 10 and 14 years, Male and Female Population Between 15 and 19 years, Male and Female Population Between 20 and 24 years, Male and Female Population Between 25 and 29 years, Male and Female Population Between 30 and 34 years, Male and Female Population Between 35 and 39 years, and 9 more
Measurement technique
The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates. To measure the three variables, namely (a) male population, (b) female population and (b) total population, we initially analyzed and categorized the data for each of the age groups. For age groups we divided it into roughly a 5 year bucket for ages between 0 and 85. For over 85, we aggregated data into a single group for all ages. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset tabulates the data for the Medical Lake, WA population pyramid, which represents the Medical Lake population distribution across age and gender, using estimates from the U.S. Census Bureau American Community Survey 5-Year estimates. It lists the male and female population for each age group, along with the total population for those age groups. Higher numbers at the bottom of the table suggest population growth, whereas higher numbers at the top indicate declining birth rates. Furthermore, the dataset can be utilized to understand the youth dependency ratio, old-age dependency ratio, total dependency ratio, and potential support ratio.

Key observations

Youth dependency ratio, which is the number of children aged 0-14 per 100 persons aged 15-64, for Medical Lake, WA, is 21.5.

Old-age dependency ratio, which is the number of persons aged 65 or over per 100 persons aged 15-64, for Medical Lake, WA, is 12.5.

Total dependency ratio for Medical Lake, WA is 34.0.

Potential support ratio, which is the number of youth (working age population) per elderly, for Medical Lake, WA is 8.0.

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.

Age groups:

Under 5 years

5 to 9 years

10 to 14 years

15 to 19 years

20 to 24 years

25 to 29 years

30 to 34 years

35 to 39 years

40 to 44 years

45 to 49 years

50 to 54 years

55 to 59 years

60 to 64 years

65 to 69 years

70 to 74 years

75 to 79 years

80 to 84 years

85 years and over

Variables / Data Columns

Age Group: This column displays the age group for the Medical Lake population analysis. Total expected values are 18 and are define above in the age groups section.

Population (Male): The male population in the Medical Lake for the selected age group is shown in the following column.

Population (Female): The female population in the Medical Lake for the selected age group is shown in the following column.

Total Population: The total population of the Medical Lake for the selected age group is shown in the following column.

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for Medical Lake Population by Age. You can refer the same here
a
Medical Service Study Area Demographics
usc-geohealth-hub-uscssi.hub.arcgis.com
Updated Nov 10, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Spatial Sciences Institute (2021). Medical Service Study Area Demographics [Dataset]. https://usc-geohealth-hub-uscssi.hub.arcgis.com/datasets/medical-service-study-area-demographics
Explore at:
Dataset updated
Nov 10, 2021
Dataset authored and provided by
Spatial Sciences Institute
Area covered

Description
Medical Service Study Areas (MSSAs)As defined by California's Office of Statewide Health Planning and Development (OSHPD) in 2013, "MSSAs are sub-city and sub-county geographical units used to organize and display population, demographic and physician data" (Source). Each census tract in CA is assigned to a given MSSA. The most recent MSSA dataset (2014) was used. Spatial data are available via OSHPD at the California Open Data Portal. This information may be useful in studying health equity.Definitions:Race/Ethnicity: Race/ethnicity is categorized as: All races/ethnicities, Non-Hispanic (NH) White, NH Black, Asian/Pacific Islander, or Hispanic. "All races" includes all of the above, as well as other and unknown race/ethnicity and American Indian/Alaska Native. The latter two groups are not reported separately due to small numbers for many cancer sites.Racial/Ethnic Composition: Distribution of residents' race/ethnicity (e.g., % Hispanic, % non-Hispanic White, % non-Hispanic Black, % non-Hispanic Asian/Pacific Islander). (Source: US Census, 2010.)Rural: Percent of residents who reside in blocks that are designated as rural. (Source: US Census, 2010.)Foreign Born: Percent of residents who were born outside the United States. (Source: American Community Survey, 2008-2012.)Socioeconomic Status (Neighborhood Level): A composite measure of seven indicator variables created by principal component analysis; indicators include: education, blue-collar job, unemployment, household income, poverty, rent, and house value. Quintiles based on state distribution, with quintile 1 being the lowest SES and 5 being the highest. (Source: American Community Survey, 2008-2012.)Spatial extent: CaliforniaSpatial Unit: MSSACreated: n/aUpdated: n/aSource: California Health MapsContact Email: gbacr@ucsf.eduSource Link: https://www.californiahealthmaps.org/?areatype=mssa&address=&sex=Both&site=AllSite&race=&year=05yr&overlays=none&choropleth=Obesity
Mexico-WHO Health Indicators
kaggle.com
zip
Updated Jan 22, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2023). Mexico-WHO Health Indicators [Dataset]. https://www.kaggle.com/datasets/thedevastator/mexico-who-health-indicators
Explore at:
zip(818791 bytes)Available download formats
Dataset updated
Jan 22, 2023
Authors
The Devastator
Area covered
Mexico
Description
Mexico-WHO Health Indicators

Demographic, Disease, and Treatment Coverage Data

By Humanitarian Data Exchange [source]

About this dataset

This Kaggle dataset contains a wide array of health and socioeconomic indicators relating to Mexico. It covers topics ranging from mortality and global health estimates, to Sustainable Development Goals, Millennium Development Goals (MDGs), Health Systems, Malaria and Tuberculosis, Child Health, Infectious Diseases, World Health Statistics, Health Financing and Public Heath & Environment. Furthermore, it includes indicators for Substance Use & Mental Health; Tobacco use; Injuries & Violence; HIV/AIDS & Other STIs; Nutrition; Urban Health; Noncommunicable Diseases (NCDs); Neglected Tropical Diseases (NTDs); Infrastructure; Essential Technologies in healthcare systems; Demographic & Socioeconomic Statistics. Finally it features indicators surrounding International Regulations Monitoring Frameworks as well as Insecticides Resistance amongst other topics.

This dataset is bursting with information on how Mexico stands in a variety of different aspects across its development spectrum- enabling researchers to gain deeper insight into the country's ecosystem as well as providing them with the data required to pinpoint potential ‘hotspots’- Areas which may require heightened attention either from policy makers or individuals looking for smarter ways through which their efforts might benefit their target population most efficiently. Don’t miss your chance at unlocking the power of this comprehensive dataset so you can make sure that no stone is left unturned when it comes to realising tangible outcomes from your research!

More Datasets

For more datasets, click here.

Featured Notebooks

🚨 Your notebook can be here! 🚨!

How to use the dataset

The dataset is organized into several key categories and each category contains a number of different indicators related to that particular area of healthcare. In order to better understand any given indicator in more detail, each one also has an associated metadata page with additional information about its definition and calculation method.

In order to make use of the data in this dataset there are several steps you will need to take: - Decide what aspect or area of healthcare you would like to explore further in more detail; - Review/understand any associated metadata provided regarding its definition or calculation method;
- Download any necessary files containing relevant numbers or figures;
- Analyze or explore this data further;
6 Use your findings to inform decisions about policy interventions for improving general public health outcomes in Mexico!

Research Ideas

Analyzing Mexico's progress towards achieving the desired health indicators for the Sustainable Development Goals (SDGs).

Examining how access to healthcare and mental health services vary by region, as well as disparities in treatment within regions.

Developing machine learning models to predict outcome based on different factors such as environment and socioeconomic status

Acknowledgements

If you use this dataset in your research, please credit the original authors. Data Source

License

See the dataset description for more information.

Columns

File: infrastructure-indicators-for-mexico-11.csv | Column name | Description | |:---------------------------|:---------------------------------------------------------------| | GHO (CODE) | The Global Health Observatory code for the indicator. (String) | | GHO (DISPLAY) | The name of the indicator. (String) | | GHO (URL) | The URL for the indicator. (URL) | | PUBLISHSTATE (CODE) | The code for the publication state of the indicator. (String) | | PUBLISHSTATE (DISPLAY) | The name of the publication state of the indicator. (String) | | PUBLISHSTATE (URL) | The URL for the publication state of the indicator. (URL) | | YEAR (CODE) | The code for the year of the indicator. (String) | | YEAR (DISPLAY) | The name of the year of the indicator. (String) | | YEAR (URL) | The URL for the year of the indicator. (URL) | | REGION (CODE) | The code for the region of the indicator. (String) | | REGION (DISPLAY) | The name of the region of the indicator. (String) | | REGION (URL) |...
d
COVID-19 Deaths by Population Characteristics
catalog.data.gov
data.sfgov.org
+2more
Updated Oct 25, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.sfgov.org (2025). COVID-19 Deaths by Population Characteristics [Dataset]. https://catalog.data.gov/dataset/covid-19-deaths-by-population-characteristics
Explore at:
Dataset updated
Oct 25, 2025
Dataset provided by
data.sfgov.org
Description
A. SUMMARY This dataset shows San Francisco COVID-19 deaths by population characteristics. This data may not be immediately available for recently reported deaths. Data updates as more information becomes available. Because of this, death totals may increase or decrease. Population characteristics are subgroups, or demographic cross-sections, like age, race, or gender. The City tracks how deaths have been distributed among different subgroups. This information can reveal trends and disparities among groups. B. HOW THE DATASET IS CREATED As of January 1, 2023, COVID-19 deaths are defined as persons who had COVID-19 listed as a cause of death or a significant condition contributing to their death on their death certificate. This definition is in alignment with the California Department of Public Health and the national Council of State and Territorial Epidemiologists. Death certificates are maintained by the California Department of Public Health. Data on the population characteristics of COVID-19 deaths are from: Case reports Medical records Electronic lab reports Death certificates Data are continually updated to maximize completeness of information and reporting on San Francisco COVID-19 deaths. To protect resident privacy, we summarize COVID-19 data by only one population characteristic at a time. Data are not shown until cumulative citywide deaths reach five or more. Data notes on select population characteristic types are listed below. Race/ethnicity * We include all race/ethnicity categories that are collected for COVID-19 cases. Gender * The City collects information on gender identity using these guidelines. C. UPDATE PROCESS Updates automatically at 06:30 and 07:30 AM Pacific Time on Wednesday each week. Dataset will not update on the business day following any federal holiday. D. HOW TO USE THIS DATASET Population estimates are only available for age groups and race/ethnicity categories. San Francisco population estimates for race/ethnicity and age groups can be found in a dataset based on the San Francisco Population and Demographic Census dataset.These population estimates are from the 2018-2022 5-year American Community Survey (ACS). This dataset includes several characteristic types. Filter the “Characteristic Type” column to explore a topic area. Then, the “Characteristic Group” column shows each group or category within that topic area and the number of cumulative deaths. Cumulative deaths are the running total of all San Francisco COVID-19 deaths in that characteristic group up to the date listed. To explore data on the total number of deaths, use the COVID-19 Deaths Over Time dataset. E. CHANGE LOG
D
[Archived] COVID-19 Deaths by Population Characteristics Over Time
data.sfgov.org
healthdata.gov
+1more
csv, xlsx, xml
Updated Jun 27, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). [Archived] COVID-19 Deaths by Population Characteristics Over Time [Dataset]. https://data.sfgov.org/Health-and-Social-Services/-Archived-COVID-19-Deaths-by-Population-Characteri/kkr3-wq7h
Explore at:
xlsx, xml, csvAvailable download formats
Dataset updated
Jun 27, 2024
License
ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
Description
As of July 2nd, 2024 the COVID-19 Deaths by Population Characteristics Over Time dataset has been retired. This dataset is archived and will no longer update. We will be publishing a cumulative deaths by population characteristics dataset that will update moving forward.

A. SUMMARY This dataset shows San Francisco COVID-19 deaths by population characteristics and by date. This data may not be immediately available for recently reported deaths. Data updates as more information becomes available. Because of this, death totals for previous days may increase or decrease. More recent data is less reliable.

Population characteristics are subgroups, or demographic cross-sections, like age, race, or gender. The City tracks how deaths have been distributed among different subgroups. This information can reveal trends and disparities among groups.

B. HOW THE DATASET IS CREATED As of January 1, 2023, COVID-19 deaths are defined as persons who had COVID-19 listed as a cause of death or a significant condition contributing to their death on their death certificate. This definition is in alignment with the California Department of Public Health and the national https://preparedness.cste.org/wp-content/uploads/2022/12/CSTE-Revised-Classification-of-COVID-19-associated-Deaths.Final_.11.22.22.pdf">Council of State and Territorial Epidemiologists. Death certificates are maintained by the California Department of Public Health.

Data on the population characteristics of COVID-19 deaths are from: *Case reports *Medical records *Electronic lab reports *Death certificates

Data are continually updated to maximize completeness of information and reporting on San Francisco COVID-19 deaths.

To protect resident privacy, we summarize COVID-19 data by only one characteristic at a time. Data are not shown until cumulative citywide deaths reach five or more.

Data notes on each population characteristic type is listed below.

Race/ethnicity * We include all race/ethnicity categories that are collected for COVID-19 cases.

Gender * The City collects information on gender identity using these guidelines.

C. UPDATE PROCESS Updates automatically at 06:30 and 07:30 AM Pacific Time on Wednesday each week.

Dataset will not update on the business day following any federal holiday.

D. HOW TO USE THIS DATASET Population estimates are only available for age groups and race/ethnicity categories. San Francisco population estimates for race/ethnicity and age groups can be found in a view based on the San Francisco Population and Demographic Census dataset. These population estimates are from the 2016-2020 5-year American Community Survey (ACS).

This dataset includes many different types of characteristics. Filter the “Characteristic Type” column to explore a topic area. Then, the “Characteristic Group” column shows each group or category within that topic area and the number of deaths on each date.

New deaths are the count of deaths within that characteristic group on that specific date. Cumulative deaths are the running total of all San Francisco COVID-19 deaths in that characteristic group up to the date listed.

This data may not be immediately available for more recent deaths. Data updates as more information becomes available.

To explore data on the total number of deaths, use the COVID-19 Deaths Over Time dataset.

E. CHANGE LOG
9/11/2023 - on this date, we began using an updated definition of a COVID-19 death to align with the California Department of Public Health. This change was applied to COVID-19 deaths retrospectively beginning on 1/1/2023. More information about the recommendation by the Council of State and Territorial Epidemiologists that motivated this change can be found https://preparedness.cste.org/wp-content/uploads/2022/12/CSTE-Revised-Classification-of-COVID-19-associated-Deaths.Final_.11.22.22.pdf">here.
6/6/2023 - data on deaths by transmission type have been removed. See section ARCHIVED DATA for more detail.
5/16/2023 - data on deaths by sexual orientation, comorbidities, homelessness, and single room occupancy have been removed. See section ARCHIVED DATA for more detail.
4/6/2023 - the State implemented system updates to improve the integrity of historical data.
1/31/2023 - column “population_estimate” added.
3/23/2022 - ‘Native American’ changed to ‘American Indian or Alaska Native’ to align with the census.
1/22/2022 - system updates to improve timeliness and accuracy of cases and deaths data were implemented.
f
Patient demographics.
datasetcatalog.nlm.nih.gov
plos.figshare.com
Updated Nov 5, 2015
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Fitzgerald-Hughes, Deirdre; O’Keeffe, Kate M.; Hearnden, Claire H.; Leech, John M.; Brown, Aisling F.; Lalor, Stephen J.; McLoughlin, Rachel M.; Rogers, Thomas R.; Mac Aogáin, Micheál; Lacey, Keenan A.; Murphy, Alison G.; Foster, Timothy J.; Tavakol, Mehri; O’Halloran, Dara P.; Geoghegan, Joan A.; Lavelle, Ed C.; Fennell, Jérôme P.; Humphreys, Hilary; van Wamel, Willem J. (2015). Patient demographics. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001916216
Explore at:
Dataset updated
Nov 5, 2015
Authors
Fitzgerald-Hughes, Deirdre; O’Keeffe, Kate M.; Hearnden, Claire H.; Leech, John M.; Brown, Aisling F.; Lalor, Stephen J.; McLoughlin, Rachel M.; Rogers, Thomas R.; Mac Aogáin, Micheál; Lacey, Keenan A.; Murphy, Alison G.; Foster, Timothy J.; Tavakol, Mehri; O’Halloran, Dara P.; Geoghegan, Joan A.; Lavelle, Ed C.; Fennell, Jérôme P.; Humphreys, Hilary; van Wamel, Willem J.
Description
a Healthcare-associated infections were defined as (i) index positive blood culture collected ≥48hrs after hospital admission, and no signs or symptoms of the infection noted at time of admission; OR (ii) index positive blood culture collected <48hrs after hospital admission if any of the following criteria are met: received intravenous therapy in an ambulatory setting in the 30 days before onset of BSI, attended a hospital clinic or haemodialysis in the 30 days before onset of BSI, hospitalised in an acute care hospital for ≥ 2 days in the 90 days prior to onset of BSI, resident of nursing home or long-term care facility.bStaphylococcus aureus bacteraemia was defined as uncomplicated if all of the following criteria were met: exclusion of endocarditis; no evidence of metastatic infection; absence of implanted prostheses; follow-up blood cultures at 2–4 days culture-negative for S. aureus; defervescence within 72 h of initiating effective therapy. Percentages shown are of entire S. aureus BSI population.† Three patients had chronic diabetic foot ulcers as a source of their S. aureus BSI, and in all cases the contiguous underlying bone was also found to be infected.MRSA = methicillin-resistant Staphylococcus aureus. NA = not applicable. BSI = bloodstream infection.Data are displayed as median (interquartile range) and number (percentage). P values are calculated by Mann-Whitney and Fisher’s exact test respectively.
p
MIMIC-III Clinical Database
physionet.org
oppositeofnorth.com
Updated Sep 4, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alistair Johnson; Tom Pollard; Roger Mark (2016). MIMIC-III Clinical Database [Dataset]. http://doi.org/10.13026/C2XW26
Explore at:
Unique identifier
https://doi.org/10.13026/C2XW26
Dataset updated
Sep 4, 2016
Authors
Alistair Johnson; Tom Pollard; Roger Mark
License
https://github.com/MIT-LCP/license-and-dua/tree/master/draftshttps://github.com/MIT-LCP/license-and-dua/tree/master/drafts
Description
MIMIC-III is a large, freely-available database comprising deidentified health-related data associated with over forty thousand patients who stayed in critical care units of the Beth Israel Deaconess Medical Center between 2001 and 2012. The database includes information such as demographics, vital sign measurements made at the bedside (~1 data point per hour), laboratory test results, procedures, medications, caregiver notes, imaging reports, and mortality (including post-hospital discharge).MIMIC supports a diverse range of analytic studies spanning epidemiology, clinical decision-rule improvement, and electronic tool development. It is notable for three factors: it is freely available to researchers worldwide; it encompasses a diverse and very large population of ICU patients; and it contains highly granular data, including vital signs, laboratory results, and medications.
Claims-based definitions of death.
plos.figshare.com
datasetcatalog.nlm.nih.gov
xls
Updated Jun 2, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nobuhiro Ooba; Soko Setoguchi; Takashi Ando; Tsugumichi Sato; Takuhiro Yamaguchi; Mayumi Mochizuki; Kiyoshi Kubota (2023). Claims-based definitions of death. [Dataset]. http://doi.org/10.1371/journal.pone.0066116.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0066116.t001
Dataset updated
Jun 2, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Nobuhiro Ooba; Soko Setoguchi; Takashi Ando; Tsugumichi Sato; Takuhiro Yamaguchi; Mayumi Mochizuki; Kiyoshi Kubota
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Abbreviation: CCI, charlson comorbidity index.
Assessing the validity of a data driven segmentation approach: A 4 year...
plos.figshare.com
docx
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lian Leng Low; Shi Yan; Yu Heng Kwan; Chuen Seng Tan; Julian Thumboo (2023). Assessing the validity of a data driven segmentation approach: A 4 year longitudinal study of healthcare utilization and mortality [Dataset]. http://doi.org/10.1371/journal.pone.0195243
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0195243
Dataset updated
May 31, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Lian Leng Low; Shi Yan; Yu Heng Kwan; Chuen Seng Tan; Julian Thumboo
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
BackgroundSegmentation of heterogeneous patient populations into parsimonious and relatively homogenous groups with similar healthcare needs can facilitate healthcare resource planning and development of effective integrated healthcare interventions for each segment. We aimed to apply a data-driven, healthcare utilization-based clustering analysis to segment a regional health system patient population and validate its discriminative ability on 4-year longitudinal healthcare utilization and mortality data.MethodsWe extracted data from the Singapore Health Services Electronic Health Intelligence System, an electronic medical record database that included healthcare utilization (inpatient admissions, specialist outpatient clinic visits, emergency department visits, and primary care clinic visits), mortality, diseases, and demographics for all adult Singapore residents who resided in and had a healthcare encounter with our regional health system in 2012. Hierarchical clustering analysis (Ward’s linkage) and K-means cluster analysis using age and healthcare utilization data in 2012 were applied to segment the selected population. These segments were compared using their demographics (other than age) and morbidities in 2012, and longitudinal healthcare utilization and mortality from 2013–2016.ResultsAmong 146,999 subjects, five distinct patient segments “Young, healthy”; “Middle age, healthy”; “Stable, chronic disease”; “Complicated chronic disease” and “Frequent admitters” were identified. Healthcare utilization patterns in 2012, morbidity patterns and demographics differed significantly across all segments. The “Frequent admitters” segment had the smallest number of patients (1.79% of the population) but consumed 69% of inpatient admissions, 77% of specialist outpatient visits, 54% of emergency department visits, and 23% of primary care clinic visits in 2012. 11.5% and 31.2% of this segment has end stage renal failure and malignancy respectively. The validity of cluster-analysis derived segments is supported by discriminative ability for longitudinal healthcare utilization and mortality from 2013–2016. Incident rate ratios for healthcare utilization and Cox hazards ratio for mortality increased as patient segments increased in complexity. Patients in the “Frequent admitters” segment accounted for a disproportionate healthcare utilization and 8.16 times higher mortality rate.ConclusionOur data-driven clustering analysis on a general patient population in Singapore identified five patient segments with distinct longitudinal healthcare utilization patterns and mortality risk to provide an evidence-based segmentation of a regional health system’s healthcare needs.
Infectious Disease Prediction
kaggle.com
zip
Updated Jul 14, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Haithem Hermassi (2020). Infectious Disease Prediction [Dataset]. https://www.kaggle.com/haithemhermessi/infectious-disease-prediction
Explore at:
zip(1804291 bytes)Available download formats
Dataset updated
Jul 14, 2020
Authors
Haithem Hermassi
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Context

These data contain counts and rates for Centers for Infectious Diseases-related disease cases among California residents by county, disease, sex, and year spanning 2001-2014 (As of September, 2015). Data were extracted on communicable disease cases with an estimated onset or diagnosis date from 2001 through 2014 from California Confidential Morbidity Reports and/or Laboratory Report that were submitted to CDPH by September 2015 and which met the surveillance case definition for that disease. A cleansing and exploration steps have been performed to generate the train and test datasets.

Content

The train dataset contains 75614 rows and the test data has 18904 rows ****Features:**** ****Disease****:Plain text: The name of the disease reported for the patient. ****County****: Plain text "The county in which the case resided when they were diagnosed and/or where they are currently receiving care; in most cases this will be the county that reported the case.
****Year ****:Number: Year is derived from the estimated illness onset date. We defined the estimated illness onset date for each case as the date closest to the time when symptoms first appeared. Because date of illness onset may not be recorded, the estimated date of illness onset can range from the first appearance of symptoms to the date the report was made to CDPH. For diseases with insidious illness onset (for instance, coccidioidomycosis), estimated illness onset was more frequently drawn from the diagnosis date Values include: years spanning 2001-2014, unless otherwise indicated below ****Sex ****:Plain text : Values include: Male, Female, **Count **:Number: The number of occurrences of each disease that meet the surveillance definition and/or inclusion criteria specific to that disease for that County, Year, Sex strata. National surveillance case definitions for these conditions can be found at http://wwwn.cdc.gov/nndss/case-definitions.html. ****Population ****:Number: The estimated population size (rounded to the nearest integer) for each County, Year, Sex strata. California Department of Finance (DOF) Population Projection data (P-3 data table) were used to determine the population proportion of a particular demographic subgroup relative to the total State/County population for a given year. These proportions were then applied to the DOF Estimate totals (E-2 data table) for the given State/County and year total, to obtain the estimates used. These data are available at http://www.dof.ca.gov/research/demographic/reports/view.php. Value: a number (a positive integer)" ****Rate ****:Number:The rate of disease per 100,000 population for the corresponding County, Year, Sex strata using the standard calculation (Count *100,000/Population) Value: a number (a positive real number xxx.xxx)" ****CI.lower****:Number: The lower bound of the 95% confidence interval for the calculated rate. The confidence interval was calculated with the R software package (R Core Team (2015). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL http://www.R-project.org/.) using the ""Exact Pearson-Klopper"" method as implement in the ""binom"" package (Sundar Dorai-Raj (2014). binom: Binomial Confidence Intervals For Several Parameterizations. R package version 1.1-1. http://CRAN.R-project.org/package=binom) Value: a number (a positive real number xxx.xxx)" ****CI.uppe**r**:Number:The upper bound of the 95% confidence interval for the calculated rate, calculated as above. Value: a number (a positive real number xxx.xxx)"

Acknowledgements
National Health & Nutrition Exam Survey 2017-2018
kaggle.com
zip
Updated Jan 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Riley Zurrin (2024). National Health & Nutrition Exam Survey 2017-2018 [Dataset]. https://www.kaggle.com/rileyzurrin/national-health-and-nutrition-exam-survey-2017-2018
Explore at:
zip(12252608 bytes)Available download formats
Dataset updated
Jan 12, 2024
Authors
Riley Zurrin
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
As of January 2024, this is the most recent NHANES dataset whose data collection was not affected by COVID-19.

Context

The National Health and Nutrition Examination Survey (NHANES) is a program of studies designed to assess the health and nutritional status of adults and children in the United States. The survey is unique in that it combines interviews and physical examinations. NHANES is a major program of the National Center for Health Statistics (NCHS). NCHS is part of the Centers for Disease Control and Prevention (CDC) and has the responsibility for producing vital and health statistics for the Nation.

The NHANES program began in the early 1960s and has been conducted as a series of surveys focusing on different population groups or health topics. In 1999, the survey became a continuous program that has a changing focus on a variety of health and nutrition measurements to meet emerging needs. The survey examines a nationally representative sample of about 5,000 persons each year. These persons are located in counties across the country, 15 of which are visited each year.

The NHANES interview includes demographic, socioeconomic, dietary, and health-related questions. The examination component consists of medical, dental, and physiological measurements, as well as laboratory tests administered by highly trained medical personnel.

To date, thousands of research findings have been published using the NHANES data.

Content

The 2017-2018 NHANES datasets include the following components:

1. Demographics dataset:

A complete variable dictionary can be found here

2. Examinations dataset, which contains factors like:

Blood pressure

Body measures

Muscle strength - grip test

Oral health - dentition

Taste & smell

A complete variable dictionary can be found here

3. Dietary data - total nutrient intake:

A complete variable dictionary can be found here

4. Laboratory dataset, which includes factors like:

Albumin & Creatinine - Urine

Apolipoprotein B

Blood Lead, Cadmium, Total Mercury, Selenium, and Manganese

Blood mercury: inorganic, ethyl and methyl

Cholesterol - HDL

Cholesterol - LDL & Triglycerides

Cholesterol - Total

Complete Blood Count with 5-part Differential - Whole Blood

Copper, Selenium & Zinc - Serum

Fasting Questionnaire

Fluoride - Plasma

Fluoride - Water

Glycohemoglobin

Hepatitis A

Hepatitis B Surface Antibody

Hepatitis B: core antibody, surface antigen, and Hepatitis D antibody

Hepatitis C RNA (HCV-RNA) and Hepatitis C Genotype

Hepatitis E: IgG & IgM Antibodies

Herpes Simplex Virus Type-1 & Type-2

HIV Antibody Test

Human Papillomavirus (HPV) - Oral Rinse

Human Papillomavirus (HPV) DNA - Vaginal Swab: Roche Cobas & Roche Linear Array

Human Papillomavirus (HPV) DNA Results from Penile Swab Samples: Roche Linear Array

Insulin

Iodine - Urine

Perchlorate, Nitrate & Thiocyanate - Urine

Perfluoroalkyl and Polyfluoroalkyl Substances (formerly Polyfluoroalkyl Chemicals - PFC)

Personal Care and Consumer Product Chemicals and Metabolites

Phthalates and Plasticizers Metabolites - Urine

Plasma Fasting Glucose

Polycyclic Aromatic Hydrocarbons (PAH) - Urine

Standard Biochemistry Profile

Tissue Transglutaminase Assay (IgA-TTG) & IgA Endomyseal Antibody Assay (IgA EMA)

Trichomonas - Urine

Two-hour Oral Glucose Tolerance Test

Urinary Chlamydia

Urinary Mercury

Urinary Speciated Arsenics

Urinary Total Arsenic

Urine Flow Rate

Urine Metals

Urine Pregnancy Test

Vitamin B12

A complete variable dictionary can be found here

5. Questionnaire dataset, which includes items like:

Acculturation

Alcohol Use

Blood Pressure & Cholesterol

Cardiovascular Health

Consumer Behavior

Current Health Status

Dermatology

Diabetes

Diet Behavior & Nutrition

Disability

Drug Use

Early Childhood

Food Security

Health Insurance

Hepatitis

Hospital Utilization & Access to Care

Housing Characteristics

Immunization

Income

Medical Conditions

Mental Health - Depression Screener

Occupation

Oral Health

Osteoporosis

Pesticide Use

Physical Activity

Physical Functioning

Preventive Aspirin Us...
Socio-demographics of the study population (n = 671).
plos.figshare.com
datasetcatalog.nlm.nih.gov
xls
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Birte Pantenburg; Claudia Sikorski; Melanie Luppa; Georg Schomerus; Hans-Helmut König; Perla Werner; Steffi G. Riedel-Heller (2023). Socio-demographics of the study population (n = 671). [Dataset]. http://doi.org/10.1371/journal.pone.0048113.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0048113.t001
Dataset updated
May 31, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Birte Pantenburg; Claudia Sikorski; Melanie Luppa; Georg Schomerus; Hans-Helmut König; Perla Werner; Steffi G. Riedel-Heller
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
SD = standard deviation; ow = overweight.*definition of migrational background adopted from Federal Statistical Office of Germany: Participant not born in Germany or not in possession of German passport, or at least one of participant’s parents not born in Germany [38].
d
ARCHIVED: COVID-19 Cases and Deaths Summarized by Geography
catalog.data.gov
Updated Mar 29, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.sfgov.org (2025). ARCHIVED: COVID-19 Cases and Deaths Summarized by Geography [Dataset]. https://catalog.data.gov/dataset/covid-19-cases-and-deaths-summarized-by-geography
Explore at:
Dataset updated
Mar 29, 2025
Dataset provided by
data.sfgov.org
Description
A. SUMMARY Medical provider confirmed COVID-19 cases and confirmed COVID-19 related deaths in San Francisco, CA aggregated by several different geographic areas and normalized by 2016-2020 American Community Survey (ACS) 5-year estimates for population data to calculate rate per 10,000 residents. On September 12, 2021, a new case definition of COVID-19 was introduced that includes criteria for enumerating new infections after previous probable or confirmed infections (also known as reinfections). A reinfection is defined as a confirmed positive PCR lab test more than 90 days after a positive PCR or antigen test. The first reinfection case was identified on December 7, 2021. Cases and deaths are both mapped to the residence of the individual, not to where they were infected or died. For example, if one was infected in San Francisco at work but lives in the East Bay, those are not counted as SF Cases or if one dies in Zuckerberg San Francisco General but is from another county, that is also not counted in this dataset. Dataset is cumulative and covers cases going back to 3/2/2020 when testing began. Geographic areas summarized are: 1. Analysis Neighborhoods 2. Census Tracts 3. Census Zip Code Tabulation Areas B. HOW THE DATASET IS CREATED Addresses from medical data are geocoded by the San Francisco Department of Public Health (SFDPH). Those addresses are spatially joined to the geographic areas. Counts are generated based on the number of address points that match each geographic area. The 2016-2020 American Community Survey (ACS) population estimates provided by the Census are used to create a rate which is equal to ([count] / [acs_population]) * 10000) representing the number of cases per 10,000 residents. C. UPDATE PROCESS Geographic analysis is scripted by SFDPH staff and synced to this dataset daily at 7:30 Pacific Time. D. HOW TO USE THIS DATASET San Francisco population estimates for geographic regions can be found in a view based on the San Francisco Population and Demographic Census dataset. These population estimates are from the 2016-2020 5-year American Community Survey (ACS). Privacy rules in effect To protect privacy, certain rules are in effect: 1. Case counts greater than 0 and less than 10 are dropped - these will be null (blank) values 2. Death counts greater than 0 and less than 10 are dropped - these will be null (blank) values 3. Cases and deaths dropped altogether for areas where acs_population < 1000 Rate suppression in effect where counts lower than 20 Rates are not calculated unless the case count is greater than or equal to 20. Rates are generally unstable at small numbers, so we avoid calculating them directly. We advise you to apply the same approach as this is best practice in epidemiology. A note on Census ZIP Code Tabulation Areas (ZCTAs) ZIP Code Tabulation Areas are special boundaries created by the U.S. Census based on ZIP Codes developed by the USPS. They are not, however, the same thing. ZCTAs are areal representations of routes. Read how the Census develops ZCTAs on their website. Row included for Citywide case counts, incidence rate, and deaths A single row is included that has the Citywide case counts and incidence rate. This can be used for comparisons. Citywide will capture all cases regardless of address quality. While some cases cannot be mapped to sub-areas like Census Tracts, ongo
N
Medicine Park, OK Population Pyramid Dataset: Age Groups, Male and Female...
neilsberg.com
csv, json
Updated Sep 16, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2023). Medicine Park, OK Population Pyramid Dataset: Age Groups, Male and Female Population, and Total Population for Demographics Analysis [Dataset]. https://www.neilsberg.com/research/datasets/62e48990-3d85-11ee-9abe-0aa64bf2eeb2/
Explore at:
json, csvAvailable download formats
Dataset updated
Sep 16, 2023
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Medicine Park, Oklahoma
Variables measured
Male and Female Population Under 5 Years, Male and Female Population over 85 years, Male and Female Total Population for Age Groups, Male and Female Population Between 5 and 9 years, Male and Female Population Between 10 and 14 years, Male and Female Population Between 15 and 19 years, Male and Female Population Between 20 and 24 years, Male and Female Population Between 25 and 29 years, Male and Female Population Between 30 and 34 years, Male and Female Population Between 35 and 39 years, and 9 more
Measurement technique
The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates. To measure the three variables, namely (a) male population, (b) female population and (b) total population, we initially analyzed and categorized the data for each of the age groups. For age groups we divided it into roughly a 5 year bucket for ages between 0 and 85. For over 85, we aggregated data into a single group for all ages. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset tabulates the data for the Medicine Park, OK population pyramid, which represents the Medicine Park population distribution across age and gender, using estimates from the U.S. Census Bureau American Community Survey 5-Year estimates. It lists the male and female population for each age group, along with the total population for those age groups. Higher numbers at the bottom of the table suggest population growth, whereas higher numbers at the top indicate declining birth rates. Furthermore, the dataset can be utilized to understand the youth dependency ratio, old-age dependency ratio, total dependency ratio, and potential support ratio.

Key observations

Youth dependency ratio, which is the number of children aged 0-14 per 100 persons aged 15-64, for Medicine Park, OK, is 19.1.

Old-age dependency ratio, which is the number of persons aged 65 or over per 100 persons aged 15-64, for Medicine Park, OK, is 18.2.

Total dependency ratio for Medicine Park, OK is 37.3.

Potential support ratio, which is the number of youth (working age population) per elderly, for Medicine Park, OK is 5.5.

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.

Age groups:

Under 5 years

5 to 9 years

10 to 14 years

15 to 19 years

20 to 24 years

25 to 29 years

30 to 34 years

35 to 39 years

40 to 44 years

45 to 49 years

50 to 54 years

55 to 59 years

60 to 64 years

65 to 69 years

70 to 74 years

75 to 79 years

80 to 84 years

85 years and over

Variables / Data Columns

Age Group: This column displays the age group for the Medicine Park population analysis. Total expected values are 18 and are define above in the age groups section.

Population (Male): The male population in the Medicine Park for the selected age group is shown in the following column.

Population (Female): The female population in the Medicine Park for the selected age group is shown in the following column.

Total Population: The total population of the Medicine Park for the selected age group is shown in the following column.

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for Medicine Park Population by Age. You can refer the same here
MIMIC-III - Deep Reinforcement Learning
kaggle.com
zip
Updated Apr 7, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Asjad K (2022). MIMIC-III - Deep Reinforcement Learning [Dataset]. https://www.kaggle.com/datasets/asjad99/mimiciii
Explore at:
zip(11100065 bytes)Available download formats
Dataset updated
Apr 7, 2022
Authors
Asjad K
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Digitization of healthcare data along with algorithmic breakthroughts in AI will have a major impact on healthcare delivery in coming years. Its intresting to see application of AI to assist clinicians during patient treatment in a privacy preserving way. While scientific knowledge can help guide interventions, there remains a key need to quickly cut through the space of decision policies to find effective strategies to support patients during the care process.

Offline Reinforcement learning (also referred to as safe or batch reinforcement learning) is a promising sub-field of RL which provides us with a mechanism for solving real world sequential decision making problems where access to simulator is not available. Here we assume that learn a policy from fixed dataset of trajectories with further interaction with the environment(agent doesn't receive reward or punishment signal from the environment). It has shown that such an approach can leverage vast amount of existing logged data (in the form of previous interactions with the environment) and can outperform supervised learning approaches or heuristic based policies for solving real world - decision making problems. Offline RL algorithms when trained on sufficiently large and diverse offline datasets can produce close to optimal policies(ability to generalize beyond training data).

As Part of my PhD, research, I investigated the problem of developing a Clinical Decision Support System for Sepsis Management using Offline Deep Reinforcement Learning.

MIMIC-III ('Medical Information Mart for Intensive Care') is a large open-access anonymized single-center database which consists of comprehensive clinical data of 61,532 critical care admissions from 2001–2012 collected at a Boston teaching hospital. Dataset consists of 47 features (including demographics, vitals, and lab test results) on a cohort of sepsis patients who meet the sepsis-3 definition criteria.

we try to answer the following question:

Given a particular patient’s characteristics and physiological information at each time step as input, can our DeepRL approach, learn an optimal treatment policy that can prescribe the right intervention(e.g use of ventilator) to the patient each stage of the treatment process, in order to improve the final outcome(e.g patient mortality)?

we can use popular state-of-the-art algorithms such as Deep Q Learning(DQN), Double Deep Q Learning (DDQN), DDQN combined with BNC, Mixed Monte Carlo(MMC) and Persistent Advantage Learning (PAL). Using these methods we can train an RL policy to recommend optimum treatment path for a given patient.

Data acquisition, standard pre-processing and modelling details can be found here in Github repo: https://github.com/asjad99/MIMIC_RL_COACH
f
Participants demographic data with lifestyle factors.
figshare.com
plos.figshare.com
xls
Updated Apr 25, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Andrew Li; Jie Lian; Varut Vardhanabhuti (2025). Participants demographic data with lifestyle factors. [Dataset]. http://doi.org/10.1371/journal.pdig.0000795.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pdig.0000795.t001
Dataset updated
Apr 25, 2025
Dataset provided by
PLOS Digital Health
Authors
Andrew Li; Jie Lian; Varut Vardhanabhuti
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Participants demographic data with lifestyle factors.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Department of Health Care Access and Information (2024). Medical Service Study Areas [Dataset]. https://data.chhs.ca.gov/dataset/medical-service-study-areas

Medical Service Study Areas

Explore at:

csv, html, geojson, kml, zip, arcgis geoservices rest apiAvailable download formats

Dataset updated

Dec 6, 2024

Dataset authored and provided by

Department of Health Care Access and Information

Description

This is the current Medical Service Study Area. California Medical Service Study Areas are created by the California Department of Health Care Access and Information (HCAI).

Check the Data Dictionary for field descriptions.

Search for the Medical Service Study Area data on the CHHS Open Data Portal.

Checkout the California Healthcare Atlas for more Medical Service Study Area information.

This is an update to the MSSA geometries and demographics to reflect the new 2020 Census tract data. The Medical Service Study Area (MSSA) polygon layer represents the best fit mapping of all new 2020 California census tract boundaries to the original 2010 census tract boundaries used in the construction of the original 2010 MSSA file. Each of the state's new 9,129 census tracts was assigned to one of the previously established medical service study areas (excluding tracts with no land area), as identified in this data layer. The MSSA Census tract data is aggregated by HCAI, to create this MSSA data layer. This represents the final re-mapping of 2020 Census tracts to the original 2010 MSSA geometries. The 2010 MSSA were based on U.S. Census 2010 data and public meetings held throughout California.

<a href="https://hcai.ca.gov/">https://hcai.ca.gov/</a>

Source of update: American Community Survey 5-year 2006-2010 data for poverty. For source tables refer to InfoUSA update procedural documentation. The 2010 MSSA Detail layer was developed to update fields affected by population change. The American Community Survey 5-year 2006-2010 population data pertaining to total, in households, race, ethnicity, age, and poverty was used in the update. The 2010 MSSA Census Tract Detail map layer was developed to support geographic information systems (GIS) applications, representing 2010 census tract geography that is the foundation of 2010 medical service study area (MSSA) boundaries. ***This version is the finalized MSSA reconfiguration boundaries based on the US Census Bureau 2010 Census. In 1976 Garamendi Rural Health Services Act, required the development of a geographic framework for determining which parts of the state were rural and which were urban, and for determining which parts of counties and cities had adequate health care resources and which were "medically underserved". Thus, sub-city and sub-county geographic units called "medical service study areas [MSSAs]" were developed, using combinations of census-defined geographic units, established following General Rules promulgated by a statutory commission. After each subsequent census the MSSAs were revised. In the scheduled revisions that followed the 1990 census, community meetings of stakeholders (including county officials, and representatives of hospitals and community health centers) were held in larger metropolitan areas. The meetings were designed to develop consensus as how to draw the sub-city units so as to best display health care disparities. The importance of involving stakeholders was heightened in 1992 when the United States Department of Health and Human Services' Health and Resources Administration entered a formal agreement to recognize the state-determined MSSAs as "rational service areas" for federal recognition of "health professional shortage areas" and "medically underserved areas". After the 2000 census, two innovations transformed the process, and set the stage for GIS to emerge as a major factor in health care resource planning in California. First, the Office of Statewide Health Planning and Development [OSHPD], which organizes the community stakeholder meetings and provides the staff to administer the MSSAs, entered into an Enterprise GIS contract. Second, OSHPD authorized at least one community meeting to be held in each of the 58 counties, a significant number of which were wholly rural or frontier counties. For populous Los Angeles County, 11 community meetings were held. As a result, health resource data in California are collected and organized by 541 geographic units. The boundaries of these units were established by community healthcare experts, with the objective of maximizing their usefulness for needs assessment purposes. The most dramatic consequence was introducing a data simultaneously displayed in a GIS format. A two-person team, incorporating healthcare policy and GIS expertise, conducted the series of meetings, and supervised the development of the 2000-census configuration of the MSSAs.

MSSA Configuration Guidelines (General Rules):- Each MSSA is composed of one or more complete census tracts.- As a general rule, MSSAs are deemed to be "rational service areas [RSAs]" for purposes of designating health professional shortage areas [HPSAs], medically underserved areas [MUAs] or medically underserved populations [MUPs].- MSSAs will not cross county lines.- To the extent practicable, all census-defined places within the MSSA are within 30 minutes travel time to the largest population center within the MSSA, except in those circumstances where meeting this criterion would require splitting a census tract.- To the extent practicable, areas that, standing alone, would meet both the definition of an MSSA and a Rural MSSA, should not be a part of an Urban MSSA.- Any Urban MSSA whose population exceeds 200,000 shall be divided into two or more Urban MSSA Subdivisions.- Urban MSSA Subdivisions should be within a population range of 75,000 to 125,000, but may not be smaller than five square miles in area. If removing any census tract on the perimeter of the Urban MSSA Subdivision would cause the area to fall below five square miles in area, then the population of the Urban MSSA may exceed 125,000. - To the extent practicable, Urban MSSA Subdivisions should reflect recognized community and neighborhood boundaries and take into account such demographic information as income level and ethnicity. Rural Definitions: A rural MSSA is an MSSA adopted by the Commission, which has a population density of less than 250 persons per square mile, and which has no census defined place within the area with a population in excess of 50,000. Only the population that is located within the MSSA is counted in determining the population of the census defined place. A frontier MSSA is a rural MSSA adopted by the Commission which has a population density of less than 11 persons per square mile. Any MSSA which is not a rural or frontier MSSA is an urban MSSA. Last updated December 6th 2024.

Clear search

Close search

Google apps

Main menu

Medical Service Study Areas

Medical Lake, WA Population Pyramid Dataset: Age Groups, Male and Female...

About this dataset

Content

Inspiration

Recommended for further research

Medical Service Study Area Demographics

Mexico-WHO Health Indicators

Mexico-WHO Health Indicators

Demographic, Disease, and Treatment Coverage Data

About this dataset

More Datasets

Featured Notebooks

How to use the dataset

Research Ideas

Acknowledgements

License

Columns

COVID-19 Deaths by Population Characteristics

[Archived] COVID-19 Deaths by Population Characteristics Over Time

Patient demographics.

MIMIC-III Clinical Database

Claims-based definitions of death.

Assessing the validity of a data driven segmentation approach: A 4 year...

Infectious Disease Prediction

Context

Content

Acknowledgements

National Health & Nutrition Exam Survey 2017-2018

Context

Content

1. Demographics dataset:

2. Examinations dataset, which contains factors like:

3. Dietary data - total nutrient intake:

4. Laboratory dataset, which includes factors like:

5. Questionnaire dataset, which includes items like:

Socio-demographics of the study population (n = 671).

ARCHIVED: COVID-19 Cases and Deaths Summarized by Geography

Medicine Park, OK Population Pyramid Dataset: Age Groups, Male and Female...

About this dataset

Content

Inspiration

Recommended for further research

MIMIC-III - Deep Reinforcement Learning

Participants demographic data with lifestyle factors.

Medical Service Study Areas