12 datasets found
  1. NCHS mortality data 2014-2022

    • zenodo.org
    bin
    Updated Jul 24, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Weinberger Daniel; Weinberger Daniel (2024). NCHS mortality data 2014-2022 [Dataset]. http://doi.org/10.5281/zenodo.12808102
    Explore at:
    binAvailable download formats
    Dataset updated
    Jul 24, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Weinberger Daniel; Weinberger Daniel
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is a database (parquet format) containing publicly available multiple cause mortality data from the US (CDC/NCHS) for 2014-2022. Not all variables are included on this export. Please see below for restrictions on the use of these data imposed by NCHS. You can use the arrow package in R to open the file. See here for example analysis; https://github.com/DanWeinberger/pneumococcal_mortality/blob/main/analysis_nongeo.Rmd . For instance, save this file in a folder called "parquet3":

    library(arrow)

    library(dplyr)

    pneumo.deaths.in <- open_dataset("R:/parquet3", format = "parquet") %>% #open the dataset
    filter(grepl("J13|A39|J181|A403|B953|G001", all_icd)) %>% #filter to records that have the selected ICD codes
    collect() #call the dataset into memory. Note you should do any operations you canbefore calling 'collect()" due to memory issues

    The variables included are named: (see full dictionary:https://www.cdc.gov/nchs/nvss/mortality_public_use_data.htm)

    year: Calendar year of death

    month: Calendar month of death

    age_detail_number: number indicating year or part of year; can't be interpreted itself here. see agey variable instead

    sex: M/F

    place_of_death:

    Place of Death and Decedent’s Status
    Place of Death and Decedent’s Status
    1 ... Hospital, Clinic or Medical Center
    - Inpatient
    2 ... Hospital, Clinic or Medical Center
    - Outpatient or admitted to Emergency Room
    3 ... Hospital, Clinic or Medical Center
    - Dead on Arrival
    4 ... Decedent’s home
    5 ... Hospice facility
    6 ... Nursing home/long term care
    7 ... Other
    9 ... Place of death unknown

    all_icd: Cause of death coded as ICD10 codes. ICD1-ICD21 pasted into a single string, with separation of codes by an underscore

    hisp_recode: 0=Non-Hispanic; 1=Hispanic; 999= Not specified

    race_recode: race coding prior to 2018 (reconciled in race_recode_new)

    race_recode_alt: race coding after 2018 (reconciled in race_recode_new)

    race_recode_new:

    1='White'

    2= 'Black'

    3='Hispanic'

    4='American Indian'

    5='Asian/Pacific Islanders'

    agey:

    age in years (or partial years for kids <12months)

    https://www.cdc.gov/nchs/data_access/restrictions.htm

    Please Read Carefully Before Using NCHS Public Use Survey Data

    The National Center for Health Statistics (NCHS), Centers for Disease Control and Prevention (CDC), conducts statistical and epidemiological activities under the authority granted by the Public Health Service Act (42 U.S.C. § 242k). NCHS survey data are protected by Federal confidentiality laws including Section 308(d) Public Health Service Act [42 U.S.C. 242m(d)] and the Confidential Information Protection and Statistical Efficiency Act or CIPSEA [Pub. L. No. 115-435, 132 Stat. 5529 § 302]. These confidentiality laws state the data collected by NCHS may be used only for statistical reporting and analysis. Any effort to determine the identity of individuals and establishments violates the assurances of confidentiality provided by federal law.

    Terms and Conditions

    NCHS does all it can to assure that the identity of individuals and establishments cannot be disclosed. All direct identifiers, as well as any characteristics that might lead to identification, are omitted from the dataset. Any intentional identification or disclosure of an individual or establishment violates the assurances of confidentiality given to the providers of the information. Therefore, users will:

    1. Use the data in this dataset for statistical reporting and analysis only.
    1. Make no attempt to learn the identity of any person or establishment included in these data.
    1. Not link this dataset with individually identifiable data from other NCHS or non-NCHS datasets.
    1. Not engage in any efforts to assess disclosure methodologies applied to protect individuals and establishments or any research on methods of re-identification of individuals and establishments.

    By using these data you signify your agreement to comply with the above-stated statutorily based requirements.

    Sanctions for Violating NCHS Data Use Agreement

    Willfully disclosing any information that could identify a person or establishment in any manner to a person or agency not entitled to receive it, shall be guilty of a class E felony and imprisoned for not more than 5 years, or fined not more than $250,000, or both.

  2. NCHS - Leading Causes of Death: United States

    • catalog.data.gov
    • healthdata.gov
    • +6more
    Updated Apr 23, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Centers for Disease Control and Prevention (2025). NCHS - Leading Causes of Death: United States [Dataset]. https://catalog.data.gov/dataset/nchs-leading-causes-of-death-united-states
    Explore at:
    Dataset updated
    Apr 23, 2025
    Dataset provided by
    Centers for Disease Control and Preventionhttp://www.cdc.gov/
    Area covered
    United States
    Description

    This dataset presents the age-adjusted death rates for the 10 leading causes of death in the United States beginning in 1999. Data are based on information from all resident death certificates filed in the 50 states and the District of Columbia using demographic and medical characteristics. Age-adjusted death rates (per 100,000 population) are based on the 2000 U.S. standard population. Populations used for computing death rates after 2010 are postcensal estimates based on the 2010 census, estimated as of July 1, 2010. Rates for census years are based on populations enumerated in the corresponding censuses. Rates for non-census years before 2010 are revised using updated intercensal population estimates and may differ from rates previously published. Causes of death classified by the International Classification of Diseases, Tenth Revision (ICD–10) are ranked according to the number of deaths assigned to rankable causes. Cause of death statistics are based on the underlying cause of death. SOURCES CDC/NCHS, National Vital Statistics System, mortality data (see http://www.cdc.gov/nchs/deaths.htm); and CDC WONDER (see http://wonder.cdc.gov). REFERENCES National Center for Health Statistics. Vital statistics data available. Mortality multiple cause files. Hyattsville, MD: National Center for Health Statistics. Available from: https://www.cdc.gov/nchs/data_access/vitalstatsonline.htm. Murphy SL, Xu JQ, Kochanek KD, Curtin SC, and Arias E. Deaths: Final data for 2015. National vital statistics reports; vol 66. no. 6. Hyattsville, MD: National Center for Health Statistics. 2017. Available from: https://www.cdc.gov/nchs/data/nvsr/nvsr66/nvsr66_06.pdf.

  3. Reduced Access to Care During COVID-19

    • catalog.data.gov
    • data.virginia.gov
    • +5more
    Updated Apr 23, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Centers for Disease Control and Prevention (2025). Reduced Access to Care During COVID-19 [Dataset]. https://catalog.data.gov/dataset/reduced-access-to-care-during-covid-19
    Explore at:
    Dataset updated
    Apr 23, 2025
    Dataset provided by
    Centers for Disease Control and Preventionhttp://www.cdc.gov/
    Description

    The Research and Development Survey (RANDS) is a platform designed for conducting survey question evaluation and statistical research. RANDS is an ongoing series of surveys from probability-sampled commercial survey panels used for methodological research at the National Center for Health Statistics (NCHS). RANDS estimates are generated using an experimental approach that differs from the survey design approaches generally used by NCHS, including possible biases from different response patterns and sampling frames as well as increased variability from lower sample sizes. Use of the RANDS platform allows NCHS to produce more timely data than would be possible using traditional data collection methods. RANDS is not designed to replace NCHS’ higher quality, core data collections. Below are experimental estimates of reduced access to healthcare for three rounds of RANDS during COVID-19. Data collection for the three rounds of RANDS during COVID-19 occurred between June 9, 2020 and July 6, 2020, August 3, 2020 and August 20, 2020, and May 17, 2021 and June 30, 2021. Information needed to interpret these estimates can be found in the Technical Notes. RANDS during COVID-19 included questions about unmet care in the last 2 months during the coronavirus pandemic. Unmet needs for health care are often the result of cost-related barriers. The National Health Interview Survey, conducted by NCHS, is the source for high-quality data to monitor cost-related health care access problems in the United States. For example, in 2018, 7.3% of persons of all ages reported delaying medical care due to cost and 4.8% reported needing medical care but not getting it due to cost in the past year. However, cost is not the only reason someone might delay or not receive needed medical care. As a result of the coronavirus pandemic, people also may not get needed medical care due to cancelled appointments, cutbacks in transportation options, fear of going to the emergency room, or an altruistic desire to not be a burden on the health care system, among other reasons. The Household Pulse Survey (https://www.cdc.gov/nchs/covid19/pulse/reduced-access-to-care.htm), an online survey conducted in response to the COVID-19 pandemic by the Census Bureau in partnership with other federal agencies including NCHS, also reports estimates of reduced access to care during the pandemic (beginning in Phase 1, which started on April 23, 2020). The Household Pulse Survey reports the percentage of adults who delayed medical care in the last 4 weeks or who needed medical care at any time in the last 4 weeks for something other than coronavirus but did not get it because of the pandemic. The experimental estimates on this page are derived from RANDS during COVID-19 and show the percentage of U.S. adults who were unable to receive medical care (including urgent care, surgery, screening tests, ongoing treatment, regular checkups, prescriptions, dental care, vision care, and hearing care) in the last 2 months. Technical Notes: https://www.cdc.gov/nchs/covid19/rands/reduced-access-to-care.htm#limitations

  4. Trends in COVID-19 Cases and Deaths in the United States, by County-level...

    • data.cdc.gov
    • data.virginia.gov
    • +2more
    csv, xlsx, xml
    Updated Jun 8, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CDC COVID-19 Response (2023). Trends in COVID-19 Cases and Deaths in the United States, by County-level Population Factors - ARCHIVED [Dataset]. https://data.cdc.gov/dataset/Trends-in-COVID-19-Cases-and-Deaths-in-the-United-/njmz-dpbc
    Explore at:
    xml, xlsx, csvAvailable download formats
    Dataset updated
    Jun 8, 2023
    Dataset provided by
    Centers for Disease Control and Preventionhttp://www.cdc.gov/
    Authors
    CDC COVID-19 Response
    Area covered
    United States
    Description

    Reporting of Aggregate Case and Death Count data was discontinued on May 11, 2023, with the expiration of the COVID-19 public health emergency declaration. Although these data will continue to be publicly available, this dataset will no longer be updated.

    The surveillance case definition for COVID-19, a nationally notifiable disease, was first described in a position statement from the Council for State and Territorial Epidemiologists, which was later revised. However, there is some variation in how jurisdictions implemented these case definitions. More information on how CDC collects COVID-19 case surveillance data can be found at FAQ: COVID-19 Data and Surveillance.

    Aggregate Data Collection Process Since the beginning of the COVID-19 pandemic, data were reported from state and local health departments through a robust process with the following steps:

    • Aggregate county-level counts were obtained indirectly, via automated overnight web collection, or directly, via a data submission process.
    • If more than one official county data source existed, CDC used a comprehensive data selection process comparing each official county data source to retrieve the highest case and death counts, unless otherwise specified by the state.
    • A CDC data team reviewed counts for congruency prior to integration and set up alerts to monitor for discrepancies in the data.
    • CDC routinely compiled these data and post the finalized information on COVID Data Tracker.
    • County level data were aggregated to obtain state- and territory- specific totals.
    • Counting of cases and deaths is based on date of report and not on the date of symptom onset. CDC calculates rates in these data by using population estimates provided by the US Census Bureau Population Estimates Program (2019 Vintage).
    • COVID-19 aggregate case and death data are organized in a time series that includes cumulative number of cases and deaths as reported by a jurisdiction on a given date. New case and death counts are calculated as the week-to-week change in cumulative counts of cases and deaths reported (i.e., newly reported cases and deaths = cumulative number of cases/deaths reported this week minus the cumulative total reported the prior week.

    This process was collaborative, with CDC and jurisdictions working together to ensure the accuracy of COVID-19 case and death numbers. County counts provided the most up-to-date numbers on cases and deaths by report date. Throughout data collection, CDC retrospectively updated counts to correct known data quality issues.

    Description This archived public use dataset focuses on the cumulative and weekly case and death rates per 100,000 persons within various sociodemographic factors across all states and their counties. All resulting data are expressed as rates calculated as the number of cases or deaths per 100,000 persons in counties meeting various classification criteria using the US Census Bureau Population Estimates Program (2019 Vintage).

    Each county within jurisdictions is classified into multiple categories for each factor. All rates in this dataset are based on classification of counties by the characteristics of their population, not individual-level factors. This applies to each of the available factors observed in this dataset. Specific factors and their corresponding categories are detailed below.

    Population-level factors Each unique population factor is detailed below. Please note that the “Classification” column describes each of the 12 factors in the dataset, including a data dictionary describing what each numeric digit means within each classification. The “Category” column uses numeric digits (2-6, depending on the factor) defined in the “Classification” column.

    Metro vs. Non-Metro – “Metro_Rural” Metro vs. Non-Metro classification type is an aggregation of the 6 National Center for Health Statistics (NCHS) Urban-Rural classifications, where “Metro” counties include Large Central Metro, Large Fringe Metro, Medium Metro, and Small Metro areas and “Non-Metro” counties include Micropolitan and Non-Core (Rural) areas. 1 – Metro, including “Large Central Metro, Large Fringe Metro, Medium Metro, and Small Metro” areas 2 – Non-Metro, including “Micropolitan, and Non-Core” areas

    Urban/rural - “NCHS_Class” Urban/rural classification type is based on the 2013 National Center for Health Statistics Urban-Rural Classification Scheme for Counties. Levels consist of:

    1 Large Central Metro
    2 Large Fringe Metro 3 Medium Metro 4 Small Metro 5 Micropolitan 6 Non-Core (Rural)

    American Community Survey (ACS) data were used to classify counties based on their age, race/ethnicity, household size, poverty level, and health insurance status distributions. Cut points were generated by using tertiles and categorized as High, Moderate, and Low percentages. The classification “Percent non-Hispanic, Native Hawaiian/Pacific Islander” is only available for “Hawaii” due to low numbers in this category for other available locations. This limitation also applies to other race/ethnicity categories within certain jurisdictions, where 0 counties fall into the certain category. The cut points for each ACS category are further detailed below:

    Age 65 - “Age65”

    1 Low (0-24.4%) 2 Moderate (>24.4%-28.6%) 3 High (>28.6%)

    Non-Hispanic, Asian - “NHAA”

    1 Low (<=5.7%) 2 Moderate (>5.7%-17.4%) 3 High (>17.4%)

    Non-Hispanic, American Indian/Alaskan Native - “NHIA”

    1 Low (<=0.7%) 2 Moderate (>0.7%-30.1%) 3 High (>30.1%)

    Non-Hispanic, Black - “NHBA”

    1 Low (<=2.5%) 2 Moderate (>2.5%-37%) 3 High (>37%)

    Hispanic - “HISP”

    1 Low (<=18.3%) 2 Moderate (>18.3%-45.5%) 3 High (>45.5%)

    Population in Poverty - “Pov”

    1 Low (0-12.3%) 2 Moderate (>12.3%-17.3%) 3 High (>17.3%)

    Population Uninsured- “Unins”

    1 Low (0-7.1%) 2 Moderate (>7.1%-11.4%) 3 High (>11.4%)

    Average Household Size - “HH”

    1 Low (1-2.4) 2 Moderate (>2.4-2.6) 3 High (>2.6)

    Community Vulnerability Index Value - “CCVI” COVID-19 Community Vulnerability Index (CCVI) scores are from Surgo Ventures, which range from 0 to 1, were generated based on tertiles and categorized as:

    1 Low Vulnerability (0.0-0.4) 2 Moderate Vulnerability (0.4-0.6) 3 High Vulnerability (0.6-1.0)

    Social Vulnerability Index Value – “SVI" Social Vulnerability Index (SVI) scores (vintage 2020), which also range from 0 to 1, are from CDC/ASTDR’s Geospatial Research, Analysis & Service Program. Cut points for CCVI and SVI scores were generated based on tertiles and categorized as:

    1 Low Vulnerability (0-0.333) 2 Moderate Vulnerability (0.334-0.666) 3 High Vulnerability (0.667-1)

  5. O

    COVID-19 Cases and Deaths by Race/Ethnicity - ARCHIVE

    • data.ct.gov
    • s.cnmilf.com
    • +1more
    application/rdfxml +5
    Updated Jun 24, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department of Public Health (2022). COVID-19 Cases and Deaths by Race/Ethnicity - ARCHIVE [Dataset]. https://data.ct.gov/Health-and-Human-Services/COVID-19-Cases-and-Deaths-by-Race-Ethnicity-ARCHIV/7rne-efic
    Explore at:
    xml, tsv, csv, application/rdfxml, json, application/rssxmlAvailable download formats
    Dataset updated
    Jun 24, 2022
    Dataset authored and provided by
    Department of Public Health
    License

    U.S. Government Workshttps://www.usa.gov/government-works
    License information was derived automatically

    Description

    Note: DPH is updating and streamlining the COVID-19 cases, deaths, and testing data. As of 6/27/2022, the data will be published in four tables instead of twelve.

    The COVID-19 Cases, Deaths, and Tests by Day dataset contains cases and test data by date of sample submission. The death data are by date of death. This dataset is updated daily and contains information back to the beginning of the pandemic. The data can be found at https://data.ct.gov/Health-and-Human-Services/COVID-19-Cases-Deaths-and-Tests-by-Day/g9vi-2ahj.

    The COVID-19 State Metrics dataset contains over 93 columns of data. This dataset is updated daily and currently contains information starting June 21, 2022 to the present. The data can be found at https://data.ct.gov/Health-and-Human-Services/COVID-19-State-Level-Data/qmgw-5kp6 .

    The COVID-19 County Metrics dataset contains 25 columns of data. This dataset is updated daily and currently contains information starting June 16, 2022 to the present. The data can be found at https://data.ct.gov/Health-and-Human-Services/COVID-19-County-Level-Data/ujiq-dy22 .

    The COVID-19 Town Metrics dataset contains 16 columns of data. This dataset is updated daily and currently contains information starting June 16, 2022 to the present. The data can be found at https://data.ct.gov/Health-and-Human-Services/COVID-19-Town-Level-Data/icxw-cada . To protect confidentiality, if a town has fewer than 5 cases or positive NAAT tests over the past 7 days, those data will be suppressed.

    COVID-19 cases and associated deaths that have been reported among Connecticut residents, broken down by race and ethnicity. All data in this report are preliminary; data for previous dates will be updated as new reports are received and data errors are corrected. Deaths reported to the either the Office of the Chief Medical Examiner (OCME) or Department of Public Health (DPH) are included in the COVID-19 update.

    The following data show the number of COVID-19 cases and associated deaths per 100,000 population by race and ethnicity. Crude rates represent the total cases or deaths per 100,000 people. Age-adjusted rates consider the age of the person at diagnosis or death when estimating the rate and use a standardized population to provide a fair comparison between population groups with different age distributions. Age-adjustment is important in Connecticut as the median age of among the non-Hispanic white population is 47 years, whereas it is 34 years among non-Hispanic blacks, and 29 years among Hispanics. Because most non-Hispanic white residents who died were over 75 years of age, the age-adjusted rates are lower than the unadjusted rates. In contrast, Hispanic residents who died tend to be younger than 75 years of age which results in higher age-adjusted rates.

    The population data used to calculate rates is based on the CT DPH population statistics for 2019, which is available online here: https://portal.ct.gov/DPH/Health-Information-Systems--Reporting/Population/Population-Statistics. Prior to 5/10/2021, the population estimates from 2018 were used.

    Rates are standardized to the 2000 US Millions Standard population (data available here: https://seer.cancer.gov/stdpopulations/). Standardization was done using 19 age groups (0, 1-4, 5-9, 10-14, ..., 80-84, 85 years and older). More information about direct standardization for age adjustment is available here: https://www.cdc.gov/nchs/data/statnt/statnt06rv.pdf

    Categories are mutually exclusive. The category “multiracial” includes people who answered ‘yes’ to more than one race category. Counts may not add up to total case counts as data on race and ethnicity may be missing. Age adjusted rates calculated only for groups with more than 20 deaths. Abbreviation: NH=Non-Hispanic.

    Data on Connecticut deaths were obtained from the Connecticut Deaths Registry maintained by the DPH Office of Vital Records. Cause of death was determined by a death certifier (e.g., physician, APRN, medical examiner) using their best clinical judgment. Additionally, all COVID-19 deaths, including suspected or related, are required to be reported to OCME. On April 4, 2020, CT DPH and OCME released a joint memo to providers and facilities within Connecticut providing guidelines for certifying deaths due to COVID-19 that were consistent with the CDC’s guidelines and a reminder of the required reporting to OCME.25,26 As of July 1, 2021, OCME had reviewed every case reported and performed additional investigation on about one-third of reported deaths to better ascertain if COVID-19 did or did not cause or contribute to the death. Some of these investigations resulted in the OCME performing postmortem swabs for PCR testing on individuals whose deaths were suspected to be due to COVID-19, but antemortem diagnosis was unable to be made.31 The OCME issued or re-issued about 10% of COVID-19 death certificates and, when appropriate, removed COVID-19 from the death certificate. For standardization and tabulation of mortality statistics, written cause of death statements made by the certifiers on death certificates are sent to the National Center for Health Statistics (NCHS) at the CDC which assigns cause of death codes according to the International Causes of Disease 10th Revision (ICD-10) classification system.25,26 COVID-19 deaths in this report are defined as those for which the death certificate has an ICD-10 code of U07.1 as either a primary (underlying) or a contributing cause of death. More information on COVID-19 mortality can be found at the following link: https://portal.ct.gov/DPH/Health-Information-Systems--Reporting/Mortality/Mortality-Statistics

    Data are subject to future revision as reporting changes.

    Starting in July 2020, this dataset will be updated every weekday.

    Additional notes: A delay in the data pull schedule occurred on 06/23/2020. Data from 06/22/2020 was processed on 06/23/2020 at 3:30 PM. The normal data cycle resumed with the data for 06/23/2020.

    A network outage on 05/19/2020 resulted in a change in the data pull schedule. Data from 5/19/2020 was processed on 05/20/2020 at 12:00 PM. Data from 5/20/2020 was processed on 5/20/2020 8:30 PM. The normal data cycle resumed on 05/20/2020 with the 8:30 PM data pull. As a result of the network outage, the timestamp on the datasets on the Open Data Portal differ from the timestamp in DPH's daily PDF reports.

    Starting 5/10/2021, the date field will represent the date this data was updated on data.ct.gov. Previously the date the data was pulled by DPH was listed, which typically coincided with the date before the data was published on data.ct.gov. This change was made to standardize the COVID-19 data sets on data.ct.gov.

  6. Weekly United States COVID-19 Cases and Deaths by County - ARCHIVED

    • data.cdc.gov
    • data.virginia.gov
    • +1more
    csv, xlsx, xml
    Updated Jul 10, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CDC COVID-19 Response (2023). Weekly United States COVID-19 Cases and Deaths by County - ARCHIVED [Dataset]. https://data.cdc.gov/w/yviw-z6j5/tdwk-ruhb?cur=0sEK0zoBw6T
    Explore at:
    xml, xlsx, csvAvailable download formats
    Dataset updated
    Jul 10, 2023
    Dataset provided by
    Centers for Disease Control and Preventionhttp://www.cdc.gov/
    Authors
    CDC COVID-19 Response
    Area covered
    United States
    Description

    Note: The cumulative case count for some counties (with small population) is higher than expected due to the inclusion of non-permanent residents in COVID-19 case counts.

    Reporting of Aggregate Case and Death Count data was discontinued on May 11, 2023, with the expiration of the COVID-19 public health emergency declaration. Although these data will continue to be publicly available, this dataset will no longer be updated.

    Aggregate Data Collection Process Since the beginning of the COVID-19 pandemic, data were reported through a robust process with the following steps:

    • Aggregate county-level counts were obtained indirectly, via automated overnight web collection, or directly, via a data submission process.
    • If more than one official county data source existed, CDC used a comprehensive data selection process comparing each official county data source to retrieve the highest case and death counts, unless otherwise specified by the state.
    • A CDC data team reviewed counts for congruency prior to integration. CDC routinely compiled these data and post the finalized information on COVID Data Tracker.
    • Cases and deaths are based on date of report and not on the date of symptom onset. CDC calculates rates in this data by using population estimates provided by the US Census Bureau Population Estimates Program (2019 Vintage).
    • COVID-19 aggregate case and death data were organized in a time series that includes cumulative number of cases and deaths as reported by a jurisdiction on a given date. New case and death counts were calculated as the week-to-week change in reported cumulative cases and deaths (i.e., newly reported cases and deaths = cumulative number of cases/deaths reported this week minus the cumulative total reported the week before.

    This process was collaborative, with CDC and jurisdictions working together to ensure the accuracy of COVID-19 case and death numbers. County counts provided the most up-to-date numbers on cases and deaths by report date. Throughout data collection, CDC retrospectively updated counts to correct known data quality issues. CDC also worked with jurisdictions after the end of the public health emergency declaration to finalize county data.

    • Source: The weekly archived dataset is based on county-level aggregate count data
    • Confirmed/Probable Cases/Death breakdown: Cumulative cases and deaths for each county are included. Total reported cases include probable and confirmed cases.
    • Time Series Frequency: The weekly archived dataset contains weekly time series data (i.e., one record per week per county)

    Important note: The counts reflected during a given time period in this dataset may not match the counts reflected for the same time period in the daily archived dataset noted above. Discrepancies may exist due to differences between county and state COVID-19 case surveillance and reconciliation efforts.

    The surveillance case definition for COVID-19, a nationally notifiable disease, was first described in a position statement from the Council for State and Territorial Epidemiologists, which was later revised. However, there is some variation in how jurisdictions implement these case classifications. More information on how CDC collects COVID-19 case surveillance data can be found at FAQ: COVID-19 Data and Surveillance.

    Confirmed and Probable Counts In this dataset, counts by jurisdiction are not displayed by confirmed or probable status. Instead, counts of confirmed and probable cases and deaths are included in the Total Cases and Total Deaths columns, when available. Not all jurisdictions reported probable cases and deaths to CDC. Confirmed and probable case definition criteria are described here: "https://ndc.services.cdc.gov/case-definitions/coronavirus-disease-2019-covid-19/">Coronavirus Disease 2019 (COVID-19) 2023 Case Definition | CDC Council of State and Territorial Epidemiologists (ymaws.com).

    Deaths COVID-19 deaths were reported to CDC from several sources since the beginning of the pandemic including aggregate death data and NCHS Provisional Death Counts. Historic information presented on the COVID Data Tracker pages were based on the same source (Aggregate Data) as the present dataset until the expiration of the public health emergency declaration on May 11, 2023; however, the NCHS Death Counts are based on death certificate data that use information reported by physicians, medical examiners, or coroners in the cause-of-death section of each certificate. Counts from previous weeks were continually revised as more records were received and processed.

    Number of Jurisdictions Reporting There were 60 public health jurisdictions that reported cases and deaths of COVID-19. This included the 50 states, the District of Columbia, New York City, the U.S. territories of American Samoa, Guam, the Commonwealth of the Northern Mariana Islands, Puerto Rico, and the U.S Virgin Islands as well as three independent countries in compacts of free association with the United States, Federated States of Micronesia, Republic of the Marshall Islands, and Republic of Palau. In total there were 3,222 counties for which counts were tracked within the 60 public health jurisdictions.

    Additional COVID-19 public use datasets, include line-level (patient-level) data, are available at: https://data.cdc.gov/browse?tags=covid-19.

    Note: In early 2020, Alaska enacted changes to their counties/boroughs due to low populations in certain areas:

    Case and death counts for Yakutat City and Borough, Alaska, are shown as 0 by default. Case and death counts for Hoonah-Angoon Census Area, Alaska, represent total cases and deaths in residents of Hoonah-Angoon Census Area, Alaska, and Yakutat City and Borough, Alaska. Case and death counts for Bristol Bay Borough, Alaska, are shown as 0 by default. Case and death counts for Lake and Peninsula Borough, Alaska, represent total cases and deaths in residents of Lake and Peninsula Borough, Alaska, and Bristol Bay Borough, Alaska.

    Historical cases and deaths are not tracked separately in the county level datasets, and differences in weekly new cases and deaths could exist when county-level data are aggregated to the state-level (i.e., when compared to this dataset: https://data.cdc.gov/Case-Surveillance/United-States-COVID-19-Cases-and-Deaths-by-State-o/9mfq-cb36).

  7. Death in the United States

    • kaggle.com
    zip
    Updated Aug 3, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Centers for Disease Control and Prevention (2017). Death in the United States [Dataset]. https://www.kaggle.com/datasets/cdc/mortality
    Explore at:
    zip(766333584 bytes)Available download formats
    Dataset updated
    Aug 3, 2017
    Dataset authored and provided by
    Centers for Disease Control and Preventionhttp://www.cdc.gov/
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    United States
    Description

    Every year the CDC releases the country’s most detailed report on death in the United States under the National Vital Statistics Systems. This mortality dataset is a record of every death in the country for 2005 through 2015, including detailed information about causes of death and the demographic background of the deceased.

    It's been said that "statistics are human beings with the tears wiped off." This is especially true with this dataset. Each death record represents somebody's loved one, often connected with a lifetime of memories and sometimes tragically too short.

    Putting the sensitive nature of the topic aside, analyzing mortality data is essential to understanding the complex circumstances of death across the country. The US Government uses this data to determine life expectancy and understand how death in the U.S. differs from the rest of the world. Whether you’re looking for macro trends or analyzing unique circumstances, we challenge you to use this dataset to find your own answers to one of life’s great mysteries.

    Overview

    This dataset is a collection of CSV files each containing one year's worth of data and paired JSON files containing the code mappings, plus an ICD 10 code set. The CSVs were reformatted from their original fixed-width file formats using information extracted from the CDC's PDF manuals using this script. Please note that this process may have introduced errors as the text extracted from the pdf is not a perfect match. If you have any questions or find errors in the preparation process, please leave a note in the forums. We hope to publish additional years of data using this method soon.

    A more detailed overview of the data can be found here. You'll find that the fields are consistent within this time window, but some of data codes change every few years. For example, the 113_cause_recode entry 069 only covers ICD codes (I10,I12) in 2005, but by 2015 it covers (I10,I12,I15). When I post data from years prior to 2005, expect some of the fields themselves to change as well.

    All data comes from the CDC’s National Vital Statistics Systems, with the exception of the Icd10Code, which are sourced from the World Health Organization.

    Project ideas

    • The CDC's mortality data was the basis of a widely publicized paper, by Anne Case and Nobel prize winner Angus Deaton, arguing that middle-aged whites are dying at elevated rates. One of the criticisms against the paper is that it failed to properly account for the exact ages within the broad bins available through the CDC's WONDER tool. What do these results look like with exact/not-binned age data?
    • Similarly, how sensitive are the mortality trends being discussed in the news to the choice of bin-widths?
    • As noted above, the data preparation process could have introduced errors. Can you find any discrepancies compared to the aggregate metrics on WONDER? If so, please let me know in the forums!
    • WONDER is cited in numerous economics, sociology, and public health research papers. Can you find any papers whose conclusions would be altered if they used the exact data available here rather than binned data from Wonder?

    Differences from the first version of the dataset

    • This version of the dataset was prepared in a completely different many. This has allowed us to provide a much larger volume of data and ensure that codes are available for every field.
    • We've replaced the batch of sql files with a single JSON per year. Kaggle's platform currently offer's better support for JSON files, and this keeps the number of files manageable.
    • A tutorial kernel providing a quick introduction to the new format is available here.
    • Lastly, I apologize if the transition has interrupted anyone's work! If need be, you can still download v1.
  8. g

    CDC's NCHS, 2005 Hispanic population by county by single age, U.S., 2005

    • geocommons.com
    Updated May 6, 2008
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data (2008). CDC's NCHS, 2005 Hispanic population by county by single age, U.S., 2005 [Dataset]. http://geocommons.com/search.html
    Explore at:
    Dataset updated
    May 6, 2008
    Dataset provided by
    data
    Postcensal bridged race data from National Center for Health Statistics of CDC
    Description

    Hispanic population at county level by single age in year 2000. the data is for all ages from 1 to 84, also infants and those of age 85 and more. The original data published by NCHS (National center for Health Statistic) of CDC has data by race and ethincity. This particular data was extracted for the lower 48 counties for Hispanic descent.

  9. Data to reproduce: "The impact of population heterogeneity on the age...

    • zenodo.org
    bin
    Updated Apr 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jonas Schöley; Jonas Schöley (2025). Data to reproduce: "The impact of population heterogeneity on the age trajectory of neonatal mortality" [Dataset]. http://doi.org/10.5281/zenodo.15304230
    Explore at:
    binAvailable download formats
    Dataset updated
    Apr 29, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Jonas Schöley; Jonas Schöley
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Processed cohort linked birth / infant death data set by the US National Center for Health Statistics.

    Used as input data for "The impact of population heterogeneity on the age trajectory of neonatal mortality".

    Original data: ftp://ftp.cdc.gov/pub/Health_Statistics/NCHS/Datasets/DVS/cohortlinkedus/.

  10. National Hospital Ambulatory Medical Care Survey

    • datacatalog.med.nyu.edu
    Updated Mar 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    United States - Centers for Disease Control and Prevention (CDC) (2025). National Hospital Ambulatory Medical Care Survey [Dataset]. https://datacatalog.med.nyu.edu/dataset/10520
    Explore at:
    Dataset updated
    Mar 28, 2025
    Dataset provided by
    Centers for Disease Control and Preventionhttp://www.cdc.gov/
    Authors
    United States - Centers for Disease Control and Prevention (CDC)
    Time period covered
    Jan 1, 1992 - Present
    Area covered
    United States
    Description

    The National Hospital Ambulatory Medical Care Survey (NHAMCS) is a national survey that collects information information about the provision and use of ambulatory medical care services in the United States. The survey samples visits to hospital outpatient departments (OPD), hospital emergency departments (ED), and hospital-based ambulatory surgery locations (ASL). The survey has been conducted annually since 1992; since 2018, the survey has only collected data on hospital emergency department visits.

    Approximately 500 nationally representative hospitals are selected to provide data on a sample of patient visits each year. Excluded hospitals include federal, hospital units within institutions, and hospitals with fewer than six staffed beds for patient use. Data collected includes patient demographics, conditions treated, services provided, and payment methods. The data is weighted to produce national estimates.

  11. Post-COVID Conditions

    • catalog.data.gov
    • data.virginia.gov
    • +3more
    Updated Apr 23, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Centers for Disease Control and Prevention (2025). Post-COVID Conditions [Dataset]. https://catalog.data.gov/dataset/post-covid-conditions-89bb3
    Explore at:
    Dataset updated
    Apr 23, 2025
    Dataset provided by
    Centers for Disease Control and Preventionhttp://www.cdc.gov/
    Description

    As part of an ongoing partnership with the Census Bureau, the National Center for Health Statistics (NCHS) recently added questions to assess the prevalence of post-COVID-19 conditions (long COVID), on the experimental Household Pulse Survey. This 20-minute online survey was designed to complement the ability of the federal statistical system to rapidly respond and provide relevant information about the impact of the coronavirus pandemic in the U.S. Data collection began on April 23, 2020. Beginning in Phase 3.5 (on June 1, 2022), NCHS included questions about the presence of symptoms of COVID that lasted three months or longer. Phase 3.5 will continue with a two-weeks on, two-weeks off collection and dissemination approach. Estimates on this page are derived from the Household Pulse Survey and show the percentage of adults aged 18 and over who a) as a proportion of the U.S. population, the percentage of adults who EVER experienced post-COVID conditions (long COVID). These adults had COVID and had some symptoms that lasted three months or longer; b) as a proportion of adults who said they ever had COVID, the percentage who EVER experienced post-COVID conditions; c) as a proportion of the U.S. population, the percentage of adults who are CURRENTLY experiencing post-COVID conditions. These adults had COVID, had long-term symptoms, and are still experiencing symptoms; d) as a proportion of adults who said they ever had COVID, the percentage who are CURRENTLY experiencing post-COVID conditions; and e) as a proportion of the U.S. population, the percentage of adults who said they ever had COVID.

  12. NHANES 2017-2018 Height Weight Data

    • figshare.com
    txt
    Updated Feb 25, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Liang Zhao (2023). NHANES 2017-2018 Height Weight Data [Dataset]. http://doi.org/10.6084/m9.figshare.22086662.v2
    Explore at:
    txtAvailable download formats
    Dataset updated
    Feb 25, 2023
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Liang Zhao
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is the age, height, and weight data extracted from the NHANES 2017-2018 survey dataset. The original data were BMX_J.xpt (see https://wwwn.cdc.gov/nchs/nhanes/search/datapage.aspx?Component=Examination&CycleBeginYear=2017) and DEMO_J.xpt (see https://wwwn.cdc.gov/nchs/nhanes/search/datapage.aspx?Component=Demographics&CycleBeginYear=2017). I used Linux Mint 20 to get the CSV files from the above XPT files. First, I installed the R foreign package by the next command. $ sudo apt install r-cran-foreign Then, I developed two R scripts to extract the CSV data. The scripts are attached to this dataset. For analysis of the CSV file, I used the following commands within the R environment.

    data h =20 & data$age w =20 & data$age wt ht model summary(model) Call: lm(formula = wt ~ ht) Residuals: Min 1Q Median 3Q Max -0.29406 -0.07182 -0.00558 0.06514 0.47048 Coefficients: Estimate Std. Error t value Pr(>|t|)
    (Intercept) 1.46404 0.01423 102.90

  13. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Weinberger Daniel; Weinberger Daniel (2024). NCHS mortality data 2014-2022 [Dataset]. http://doi.org/10.5281/zenodo.12808102
Organization logo

NCHS mortality data 2014-2022

Explore at:
binAvailable download formats
Dataset updated
Jul 24, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Weinberger Daniel; Weinberger Daniel
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This is a database (parquet format) containing publicly available multiple cause mortality data from the US (CDC/NCHS) for 2014-2022. Not all variables are included on this export. Please see below for restrictions on the use of these data imposed by NCHS. You can use the arrow package in R to open the file. See here for example analysis; https://github.com/DanWeinberger/pneumococcal_mortality/blob/main/analysis_nongeo.Rmd . For instance, save this file in a folder called "parquet3":

library(arrow)

library(dplyr)

pneumo.deaths.in <- open_dataset("R:/parquet3", format = "parquet") %>% #open the dataset
filter(grepl("J13|A39|J181|A403|B953|G001", all_icd)) %>% #filter to records that have the selected ICD codes
collect() #call the dataset into memory. Note you should do any operations you canbefore calling 'collect()" due to memory issues

The variables included are named: (see full dictionary:https://www.cdc.gov/nchs/nvss/mortality_public_use_data.htm)

year: Calendar year of death

month: Calendar month of death

age_detail_number: number indicating year or part of year; can't be interpreted itself here. see agey variable instead

sex: M/F

place_of_death:

Place of Death and Decedent’s Status
Place of Death and Decedent’s Status
1 ... Hospital, Clinic or Medical Center
- Inpatient
2 ... Hospital, Clinic or Medical Center
- Outpatient or admitted to Emergency Room
3 ... Hospital, Clinic or Medical Center
- Dead on Arrival
4 ... Decedent’s home
5 ... Hospice facility
6 ... Nursing home/long term care
7 ... Other
9 ... Place of death unknown

all_icd: Cause of death coded as ICD10 codes. ICD1-ICD21 pasted into a single string, with separation of codes by an underscore

hisp_recode: 0=Non-Hispanic; 1=Hispanic; 999= Not specified

race_recode: race coding prior to 2018 (reconciled in race_recode_new)

race_recode_alt: race coding after 2018 (reconciled in race_recode_new)

race_recode_new:

1='White'

2= 'Black'

3='Hispanic'

4='American Indian'

5='Asian/Pacific Islanders'

agey:

age in years (or partial years for kids <12months)

https://www.cdc.gov/nchs/data_access/restrictions.htm

Please Read Carefully Before Using NCHS Public Use Survey Data

The National Center for Health Statistics (NCHS), Centers for Disease Control and Prevention (CDC), conducts statistical and epidemiological activities under the authority granted by the Public Health Service Act (42 U.S.C. § 242k). NCHS survey data are protected by Federal confidentiality laws including Section 308(d) Public Health Service Act [42 U.S.C. 242m(d)] and the Confidential Information Protection and Statistical Efficiency Act or CIPSEA [Pub. L. No. 115-435, 132 Stat. 5529 § 302]. These confidentiality laws state the data collected by NCHS may be used only for statistical reporting and analysis. Any effort to determine the identity of individuals and establishments violates the assurances of confidentiality provided by federal law.

Terms and Conditions

NCHS does all it can to assure that the identity of individuals and establishments cannot be disclosed. All direct identifiers, as well as any characteristics that might lead to identification, are omitted from the dataset. Any intentional identification or disclosure of an individual or establishment violates the assurances of confidentiality given to the providers of the information. Therefore, users will:

  1. Use the data in this dataset for statistical reporting and analysis only.
  1. Make no attempt to learn the identity of any person or establishment included in these data.
  1. Not link this dataset with individually identifiable data from other NCHS or non-NCHS datasets.
  1. Not engage in any efforts to assess disclosure methodologies applied to protect individuals and establishments or any research on methods of re-identification of individuals and establishments.

By using these data you signify your agreement to comply with the above-stated statutorily based requirements.

Sanctions for Violating NCHS Data Use Agreement

Willfully disclosing any information that could identify a person or establishment in any manner to a person or agency not entitled to receive it, shall be guilty of a class E felony and imprisoned for not more than 5 years, or fined not more than $250,000, or both.

Search
Clear search
Close search
Google apps
Main menu