47 datasets found
  1. Statewide Death Profiles

    • data.chhs.ca.gov
    • data.ca.gov
    • +3more
    csv, zip
    Updated Jul 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    California Department of Public Health (2025). Statewide Death Profiles [Dataset]. https://data.chhs.ca.gov/dataset/statewide-death-profiles
    Explore at:
    csv(4689434), csv(16301), csv(5034), csv(463460), csv(2026589), csv(5401561), csv(164006), csv(200270), csv(419332), zip, csv(385695)Available download formats
    Dataset updated
    Jul 28, 2025
    Dataset authored and provided by
    California Department of Public Healthhttps://www.cdph.ca.gov/
    Description

    This dataset contains counts of deaths for California as a whole based on information entered on death certificates. Final counts are derived from static data and include out-of-state deaths to California residents, whereas provisional counts are derived from incomplete and dynamic data. Provisional counts are based on the records available when the data was retrieved and may not represent all deaths that occurred during the time period. Deaths involving injuries from external or environmental forces, such as accidents, homicide and suicide, often require additional investigation that tends to delay certification of the cause and manner of death. This can result in significant under-reporting of these deaths in provisional data.

    The final data tables include both deaths that occurred in California regardless of the place of residence (by occurrence) and deaths to California residents (by residence), whereas the provisional data table only includes deaths that occurred in California regardless of the place of residence (by occurrence). The data are reported as totals, as well as stratified by age, gender, race-ethnicity, and death place type. Deaths due to all causes (ALL) and selected underlying cause of death categories are provided. See temporal coverage for more information on which combinations are available for which years.

    The cause of death categories are based solely on the underlying cause of death as coded by the International Classification of Diseases. The underlying cause of death is defined by the World Health Organization (WHO) as "the disease or injury which initiated the train of events leading directly to death, or the circumstances of the accident or violence which produced the fatal injury." It is a single value assigned to each death based on the details as entered on the death certificate. When more than one cause is listed, the order in which they are listed can affect which cause is coded as the underlying cause. This means that similar events could be coded with different underlying causes of death depending on variations in how they were entered. Consequently, while underlying cause of death provides a convenient comparison between cause of death categories, it may not capture the full impact of each cause of death as it does not always take into account all conditions contributing to the death.

  2. Death Profiles by Leading Causes of Death

    • data.chhs.ca.gov
    • data.ca.gov
    • +4more
    web link, zip
    Updated Apr 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    California Department of Public Health (2025). Death Profiles by Leading Causes of Death [Dataset]. https://data.chhs.ca.gov/dataset/death-profiles-by-leading-causes-of-death
    Explore at:
    web link, zipAvailable download formats
    Dataset updated
    Apr 22, 2025
    Dataset authored and provided by
    California Department of Public Healthhttps://www.cdph.ca.gov/
    Description

    Data for deaths by leading cause of death categories are now available in the death profiles dataset for each geographic granularity.

    The cause of death categories are based solely on the underlying cause of death as coded by the International Classification of Diseases. The underlying cause of death is defined by the World Health Organization (WHO) as "the disease or injury which initiated the train of events leading directly to death, or the circumstances of the accident or violence which produced the fatal injury." It is a single value assigned to each death based on the details as entered on the death certificate. When more than one cause is listed, the order in which they are listed can affect which cause is coded as the underlying cause. This means that similar events could be coded with different underlying causes of death depending on variations in how they were entered. Consequently, while underlying cause of death provides a convenient comparison between cause of death categories, it may not capture the full impact of each cause of death as it does not always take into account all conditions contributing to the death.

    Cause of death categories for years 1999 and later are based on tenth revision of International Classification of Diseases (ICD-10) codes. Comparable categories are provided for years 1979 through 1998 based on ninth revision (ICD-9) codes. For more information on the comparability of cause of death classification between ICD revisions see Comparability of Cause-of-death Between ICD Revisions.

  3. d

    COVID-19 Cases and Deaths by Race/Ethnicity - ARCHIVE

    • catalog.data.gov
    • data.ct.gov
    • +1more
    Updated Aug 12, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.ct.gov (2023). COVID-19 Cases and Deaths by Race/Ethnicity - ARCHIVE [Dataset]. https://catalog.data.gov/dataset/covid-19-cases-and-deaths-by-race-ethnicity
    Explore at:
    Dataset updated
    Aug 12, 2023
    Dataset provided by
    data.ct.gov
    Description

    Note: DPH is updating and streamlining the COVID-19 cases, deaths, and testing data. As of 6/27/2022, the data will be published in four tables instead of twelve. The COVID-19 Cases, Deaths, and Tests by Day dataset contains cases and test data by date of sample submission. The death data are by date of death. This dataset is updated daily and contains information back to the beginning of the pandemic. The data can be found at https://data.ct.gov/Health-and-Human-Services/COVID-19-Cases-Deaths-and-Tests-by-Day/g9vi-2ahj. The COVID-19 State Metrics dataset contains over 93 columns of data. This dataset is updated daily and currently contains information starting June 21, 2022 to the present. The data can be found at https://data.ct.gov/Health-and-Human-Services/COVID-19-State-Level-Data/qmgw-5kp6 . The COVID-19 County Metrics dataset contains 25 columns of data. This dataset is updated daily and currently contains information starting June 16, 2022 to the present. The data can be found at https://data.ct.gov/Health-and-Human-Services/COVID-19-County-Level-Data/ujiq-dy22 . The COVID-19 Town Metrics dataset contains 16 columns of data. This dataset is updated daily and currently contains information starting June 16, 2022 to the present. The data can be found at https://data.ct.gov/Health-and-Human-Services/COVID-19-Town-Level-Data/icxw-cada . To protect confidentiality, if a town has fewer than 5 cases or positive NAAT tests over the past 7 days, those data will be suppressed. COVID-19 cases and associated deaths that have been reported among Connecticut residents, broken down by race and ethnicity. All data in this report are preliminary; data for previous dates will be updated as new reports are received and data errors are corrected. Deaths reported to the either the Office of the Chief Medical Examiner (OCME) or Department of Public Health (DPH) are included in the COVID-19 update. The following data show the number of COVID-19 cases and associated deaths per 100,000 population by race and ethnicity. Crude rates represent the total cases or deaths per 100,000 people. Age-adjusted rates consider the age of the person at diagnosis or death when estimating the rate and use a standardized population to provide a fair comparison between population groups with different age distributions. Age-adjustment is important in Connecticut as the median age of among the non-Hispanic white population is 47 years, whereas it is 34 years among non-Hispanic blacks, and 29 years among Hispanics. Because most non-Hispanic white residents who died were over 75 years of age, the age-adjusted rates are lower than the unadjusted rates. In contrast, Hispanic residents who died tend to be younger than 75 years of age which results in higher age-adjusted rates. The population data used to calculate rates is based on the CT DPH population statistics for 2019, which is available online here: https://portal.ct.gov/DPH/Health-Information-Systems--Reporting/Population/Population-Statistics. Prior to 5/10/2021, the population estimates from 2018 were used. Rates are standardized to the 2000 US Millions Standard population (data available here: https://seer.cancer.gov/stdpopulations/). Standardization was done using 19 age groups (0, 1-4, 5-9, 10-14, ..., 80-84, 85 years and older). More information about direct standardization for age adjustment is available here: https://www.cdc.gov/nchs/data/statnt/statnt06rv.pdf Categories are mutually exclusive. The category “multiracial” includes people who answered ‘yes’ to more than one race category. Counts may not add up to total case counts as data on race and ethnicity may be missing. Age adjusted rates calculated only for groups with more than 20 deaths. Abbreviation: NH=Non-Hispanic. Data on Connecticut deaths were obtained from the Connecticut Deaths Registry maintained by the DPH Office of Vital Records. Cause of death was determined by a death certifier (e.g., physician, APRN, medical

  4. Rates of COVID-19 Cases or Deaths by Age Group and Vaccination Status

    • data.cdc.gov
    • data.virginia.gov
    • +1more
    application/rdfxml +5
    Updated Feb 22, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CDC COVID-19 Response, Epidemiology Task Force (2023). Rates of COVID-19 Cases or Deaths by Age Group and Vaccination Status [Dataset]. https://data.cdc.gov/Public-Health-Surveillance/Rates-of-COVID-19-Cases-or-Deaths-by-Age-Group-and/3rge-nu2a
    Explore at:
    tsv, application/rssxml, csv, application/rdfxml, xml, jsonAvailable download formats
    Dataset updated
    Feb 22, 2023
    Dataset provided by
    Centers for Disease Control and Preventionhttp://www.cdc.gov/
    Authors
    CDC COVID-19 Response, Epidemiology Task Force
    Description

    Data for CDC’s COVID Data Tracker site on Rates of COVID-19 Cases and Deaths by Vaccination Status. Click 'More' for important dataset description and footnotes

    Dataset and data visualization details: These data were posted on October 21, 2022, archived on November 18, 2022, and revised on February 22, 2023. These data reflect cases among persons with a positive specimen collection date through September 24, 2022, and deaths among persons with a positive specimen collection date through September 3, 2022.

    Vaccination status: A person vaccinated with a primary series had SARS-CoV-2 RNA or antigen detected on a respiratory specimen collected ≥14 days after verifiably completing the primary series of an FDA-authorized or approved COVID-19 vaccine. An unvaccinated person had SARS-CoV-2 RNA or antigen detected on a respiratory specimen and has not been verified to have received COVID-19 vaccine. Excluded were partially vaccinated people who received at least one FDA-authorized vaccine dose but did not complete a primary series ≥14 days before collection of a specimen where SARS-CoV-2 RNA or antigen was detected. Additional or booster dose: A person vaccinated with a primary series and an additional or booster dose had SARS-CoV-2 RNA or antigen detected on a respiratory specimen collected ≥14 days after receipt of an additional or booster dose of any COVID-19 vaccine on or after August 13, 2021. For people ages 18 years and older, data are graphed starting the week including September 24, 2021, when a COVID-19 booster dose was first recommended by CDC for adults 65+ years old and people in certain populations and high risk occupational and institutional settings. For people ages 12-17 years, data are graphed starting the week of December 26, 2021, 2 weeks after the first recommendation for a booster dose for adolescents ages 16-17 years. For people ages 5-11 years, data are included starting the week of June 5, 2022, 2 weeks after the first recommendation for a booster dose for children aged 5-11 years. For people ages 50 years and older, data on second booster doses are graphed starting the week including March 29, 2022, when the recommendation was made for second boosters. Vertical lines represent dates when changes occurred in U.S. policy for COVID-19 vaccination (details provided above). Reporting is by primary series vaccine type rather than additional or booster dose vaccine type. The booster dose vaccine type may be different than the primary series vaccine type. ** Because data on the immune status of cases and associated deaths are unavailable, an additional dose in an immunocompromised person cannot be distinguished from a booster dose. This is a relevant consideration because vaccines can be less effective in this group. Deaths: A COVID-19–associated death occurred in a person with a documented COVID-19 diagnosis who died; health department staff reviewed to make a determination using vital records, public health investigation, or other data sources. Rates of COVID-19 deaths by vaccination status are reported based on when the patient was tested for COVID-19, not the date they died. Deaths usually occur up to 30 days after COVID-19 diagnosis. Participating jurisdictions: Currently, these 31 health departments that regularly link their case surveillance to immunization information system data are included in these incidence rate estimates: Alabama, Arizona, Arkansas, California, Colorado, Connecticut, District of Columbia, Florida, Georgia, Idaho, Indiana, Kansas, Kentucky, Louisiana, Massachusetts, Michigan, Minnesota, Nebraska, New Jersey, New Mexico, New York, New York City (New York), North Carolina, Philadelphia (Pennsylvania), Rhode Island, South Dakota, Tennessee, Texas, Utah, Washington, and West Virginia; 30 jurisdictions also report deaths among vaccinated and unvaccinated people. These jurisdictions represent 72% of the total U.S. population and all ten of the Health and Human Services Regions. Data on cases among people who received additional or booster doses were reported from 31 jurisdictions; 30 jurisdictions also reported data on deaths among people who received one or more additional or booster dose; 28 jurisdictions reported cases among people who received two or more additional or booster doses; and 26 jurisdictions reported deaths among people who received two or more additional or booster doses. This list will be updated as more jurisdictions participate. Incidence rate estimates: Weekly age-specific incidence rates by vaccination status were calculated as the number of cases or deaths divided by the number of people vaccinated with a primary series, overall or with/without a booster dose (cumulative) or unvaccinated (obtained by subtracting the cumulative number of people vaccinated with a primary series and partially vaccinated people from the 2019 U.S. intercensal population estimates) and multiplied by 100,000. Overall incidence rates were age-standardized using the 2000 U.S. Census standard population. To estimate population counts for ages 6 months through 1 year, half of the single-year population counts for ages 0 through 1 year were used. All rates are plotted by positive specimen collection date to reflect when incident infections occurred. For the primary series analysis, age-standardized rates include ages 12 years and older from April 4, 2021 through December 4, 2021, ages 5 years and older from December 5, 2021 through July 30, 2022 and ages 6 months and older from July 31, 2022 onwards. For the booster dose analysis, age-standardized rates include ages 18 years and older from September 19, 2021 through December 25, 2021, ages 12 years and older from December 26, 2021, and ages 5 years and older from June 5, 2022 onwards. Small numbers could contribute to less precision when calculating death rates among some groups. Continuity correction: A continuity correction has been applied to the denominators by capping the percent population coverage at 95%. To do this, we assumed that at least 5% of each age group would always be unvaccinated in each jurisdiction. Adding this correction ensures that there is always a reasonable denominator for the unvaccinated population that would prevent incidence and death rates from growing unrealistically large due to potential overestimates of vaccination coverage. Incidence rate ratios (IRRs): IRRs for the past one month were calculated by dividing the average weekly incidence rates among unvaccinated people by that among people vaccinated with a primary series either overall or with a booster dose. Publications: Scobie HM, Johnson AG, Suthar AB, et al. Monitoring Incidence of COVID-19 Cases, Hospitalizations, and Deaths, by Vaccination Status — 13 U.S. Jurisdictions, April 4–July 17, 2021. MMWR Morb Mortal Wkly Rep 2021;70:1284–1290. Johnson AG, Amin AB, Ali AR, et al. COVID-19 Incidence and Death Rates Among Unvaccinated and Fully Vaccinated Adults with and Without Booster Doses During Periods of Delta and Omicron Variant Emergence — 25 U.S. Jurisdictions, April 4–December 25, 2021. MMWR Morb Mortal Wkly Rep 2022;71:132–138. Johnson AG, Linde L, Ali AR, et al. COVID-19 Incidence and Mortality Among Unvaccinated and Vaccinated Persons Aged ≥12 Years by Receipt of Bivalent Booster Doses and Time Since Vaccination — 24 U.S. Jurisdictions, October 3, 2021–December 24, 2022. MMWR Morb Mortal Wkly Rep 2023;72:145–152. Johnson AG, Linde L, Payne AB, et al. Notes from the Field: Comparison of COVID-19 Mortality Rates Among Adults Aged ≥65 Years Who Were Unvaccinated and Those Who Received a Bivalent Booster Dose Within the Preceding 6 Months — 20 U.S. Jurisdictions, September 18, 2022–April 1, 2023. MMWR Morb Mortal Wkly Rep 2023;72:667–669.

  5. COVID-19 Time-Series Metrics by County and State (ARCHIVED)

    • data.chhs.ca.gov
    • data.ca.gov
    • +2more
    csv, xlsx, zip
    Updated Aug 28, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    California Department of Public Health (2024). COVID-19 Time-Series Metrics by County and State (ARCHIVED) [Dataset]. https://data.chhs.ca.gov/dataset/covid-19-time-series-metrics-by-county-and-state
    Explore at:
    csv(6223281), csv(7729431), csv(3313), xlsx(6471), xlsx(11305), csv(4836928), xlsx(7811), zipAvailable download formats
    Dataset updated
    Aug 28, 2024
    Dataset authored and provided by
    California Department of Public Healthhttps://www.cdph.ca.gov/
    Description

    Note: This COVID-19 data set is no longer being updated as of December 1, 2023. Access current COVID-19 data on the CDPH respiratory virus dashboard (https://www.cdph.ca.gov/Programs/CID/DCDC/Pages/Respiratory-Viruses/RespiratoryDashboard.aspx) or in open data format (https://data.chhs.ca.gov/dataset/respiratory-virus-dashboard-metrics).

    As of August 17, 2023, data is being updated each Friday.

    For death data after December 31, 2022, California uses Provisional Deaths from the Center for Disease Control and Prevention’s National Center for Health Statistics (NCHS) National Vital Statistics System (NVSS). Prior to January 1, 2023, death data was sourced from the COVID-19 registry. The change in data source occurred in July 2023 and was applied retroactively to all 2023 data to provide a consistent source of death data for the year of 2023.

    As of May 11, 2023, data on cases, deaths, and testing is being updated each Thursday. Metrics by report date have been removed, but previous versions of files with report date metrics are archived below.

    All metrics include people in state and federal prisons, US Immigration and Customs Enforcement facilities, US Marshal detention facilities, and Department of State Hospitals facilities. Members of California's tribal communities are also included.

    The "Total Tests" and "Positive Tests" columns show totals based on the collection date. There is a lag between when a specimen is collected and when it is reported in this dataset. As a result, the most recent dates on the table will temporarily show NONE in the "Total Tests" and "Positive Tests" columns. This should not be interpreted as no tests being conducted on these dates. Instead, these values will be updated with the number of tests conducted as data is received.

  6. O

    COVID-19 Tests, Cases, Hospitalizations, and Deaths (Statewide) - ARCHIVE

    • data.ct.gov
    • catalog.data.gov
    application/rdfxml +5
    Updated Jun 24, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department of Public Health (2022). COVID-19 Tests, Cases, Hospitalizations, and Deaths (Statewide) - ARCHIVE [Dataset]. https://data.ct.gov/Health-and-Human-Services/COVID-19-Tests-Cases-Hospitalizations-and-Deaths-S/rf3k-f8fg
    Explore at:
    tsv, application/rdfxml, xml, json, csv, application/rssxmlAvailable download formats
    Dataset updated
    Jun 24, 2022
    Dataset authored and provided by
    Department of Public Health
    License

    U.S. Government Workshttps://www.usa.gov/government-works
    License information was derived automatically

    Description

    Note: DPH is updating and streamlining the COVID-19 cases, deaths, and testing data. As of 6/27/2022, the data will be published in four tables instead of twelve.

    The COVID-19 Cases, Deaths, and Tests by Day dataset contains cases and test data by date of sample submission. The death data are by date of death. This dataset is updated daily and contains information back to the beginning of the pandemic. The data can be found at https://data.ct.gov/Health-and-Human-Services/COVID-19-Cases-Deaths-and-Tests-by-Day/g9vi-2ahj.

    The COVID-19 State Metrics dataset contains over 93 columns of data. This dataset is updated daily and currently contains information starting June 21, 2022 to the present. The data can be found at https://data.ct.gov/Health-and-Human-Services/COVID-19-State-Level-Data/qmgw-5kp6 .

    The COVID-19 County Metrics dataset contains 25 columns of data. This dataset is updated daily and currently contains information starting June 16, 2022 to the present. The data can be found at https://data.ct.gov/Health-and-Human-Services/COVID-19-County-Level-Data/ujiq-dy22 .

    The COVID-19 Town Metrics dataset contains 16 columns of data. This dataset is updated daily and currently contains information starting June 16, 2022 to the present. The data can be found at https://data.ct.gov/Health-and-Human-Services/COVID-19-Town-Level-Data/icxw-cada . To protect confidentiality, if a town has fewer than 5 cases or positive NAAT tests over the past 7 days, those data will be suppressed.

    COVID-19 tests, cases, and associated deaths that have been reported among Connecticut residents. All data in this report are preliminary; data for previous dates will be updated as new reports are received and data errors are corrected. Hospitalization data were collected by the Connecticut Hospital Association and reflect the number of patients currently hospitalized with laboratory-confirmed COVID-19. Deaths reported to the either the Office of the Chief Medical Examiner (OCME) or Department of Public Health (DPH) are included in the daily COVID-19 update.

    Data on Connecticut deaths were obtained from the Connecticut Deaths Registry maintained by the DPH Office of Vital Records. Cause of death was determined by a death certifier (e.g., physician, APRN, medical examiner) using their best clinical judgment. Additionally, all COVID-19 deaths, including suspected or related, are required to be reported to OCME. On April 4, 2020, CT DPH and OCME released a joint memo to providers and facilities within Connecticut providing guidelines for certifying deaths due to COVID-19 that were consistent with the CDC’s guidelines and a reminder of the required reporting to OCME.25,26 As of July 1, 2021, OCME had reviewed every case reported and performed additional investigation on about one-third of reported deaths to better ascertain if COVID-19 did or did not cause or contribute to the death. Some of these investigations resulted in the OCME performing postmortem swabs for PCR testing on individuals whose deaths were suspected to be due to COVID-19, but antemortem diagnosis was unable to be made.31 The OCME issued or re-issued about 10% of COVID-19 death certificates and, when appropriate, removed COVID-19 from the death certificate. For standardization and tabulation of mortality statistics, written cause of death statements made by the certifiers on death certificates are sent to the National Center for Health Statistics (NCHS) at the CDC which assigns cause of death codes according to the International Causes of Disease 10th Revision (ICD-10) classification system.25,26 COVID-19 deaths in this report are defined as those for which the death certificate has an ICD-10 code of U07.1 as either a primary (underlying) or a contributing cause of death. More information on COVID-19 mortality can be found at the following link: https://portal.ct.gov/DPH/Health-Information-Systems--Reporting/Mortality/Mortality-Statistics

    Data are reported daily, with timestamps indicated in the daily briefings posted at: portal.ct.gov/coronavirus. Data are subject to future revision as reporting changes.

    Starting in July 2020, this dataset will be updated every weekday.

    Additional notes: As of 11/5/2020, CT DPH has added antigen testing for SARS-CoV-2 to reported test counts in this dataset. The tests included in this dataset include both molecular and antigen datasets. Molecular tests reported include polymerase chain reaction (PCR) and nucleic acid amplicfication (NAAT) tests.

    A delay in the data pull schedule occurred on 06/23/2020. Data from 06/22/2020 was processed on 06/23/2020 at 3:30 PM. The normal data cycle resumed with the data for 06/23/2020.

    A network outage on 05/19/2020 resulted in a change in the data pull schedule. Data from 5/19/2020 was processed on 05/20/2020 at 12:00 PM. Data from 5/20/2020 was processed on 5/20/2020 8:30 PM. The normal data cycle resumed on 05/20/2020 with the 8:30 PM data pull. As a result of the network outage, the timestamp on the datasets on the Open Data Portal differ from the timestamp in DPH's daily PDF reports.

    Starting 5/10/2021, the date field will represent the date this data was updated on data.ct.gov. Previously the date the data was pulled by DPH was listed, which typically coincided with the date before the data was published on data.ct.gov. This change was made to standardize the COVID-19 data sets on data.ct.gov.

    Starting April 4, 2022, negative rapid antigen and rapid PCR test results for SARS-CoV-2 are no longer required to be reported to the Connecticut Department of Public Health as of April 4. Negative test results from laboratory based molecular (PCR/NAAT) results are still required to be reported as are all positive test results from both molecular (PCR/NAAT) and antigen tests.

    On 5/16/2022, 8,622 historical cases were included in the data. The date range for these cases were from August 2021 – April 2022.”

  7. C

    Death Profiles by ZIP Code

    • data.chhs.ca.gov
    • data.ca.gov
    • +2more
    csv, zip
    Updated Apr 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    California Department of Public Health (2025). Death Profiles by ZIP Code [Dataset]. https://data.chhs.ca.gov/dataset/death-profiles-by-zip-code
    Explore at:
    csv(78958555), csv(80054609), csv(80055974), csv(40627562), csv(4571), zipAvailable download formats
    Dataset updated
    Apr 22, 2025
    Dataset authored and provided by
    California Department of Public Health
    Description

    This dataset contains counts of deaths for California residents by ZIP Code based on information entered on death certificates. Final counts are derived from static data and include out-of-state deaths of California residents. The data tables include deaths of residents of California by ZIP Code of residence (by residence). The data are reported as totals, as well as stratified by age and gender. Deaths due to all causes (ALL) and selected underlying cause of death categories are provided. See temporal coverage for more information on which combinations are available for which years.

    The cause of death categories are based solely on the underlying cause of death as coded by the International Classification of Diseases. The underlying cause of death is defined by the World Health Organization (WHO) as "the disease or injury which initiated the train of events leading directly to death, or the circumstances of the accident or violence which produced the fatal injury." It is a single value assigned to each death based on the details as entered on the death certificate. When more than one cause is listed, the order in which they are listed can affect which cause is coded as the underlying cause. This means that similar events could be coded with different underlying causes of death depending on variations in how they were entered. Consequently, while underlying cause of death provides a convenient comparison between cause of death categories, it may not capture the full impact of each cause of death as it does not always take into account all conditions contributing to the death.

  8. H

    Dataset: Faces extracted from Time Magazine 1923-2014

    • dataverse.harvard.edu
    • marketplace.sshopencloud.eu
    Updated Mar 18, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ana Jofre (2020). Dataset: Faces extracted from Time Magazine 1923-2014 [Dataset]. http://doi.org/10.7910/DVN/JMFQT7
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 18, 2020
    Dataset provided by
    Harvard Dataverse
    Authors
    Ana Jofre
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    The data presented here consists of three parts: Dataset 1: In this set, we extract 327,322 faces from our entire collection of 3389 issues, and automatically classified each face as male or female. We present this data as a single table with columns identifying the date, issue, page number, the coordinates identifying the position of the face on the page, and classification (male or female). The coordinates identifying the position of the face on the page are based on the size and resolution of the pages found in the “Time Vault”. Dataset 2: Dataset 2 consists of 8,789 classified faces from 100 selected issues. Human labor was used to identify and extract 3,299 face images from 39 issues, which were later classified by another set of workers. This selection of 39 issues contains one issue per decade spanned by the archive plus one issue per year between 1961 and 1991, and the extracted face images were used to train the face extraction algorithm. The remaining 5,490 faces from 61 issues were extracted via machine learning before being classified by human coders. These 61 issues were chosen to complement the first selection of 39 issues: one issue per year for all years in the archive excluding those between 1961 and 1991. Thus, Dataset 2 contains fully-labelled faces from at least one issue per year. Dataset 3: In the interest of transparency, Dataset 3 consists of the raw data collected to create Dataset 2, and consists of 2 tables. Before explaining these tables we first briefly describe our data collection and verification procedures, which have been fully described elsewhere. A custom AMT interface was used to enable human labors to classify faces according the categories in Table 4. Each worker was given a randomly-selected batch of 25 pages, each with a clearly highlighted face to be categorized, of which three pages were verification pages with known features, which were used for quality control. Each face was labeled by two distinct human coders, determined at random so that the paring of coders varied with the image. A proficiency rating was calculated for each coder by considering all images they annotated and computing the average number of labels that matched those identified by the image’s other coder. The tables in Dataset 2 were created by resolving inconsistencies between the two image coders by selecting the labels from the coder with the highest proficiency rating. Prior to calculating the proficiency score, all faces that were tagged as having ‘Poor’ or ‘Error’ image quality by either of the two coders were eliminated. Due to technical bugs when the AMT interface was first implemented, a small number of images were only labeled once; these were also eliminated from Datasets 2 and 3. In Dataset 3, we present the raw annotations for each coder that tagged each face, along with demographic data for each coder. Dataset 3 consists of two tables: the raw data from each of the two sets of coders, and the demographic information for each of the coders.

  9. A geometric shape regularity effect in the human brain: fMRI dataset

    • openneuro.org
    Updated Mar 14, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mathias Sablé-Meyer; Lucas Benjamin; Cassandra Potier Watkins; Chenxi He; Maxence Pajot; Théo Morfoisse; Fosca Al Roumi; Stanislas Dehaene (2025). A geometric shape regularity effect in the human brain: fMRI dataset [Dataset]. http://doi.org/10.18112/openneuro.ds006010.v1.0.1
    Explore at:
    Dataset updated
    Mar 14, 2025
    Dataset provided by
    OpenNeurohttps://openneuro.org/
    Authors
    Mathias Sablé-Meyer; Lucas Benjamin; Cassandra Potier Watkins; Chenxi He; Maxence Pajot; Théo Morfoisse; Fosca Al Roumi; Stanislas Dehaene
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    A geometric shape regularity effect in the human brain: fMRI dataset

    Authors:

    • Mathias Sablé-Meyer*
    • Lucas Benjamin
    • Cassandra Potier Watkins
    • Chenxi He
    • Maxence Pajot
    • Théo Morfoisse
    • Fosca Al Roumi
    • Stanislas Dehaene

    *Corresponding author: mathias.sable-meyer@ucl.ac.uk

    Abstract

    The perception and production of regular geometric shapes is a characteristic trait of human cultures since prehistory, whose neural mechanisms are unknown. Behavioral studies suggest that humans are attuned to discrete regularities such as symmetries and parallelism, and rely on their combinations to encode regular geometric shapes in a compressed form. To identify the relevant brain systems and their dynamics, we collected functional MRI and magnetoencephalography data in both adults and six-year-olds during the perception of simple shapes such as hexagons, triangles and quadrilaterals. The results revealed that geometric shapes, relative to other visual categories, induce a hypoactivation of ventral visual areas and an overactivation of the intraparietal and inferior temporal regions also involved in mathematical processing, whose activation is modulated by geometric regularity. While convolutional neural networks captured the early visual activity evoked by geometric shapes, they failed to account for subsequent dorsal parietal and prefrontal signals, which could only be captured by discrete geometric features or by more advanced transformer models of vision. We propose that the perception of abstract geometric regularities engages an additional symbolic mode of visual perception.

    Notes about this dataset

    We separately share the MEG dataset at https://openneuro.org/datasets/ds006012. Below are some notes about the fMRI dataset of N=20 adult participants (sub-2xx, numbers between 204 and 223), and N=22 children (sub-3xx, numbers between 301 and 325).

    • The code for the analyses is provided at https://github.com/mathias-sm/AGeometricShapeRegularityEffectHumanBrain
      However, the analyses work from already preprocessed data. Since there is no custom code per se for the preprocessing, I have not included it in the repository. To preprocess the data as was done in the published article, here is the command and software information:
      • fMRIPrep version: 20.0.5
      • fMRIPrep command: /usr/local/miniconda/bin/fmriprep /data /out participant --participant-label <label> --output-spaces MNI152NLin6Asym:res-2 MNI152NLin2009cAsym:res-2
    • Defacing has been performed with bidsonym running the pydeface masking, and nobrainer brain registraction pipeline.
      The published analyses have been performed on the non-defaced data. I have checked for data quality on all participants after defacing. In specific cases, I may be able to request the permission to share the original, non-defaced dataset.
    • sub-325 was acquired by a different experimenter and defaced before being shared with the rest of the research team, hence why the slightly different defacing mask. That participant was also preprocessed separately, and using a more recent fMRIPrep version: 20.2.6.
    • The data associated with the children has a few missing files. Notably:
      1. sub-313 and sub-316 are missing one run of the localizer each
      2. sub-316 has no data at all for the geometry
      3. sub-308 has eno useable data for the intruder task Since all of these still have some data to contribute to either task, all available files were kept on this dataset. The analysis code reflects these inconsistencies where required with specific exceptions.
  10. MyAuto.ge-CarDetails

    • kaggle.com
    Updated Apr 27, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ilia Gogotchuri (2020). MyAuto.ge-CarDetails [Dataset]. https://www.kaggle.com/datasets/gogotchuri/myautogecardetails
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 27, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Ilia Gogotchuri
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    MyAutoData

    Actual dataset Location on Kaggle Contains data scrapped by MyAutoScrapper (Written in go)

    Purpose

    Since this kaggle dataset real car deals, placed by real humans with pictures, It can be used for real world Machine Learning(ML) or Machine Vision. Price predictions, image processing, machine vision etc.

    Data structure

    This dataset contains data.csv file, which has 100 000 car deal detail. Each row representing each deal. data.csv has 18 columns: - ID: Represents unique identifier for each entry, also for each id, there is a sub-folder in images respectively, which contains images for the given deal. ID is an integer starting from 0. - Manufacturer: A string identifying car manufacturer. - Model: A string identifying car model. - Year: An Integer for the car production year. - Category: A type of the vechile (Sedan, Cabriolet, etc.). - Mileage: An integer representing car mileage in kilometers. - FuelType: A Fuel type the car uses. - EngineVolume: A Floating point number, representing engine volume in litres. - DriveWheels: A String representing car drive wheels (i.e. Front, Rear, 4x4, etc.). - GearBox: A string to identify gear box of the transmission (Manual, Automatic, etc.) - Doors: A string representing car doors (4, 4/5, etc.) - Wheel: Steering wheel position (Left Wheel, Right Wheel) - Color: Color of the car body. - InteriorColor: Interior color. - VIN: VIN number of the vechile, represented as a string. - LeatherInterior: A boolean value, true if car has a leather interior. - Price: Price of the car in USD. If ommited, meants price was set as negotiable. - Clearance: A boolean value identifying, whether customs has been cleared of not.

    Important Note! (Disclaimer)

    None of the fields (Except ID) are guaranteed to be filled, or filled with correct information. Since, people sometimes don't enter correct information, or hide some information for reasons. But for most of the entries, most of the fields are supposed to be filled with correct information.

  11. O

    COVID-19 case rate per 100,000 population and percent test positivity in the...

    • data.ct.gov
    • catalog.data.gov
    application/rdfxml +5
    Updated Jun 23, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department of Public Health (2022). COVID-19 case rate per 100,000 population and percent test positivity in the last 14 days by town - ARCHIVE [Dataset]. https://data.ct.gov/Health-and-Human-Services/COVID-19-case-rate-per-100-000-population-and-perc/hree-nys2
    Explore at:
    application/rssxml, xml, csv, json, tsv, application/rdfxmlAvailable download formats
    Dataset updated
    Jun 23, 2022
    Dataset authored and provided by
    Department of Public Health
    License

    U.S. Government Workshttps://www.usa.gov/government-works
    License information was derived automatically

    Description

    Note: DPH is updating and streamlining the COVID-19 cases, deaths, and testing data. As of 6/27/2022, the data will be published in four tables instead of twelve.

    The COVID-19 Cases, Deaths, and Tests by Day dataset contains cases and test data by date of sample submission. The death data are by date of death. This dataset is updated daily and contains information back to the beginning of the pandemic. The data can be found at https://data.ct.gov/Health-and-Human-Services/COVID-19-Cases-Deaths-and-Tests-by-Day/g9vi-2ahj.

    The COVID-19 State Metrics dataset contains over 93 columns of data. This dataset is updated daily and currently contains information starting June 21, 2022 to the present. The data can be found at https://data.ct.gov/Health-and-Human-Services/COVID-19-State-Level-Data/qmgw-5kp6 .

    The COVID-19 County Metrics dataset contains 25 columns of data. This dataset is updated daily and currently contains information starting June 16, 2022 to the present. The data can be found at https://data.ct.gov/Health-and-Human-Services/COVID-19-County-Level-Data/ujiq-dy22 .

    The COVID-19 Town Metrics dataset contains 16 columns of data. This dataset is updated daily and currently contains information starting June 16, 2022 to the present. The data can be found at https://data.ct.gov/Health-and-Human-Services/COVID-19-Town-Level-Data/icxw-cada . To protect confidentiality, if a town has fewer than 5 cases or positive NAAT tests over the past 7 days, those data will be suppressed.

    This dataset includes a count and rate per 100,000 population for COVID-19 cases, a count of COVID-19 molecular diagnostic tests, and a percent positivity rate for tests among people living in community settings for the previous two-week period. Dates are based on date of specimen collection (cases and positivity).

    A person is considered a new case only upon their first COVID-19 testing result because a case is defined as an instance or bout of illness. If they are tested again subsequently and are still positive, it still counts toward the test positivity metric but they are not considered another case.

    Percent positivity is calculated as the number of positive tests among community residents conducted during the 14 days divided by the total number of positive and negative tests among community residents during the same period. If someone was tested more than once during that 14 day period, then those multiple test results (regardless of whether they were positive or negative) are included in the calculation.

    These case and test counts do not include cases or tests among people residing in congregate settings, such as nursing homes, assisted living facilities, or correctional facilities.

    These data are updated weekly and reflect the previous two full Sunday-Saturday (MMWR) weeks (https://wwwn.cdc.gov/nndss/document/MMWR_week_overview.pdf).

    DPH note about change from 7-day to 14-day metrics: Prior to 10/15/2020, these metrics were calculated using a 7-day average rather than a 14-day average. The 7-day metrics are no longer being updated as of 10/15/2020 but the archived dataset can be accessed here: https://data.ct.gov/Health-and-Human-Services/COVID-19-case-rate-per-100-000-population-and-perc/s22x-83rd

    As you know, we are learning more about COVID-19 all the time, including the best ways to measure COVID-19 activity in our communities. CT DPH has decided to shift to 14-day rates because these are more stable, particularly at the town level, as compared to 7-day rates. In addition, since the school indicators were initially published by DPH last summer, CDC has recommended 14-day rates and other states (e.g., Massachusetts) have started to implement 14-day metrics for monitoring COVID transmission as well.

    With respect to geography, we also have learned that many people are looking at the town-level data to inform decision making, despite emphasis on the county-level metrics in the published addenda. This is understandable as there has been variation within counties in COVID-19 activity (for example, rates that are higher in one town than in most other towns in the county).

    Additional notes: As of 11/5/2020, CT DPH has added antigen testing for SARS-CoV-2 to reported test counts in this dataset. The tests included in this dataset include both molecular and antigen datasets. Molecular tests reported include polymerase chain reaction (PCR) and nucleic acid amplicfication (NAAT) tests.

    The population data used to calculate rates is based on the CT DPH population statistics for 2019, which is available online here: https://portal.ct.gov/DPH/Health-Information-Systems--Reporting/Population/Population-Statistics. Prior to 5/10/2021, the population estimates from 2018 were used.

    Data suppression is applied when the rate is <5 cases per 100,000 or if there are <5 cases within the town. Information on why data suppression rules are applied can be found online here: https://www.cdc.gov/cancer/uscs/technical_notes/stat_methods/suppression.htm

  12. C

    California Hospital Inpatient Mortality Rates and Quality Ratings

    • data.chhs.ca.gov
    • data.ca.gov
    • +5more
    csv, pdf, xls, zip
    Updated Apr 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department of Health Care Access and Information (2025). California Hospital Inpatient Mortality Rates and Quality Ratings [Dataset]. https://data.chhs.ca.gov/dataset/california-hospital-inpatient-mortality-rates-and-quality-ratings
    Explore at:
    pdf(306372), pdf, xls(143872), pdf(134270), pdf(83317), pdf(445171), pdf(700782), pdf(280571), pdf(419645), xls(214016), xls(165376), csv(3189182), xls, pdf(451935), pdf(253971), pdf(791847), pdf(150793), xls(141824), xls(166400), xls(163840), pdf(1235022), xls(172032), pdf(713960), pdf(363570), pdf(798633), pdf(538945), pdf(100994), pdf(288823), pdf(452858), pdf(146736), pdf(114573), pdf(264343), pdf(730246), pdf(238223), pdf(796065), pdf(254426), pdf(729792), zip, pdf(239000), pdf(321071), pdf(147517), csv(6740988)Available download formats
    Dataset updated
    Apr 2, 2025
    Dataset authored and provided by
    Department of Health Care Access and Information
    Description

    The dataset contains risk-adjusted mortality rates, quality ratings, and number of deaths and cases for 6 medical conditions treated (Acute Stroke, Acute Myocardial Infarction, Heart Failure, Gastrointestinal Hemorrhage, Hip Fracture and Pneumonia) and 3 procedures performed (Carotid Endarterectomy, Pancreatic Resection, and Percutaneous Coronary Intervention) in California hospitals. The 2023 IMIs were generated using AHRQ Version 2024, while previous years' IMIs were generated with older versions of AHRQ software (2022 IMIs by Version 2023, 2021 IMIs by Version 2022, 2020 IMIs by Version 2021, 2019 IMIs by Version 2020, 2016-2018 IMIs by Version 2019, 2014 and 2015 IMIs by Version 5.0, and 2012 and 2013 IMIs by Version 4.5). The differences in the statistical method employed and inclusion and exclusion criteria using different versions can lead to different results. Users should not compare trends of mortality rates over time. However, many hospitals showed consistent performance over years; “better” performing hospitals may perform better and “worse” performing hospitals may perform worse consistently across years. This dataset does not include conditions treated or procedures performed in outpatient settings. Please refer to statewide table for California overall rates: https://data.chhs.ca.gov/dataset/california-hospital-inpatient-mortality-rates-and-quality-ratings/resource/af88090e-b6f5-4f65-a7ea-d613e6569d96

  13. 2-meter Universal Thermal Climate Index (UTCI) and Human Heat Health Index...

    • zenodo.org
    bin, png, tiff
    Updated Jul 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Harsh Kamath; Harsh Kamath; Trevor Brooks; Trevor Brooks; Kevin Lanza; Marc Coudert; Dev Niyogi; Dev Niyogi; Kevin Lanza; Marc Coudert (2024). 2-meter Universal Thermal Climate Index (UTCI) and Human Heat Health Index (H3I) hazard for Austin, Texas [Dataset]. http://doi.org/10.5281/zenodo.10870068
    Explore at:
    png, bin, tiffAvailable download formats
    Dataset updated
    Jul 6, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Harsh Kamath; Harsh Kamath; Trevor Brooks; Trevor Brooks; Kevin Lanza; Marc Coudert; Dev Niyogi; Dev Niyogi; Kevin Lanza; Marc Coudert
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Texas, Austin
    Description

    Universal Thermal Climate Index (UTCI) is a physiological temperature that is widely used in biometeorological studies to assess the heat stress felt by humans. UTCI considers the shortwave and longwave radiation incident on humans from the six cubical directions as well as air temperature, humidity, wind speed and clothing. As a part of NOAA National Integrated Heat Health Information System (NIHHIS) and NASA Interdisciplinary Research in Earth Science (IDS) project, we have generated the UTCI data for Austin, Texas and surrounding peri-urban area at 2-meters spatial resolution for the year 2017. Details on data generation and methodology can be found in Kamath et al., (2023) but are summarized here.

    1. Datasets and model used

    The solar and longwave environmental irradiance geometry (SOLWEIG) model was used to simulate shadows, mean radiant temperature (TMRT) and the UTCI (Lindberg et al., 2008). TMRT is the equivalent temperature due to exposure to absorbed shortwave and longwave radiation from all directions in a standing position. SOLWEIG was forced using near-surface ERA-5 data available at a spatial resolution of 0.25°x 0.25°. Building, vegetation heights, and digital terrain model were again derived from 3DEP LiDAR point cloud data. SOLWEIG was run using the urban multi-scale environment predictor (UMEP) (Lindberg et al., 2018) plug-in with QGIS.

    2. Data availability

    Diurnal UTCI data were calculated for typical meteorological clear sky days corresponding to Summer and Fall. The typical clear sky day was selected using the 10-year Typical meteorological Year (TMY) for Austin, Texas (30.2672° N, 97.7431° W) provided by National Solar Radiation Database (NSRDB). More details on TMY files can be found at: https://nsrdb.nrel.gov/data-sets/tmy

    Additionally, data is developed for heat hazard for daytime Human Heat Health Index (H3I) calculation as defined by Kamath et al., (2023). Briefly, this heat hazard is defined as the fraction of the day when the UTCI exceeds certain threshold. The threshold used to calculate heat hazard for Summer and Fall were 35° C and 32°C, respectively that imply strong heat stress (Jendritzky et al., 2012). Note that UTCI is on a different scale compared to air temperature, and could yield different heat stress levels.

    3. Data format

    The georeferenced UTCI and heat hazard data are available in the geoTIFF file format. The files can be readily visualized using GIS software such as QGIS and ArcGIS, as well as programing languages such as Python.

    4. Companion dataset

    Based on the calculated UTCI here, the potential locations for tree planting were calculated to increase the shade to reduce heat vulnerability for Austin, Texas. [https://doi.org/10.5281/zenodo.6363494]

    References

    1. Kamath, H. G., Martilli, A., Singh, M., Brooks, T., Lanza, K., Bixler, R. P., ... & Niyogi, D. (2023). Human heat health index (H3I) for holistic assessment of heat hazard and mitigation strategies beyond urban heat islands. Urban Climate, 52, 101675.
    2. Lindberg, F., Holmer, B., & Thorsson, S. (2008). SOLWEIG 1.0–Modelling spatial variations of 3D radiant fluxes and mean radiant temperature in complex urban settings. International journal of biometeorology, 52, 697-713.
    3. Lindberg, F., Grimmond, C. S. B., Gabey, A., Huang, B., Kent, C. W., Sun, T., ... & Zhang, Z. (2018). Urban Multi-scale Environmental Predictor (UMEP): An integrated tool for city-based climate services. Environmental modelling & software, 99, 70-87.
    4. Jendritzky, G., de Dear, R., & Havenith, G. (2012). UTCI—why another thermal index?. International journal of biometeorology, 56, 421-428.
    5. Bixler, R. P., Coudert, M., Richter, S. M., Jones, J. M., Llanes Pulido, C., Akhavan, N., ... & Niyogi, D. (2022). Reflexive co-production for urban resilience: Guiding framework and experiences from Austin, Texas. Frontiers in Sustainable Cities, 4, 1015630.
    6. Lanza, K., Jones, J., Acuña, F., Coudert, M., Bixler, R. P., Kamath, H., & Niyogi, D. (2023). Heat vulnerability of Latino and Black residents in a low-income community and their recommended adaptation strategies: A qualitative study. Urban Climate, 51, 101656.
  14. p

    Cervical Cancer Risk Classification - Dataset - CKAN

    • data.poltekkes-smg.ac.id
    Updated Oct 7, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Cervical Cancer Risk Classification - Dataset - CKAN [Dataset]. https://data.poltekkes-smg.ac.id/dataset/cervical-cancer-risk-classification
    Explore at:
    Dataset updated
    Oct 7, 2024
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Cervical Cancer Risk Factors for Biopsy: This Dataset is Obtained from UCI Repository and kindly acknowledged! This file contains a List of Risk Factors for Cervical Cancer leading to a Biopsy Examination! About 11,000 new cases of invasive cervical cancer are diagnosed each year in the U.S. However, the number of new cervical cancer cases has been declining steadily over the past decades. Although it is the most preventable type of cancer, each year cervical cancer kills about 4,000 women in the U.S. and about 300,000 women worldwide. In the United States, cervical cancer mortality rates plunged by 74% from 1955 - 1992 thanks to increased screening and early detection with the Pap test. AGE Fifty percent of cervical cancer diagnoses occur in women ages 35 - 54, and about 20% occur in women over 65 years of age. The median age of diagnosis is 48 years. About 15% of women develop cervical cancer between the ages of 20 - 30. Cervical cancer is extremely rare in women younger than age 20. However, many young women become infected with multiple types of human papilloma virus, which then can increase their risk of getting cervical cancer in the future. Young women with early abnormal changes who do not have regular examinations are at high risk for localized cancer by the time they are age 40, and for invasive cancer by age 50. SOCIOECONOMIC AND ETHNIC FACTORS Although the rate of cervical cancer has declined among both Caucasian and African-American women over the past decades, it remains much more prevalent in African-Americans -- whose death rates are twice as high as Caucasian women. Hispanic American women have more than twice the risk of invasive cervical cancer as Caucasian women, also due to a lower rate of screening. These differences, however, are almost certainly due to social and economic differences. Numerous studies report that high poverty levels are linked with low screening rates. In addition, lack of health insurance, limited transportation, and language difficulties hinder a poor woman’s access to screening services. HIGH SEXUAL ACTIVITY Human papilloma virus (HPV) is the main risk factor for cervical cancer. In adults, the most important risk factor for HPV is sexual activity with an infected person. Women most at risk for cervical cancer are those with a history of multiple sexual partners, sexual intercourse at age 17 years or younger, or both. A woman who has never been sexually active has a very low risk for developing cervical cancer. Sexual activity with multiple partners increases the likelihood of many other sexually transmitted infections (chlamydia, gonorrhea, syphilis).Studies have found an association between chlamydia and cervical cancer risk, including the possibility that chlamydia may prolong HPV infection. FAMILY HISTORY Women have a higher risk of cervical cancer if they have a first-degree relative (mother, sister) who has had cervical cancer. USE OF ORAL CONTRACEPTIVES Studies have reported a strong association between cervical cancer and long-term use of oral contraception (OC). Women who take birth control pills for more than 5 - 10 years appear to have a much higher risk HPV infection (up to four times higher) than those who do not use OCs. (Women taking OCs for fewer than 5 years do not have a significantly higher risk.) The reasons for this risk from OC use are not entirely clear. Women who use OCs may be less likely to use a diaphragm, condoms, or other methods that offer some protection against sexual transmitted diseases, including HPV. Some research also suggests that the hormones in OCs might help the virus enter the genetic material of cervical cells. HAVING MANY CHILDREN Studies indicate that having many children increases the risk for developing cervical cancer, particularly in women infected with HPV. SMOKING Smoking is associated with a higher risk for precancerous changes (dysplasia) in the cervix and for progression to invasive cervical cancer, especially for women infected with HPV. IMMUNOSUPPRESSION Women with weak immune systems, (such as those with HIV / AIDS), are more susceptible to acquiring HPV. Immunocompromised patients are also at higher risk for having cervical precancer develop rapidly into invasive cancer. DIETHYLSTILBESTROL (DES) From 1938 - 1971, diethylstilbestrol (DES), an estrogen-related drug, was widely prescribed to pregnant women to help prevent miscarriages. The daughters of these women face a higher risk for cervical cancer. DES is no longer prsecribed.

  15. COVID-19 Reported Patient Impact and Hospital Capacity by Facility

    • healthdata.gov
    • data.ct.gov
    • +5more
    Updated May 3, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Department of Health & Human Services (2024). COVID-19 Reported Patient Impact and Hospital Capacity by Facility [Dataset]. https://healthdata.gov/Hospital/COVID-19-Reported-Patient-Impact-and-Hospital-Capa/anag-cw7u
    Explore at:
    tsv, application/rssxml, csv, xml, application/rdfxml, application/geo+json, kmz, kmlAvailable download formats
    Dataset updated
    May 3, 2024
    Dataset provided by
    United States Department of Health and Human Serviceshttp://www.hhs.gov/
    Authors
    U.S. Department of Health & Human Services
    License

    https://www.usa.gov/government-workshttps://www.usa.gov/government-works

    Description

    After May 3, 2024, this dataset and webpage will no longer be updated because hospitals are no longer required to report data on COVID-19 hospital admissions, and hospital capacity and occupancy data, to HHS through CDC’s National Healthcare Safety Network. Data voluntarily reported to NHSN after May 1, 2024, will be available starting May 10, 2024, at COVID Data Tracker Hospitalizations.

    The following dataset provides facility-level data for hospital utilization aggregated on a weekly basis (Sunday to Saturday). These are derived from reports with facility-level granularity across two main sources: (1) HHS TeleTracking, and (2) reporting provided directly to HHS Protect by state/territorial health departments on behalf of their healthcare facilities.

    The hospital population includes all hospitals registered with Centers for Medicare & Medicaid Services (CMS) as of June 1, 2020. It includes non-CMS hospitals that have reported since July 15, 2020. It does not include psychiatric, rehabilitation, Indian Health Service (IHS) facilities, U.S. Department of Veterans Affairs (VA) facilities, Defense Health Agency (DHA) facilities, and religious non-medical facilities.

    For a given entry, the term “collection_week” signifies the start of the period that is aggregated. For example, a “collection_week” of 2020-11-15 means the average/sum/coverage of the elements captured from that given facility starting and including Sunday, November 15, 2020, and ending and including reports for Saturday, November 21, 2020.

    Reported elements include an append of either “_coverage”, “_sum”, or “_avg”.

    • A “_coverage” append denotes how many times the facility reported that element during that collection week.
    • A “_sum” append denotes the sum of the reports provided for that facility for that element during that collection week.
    • A “_avg” append is the average of the reports provided for that facility for that element during that collection week.

    The file will be updated weekly. No statistical analysis is applied to impute non-response. For averages, calculations are based on the number of values collected for a given hospital in that collection week. Suppression is applied to the file for sums and averages less than four (4). In these cases, the field will be replaced with “-999,999”.

    A story page was created to display both corrected and raw datasets and can be accessed at this link: https://healthdata.gov/stories/s/nhgk-5gpv

    This data is preliminary and subject to change as more data become available. Data is available starting on July 31, 2020.

    Sometimes, reports for a given facility will be provided to both HHS TeleTracking and HHS Protect. When this occurs, to ensure that there are not duplicate reports, deduplication is applied according to prioritization rules within HHS Protect.

    For influenza fields listed in the file, the current HHS guidance marks these fields as optional. As a result, coverage of these elements are varied.

    For recent updates to the dataset, scroll to the bottom of the dataset description.

    On May 3, 2021, the following fields have been added to this data set.

    • hhs_ids
    • previous_day_admission_adult_covid_confirmed_7_day_coverage
    • previous_day_admission_pediatric_covid_confirmed_7_day_coverage
    • previous_day_admission_adult_covid_suspected_7_day_coverage
    • previous_day_admission_pediatric_covid_suspected_7_day_coverage
    • previous_week_personnel_covid_vaccinated_doses_administered_7_day_sum
    • total_personnel_covid_vaccinated_doses_none_7_day_sum
    • total_personnel_covid_vaccinated_doses_one_7_day_sum
    • total_personnel_covid_vaccinated_doses_all_7_day_sum
    • previous_week_patients_covid_vaccinated_doses_one_7_day_sum
    • previous_week_patients_covid_vaccinated_doses_all_7_day_sum

    On May 8, 2021, this data set has been converted to a corrected data set. The corrections applied to this data set are to smooth out data anomalies caused by keyed in data errors. To help determine which records have had corrections made to it. An additional Boolean field called is_corrected has been added.

    On May 13, 2021 Changed vaccination fields from sum to max or min fields. This reflects the maximum or minimum number reported for that metric in a given week.

    On June 7, 2021 Changed vaccination fields from max or min fields to Wednesday reported only. This reflects that the number reported for that metric is only reported on Wednesdays in a given week.

    On September 20, 2021, the following has been updated: The use of analytic dataset as a source.

    On January 19, 2022, the following fields have been added to this dataset:

    • inpatient_beds_used_covid_7_day_avg
    • inpatient_beds_used_covid_7_day_sum
    • inpatient_beds_used_covid_7_day_coverage

    On April 28, 2022, the following pediatric fields have been added to this dataset:

    • all_pediatric_inpatient_bed_occupied_7_day_avg
    • all_pediatric_inpatient_bed_occupied_7_day_coverage
    • all_pediatric_inpatient_bed_occupied_7_day_sum
    • all_pediatric_inpatient_beds_7_day_avg
    • all_pediatric_inpatient_beds_7_day_coverage
    • all_pediatric_inpatient_beds_7_day_sum
    • previous_day_admission_pediatric_covid_confirmed_0_4_7_day_sum
    • previous_day_admission_pediatric_covid_confirmed_12_17_7_day_sum
    • previous_day_admission_pediatric_covid_confirmed_5_11_7_day_sum
    • previous_day_admission_pediatric_covid_confirmed_unknown_7_day_sum
    • staffed_icu_pediatric_patients_confirmed_covid_7_day_avg
    • staffed_icu_pediatric_patients_confirmed_covid_7_day_coverage
    • staffed_icu_pediatric_patients_confirmed_covid_7_day_sum
    • staffed_pediatric_icu_bed_occupancy_7_day_avg
    • staffed_pediatric_icu_bed_occupancy_7_day_coverage
    • staffed_pediatric_icu_bed_occupancy_7_day_sum
    • total_staffed_pediatric_icu_beds_7_day_avg
    • total_staffed_pediatric_icu_beds_7_day_coverage
    • total_staffed_pediatric_icu_beds_7_day_sum

    On October 24, 2022, the data includes more analytical calculations in efforts to provide a cleaner dataset. For a raw version of this dataset, please follow this link: https://healthdata.gov/Hospital/COVID-19-Reported-Patient-Impact-and-Hospital-Capa/uqq2-txqb

    Due to changes in reporting requirements, after June 19, 2023, a collection week is defined as starting on a Sunday and ending on the next Saturday.

  16. F

    Native American Multi-Year Facial Image Dataset

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). Native American Multi-Year Facial Image Dataset [Dataset]. https://www.futurebeeai.com/dataset/image-dataset/facial-images-historical-native-american
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Area covered
    United States
    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    Welcome to the Native American Multi-Year Facial Image Dataset, thoughtfully curated to support the development of advanced facial recognition systems, biometric identification models, KYC verification tools, and other computer vision applications. This dataset is ideal for training AI models to recognize individuals over time, track facial changes, and enhance age progression capabilities.

    Facial Image Data

    This dataset includes over 5,000+ high-quality facial images, organized into individual participant sets, each containing:

    Historical Images: 22 facial images per participant captured across a span of 10 years
    Enrollment Image: One recent high-resolution facial image for reference or ground truth

    Diversity & Representation

    Geographic Coverage: Participants from USA, Canada, Mexico and more and other Native American regions
    Demographics: Individuals aged 18 to 70 years, with a gender distribution of 60% male and 40% female
    File Formats: All images are available in JPEG and HEIC formats

    Image Quality & Capture Conditions

    To ensure model generalization and practical usability, images in this dataset reflect real-world diversity:

    Lighting Conditions: Images captured under various natural and artificial lighting setups
    Backgrounds: A wide range of indoor and outdoor backgrounds
    Device Quality: Captured using modern, high-resolution mobile devices for consistency and clarity

    Metadata

    Each participant’s dataset is accompanied by rich metadata to support advanced model training and analysis, including:

    Unique participant ID
    File name
    Age at the time of image capture
    Gender
    Country of origin
    Demographic profile
    File format

    Use Cases & Applications

    This dataset is highly valuable for a wide range of AI and computer vision applications:

    Facial Recognition Systems: Train models for high-accuracy face matching across time
    KYC & Identity Verification: Improve time-spanning verification for banks, insurance, and government services
    Biometric Security Solutions: Build reliable identity authentication models
    Age Progression & Estimation Models: Train AI to predict aging patterns or estimate age from facial features
    Generative AI: Support creation and validation of synthetic age progression or longitudinal face generation

    Secure & Ethical Collection

    Platform: All data was securely collected and processed through FutureBeeAI’s proprietary systems
    Ethical Compliance: Full participant consent obtained with transparent communication of use cases
    Privacy-Protected: No personally identifiable information is included; all data is anonymized and handled with care

    Dataset Updates & Customization

    To keep pace with evolving AI needs, this dataset is regularly updated and customizable. Custom data collection options include:

    <div style="margin-top:10px; margin-bottom: 10px; padding-left: 30px; display: flex; gap:

  17. Data from: People Detection Dataset

    • kaggle.com
    Updated Jun 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Adil Shamim (2025). People Detection Dataset [Dataset]. https://www.kaggle.com/datasets/adilshamim8/people-detection
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 15, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Adil Shamim
    License

    Attribution-NoDerivs 4.0 (CC BY-ND 4.0)https://creativecommons.org/licenses/by-nd/4.0/
    License information was derived automatically

    Description

    Give Machines the Power to See People.

    This isn’t just a dataset — it’s a foundation for building the future of human-aware technology. Carefully crafted and annotated with precision, the People Detection dataset enables AI systems to recognize and understand human presence in dynamic, real-world environments.

    Whether you’re building smart surveillance, autonomous vehicles, crowd analytics, or next-gen robotics, this dataset gives your model the eyes it needs.

    What Makes This Dataset Different?

    • Real-World Images – Diverse environments, realistic lighting, and real human motion
    • High-Quality Annotations – Every person labeled with clean YOLO-format bounding boxes
    • Plug-and-Play – Comes with pre-split training, validation, and test sets — no extra prep needed
    • Speed-Optimized – Perfect for real-time object detection applications

    Built for Visionaries

    • Detect people instantly — in cities, offices, or crowds
    • Build systems that respond to human presence
    • Train intelligent agents to navigate human spaces safely and smartly

    Created using Roboflow. Optimized for clarity, performance, and scale. Source Dataset on Roboflow →

    This is more than a dataset. It’s a step toward a smarter world — One where machines can understand people.

  18. d

    COVID-19 Cases and Deaths by Age Group - ARCHIVE

    • catalog.data.gov
    • data.ct.gov
    Updated Aug 12, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.ct.gov (2023). COVID-19 Cases and Deaths by Age Group - ARCHIVE [Dataset]. https://catalog.data.gov/dataset/covid-19-cases-and-deaths-by-age-group
    Explore at:
    Dataset updated
    Aug 12, 2023
    Dataset provided by
    data.ct.gov
    Description

    Note: DPH is updating and streamlining the COVID-19 cases, deaths, and testing data. As of 6/27/2022, the data will be published in four tables instead of twelve. The COVID-19 Cases, Deaths, and Tests by Day dataset contains cases and test data by date of sample submission. The death data are by date of death. This dataset is updated daily and contains information back to the beginning of the pandemic. The data can be found at https://data.ct.gov/Health-and-Human-Services/COVID-19-Cases-Deaths-and-Tests-by-Day/g9vi-2ahj. The COVID-19 State Metrics dataset contains over 93 columns of data. This dataset is updated daily and currently contains information starting June 21, 2022 to the present. The data can be found at https://data.ct.gov/Health-and-Human-Services/COVID-19-State-Level-Data/qmgw-5kp6 . The COVID-19 County Metrics dataset contains 25 columns of data. This dataset is updated daily and currently contains information starting June 16, 2022 to the present. The data can be found at https://data.ct.gov/Health-and-Human-Services/COVID-19-County-Level-Data/ujiq-dy22 . The COVID-19 Town Metrics dataset contains 16 columns of data. This dataset is updated daily and currently contains information starting June 16, 2022 to the present. The data can be found at https://data.ct.gov/Health-and-Human-Services/COVID-19-Town-Level-Data/icxw-cada . To protect confidentiality, if a town has fewer than 5 cases or positive NAAT tests over the past 7 days, those data will be suppressed. COVID-19 cases and associated deaths that have been reported among Connecticut residents, broken out by age group. All data in this report are preliminary; data for previous dates will be updated as new reports are received and data errors are corrected. Deaths reported to the either the Office of the Chief Medical Examiner (OCME) or Department of Public Health (DPH) are included in the daily COVID-19 update. Data are reported daily, with timestamps indicated in the daily briefings posted at: portal.ct.gov/coronavirus. Data are subject to future revision as reporting changes. Starting in July 2020, this dataset will be updated every weekday. Additional notes: A delay in the data pull schedule occurred on 06/23/2020. Data from 06/22/2020 was processed on 06/23/2020 at 3:30 PM. The normal data cycle resumed with the data for 06/23/2020. A network outage on 05/19/2020 resulted in a change in the data pull schedule. Data from 5/19/2020 was processed on 05/20/2020 at 12:00 PM. Data from 5/20/2020 was processed on 5/20/2020 8:30 PM. The normal data cycle resumed on 05/20/2020 with the 8:30 PM data pull. As a result of the network outage, the timestamp on the datasets on the Open Data Portal differ from the timestamp in DPH's daily PDF reports. Starting 5/10/2021, the date field will represent the date this data was updated on data.ct.gov. Previously the date the data was pulled by DPH was listed, which typically coincided with the date before the data was published on data.ct.gov. This change was made to standardize the COVID-19 data sets on data.ct.gov.

  19. a

    Global Human Footprint Index

    • hub.arcgis.com
    • cacgeoportal.com
    • +1more
    Updated Jul 14, 2015
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Columbia (2015). Global Human Footprint Index [Dataset]. https://hub.arcgis.com/maps/65518e782be04e7db31de65d53d591a9
    Explore at:
    Dataset updated
    Jul 14, 2015
    Dataset authored and provided by
    Columbia
    Area covered
    Description

    Global Human Footprint Index represents the relative human influence in each terrestrial biome expressed as a percentage. The purpose is to provide an updated map of anthropogenic impacts on the environment in geographic projection which can be used in wildlife conservation planning, natural resource management, and research on human-environment interactions. Dataset SummaryThe Global Human Footprint Index Dataset of the Last of the Wild Project, Version 2, 2005 (LWP-2) is the Human Influence Index (HII) normalized by biome and realm. The HII is a global dataset of 1-kilometer grid cells, created from nine global data layers of human population pressure (population density), human land use and infrastructure (built-up areas, nighttime lights, land use/land cover), and human access (coastlines, roads, railroads, navigable rivers). A value of zero represents the least influenced–the “most wild” part of the biome with value of 100 representing the most influenced (least wild) part of the biome. The dataset is produced by the Wildlife Conservation Society (WCS) and the Columbia University Center for International Earth Science Information Network (CIESIN).Recommended CitationWildlife Conservation Society - WCS, and Center for International Earth Science Information Network - CIESIN - Columbia University. 2005. Last of the Wild Project, Version 2, 2005 (LWP-2): Global Human Footprint Dataset (Geographic). Palisades, NY: NASA Socioeconomic Data and Applications Center (SEDAC). http://dx.doi.org/10.7927/H4M61H5F. Accessed DAY MONTH YEAR.

  20. Glaive Function Calling V2

    • kaggle.com
    • huggingface.co
    Updated Nov 24, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). Glaive Function Calling V2 [Dataset]. https://www.kaggle.com/datasets/thedevastator/ai-chatbot-conversational-data/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 24, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    The Devastator
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    AI Chatbot Conversational Data

    A Knowledge Base for Trainable Natural Language Processing

    By Huggingface Hub [source]

    About this dataset

    This dataset contains valuable records of conversations between humans and AI-driven chatbots in real-world scenarios. This is a great opportunity to explore the nuances and intricacies of conversations between humans and machines, opening the door to interesting research directions for machine learning, artificial intelligence, natural language processing (NLP), and beyond. With this data, researchers can determine how well machines are able to simulate real conversation behavior such as nonverbal exchanges, intonations, humorous insights or even sarcasm. The data also provides an avenue for comparative studies between human behavior and AI capabilities in carrying out meaningful dialogues with humans. This knowledge base is invaluable for those who aim to create more astounding AI systems that can closely imitate comprehensible speech patterns through their trained technology models

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    How to Use this Dataset

    This dataset contains conversations between humans and AI-driven chatbots in real-world scenarios. With this dataset, you will be able to use the data to build an AI system that can respond intelligently in natural language conversations. For example, you can build a system with the ability to further engage users by replying with meaningful responses as the conversation progresses.

    In order to get started, first familiarize yourself with the columns included in this dataset: 'chat' and 'system'. The column 'chat' contains conversations between humans and chatbot systems while the column 'system' contains responses from AI-driven chatbots.

    Once you understand what is included in the data set, it's time for you to start building your AI system! Depending on how complex or advanced your goal is, there are several different approaches that could be used when working with this data set such as supervised learning models like seq2seq network or unsupervised methods like autoencoders etc. To get more detailed information regarding those methods refer to external materials available online.

    After having trained your model, now it's time for testing out its performance! Enter some sample text into your model using either a web form or command line interface – then observe how it responds against what’s already stored within training datasets column ‘System’ which indicates expected chatsbot response (see above). You should find that once trained correctly; potential outcomes of such tests explores very closely resembling instances from learning sources (the training dataset) leading evidence of advanced Artificial intelligence applications are possible with sufficient analysis inputs! As always if extra accuracy is needed afterwards tweak any parameters until desired results are achieved - Congratulations!

    Research Ideas

    • AI-driven natural language generation: Using this dataset, developers can train AI systems to automatically generate natural conversations between humans and machines.
    • Automatic response selection: The data in the dataset could be used to train AI algorithms which select the most appropriate response in any given conversation.
    • Evaluating human-machine interaction: Researchers can use this data to identify areas of improvement in conversational interactions between humans and machines, as well as evaluate various techniques for creating effective dialogue systems

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

    Columns

    File: train.csv | Column name | Description | |:--------------|:--------------------------------------------------------| | chat | Contains dialogues uttered by the human. (String) | | system | Contains responses from the AI-driven chatbot. (String) |

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. If you use this dataset in your research, please credit Huggingface Hub.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
California Department of Public Health (2025). Statewide Death Profiles [Dataset]. https://data.chhs.ca.gov/dataset/statewide-death-profiles
Organization logo

Statewide Death Profiles

Explore at:
4 scholarly articles cite this dataset (View in Google Scholar)
csv(4689434), csv(16301), csv(5034), csv(463460), csv(2026589), csv(5401561), csv(164006), csv(200270), csv(419332), zip, csv(385695)Available download formats
Dataset updated
Jul 28, 2025
Dataset authored and provided by
California Department of Public Healthhttps://www.cdph.ca.gov/
Description

This dataset contains counts of deaths for California as a whole based on information entered on death certificates. Final counts are derived from static data and include out-of-state deaths to California residents, whereas provisional counts are derived from incomplete and dynamic data. Provisional counts are based on the records available when the data was retrieved and may not represent all deaths that occurred during the time period. Deaths involving injuries from external or environmental forces, such as accidents, homicide and suicide, often require additional investigation that tends to delay certification of the cause and manner of death. This can result in significant under-reporting of these deaths in provisional data.

The final data tables include both deaths that occurred in California regardless of the place of residence (by occurrence) and deaths to California residents (by residence), whereas the provisional data table only includes deaths that occurred in California regardless of the place of residence (by occurrence). The data are reported as totals, as well as stratified by age, gender, race-ethnicity, and death place type. Deaths due to all causes (ALL) and selected underlying cause of death categories are provided. See temporal coverage for more information on which combinations are available for which years.

The cause of death categories are based solely on the underlying cause of death as coded by the International Classification of Diseases. The underlying cause of death is defined by the World Health Organization (WHO) as "the disease or injury which initiated the train of events leading directly to death, or the circumstances of the accident or violence which produced the fatal injury." It is a single value assigned to each death based on the details as entered on the death certificate. When more than one cause is listed, the order in which they are listed can affect which cause is coded as the underlying cause. This means that similar events could be coded with different underlying causes of death depending on variations in how they were entered. Consequently, while underlying cause of death provides a convenient comparison between cause of death categories, it may not capture the full impact of each cause of death as it does not always take into account all conditions contributing to the death.

Search
Clear search
Close search
Google apps
Main menu