Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
https://news.yale.edu/sites/default/files/styles/featured_media/public/ynews-cancer-healthy_137381816.jpg?itok=HN73dW20&c=a75e254fe1da31f2732f6b0d7bce1413" alt="Cancer">
The dataset appears to contain information on the risk of developing or dying from various types of cancer for both males and females.
The columns include:
Gender: The type of cancer or category (e.g., "Any cancer", "Bladder", etc.). Risk of developing (Male): The percentage risk and the equivalent "one in _ person" statistic. Risk of dying (Male): The percentage risk and the equivalent "one in _ person" statistic. Risk of developing (Woman): The percentage risk and the equivalent "one in _ person" statistic. Risk of dying (Woman): The percentage risk and the equivalent "one in _ person" statistic.
Columns in the Dataset Gender Risk of developing (Male): Percentage Risk of developing (Male): One in _ Person Risk of dying (Male): Percentage Risk of dying (Male): One in _ Person Risk of developing (Woman): Percentage Risk of developing (Woman): One in _ Person Risk of dying (Woman): Percentage Risk of dying (Woman): One in _ Person
Facebook
TwitterData on death rates in the United States in by age and cause of death. At the bottom of the table, some of the columns are a little out of whack but if you download the file, you should be able to make out all the numbers and information
Looking at death rates in the United States can be a sobering experience, but it can also be a helpful way to see where our country needs to focus its efforts in terms of public health. This dataset contains information on death rates in the United States in 2014, by age and cause of death. This can be used to help identify which age groups are most at risk for certain causes of death, and what factors may contribute to those risks
- Find out what age group is dying the most and why.
- Compare death rates from different causes of death.
- Find out which states have the highest death rates
License
Unknown License - Please check the dataset description for more information.
File: 2014 Death Rates by Age & Cause.csv | Column name | Description | |:-------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------| | Cause of death (based on ICD–10) | The cause of death that the row represents. This is given as a code based on the International Classification of Diseases (ICD). (String) | | All ages1 | The number of deaths due to the given cause in the given age group.(Integer) | | Under 1 year2 | The number of deaths due to the given cause in the given age group.(Integer) | | 1–4 | The number of deaths due to the given cause in the given age group.(Integer) | | 5–14 | The number of deaths due to the given cause in the given age group.(Integer) | | 15–24 | The number of deaths due to the given cause in the given age group.(Integer) | | 25–34 | The number of deaths due to the given cause in the given age group.(Integer) | | 35–44 | The number of deaths due to the given cause in the given age group.(Integer) | | 45–54 | The number of deaths due to the given cause in the given age group.(Integer) | | 55–64 | The number of deaths due to the given cause in the given age group.(Integer) | | 65–74 | The number of deaths due to the given cause in the given age group.(Integer) | | 75–84 | The number of deaths due to the given cause in the given age group.(Integer) | | 85 and over | The number of deaths due to the given cause in the given age group.(Integer) |
Facebook
TwitterRank, number of deaths, percentage of deaths, and age-specific mortality rates for the leading causes of death, by age group and sex, 2000 to most recent year.
Facebook
TwitterThe model on which this dataset is based contain the following nodes, divided as follow: • Personal info, containing the node: Age range and Gen- der; • Probability to contract the disease/infection described by the node, containing the nodes: Respiratory infection, Malignancy, Cardiovascular disease, Poisoning, Nature force, Fall; • Probability to take damage from something in between described in the previous point, containing the nodes: Other injuries, Sickness, Transport incident, Self-harm vi- olence, Other sickness and Unintentional injuries • Probability of dying, containing the node Death. All of the nodes are discrete random variable with range true, false. The only differences are represented by Age range and Gender that have range respectively in young, adult, old and male, female. Primarily, in order to build our model, we followed the causal order trying to figure out which nodes had directly influence on which others.
Facebook
TwitterAttribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
License information was derived automatically
Effect of suicide rates on life expectancy dataset
Abstract
In 2015, approximately 55 million people died worldwide, of which 8 million committed suicide. In the USA, one of the main causes of death is the aforementioned suicide, therefore, this experiment is dealing with the question of how much suicide rates affects the statistics of average life expectancy.
The experiment takes two datasets, one with the number of suicides and life expectancy in the second one and combine data into one dataset. Subsequently, I try to find any patterns and correlations among the variables and perform statistical test using simple regression to confirm my assumptions.
Data
The experiment uses two datasets - WHO Suicide Statistics[1] and WHO Life Expectancy[2], which were firstly appropriately preprocessed. The final merged dataset to the experiment has 13 variables, where country and year are used as index: Country, Year, Suicides number, Life expectancy, Adult Mortality, which is probability of dying between 15 and 60 years per 1000 population, Infant deaths, which is number of Infant Deaths per 1000 population, Alcohol, which is alcohol, recorded per capita (15+) consumption, Under-five deaths, which is number of under-five deaths per 1000 population, HIV/AIDS, which is deaths per 1 000 live births HIV/AIDS, GDP, which is Gross Domestic Product per capita, Population, Income composition of resources, which is Human Development Index in terms of income composition of resources, and Schooling, which is number of years of schooling.
LICENSE
THE EXPERIMENT USES TWO DATASET - WHO SUICIDE STATISTICS AND WHO LIFE EXPECTANCY, WHICH WERE COLLEECTED FROM WHO AND UNITED NATIONS WEBSITE. THEREFORE, ALL DATASETS ARE UNDER THE LICENSE ATTRIBUTION-NONCOMMERCIAL-SHAREALIKE 3.0 IGO (https://creativecommons.org/licenses/by-nc-sa/3.0/igo/).
[1] https://www.kaggle.com/szamil/who-suicide-statistics
[2] https://www.kaggle.com/kumarajarshi/life-expectancy-who
Facebook
TwitterOpen Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Annual data on death registrations by single year of age for the UK (1974 onwards) and England and Wales (1963 onwards).
Facebook
TwitterThe following tables provide historical and projected probabilities of death by age, sex, and year for the period 2011 - 2090. Death probabilities for males.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Context:
This dataset provides data on death rates for suicide categorized by selected population characteristics including sex, race, Hispanic origin, and age in the United States. It includes critical information about measures, definitions, and changes over time.
Source: - NCHS, National Vital Statistics System (NVSS) - Grove RD, Hetzel AM. Vital statistics rates in the United States, 1940–1960. National Center for Health Statistics. 1968 - Numerator data from NVSS annual public-use Mortality Files - Denominator data from U.S. Census Bureau national population estimates - Murphy SL, Xu JQ, Kochanek KD, Arias E, Tejada-Vera B. Deaths: Final data for 2018. National Vital Statistics Reports; vol 69 no 13. Hyattsville, MD: National Center for Health Statistics. 2021
Source URLs:
Death rates for suicide by sex, race, Hispanic origin, and age: United States - HUS 2019 Data Finder - National Vital Statistics Reports - NVSS Appendix Entry
The dataset consists of data collected from the National Vital Statistics System (NVSS) and the U.S. Census Bureau, providing a comprehensive overview of suicide death rates across different demographics in the United States from 1950 to 2001.
| Column Name | Description |
|---|---|
| INDICATOR | Indicator for the data type, e.g., Death rate |
| UNIT | Unit of measurement, e.g., Deaths per 100,000 population |
| UNIT_NU | Numerical value representing the unit |
| STUB_NA | Stub name for category, e.g., Total |
| STUB_LA | Label for the stub category, e.g., All persons |
| STUB_LA_1 | Additional label information for the stub category |
| YEAR | The year the data was recorded |
| YEAR_NUM | Numerical value representing the year |
| AGE | Age group category, e.g., All ages |
| AGE_NUM | Numerical value representing the age group |
| ESTIMATE | Estimated death rate |
Facebook
TwitterThis dataset presents the age-adjusted death rates for the 10 leading causes of death in the United States beginning in 1999. Data are based on information from all resident death certificates filed in the 50 states and the District of Columbia using demographic and medical characteristics. Age-adjusted death rates (per 100,000 population) are based on the 2000 U.S. standard population. Populations used for computing death rates after 2010 are postcensal estimates based on the 2010 census, estimated as of July 1, 2010. Rates for census years are based on populations enumerated in the corresponding censuses. Rates for non-census years before 2010 are revised using updated intercensal population estimates and may differ from rates previously published. Causes of death classified by the International Classification of Diseases, Tenth Revision (ICD–10) are ranked according to the number of deaths assigned to rankable causes. Cause of death statistics are based on the underlying cause of death. SOURCES CDC/NCHS, National Vital Statistics System, mortality data (see http://www.cdc.gov/nchs/deaths.htm); and CDC WONDER (see http://wonder.cdc.gov). REFERENCES National Center for Health Statistics. Vital statistics data available. Mortality multiple cause files. Hyattsville, MD: National Center for Health Statistics. Available from: https://www.cdc.gov/nchs/data_access/vitalstatsonline.htm. Murphy SL, Xu JQ, Kochanek KD, Curtin SC, and Arias E. Deaths: Final data for 2015. National vital statistics reports; vol 66. no. 6. Hyattsville, MD: National Center for Health Statistics. 2017. Available from: https://www.cdc.gov/nchs/data/nvsr/nvsr66/nvsr66_06.pdf.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
This Dataset contains details of Probability of dying between age 30 and 70 from any of cardiovascular disease, cancer, diabetes, or chronic respiratory disease. (Last Updated 18-12-2024)
Acknowledgements
Photo by Joshua Hoehne on Unsplash
Facebook
TwitterBy Data Exercises [source]
This dataset is a comprehensive collection of data from county-level cancer mortality and incidence rates in the United States between 2000-2014. This data provides an unprecedented level of detail into cancer cases, deaths, and trends at a local level. The included columns include County, FIPS, age-adjusted death rate, average death rate per year, recent trend (2) in death rates, recent 5-year trend (2) in death rates and average annual count for each county. This dataset can be used to provide deep insight into the patterns and effects of cancer on communities as well as help inform policy decisions related to mitigating risk factors or increasing preventive measures such as screenings. With this comprehensive set of records from across the United States over 15 years, you will be able to make informed decisions regarding individual patient care or policy development within your own community!
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
This dataset provides comprehensive US county-level cancer mortality and incidence rates from 2000 to 2014. It includes the mortality and incidence rate for each county, as well as whether the county met the objective of 45.5 deaths per 100,000 people. It also provides information on recent trends in death rates and average annual counts of cases over the five year period studied.
This dataset can be extremely useful to researchers looking to study trends in cancer death rates across counties. By using this data, researchers will be able to gain valuable insight into how different counties are performing in terms of providing treatment and prevention services for cancer patients and whether preventative measures and healthcare access are having an effect on reducing cancer mortality rates over time. This data can also be used to inform policy makers about counties needing more target prevention efforts or additional resources for providing better healthcare access within at risk communities.
When using this dataset, it is important to pay close attention to any qualitative columns such as “Recent Trend” or “Recent 5-Year Trend (2)” that may provide insights into long term changes that may not be readily apparent when using quantitative variables such as age-adjusted death rate or average deaths per year over shorter periods of time like one year or five years respectively. Additionally, when studying differences between different counties it is important to take note of any standard FIPS code differences that may indicate that data was collected by a different source with a difference methodology than what was used in other areas studied
- Using this dataset, we can identify patterns in cancer mortality and incidence rates that are statistically significant to create treatment regimens or preventive measures specifically targeting those areas.
- This data can be useful for policymakers to target areas with elevated cancer mortality and incidence rates so they can allocate financial resources to these areas more efficiently.
- This dataset can be used to investigate which factors (such as pollution levels, access to medical care, genetic make up) may have an influence on the cancer mortality and incidence rates in different US counties
If you use this dataset in your research, please credit the original authors. Data Source
License: Dataset copyright by authors - You are free to: - Share - copy and redistribute the material in any medium or format for any purpose, even commercially. - Adapt - remix, transform, and build upon the material for any purpose, even commercially. - You must: - Give appropriate credit - Provide a link to the license, and indicate if changes were made. - ShareAlike - You must distribute your contributions under the same license as the original. - Keep intact - all notices that refer to this license, including copyright notices.
File: death .csv | Column name | Description | |:-------------------------------------------|:-------------------------------------------------------------------...
Facebook
TwitterNumber and percentage of deaths, by place of death (in hospital or non-hospital), 1991 to most recent year.
Facebook
TwitterThis dataset of U.S. mortality trends since 1900 highlights trends in age-adjusted death rates for five selected major causes of death. Age-adjusted death rates (deaths per 100,000) after 1998 are calculated based on the 2000 U.S. standard population. Populations used for computing death rates for 2011–2017 are postcensal estimates based on the 2010 census, estimated as of July 1, 2010. Rates for census years are based on populations enumerated in the corresponding censuses. Rates for noncensus years between 2000 and 2010 are revised using updated intercensal population estimates and may differ from rates previously published. Data on age-adjusted death rates prior to 1999 are taken from historical data (see References below). Revisions to the International Classification of Diseases (ICD) over time may result in discontinuities in cause-of-death trends. SOURCES CDC/NCHS, National Vital Statistics System, historical data, 1900-1998 (see https://www.cdc.gov/nchs/nvss/mortality_historical_data.htm); CDC/NCHS, National Vital Statistics System, mortality data (see http://www.cdc.gov/nchs/deaths.htm); and CDC WONDER (see http://wonder.cdc.gov). REFERENCES National Center for Health Statistics, Data Warehouse. Comparability of cause-of-death between ICD revisions. 2008. Available from: http://www.cdc.gov/nchs/nvss/mortality/comparability_icd.htm. National Center for Health Statistics. Vital statistics data available. Mortality multiple cause files. Hyattsville, MD: National Center for Health Statistics. Available from: https://www.cdc.gov/nchs/data_access/vitalstatsonline.htm. Kochanek KD, Murphy SL, Xu JQ, Arias E. Deaths: Final data for 2017. National Vital Statistics Reports; vol 68 no 9. Hyattsville, MD: National Center for Health Statistics. 2019. Available from: https://www.cdc.gov/nchs/data/nvsr/nvsr68/nvsr68_09-508.pdf. Arias E, Xu JQ. United States life tables, 2017. National Vital Statistics Reports; vol 68 no 7. Hyattsville, MD: National Center for Health Statistics. 2019. Available from: https://www.cdc.gov/nchs/data/nvsr/nvsr68/nvsr68_07-508.pdf. National Center for Health Statistics. Historical Data, 1900-1998. 2009. Available from: https://www.cdc.gov/nchs/nvss/mortality_historical_data.htm.
Facebook
TwitterThis dataset contains counts of deaths for California as a whole based on information entered on death certificates. Final counts are derived from static data and include out-of-state deaths to California residents, whereas provisional counts are derived from incomplete and dynamic data. Provisional counts are based on the records available when the data was retrieved and may not represent all deaths that occurred during the time period. Deaths involving injuries from external or environmental forces, such as accidents, homicide and suicide, often require additional investigation that tends to delay certification of the cause and manner of death. This can result in significant under-reporting of these deaths in provisional data.
The final data tables include both deaths that occurred in California regardless of the place of residence (by occurrence) and deaths to California residents (by residence), whereas the provisional data table only includes deaths that occurred in California regardless of the place of residence (by occurrence). The data are reported as totals, as well as stratified by age, gender, race-ethnicity, and death place type. Deaths due to all causes (ALL) and selected underlying cause of death categories are provided. See temporal coverage for more information on which combinations are available for which years.
The cause of death categories are based solely on the underlying cause of death as coded by the International Classification of Diseases. The underlying cause of death is defined by the World Health Organization (WHO) as "the disease or injury which initiated the train of events leading directly to death, or the circumstances of the accident or violence which produced the fatal injury." It is a single value assigned to each death based on the details as entered on the death certificate. When more than one cause is listed, the order in which they are listed can affect which cause is coded as the underlying cause. This means that similar events could be coded with different underlying causes of death depending on variations in how they were entered. Consequently, while underlying cause of death provides a convenient comparison between cause of death categories, it may not capture the full impact of each cause of death as it does not always take into account all conditions contributing to the death.
Facebook
TwitterAttribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
License information was derived automatically
Shipwrecks can cause serious and large damage to nature.
Ship accidents are divided into two categories:
- peacetime shipwrecks
- wartime shipwrecks
In addition, shipwrecks often cause great loss of life. Shipwrecks lead to the following results
• large number of casualties • damage to the marine ecosystem as a result of the release of oil, sulfur or other toxic and chemical substances into the sea from the wrecked ship. • causes large-scale economic losses • ship accidents near ports destroy property on land.
There are two datasets here:
I collected data from Wikipedia and created "Maritime disasters of the 20th century" dataset.
I got "Death probabilities male and female by age" dataset from Data world
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time series data for the statistic Probability of dying among youth ages 20-24 years (per 1,000) and country Armenia.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset explores the factors influencing life expectancy across various countries and years, aiming to uncover patterns and disparities in health outcomes based on geographic locations. By examining key features such as adult mortality, alcohol consumption, healthcare expenditures, and socioeconomic indicators, this dataset provides insights into the complex interplay of factors shaping life expectancy worldwide.
| Feature | Description |
|---|---|
| Country | Name of the country |
| Year | Year of observation |
| Status | Urban or rural status |
| Life expectancy | Life expectancy at birth in years |
| Adult Mortality | Probability of dying between 15 and 60 years per 1000 |
| Infant deaths | Number of infant deaths per 1000 population |
| Alcohol | Alcohol consumption, measured as liters per capita |
| Percentage expenditure | Expenditure on health as a percentage of GDP |
| Hepatitis B | Hepatitis B immunization coverage among 1-year-olds (%) |
| Measles | Number of reported measles cases per 1000 population |
| BMI | Average Body Mass Index of the population |
| Under-five deaths | Number of deaths under age five per 1000 population |
| Polio | Polio immunization coverage among 1-year-olds (%) |
| Total expenditure | Total government health expenditure as a percentage of GDP |
| Diphtheria | Diphtheria tetanus toxoid and pertussis immunization coverage among 1-year-olds (%) |
| HIV/AIDS | Deaths per 1 000 live births due to HIV/AIDS (0-4 years) |
| GDP | Gross Domestic Product per capita (in USD) |
| Population | Population of the country |
| Thinness 1-19 years | Prevalence of thinness among children and adolescents aged 10–19 (%) |
| Thinness 5-9 years | Prevalence of thinness among children aged 5–9 (%) |
| Income composition of resources | Human Development Index in terms of income composition of resources (0 to 1) |
| Schooling | Number of years of schooling |
World Health Organization (WHO), United Nations (UN), World Bank, etc.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F2993575%2Fb55c8c53db1eb6809cc0fb6b5a081195%2F2024-05-25%20093352.png?generation=1716597253375211&alt=media" alt="">
These data were created with the assumption that the number of deaths due to obesity in 2014 will be estimated from data from 1990 to 2013.
There is also something called HINT data(hint.csv). This is data for 2015 and beyond. I have left it out of the train or test data because it has many missing values, but it may be useful for forecasting and for those who are interested in more recent data.
| Variables | Discription |
|---|---|
| Country | 205 country names |
| Code | Country code like AFG for Afghanistan |
| Year | Year of collecting data |
| Population | Population in a country |
| Percentage-Overweight | Percentage of defined as overweight, BMI >= 25(age-standardized estimate)(%),Sex: both sexes, Age group:18+ |
| Mean-Daily-Caloric-Supply | Mean of daily supply of calories among overweight or obesity, BMI >= 25(age-standardized). Only about men |
| Mean-BMI | BMI, Age group:18+ years. 2 columns for both male and female |
| Percentage-Overweighted-Male | Percentage of adults who are overweight (age-standardized) - Age group: 18+ years. 2 columns for both male and female |
| Prevalence-Hypertension-Male | Prevalence of hypertension among adults aged 30-79 years(age-standardized). 2 columns for both male and female |
| Prevalence-Obesity | Prevalence of obesity among adults, BMI >= 30(age-standardized estimate)(%),Sex: both sexes, Age group:18+ |
| Death-By-High-BMI | Deaths that are from all causes attributed to high body-mass index per 100,000 people, in both sexes aged age-standarized |
Facebook
TwitterThe leading causes of death by sex and ethnicity in New York City in since 2007. Cause of death is derived from the NYC death certificate which is issued for every death that occurs in New York City.
Report last ran: 09/24/2019
Facebook
TwitterNote: Note: Starting October 10th, 2025 this dataset is deprecated and is no longer being updated. As of April 27, 2023 updates changed from daily to weekly. Summary The cumulative number of confirmed COVID-19 deaths among Maryland residents by age: 0-9; 10-19; 20-29; 30-39; 40-49; 50-59; 60-69; 70-79; 80+; Unknown. Description The MD COVID-19 - Confirmed Deaths by Age Distribution data layer is a collection of the statewide confirmed COVID-19 related deaths that have been reported each day by the Vital Statistics Administration by designated age ranges. A death is classified as confirmed if the person had a laboratory-confirmed positive COVID-19 test result. Some data on deaths may be unavailable due to the time lag between the death, typically reported by a hospital or other facility, and the submission of the complete death certificate. Probable deaths are available from the MD COVID-19 - Probable Deaths by Age Distribution data layer. Terms of Use The Spatial Data, and the information therein, (collectively the "Data") is provided "as is" without warranty of any kind, either expressed, implied, or statutory. The user assumes the entire risk as to quality and performance of the Data. No guarantee of accuracy is granted, nor is any responsibility for reliance thereon assumed. In no event shall the State of Maryland be liable for direct, indirect, incidental, consequential or special damages of any kind. The State of Maryland does not accept liability for any damages or misrepresentation caused by inaccuracies in the Data or as a result to changes to the Data, nor is there responsibility assumed to maintain the Data in any manner or form. The Data can be freely distributed as long as the metadata entry is not modified or deleted. Any data derived from the Data must acknowledge the State of Maryland in the metadata.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
https://news.yale.edu/sites/default/files/styles/featured_media/public/ynews-cancer-healthy_137381816.jpg?itok=HN73dW20&c=a75e254fe1da31f2732f6b0d7bce1413" alt="Cancer">
The dataset appears to contain information on the risk of developing or dying from various types of cancer for both males and females.
The columns include:
Gender: The type of cancer or category (e.g., "Any cancer", "Bladder", etc.). Risk of developing (Male): The percentage risk and the equivalent "one in _ person" statistic. Risk of dying (Male): The percentage risk and the equivalent "one in _ person" statistic. Risk of developing (Woman): The percentage risk and the equivalent "one in _ person" statistic. Risk of dying (Woman): The percentage risk and the equivalent "one in _ person" statistic.
Columns in the Dataset Gender Risk of developing (Male): Percentage Risk of developing (Male): One in _ Person Risk of dying (Male): Percentage Risk of dying (Male): One in _ Person Risk of developing (Woman): Percentage Risk of developing (Woman): One in _ Person Risk of dying (Woman): Percentage Risk of dying (Woman): One in _ Person