By Data Exercises [source]
This dataset is a comprehensive collection of data from county-level cancer mortality and incidence rates in the United States between 2000-2014. This data provides an unprecedented level of detail into cancer cases, deaths, and trends at a local level. The included columns include County, FIPS, age-adjusted death rate, average death rate per year, recent trend (2) in death rates, recent 5-year trend (2) in death rates and average annual count for each county. This dataset can be used to provide deep insight into the patterns and effects of cancer on communities as well as help inform policy decisions related to mitigating risk factors or increasing preventive measures such as screenings. With this comprehensive set of records from across the United States over 15 years, you will be able to make informed decisions regarding individual patient care or policy development within your own community!
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
This dataset provides comprehensive US county-level cancer mortality and incidence rates from 2000 to 2014. It includes the mortality and incidence rate for each county, as well as whether the county met the objective of 45.5 deaths per 100,000 people. It also provides information on recent trends in death rates and average annual counts of cases over the five year period studied.
This dataset can be extremely useful to researchers looking to study trends in cancer death rates across counties. By using this data, researchers will be able to gain valuable insight into how different counties are performing in terms of providing treatment and prevention services for cancer patients and whether preventative measures and healthcare access are having an effect on reducing cancer mortality rates over time. This data can also be used to inform policy makers about counties needing more target prevention efforts or additional resources for providing better healthcare access within at risk communities.
When using this dataset, it is important to pay close attention to any qualitative columns such as “Recent Trend” or “Recent 5-Year Trend (2)” that may provide insights into long term changes that may not be readily apparent when using quantitative variables such as age-adjusted death rate or average deaths per year over shorter periods of time like one year or five years respectively. Additionally, when studying differences between different counties it is important to take note of any standard FIPS code differences that may indicate that data was collected by a different source with a difference methodology than what was used in other areas studied
- Using this dataset, we can identify patterns in cancer mortality and incidence rates that are statistically significant to create treatment regimens or preventive measures specifically targeting those areas.
- This data can be useful for policymakers to target areas with elevated cancer mortality and incidence rates so they can allocate financial resources to these areas more efficiently.
- This dataset can be used to investigate which factors (such as pollution levels, access to medical care, genetic make up) may have an influence on the cancer mortality and incidence rates in different US counties
If you use this dataset in your research, please credit the original authors. Data Source
License: Dataset copyright by authors - You are free to: - Share - copy and redistribute the material in any medium or format for any purpose, even commercially. - Adapt - remix, transform, and build upon the material for any purpose, even commercially. - You must: - Give appropriate credit - Provide a link to the license, and indicate if changes were made. - ShareAlike - You must distribute your contributions under the same license as the original. - Keep intact - all notices that refer to this license, including copyright notices.
File: death .csv | Column name | Description | |:-------------------------------------------|:-------------------------------------------------------------------...
Mortality Rates for Lake County, Illinois. Explanation of field attributes: Average Age of Death – The average age at which a people in the given zip code die. Cancer Deaths – Cancer deaths refers to individuals who have died of cancer as the underlying cause. This is a rate per 100,000. Heart Disease Related Deaths – Heart Disease Related Deaths refers to individuals who have died of heart disease as the underlying cause. This is a rate per 100,000. COPD Related Deaths – COPD Related Deaths refers to individuals who have died of chronic obstructive pulmonary disease (COPD) as the underlying cause. This is a rate per 100,000.
This map service portrays the number of deaths per 100,000 people per square mile from lung and colon cancer. It displays the distribution of lung and colon cancer across the United States. Pop-ups show attributes such as state name, county name, number of colon or lung cancer deaths, and square miles per area.Lung Cancer: Death due to malignant neoplasm of the trachea, bronchus and lung.Colon Cancer: Death due to malignant neoplasm of the colon, rectum and anus.This data was sourced from: Community Health Status Indicators_Other Health Datapalooza focused content that may interest you: Health Datapalooza Health Datapalooza
Number and rate of new cancer cases diagnosed annually from 1992 to the most recent diagnosis year available. Included are all invasive cancers and in situ bladder cancer with cases defined using the Surveillance, Epidemiology and End Results (SEER) Groups for Primary Site based on the World Health Organization International Classification of Diseases for Oncology, Third Edition (ICD-O-3). Random rounding of case counts to the nearest multiple of 5 is used to prevent inappropriate disclosure of health-related information.
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Age-standardised rate of mortality from oral cancer (ICD-10 codes C00-C14) in persons of all ages and sexes per 100,000 population.RationaleOver the last decade in the UK (between 2003-2005 and 2012-2014), oral cancer mortality rates have increased by 20% for males and 19% for females1Five year survival rates are 56%. Most oral cancers are triggered by tobacco and alcohol, which together account for 75% of cases2. Cigarette smoking is associated with an increased risk of the more common forms of oral cancer. The risk among cigarette smokers is estimated to be 10 times that for non-smokers. More intense use of tobacco increases the risk, while ceasing to smoke for 10 years or more reduces it to almost the same as that of non-smokers3. Oral cancer mortality rates can be used in conjunction with registration data to inform service planning as well as comparing survival rates across areas of England to assess the impact of public health prevention policies such as smoking cessation.References:(1) Cancer Research Campaign. Cancer Statistics: Oral – UK. London: CRC, 2000.(2) Blot WJ, McLaughlin JK, Winn DM et al. Smoking and drinking in relation to oral and pharyngeal cancer. Cancer Res 1988; 48: 3282-7. (3) La Vecchia C, Tavani A, Franceschi S et al. Epidemiology and prevention of oral cancer. Oral Oncology 1997; 33: 302-12.Definition of numeratorAll cancer mortality for lip, oral cavity and pharynx (ICD-10 C00-C14) in the respective calendar years aggregated into quinary age bands (0-4, 5-9,…, 85-89, 90+). This does not include secondary cancers or recurrences. Data are reported according to the calendar year in which the cancer was diagnosed.Counts of deaths for years up to and including 2019 have been adjusted where needed to take account of the MUSE ICD-10 coding change introduced in 2020. Detailed guidance on the MUSE implementation is available at: https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/deaths/articles/causeofdeathcodinginmortalitystatisticssoftwarechanges/january2020Counts of deaths for years up to and including 2013 have been double adjusted by applying comparability ratios from both the IRIS coding change and the MUSE coding change where needed to take account of both the MUSE ICD-10 coding change and the IRIS ICD-10 coding change introduced in 2014. The detailed guidance on the IRIS implementation is available at: https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/deaths/bulletins/impactoftheimplementationofirissoftwareforicd10causeofdeathcodingonmortalitystatisticsenglandandwales/2014-08-08Counts of deaths for years up to and including 2010 have been triple adjusted by applying comparability ratios from the 2011 coding change, the IRIS coding change and the MUSE coding change where needed to take account of the MUSE ICD-10 coding change, the IRIS ICD-10 coding change and the ICD-10 coding change introduced in 2011. The detailed guidance on the 2011 implementation is available at https://webarchive.nationalarchives.gov.uk/ukgwa/20160108084125/http://www.ons.gov.uk/ons/guide-method/classifications/international-standard-classifications/icd-10-for-mortality/comparability-ratios/index.htmlDefinition of denominatorPopulation-years (aggregated populations for the three years) for people of all ages, aggregated into quinary age bands (0-4, 5-9, …, 85-89, 90+)
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
One-year and five-year net survival for adults (15-99) in England diagnosed with one of 29 common cancers, by age and sex.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
India IN: Mortality from CVD, Cancer, Diabetes or CRD between Exact Ages 30 and 70: Female data was reported at 19.800 NA in 2016. This records a decrease from the previous number of 20.000 NA for 2015. India IN: Mortality from CVD, Cancer, Diabetes or CRD between Exact Ages 30 and 70: Female data is updated yearly, averaging 21.200 NA from Dec 2000 (Median) to 2016, with 5 observations. The data reached an all-time high of 23.400 NA in 2000 and a record low of 19.800 NA in 2016. India IN: Mortality from CVD, Cancer, Diabetes or CRD between Exact Ages 30 and 70: Female data remains active status in CEIC and is reported by World Bank. The data is categorized under Global Database’s India – Table IN.World Bank.WDI: Health Statistics. Mortality from CVD, cancer, diabetes or CRD is the percent of 30-year-old-people who would die before their 70th birthday from any of cardiovascular disease, cancer, diabetes, or chronic respiratory disease, assuming that s/he would experience current mortality rates at every age and s/he would not die from any other cause of death (e.g., injuries or HIV/AIDS).; ; World Health Organization, Global Health Observatory Data Repository (http://apps.who.int/ghodata/).; Weighted average;
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
What is Lung Cancer Dataset?
The effectiveness of the cancer prediction system helps people to know their cancer risk at a low cost and it also helps the people to take the appropriate decision based on their cancer risk status. The data is collected from the website online lung cancer prediction system.
.
https://user-images.githubusercontent.com/36210723/182395183-ef7519e3-9c18-47ac-b7a6-a00e234f3949.png" alt="2022-08-02_170741">
.
Acknowledgments
When we use this dataset in our research, we credit the authors as :
License : CC BY 4.0.
Hong, Z.Q. and Yang, J.Y. "Optimal Discriminant Plane for a Small Number of Samples and Design Method of Classifier on the Plane", Pattern Recognition, Vol. 24, No. 4, pp. 317-324, 1991 and it is published t to reuse in google research dataset
The main idea for uploading this dataset is to practice data analysis with my students, as I am working in college and want my student to train our studying ideas in a big dataset, It may be not up to date and I mention the collecting years, but it is a good resource of data to practice
https://digital.nhs.uk/about-nhs-digital/terms-and-conditionshttps://digital.nhs.uk/about-nhs-digital/terms-and-conditions
Rapid Cancer Registration Data (RCRD) provides a quick, indicative source of cancer data. It is provided to support the planning and provision of cancer services. The data is based on a rapid processing of cancer registration data sources, in particular on Cancer Outcomes and Services Dataset (COSD) information. In comparison, National Cancer Registration Data (NCRD) relies on additional data sources, enhanced follow-up with trusts and expert processing by cancer registration officers. The Rapid Cancer Registration Data (RCRD) may be useful for service improvement projects including healthcare planning and prioritisation. However, it is poorly suited for epidemiological research due to limitations in the data quality and completeness.
Age standardized rate of cancer incidence, by selected sites of cancer and sex, three-year average, census metropolitan areas.
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
AbstractIn Italy, approximately 400.000 new cases of malignant tumors are recorded every year. The average of annual deaths caused by tumors, according to the Italian Cancer Registers, is about 3.5 deaths and about 2.5 per 1,000 men and women respectively, for a total of about 3 deaths every 1,000 people. Long-term (at least a decade) and spatially detailed data (up to the municipality scale) are neither easily accessible nor fully available for public consultation by the citizens, scientists, research groups, and associations. Therefore, here we present a ten-year (2009–2018) database on cancer mortality rates (in the form of Standardized Mortality Ratios, SMR) for 23 cancer macro-types in Italy on municipal, provincial, and regional scales. We aim to make easily accessible a comprehensive, ready-to-use, and openly accessible source of data on the most updated status of cancer mortality in Italy for local and national stakeholders, researchers, and policymakers and to provide researchers with ready-to-use data to perform specific studies. Methods For a given locality, year, and cause of death, the SMR is the ratio between the observed number of deaths (Om) and the number of expected deaths (Em): SMR = Om/Em (1) where Om should be an available observational data and Em is estimated as the weighted sum of age-specific population size for the given locality (ni) per age-specific death rates of the reference population (MRi): Em = sum(MRi x ni) (2) MRi could be provided by a public health organization or be estimated as the ratio between the age-specific number of deaths of reference population (Mi) to the age-specific reference population size (Ni): MRi = Mi/Ni (3) Thus, the value of Em is weighted by the age distribution of deaths and population size. SMR assumes value 1 when the number of observed and expected deaths are equal. Following eqns. (1-3), the SMR was computed for single years of the period 2009-2018 and for single cause of death as defined by the International ICD-10 classification system by using the following data: age-specific number of deaths by cause of reference population (i.e., Mi) from the Italian National Institute of Statistics (ISTAT, (http://www.istat.it/en/, last access: 26/01/2022)); age-specific census data on reference population (i.e., Ni) from ISTAT; the observed number of deaths by cause (i.e., Om) from ISTAT; the age-specific census data on population (ni); the SMR was estimated at three different level of aggregation: municipal, provincial (equivalent to the European classification NUTS 3) and regional (i.e., NUTS2). The SMR was also computed for the broad category of malignant tumors (i.e. C00-C979, hereinafter cancer macro-type C), and for the broad category of malignant tumor plus non-malignant tumors (i.e. C00-C979 plus D0-D489, hereinafter cancer macro-type CD). Lower 90% and 95% confidence intervals of 10-year average values were computed according to the Byar method.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In the shadows of the Covid-19 pandemic, there is another global health crisis that has gone largely unnoticed. This is the Noncommunicable Disease (NCD) pandemic.
The WHO website describes NCDs as follows:
Noncommunicable diseases (NCDs), also known as chronic diseases, tend to be of long duration and are the result of a combination of genetic, physiological, environmental and behaviours factors.
The main types of NCDs are cardiovascular diseases (like heart attacks and stroke), cancers, chronic respiratory diseases (such as chronic obstructive pulmonary disease and asthma) and diabetes.
NCDs disproportionately affect people in low- and middle-income countries where more than three quarters of global NCD deaths – 32million – occur.
- Noncommunicable diseases (NCDs) kill 41 million people each year, equivalent to 71% of all deaths globally.
- Each year, 15 million people die from a NCD between the ages of 30 and 69 years; over 85% of these "premature" deaths occur in low- and middle-income > * countries.
- Cardiovascular diseases account for most NCD deaths, or 17.9 million people annually, followed by cancers (9.0 million), respiratory diseases (3.9million), and diabetes (1.6 million).
- These 4 groups of diseases account for over 80% of all premature NCD deaths.
- Tobacco use, physical inactivity, the harmful use of alcohol and unhealthy diets all increase the risk of dying from a NCD.
- Detection, screening and treatment of NCDs, as well as palliative care, are key components of the response to NCDs.
This data repository consists of 3 CSV files: WHO-cause-of-death-by-NCD.csv is the main dataset, which provides the percentage of deaths caused by NCDs out of all causes of death, for each nation globally. Metadata_Country.csv and Metadata_Indicator.csv provide additional metadata which is helpful for interpreting the main CSV.
The data collected spans a period from 2000 to 2016. The main CSV has columns for every year from 1960 to 2019. It is advisable to drop all redundant columns where no data was collected.
Furthermore, it is advisable to merge Metadata_Country.csv with the main CSV as it provides valuable additional information, particularly on the economic situation of each nation.
This dataset has been extracted from The World Bank 'Cause of death, by non-communicable diseases (% of total)' Dataset, derived based on the data from WHO's Global Health Estimates. It is freely provided under a Creative Commons Attribution 4.0 International License (CC BY 4.0), with the additional terms as stated on the World Bank website: World Bank Terms of Use for Datasets.
I would be interested to see some good data wrangling (dropping redundant columns), as well as kernels interpreting additional information in 'SpecialNotes' column in Metadata_country.csv
It would also be great to see what different factors influence NCDs: most of all, the geopolitical factors. Would be great to see some choropleth visualisations to get an idea of which regions are most affected by NCDs.
Haverman data set is the result of the research conducted between 1958 and 1970 to examine the patient will survived for less than 5 years or grater than equal to five years after operation. The study was held at the University of Chicago's Billings Hospital.
It contains the three features and two classes
age - Age of patient
year - Year of operation
nodes - Positive Lymph Nodes
class 1 - Patient survived more than 5 years, class 2 - patient survived less than 5 years
all columns are numerical data
We wouldn't be here without the help of others. If you owe any attributions or thanks, include them here along with any citations of past research.
Your data will be in front of the world's largest data science community. What questions do you want to see answered?
SUMMARYThis analysis, designed and executed by Ribble Rivers Trust, identifies areas across England with the greatest levels of cancer (in persons of all ages). Please read the below information to gain a full understanding of what the data shows and how it should be interpreted.ANALYSIS METHODOLOGYThe analysis was carried out using Quality and Outcomes Framework (QOF) data, derived from NHS Digital, relating to cancer (in persons of all ages).This information was recorded at the GP practice level. However, GP catchment areas are not mutually exclusive: they overlap, with some areas covered by 30+ GP practices. Therefore, to increase the clarity and usability of the data, the GP-level statistics were converted into statistics based on Middle Layer Super Output Area (MSOA) census boundaries.The percentage of each MSOA’s population (all ages) with cancer was estimated. This was achieved by calculating a weighted average based on:The percentage of the MSOA area that was covered by each GP practice’s catchment areaOf the GPs that covered part of that MSOA: the percentage of registered patients that have that illness The estimated percentage of each MSOA’s population with cancer was then combined with Office for National Statistics Mid-Year Population Estimates (2019) data for MSOAs, to estimate the number of people in each MSOA with cancer, within the relevant age range.Each MSOA was assigned a relative score between 1 and 0 (1 = worst, 0 = best) based on:A) the PERCENTAGE of the population within that MSOA who are estimated to have cancerB) the NUMBER of people within that MSOA who are estimated to have cancerAn average of scores A & B was taken, and converted to a relative score between 1 and 0 (1= worst, 0 = best). The closer to 1 the score, the greater both the number and percentage of the population in the MSOA that are estimated to have cancer, compared to other MSOAs. In other words, those are areas where it’s estimated a large number of people suffer from cancer, and where those people make up a large percentage of the population, indicating there is a real issue with cancer within the population and the investment of resources to address that issue could have the greatest benefits.LIMITATIONS1. GP data for the financial year 1st April 2018 – 31st March 2019 was used in preference to data for the financial year 1st April 2019 – 31st March 2020, as the onset of the COVID19 pandemic during the latter year could have affected the reporting of medical statistics by GPs. However, for 53 GPs (out of 7670) that did not submit data in 2018/19, data from 2019/20 was used instead. Note also that some GPs (997 out of 7670) did not submit data in either year. This dataset should be viewed in conjunction with the ‘Health and wellbeing statistics (GP-level, England): Missing data and potential outliers’ dataset, to determine areas where data from 2019/20 was used, where one or more GPs did not submit data in either year, or where there were large discrepancies between the 2018/19 and 2019/20 data (differences in statistics that were > mean +/- 1 St.Dev.), which suggests erroneous data in one of those years (it was not feasible for this study to investigate this further), and thus where data should be interpreted with caution. Note also that there are some rural areas (with little or no population) that do not officially fall into any GP catchment area (although this will not affect the results of this analysis if there are no people living in those areas).2. Although all of the obesity/inactivity-related illnesses listed can be caused or exacerbated by inactivity and obesity, it was not possible to distinguish from the data the cause of the illnesses in patients: obesity and inactivity are highly unlikely to be the cause of all cases of each illness. By combining the data with data relating to levels of obesity and inactivity in adults and children (see the ‘Levels of obesity, inactivity and associated illnesses: Summary (England)’ dataset), we can identify where obesity/inactivity could be a contributing factor, and where interventions to reduce obesity and increase activity could be most beneficial for the health of the local population.3. It was not feasible to incorporate ultra-fine-scale geographic distribution of populations that are registered with each GP practice or who live within each MSOA. Populations might be concentrated in certain areas of a GP practice’s catchment area or MSOA and relatively sparse in other areas. Therefore, the dataset should be used to identify general areas where there are high levels of cancer, rather than interpreting the boundaries between areas as ‘hard’ boundaries that mark definite divisions between areas with differing levels of cancer.TO BE VIEWED IN COMBINATION WITH:This dataset should be viewed alongside the following datasets, which highlight areas of missing data and potential outliers in the data:Health and wellbeing statistics (GP-level, England): Missing data and potential outliersLevels of obesity, inactivity and associated illnesses (England): Missing dataDOWNLOADING THIS DATATo access this data on your desktop GIS, download the ‘Levels of obesity, inactivity and associated illnesses: Summary (England)’ dataset.DATA SOURCESThis dataset was produced using:Quality and Outcomes Framework data: Copyright © 2020, Health and Social Care Information Centre. The Health and Social Care Information Centre is a non-departmental body created by statute, also known as NHS Digital.GP Catchment Outlines. Copyright © 2020, Health and Social Care Information Centre. The Health and Social Care Information Centre is a non-departmental body created by statute, also known as NHS Digital. Data was cleaned by Ribble Rivers Trust before use.MSOA boundaries: © Office for National Statistics licensed under the Open Government Licence v3.0. Contains OS data © Crown copyright and database right 2021.Population data: Mid-2019 (June 30) Population Estimates for Middle Layer Super Output Areas in England and Wales. © Office for National Statistics licensed under the Open Government Licence v3.0. © Crown Copyright 2020.COPYRIGHT NOTICEThe reproduction of this data must be accompanied by the following statement:© Ribble Rivers Trust 2021. Analysis carried out using data that is: Copyright © 2020, Health and Social Care Information Centre. The Health and Social Care Information Centre is a non-departmental body created by statute, also known as NHS Digital; © Office for National Statistics licensed under the Open Government Licence v3.0. Contains OS data © Crown copyright and database right 2021. © Crown Copyright 2020.CaBA HEALTH & WELLBEING EVIDENCE BASEThis dataset forms part of the wider CaBA Health and Wellbeing Evidence Base.
Death rate has been age-adjusted by the 2000 U.S. standard population. Single-year data are only available for Los Angeles County overall, Service Planning Areas, Supervisorial Districts, City of Los Angeles overall, and City of Los Angeles Council Districts.Lung cancer is a leading cause of cancer-related death in the US. People who smoke have the greatest risk of lung cancer, though lung cancer can also occur in people who have never smoked. Most cases are due to long-term tobacco smoking or exposure to secondhand tobacco smoke. Cities and communities can take an active role in curbing tobacco use and reducing lung cancer by adopting policies to regulate tobacco retail; reducing exposure to secondhand smoke in outdoor public spaces, such as parks, restaurants, or in multi-unit housing; and improving access to tobacco cessation programs and other preventive services.For more information about the Community Health Profiles Data Initiative, please see the initiative homepage.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Nigeria NG: Mortality from CVD, Cancer, Diabetes or CRD between Exact Ages 30 and 70: Male data was reported at 20.900 NA in 2016. This records an increase from the previous number of 20.800 NA for 2015. Nigeria NG: Mortality from CVD, Cancer, Diabetes or CRD between Exact Ages 30 and 70: Male data is updated yearly, averaging 21.000 NA from Dec 2000 (Median) to 2016, with 5 observations. The data reached an all-time high of 22.600 NA in 2000 and a record low of 20.800 NA in 2015. Nigeria NG: Mortality from CVD, Cancer, Diabetes or CRD between Exact Ages 30 and 70: Male data remains active status in CEIC and is reported by World Bank. The data is categorized under Global Database’s Nigeria – Table NG.World Bank.WDI: Health Statistics. Mortality from CVD, cancer, diabetes or CRD is the percent of 30-year-old-people who would die before their 70th birthday from any of cardiovascular disease, cancer, diabetes, or chronic respiratory disease, assuming that s/he would experience current mortality rates at every age and s/he would not die from any other cause of death (e.g., injuries or HIV/AIDS).; ; World Health Organization, Global Health Observatory Data Repository (http://apps.who.int/ghodata/).; Weighted average;
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
The dataset consists of CT brain scans with cancer, tumor, and aneurysm. Each scan represents a detailed image of a patient's brain taken using CT (Computed Tomography). The data are presented in 2 different formats: .jpg and .dcm.
The dataset of CT brain scans is valuable for research in neurology, radiology, and oncology. It allows the development and evaluation of computer-based algorithms, machine learning models, and deep learning techniques for automated detection, diagnosis, and classification of these conditions.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12421376%2Fd534483d76552e312cf094fbe23d8cc5%2Fezgif.com-optimize.gif?generation=1697211124166914&alt=media" alt="">
keywords: aneurysm, cancer detection, cancer segmentation, tumor, computed tomography, head, skull, brain scan, eye sockets, sinuses, medical imaging, radiology dataset, neurology dataset, oncology dataset, image dataset, abnormalities detection, brain anatomy, health, brain formations, imaging procedure, x-rays measurements, machine learning, computer vision, deep learning
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
This dataset presents the footprint of cancer mortality statistics in Australia for all cancers combined and the 6 top cancer groupings (colorectal, leukaemia, lung, lymphoma, melanoma of the skin and pancreas) and their respective ICD-10 codes. The data spans the years 2009-2013 and is aggregated to Greater Capital City Statistical Areas (GCCSA) from the 2011 Australian Statistical Geography Standard (ASGS).
Mortality data refer to the number of deaths due to cancer in a given time period. Cancer deaths data are sourced from the Australian Institute of Health and Welfare (AIHW) 2013 National Mortality Database (NMD).
For further information about this dataset, please visit:
Please note:
AURIN has spatially enabled the original data.
Due to changes in geographic classifications over time, long-term trends are not available.
Values assigned to "n.p." in the original data have been removed from the data.
The Australian and jurisdictional totals include people who could not be assigned a GCCSA. The number of people who could not be assigned a GCCSA is less than 1% of the total.
The Australian total also includes residents of Other Territories (Cocos (Keeling) Islands, Christmas Island and Jervis Bay Territory).
Cause of Death Unit Record File data are provided to the AIHW by the Registries of Births, Deaths and Marriages and the National Coronial Information System (managed by the Victorian Department of Justice) and include cause of death coded by the Australian Bureau of Statistics (ABS). The data are maintained by the AIHW in the NMD.
Year refers to year of occurrence of death for years up to and including 2012, and year of registration of death for 2013. Deaths registered in 2011 and earlier are based on the final version of cause of death data; deaths registered in 2012 and 2013 are based on revised and preliminary versions, respectively and are subject to further revision by the ABS.
Cause of death information are based on underlying cause of death and are classified according to the International Classification of Diseases and Related Health Problems (ICD). Deaths registered in 1997 onwards are classified according to the 10th revision (ICD-10).
Colorectal deaths presented are underestimates. For further information, refer to "Complexities in the measurement of bowel cancer in Australia" in Causes of Death, Australia (ABS cat. no. 3303.0).
Note: DPH is updating and streamlining the COVID-19 cases, deaths, and testing data. As of 6/27/2022, the data will be published in four tables instead of twelve. The COVID-19 Cases, Deaths, and Tests by Day dataset contains cases and test data by date of sample submission. The death data are by date of death. This dataset is updated daily and contains information back to the beginning of the pandemic. The data can be found at https://data.ct.gov/Health-and-Human-Services/COVID-19-Cases-Deaths-and-Tests-by-Day/g9vi-2ahj. The COVID-19 State Metrics dataset contains over 93 columns of data. This dataset is updated daily and currently contains information starting June 21, 2022 to the present. The data can be found at https://data.ct.gov/Health-and-Human-Services/COVID-19-State-Level-Data/qmgw-5kp6 . The COVID-19 County Metrics dataset contains 25 columns of data. This dataset is updated daily and currently contains information starting June 16, 2022 to the present. The data can be found at https://data.ct.gov/Health-and-Human-Services/COVID-19-County-Level-Data/ujiq-dy22 . The COVID-19 Town Metrics dataset contains 16 columns of data. This dataset is updated daily and currently contains information starting June 16, 2022 to the present. The data can be found at https://data.ct.gov/Health-and-Human-Services/COVID-19-Town-Level-Data/icxw-cada . To protect confidentiality, if a town has fewer than 5 cases or positive NAAT tests over the past 7 days, those data will be suppressed. COVID-19 cases and associated deaths that have been reported among Connecticut residents, broken down by race and ethnicity. All data in this report are preliminary; data for previous dates will be updated as new reports are received and data errors are corrected. Deaths reported to the either the Office of the Chief Medical Examiner (OCME) or Department of Public Health (DPH) are included in the COVID-19 update. The following data show the number of COVID-19 cases and associated deaths per 100,000 population by race and ethnicity. Crude rates represent the total cases or deaths per 100,000 people. Age-adjusted rates consider the age of the person at diagnosis or death when estimating the rate and use a standardized population to provide a fair comparison between population groups with different age distributions. Age-adjustment is important in Connecticut as the median age of among the non-Hispanic white population is 47 years, whereas it is 34 years among non-Hispanic blacks, and 29 years among Hispanics. Because most non-Hispanic white residents who died were over 75 years of age, the age-adjusted rates are lower than the unadjusted rates. In contrast, Hispanic residents who died tend to be younger than 75 years of age which results in higher age-adjusted rates. The population data used to calculate rates is based on the CT DPH population statistics for 2019, which is available online here: https://portal.ct.gov/DPH/Health-Information-Systems--Reporting/Population/Population-Statistics. Prior to 5/10/2021, the population estimates from 2018 were used. Rates are standardized to the 2000 US Millions Standard population (data available here: https://seer.cancer.gov/stdpopulations/). Standardization was done using 19 age groups (0, 1-4, 5-9, 10-14, ..., 80-84, 85 years and older). More information about direct standardization for age adjustment is available here: https://www.cdc.gov/nchs/data/statnt/statnt06rv.pdf Categories are mutually exclusive. The category “multiracial” includes people who answered ‘yes’ to more than one race category. Counts may not add up to total case counts as data on race and ethnicity may be missing. Age adjusted rates calculated only for groups with more than 20 deaths. Abbreviation: NH=Non-Hispanic. Data on Connecticut deaths were obtained from the Connecticut Deaths Registry maintained by the DPH Office of Vital Records. Cause of death was determined by a death certifier (e.g., physician, APRN, medical
https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/
Characteristic | Value (N = 26254) |
---|---|
Age (years) | Mean ± SD: 61.4± 5 Median (IQR): 60 (57-65) Range: 43-75 |
Sex | Male: 15512 (59%) Female: 10742 (41%) |
Race | White: 23969 (91.3%) |
Ethnicity | Not Available |
Background: The aggressive and heterogeneous nature of lung cancer has thwarted efforts to reduce mortality from this cancer through the use of screening. The advent of low-dose helical computed tomography (CT) altered the landscape of lung-cancer screening, with studies indicating that low-dose CT detects many tumors at early stages. The National Lung Screening Trial (NLST) was conducted to determine whether screening with low-dose CT could reduce mortality from lung cancer.
Methods: From August 2002 through April 2004, we enrolled 53,454 persons at high risk for lung cancer at 33 U.S. medical centers. Participants were randomly assigned to undergo three annual screenings with either low-dose CT (26,722 participants) or single-view posteroanterior chest radiography (26,732). Data were collected on cases of lung cancer and deaths from lung cancer that occurred through December 31, 2009. This dataset includes the low-dose CT scans from 26,254 of these subjects, as well as digitized histopathology images from 451 subjects.
Results: The rate of adherence to screening was more than 90%. The rate of positive screening tests was 24.2% with low-dose CT and 6.9% with radiography over all three rounds. A total of 96.4% of the positive screening results in the low-dose CT group and 94.5% in the radiography group were false positive results. The incidence of lung cancer was 645 cases per 100,000 person-years (1060 cancers) in the low-dose CT group, as compared with 572 cases per 100,000 person-years (941 cancers) in the radiography group (rate ratio, 1.13; 95% confidence interval [CI], 1.03 to 1.23). There were 247 deaths from lung cancer per 100,000 person-years in the low-dose CT group and 309 deaths per 100,000 person-years in the radiography group, representing a relative reduction in mortality from lung cancer with low-dose CT screening of 20.0% (95% CI, 6.8 to 26.7; P=0.004). The rate of death from any cause was reduced in the low-dose CT group, as compared with the radiography group, by 6.7% (95% CI, 1.2 to 13.6; P=0.02).
Conclusions: Screening with the use of low-dose CT reduces mortality from lung cancer. (Funded by the National Cancer Institute; National Lung Screening Trial ClinicalTrials.gov number, NCT00047385).
Data Availability: A summary of the National Lung Screening Trial and its available datasets are provided on the Cancer Data Access System (CDAS). CDAS is maintained by Information Management System (IMS), contracted by the National Cancer Institute (NCI) as keepers and statistical analyzers of the NLST trial data. The full clinical data set from NLST is available through CDAS. Users of TCIA can download without restriction a publicly distributable subset of that clinical data, along with the CT and Histopathology images collected during the trial. (These previously were restricted.)
By Data Exercises [source]
This dataset is a comprehensive collection of data from county-level cancer mortality and incidence rates in the United States between 2000-2014. This data provides an unprecedented level of detail into cancer cases, deaths, and trends at a local level. The included columns include County, FIPS, age-adjusted death rate, average death rate per year, recent trend (2) in death rates, recent 5-year trend (2) in death rates and average annual count for each county. This dataset can be used to provide deep insight into the patterns and effects of cancer on communities as well as help inform policy decisions related to mitigating risk factors or increasing preventive measures such as screenings. With this comprehensive set of records from across the United States over 15 years, you will be able to make informed decisions regarding individual patient care or policy development within your own community!
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
This dataset provides comprehensive US county-level cancer mortality and incidence rates from 2000 to 2014. It includes the mortality and incidence rate for each county, as well as whether the county met the objective of 45.5 deaths per 100,000 people. It also provides information on recent trends in death rates and average annual counts of cases over the five year period studied.
This dataset can be extremely useful to researchers looking to study trends in cancer death rates across counties. By using this data, researchers will be able to gain valuable insight into how different counties are performing in terms of providing treatment and prevention services for cancer patients and whether preventative measures and healthcare access are having an effect on reducing cancer mortality rates over time. This data can also be used to inform policy makers about counties needing more target prevention efforts or additional resources for providing better healthcare access within at risk communities.
When using this dataset, it is important to pay close attention to any qualitative columns such as “Recent Trend” or “Recent 5-Year Trend (2)” that may provide insights into long term changes that may not be readily apparent when using quantitative variables such as age-adjusted death rate or average deaths per year over shorter periods of time like one year or five years respectively. Additionally, when studying differences between different counties it is important to take note of any standard FIPS code differences that may indicate that data was collected by a different source with a difference methodology than what was used in other areas studied
- Using this dataset, we can identify patterns in cancer mortality and incidence rates that are statistically significant to create treatment regimens or preventive measures specifically targeting those areas.
- This data can be useful for policymakers to target areas with elevated cancer mortality and incidence rates so they can allocate financial resources to these areas more efficiently.
- This dataset can be used to investigate which factors (such as pollution levels, access to medical care, genetic make up) may have an influence on the cancer mortality and incidence rates in different US counties
If you use this dataset in your research, please credit the original authors. Data Source
License: Dataset copyright by authors - You are free to: - Share - copy and redistribute the material in any medium or format for any purpose, even commercially. - Adapt - remix, transform, and build upon the material for any purpose, even commercially. - You must: - Give appropriate credit - Provide a link to the license, and indicate if changes were made. - ShareAlike - You must distribute your contributions under the same license as the original. - Keep intact - all notices that refer to this license, including copyright notices.
File: death .csv | Column name | Description | |:-------------------------------------------|:-------------------------------------------------------------------...