By Noah Rippner [source]
This dataset provides comprehensive information on county-level cancer death and incidence rates, as well as various related variables. It includes data on age-adjusted death rates, average deaths per year, recent trends in cancer death rates, recent 5-year trends in death rates, and average annual counts of cancer deaths or incidence. The dataset also includes the federal information processing standards (FIPS) codes for each county.
Additionally, the dataset indicates whether each county met the objective of a targeted death rate of 45.5. The recent trend in cancer deaths or incidence is also captured for analysis purposes.
The purpose of the death.csv file within this dataset is to offer detailed information specifically concerning county-level cancer death rates and related variables. On the other hand, the incd.csv file contains data on county-level cancer incidence rates and additional relevant variables.
To provide more context and understanding about the included data points, there is a separate file named cancer_data_notes.csv. This file serves to provide informative notes and explanations regarding the various aspects of the cancer data used in this dataset.
Please note that this particular description provides an overview for a linear regression walkthrough using this dataset based on Python programming language. It highlights how to source and import the data properly before moving into data preparation steps such as exploratory analysis. The walkthrough further covers model selection and important model diagnostics measures.
It's essential to bear in mind that this example serves as an initial attempt at creating a multivariate Ordinary Least Squares regression model using these datasets from various sources like cancer.gov along with US Census American Community Survey data. This baseline model allows easy comparisons with future iterations intended for improvements or refinements.
Important columns found within this extensively documented Kaggle dataset include County names along with their corresponding FIPS codes—a standardized coding system by Federal Information Processing Standards (FIPS). Moreover,Met Objective of 45.5? (1) column denotes whether a specific county achieved the targeted objective of a death rate of 45.5 or not.
Overall, this dataset aims to offer valuable insights into county-level cancer death and incidence rates across various regions, providing policymakers, researchers, and healthcare professionals with essential information for analysis and decision-making purposes
Familiarize Yourself with the Columns:
- County: The name of the county.
- FIPS: The Federal Information Processing Standards code for the county.
- Met Objective of 45.5? (1): Indicates whether the county met the objective of a death rate of 45.5 (Boolean).
- Age-Adjusted Death Rate: The age-adjusted death rate for cancer in the county.
- Average Deaths per Year: The average number of deaths per year due to cancer in the county.
- Recent Trend (2): The recent trend in cancer death rates/incidence in the county.
- Recent 5-Year Trend (2) in Death Rates: The recent 5-year trend in cancer death rates/incidence in the county.
- Average Annual Count: The average annual count of cancer deaths/incidence in the county.
Determine Counties Meeting Objective: Use this dataset to identify counties that have met or not met an objective death rate threshold of 45.5%. Look for entries where Met Objective of 45.5? (1) is marked as True or False.
Analyze Age-Adjusted Death Rates: Study and compare age-adjusted death rates across different counties using Age-Adjusted Death Rate values provided as floats.
Explore Average Deaths per Year: Examine and compare average annual counts and trends regarding deaths caused by cancer, using Average Deaths per Year as a reference point.
Investigate Recent Trends: Assess recent trends related to cancer deaths or incidence by analyzing data under columns such as Recent Trend, Recent Trend (2), and Recent 5-Year Trend (2) in Death Rates. These columns provide information on how cancer death rates/incidence have changed over time.
Compare Counties: Utilize this dataset to compare counties based on their cancer death rates and related variables. Identify counties with lower or higher average annual counts, age-adjusted death rates, or recent trends to analyze and understand the factors contributing ...
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This study delves into the global evolution of 43 Sustainable Development Goals (SDG) indicators, spanning 7 major health themes across 185 countries to evaluate the potential progress loss due to the COVID-19 pandemic. Both the cross-country and temporal variability of the dataset are employed to estimate an empirical model based on an extended version of the Preston curve, which links well-being to income levels and other key socioeconomic health determinants. The approach reveals significant global evolution trends operating in each SDG indicator assessed. We extrapolate the model yearly between 2020 and 2030 using the IMF’s pre-COVID-19 economic growth projections to show how each country in the dataset are expected to evolve in these health topics throughout the decade, assuming no other external shocks. The results of this baseline scenario are contrasted with a post-COVID-19 scenario, where most of the pandemic costs were already known. The study reveals that economic growth losses are, on average, estimated as 42% and 28% for low- and lower middle-income countries, and of 15% and 7% in high- and upper middle-income countries, respectively, according to the IMF’s projections. These disproportional figures are shown to exacerbate global health inequalities revealed by the curves. The expected progress loss in infectious diseases in low-income countries, for instance, is an average of 34%, against a mean of 6% in high-income countries. The theme of Infectious diseases is followed by injuries and violence; maternal and reproductive health; health systems coverage; and neonatal and infant health as those with worse performance. Low-income countries can expect an average progress loss of 16% across all health indicators assessed, whereas in high-income countries the estimated loss is as low as 3%. The disparity across countries is even more pronounced, with cases where the estimated progress loss is as high as nine times worse than the average loss of 8%. Conversely, countries with greater fiscal capacity are likely to fare much better under the circumstances, despite their worse death count, in many cases. Overall, these findings support the critical importance of integrating the fight against inequalities into the global development agendas.
The basis of this dataset is taken from WaterBase water quality data shared on EAA. After most of the columns there were dropped, new data was created with the help of Worldbank, OSM, Foursquare, SEDAC. After removing the country and city information from the available location information, socioeconomic features of that country were added. However, the distance of certain road types close to those coordinates was also added with OSM. It is thought that such information plays an important role in the pollution of waters.
Features:
parameterWaterBodyCategory: Water body category code, as defined in the codelist. (Taken from EAA) observedPropertyDeterminandCode: Unique code of the determinand monitored, as defined in the codelist. (Taken from EAA) procedureAnalysedFraction: Specification of which fraction of the sample was analysed. (Taken from EAA) procedureAnalysedMedia: Type of media monitored. (Taken from EAA) resultUom: Unit of measure for the reported values. (Taken from EAA) phenomenonTimeReferenceYear: Year during which the data were sampled. (Taken from EAA) parameterSamplingPeriod: The period of the year during which the data used for the aggregation were sampled. (Taken from EAA) resultMeanValue: Mean value of the data used for aggregation. (Taken from EAA) waterBodyIdentifier: Unique international identifier of the water body in which the data were obtained. (Taken from EAA) Country: Country info generated by using coordinates. PopulationDensity: Population density of Country TerraMarineProtected_2016_2018: Mean of protected Terra Marine areas of Country Between 2016-2018 TouristMean_1990_2020: Mean of Tourist count of Country between 1990-2020 VenueCount: Venue count in near of given coordinates. netMigration_2011_2018: Mean of migration of given Country between 2011-2018 literacyRate_2010_2018: Literacy rate of Country between 2010-2018 combustibleRenewables_2009_2014: Compustible Renewable count in Country between 2009-2014 droughts_floods_temperature: gdp composition_food_organic_waste_percent composition_glass_percent composition_metal_percent composition_other_percent composition_paper_cardboard_percent composition_plastic_percent composition_rubber_leather_percent composition_wood_percent composition_yard_garden_green_waste_percent waste_treatment_recycling_percent
Sources: https://www.eea.europa.eu/data-and-maps/data/waterbase-water-quality-2 https://datacatalog.worldbank.org/dataset/what-waste-global-database
Not seeing a result you expected?
Learn how you can add new datasets to our index.
By Noah Rippner [source]
This dataset provides comprehensive information on county-level cancer death and incidence rates, as well as various related variables. It includes data on age-adjusted death rates, average deaths per year, recent trends in cancer death rates, recent 5-year trends in death rates, and average annual counts of cancer deaths or incidence. The dataset also includes the federal information processing standards (FIPS) codes for each county.
Additionally, the dataset indicates whether each county met the objective of a targeted death rate of 45.5. The recent trend in cancer deaths or incidence is also captured for analysis purposes.
The purpose of the death.csv file within this dataset is to offer detailed information specifically concerning county-level cancer death rates and related variables. On the other hand, the incd.csv file contains data on county-level cancer incidence rates and additional relevant variables.
To provide more context and understanding about the included data points, there is a separate file named cancer_data_notes.csv. This file serves to provide informative notes and explanations regarding the various aspects of the cancer data used in this dataset.
Please note that this particular description provides an overview for a linear regression walkthrough using this dataset based on Python programming language. It highlights how to source and import the data properly before moving into data preparation steps such as exploratory analysis. The walkthrough further covers model selection and important model diagnostics measures.
It's essential to bear in mind that this example serves as an initial attempt at creating a multivariate Ordinary Least Squares regression model using these datasets from various sources like cancer.gov along with US Census American Community Survey data. This baseline model allows easy comparisons with future iterations intended for improvements or refinements.
Important columns found within this extensively documented Kaggle dataset include County names along with their corresponding FIPS codes—a standardized coding system by Federal Information Processing Standards (FIPS). Moreover,Met Objective of 45.5? (1) column denotes whether a specific county achieved the targeted objective of a death rate of 45.5 or not.
Overall, this dataset aims to offer valuable insights into county-level cancer death and incidence rates across various regions, providing policymakers, researchers, and healthcare professionals with essential information for analysis and decision-making purposes
Familiarize Yourself with the Columns:
- County: The name of the county.
- FIPS: The Federal Information Processing Standards code for the county.
- Met Objective of 45.5? (1): Indicates whether the county met the objective of a death rate of 45.5 (Boolean).
- Age-Adjusted Death Rate: The age-adjusted death rate for cancer in the county.
- Average Deaths per Year: The average number of deaths per year due to cancer in the county.
- Recent Trend (2): The recent trend in cancer death rates/incidence in the county.
- Recent 5-Year Trend (2) in Death Rates: The recent 5-year trend in cancer death rates/incidence in the county.
- Average Annual Count: The average annual count of cancer deaths/incidence in the county.
Determine Counties Meeting Objective: Use this dataset to identify counties that have met or not met an objective death rate threshold of 45.5%. Look for entries where Met Objective of 45.5? (1) is marked as True or False.
Analyze Age-Adjusted Death Rates: Study and compare age-adjusted death rates across different counties using Age-Adjusted Death Rate values provided as floats.
Explore Average Deaths per Year: Examine and compare average annual counts and trends regarding deaths caused by cancer, using Average Deaths per Year as a reference point.
Investigate Recent Trends: Assess recent trends related to cancer deaths or incidence by analyzing data under columns such as Recent Trend, Recent Trend (2), and Recent 5-Year Trend (2) in Death Rates. These columns provide information on how cancer death rates/incidence have changed over time.
Compare Counties: Utilize this dataset to compare counties based on their cancer death rates and related variables. Identify counties with lower or higher average annual counts, age-adjusted death rates, or recent trends to analyze and understand the factors contributing ...