Facebook
TwitterU.S. citizens with a professional degree had the highest median household income in 2023, at 172,100 U.S. dollars. In comparison, those with less than a 9th grade education made significantly less money, at 35,690 U.S. dollars. Household income The median household income in the United States has fluctuated since 1990, but rose to around 70,000 U.S. dollars in 2021. Maryland had the highest median household income in the United States in 2021. Maryland’s high levels of wealth is due to several reasons, and includes the state's proximity to the nation's capital. Household income and ethnicity The median income of white non-Hispanic households in the United States had been on the rise since 1990, but declining since 2019. While income has also been on the rise, the median income of Hispanic households was much lower than those of white, non-Hispanic private households. However, the median income of Black households is even lower than Hispanic households. Income inequality is a problem without an easy solution in the United States, especially since ethnicity is a contributing factor. Systemic racism contributes to the non-White population suffering from income inequality, which causes the opportunity for growth to stagnate.
Facebook
TwitterAn individual’s annual income results from various factors. Intuitively, it is influenced by the individual’s education level, age, gender, occupation, and etc.
This is a widely cited KNN dataset. I encountered it during my course, and I wish to share it here because it is a good starter example for data pre-processing and machine learning practices.
Fields
The dataset contains 16 columns
Target filed: Income
-- The income is divide into two classes: <=50K and >50K
Number of attributes: 14
-- These are the demographics and other features to describe a person
We can explore the possibility in predicting income level based on the individual’s personal information.
Acknowledgements This dataset named “adult” is found in the UCI machine learning repository http://www.cs.toronto.edu/~delve/data/adult/desc.html
The detailed description on the dataset can be found in the original UCI documentation http://www.cs.toronto.edu/~delve/data/adult/adultDetail.html
Facebook
TwitterIn 2022, about 37.7 percent of the U.S. population who were aged 25 and above had graduated from college or another higher education institution, a slight decline from 37.9 the previous year. However, this is a significant increase from 1960, when only 7.7 percent of the U.S. population had graduated from college. Demographics Educational attainment varies by gender, location, race, and age throughout the United States. Asian-American and Pacific Islanders had the highest level of education, on average, while Massachusetts and the District of Colombia are areas home to the highest rates of residents with a bachelor’s degree or higher. However, education levels are correlated with wealth. While public education is free up until the 12th grade, the cost of university is out of reach for many Americans, making social mobility increasingly difficult. Earnings White Americans with a professional degree earned the most money on average, compared to other educational levels and races. However, regardless of educational attainment, males typically earned far more on average compared to females. Despite the decreasing wage gap over the years in the country, it remains an issue to this day. Not only is there a large wage gap between males and females, but there is also a large income gap linked to race as well.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Multivariate ordinal logistic regression model of family demographics, parenting variables, and participant education predicting to participant adult income.
Facebook
TwitterThis intermediate level data set was extracted from the census bureau database. There are 48842 instances of data set, mix of continuous and discrete (train=32561, test=16281).
The data set has 15 attribute which include age, sex, education level and other relevant details of a person. The data set will help to improve your skills in Exploratory Data Analysis, Data Wrangling, Data Visualization and Classification Models.
Feel free to explore the data set with multiple supervised and unsupervised learning techniques. The Following description gives more details on this data set:
age: the age of an individual.workclass: The type of work or employment of an individual. It can have the following categories:
Final Weight: The weights on the CPS files are controlled to independent estimates of the civilian noninstitutional population of the US. These are prepared monthly for us by Population Division here at the Census Bureau. We use 3 sets of controls.These are: 1. A single cell estimate of the population 16+ for each state. 2. Controls for Hispanic Origin by age and sex. 3. Controls by Race, age and sex.
We use all three sets of controls in our weighting program and "rake" through them 6 times so that by the end we come back to all the controls we used.
People with similar demographic characteristics should have similar weights. There is one important caveat to remember about this statement. That is that since the CPS sample is actually a collection of 51 state samples, each with its own probability of selection, the statement only applies within state.
education: The highest level of education completed. education-num: The number of years of education completed. marital-status: The marital status. occupation: Type of work performed by an individual.relationship: The relationship status.race: The race of an individual. sex: The gender of an individual.capital-gain: The amount of capital gain (financial profit).capital-loss: The amount of capital loss an individual has incurred.hours-per-week: The number of hours works per week.native-country: The country of origin or the native country.income: The income level of an individual and serves as the target variable. It indicates whether the income is greater than $50,000 or less than or equal to $50,000, denoted as (>50K, <=50K).
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset contains information about adult income prediction. It includes the following columns:
workclass: The type of employment (e.g., Private, Self-emp-not-inc, Federal-gov, Local-gov) fnlwgt: The number of people the census believes the entry represents education: The highest level of education achieved education-num: The numeric representation of the previous column marital-status: The marital status of the individual occupation: The occupation of the individual relationship: The relationship of the individual to their household race: The race of the individual sex: The gender of the individual capital-gain: The capital gains of the individual capital-loss: The capital losses of the individual hours-per-week: The number of hours the individual works per week country: The native country of the individual salary: The income level of the individual, which is the target variable to predict.
The goal of this dataset is to build a model that can accurately predict the income level of an individual based on the provided features.
Facebook
TwitterIn the United States, the rate of obesity is lower among college graduates compared to those who did not graduate from college. For example, in 2023, around 27 percent of college graduates were obese, while 36 percent of those with some college or technical school were obese. At that time, rates of obesity were highest among those with less than a high school education, at around 37 percent. Income and obesity As with education level, there are also differences in rates of obesity in the United States based on income. Adults in the U.S. with an annual income of 75,000 U.S. dollars or more have the lowest rates of obesity, with around 29 percent of this population obese in 2023. On the other hand, those earning less than 15,000 U.S. dollars per year had the highest rates of obesity at that time, at 37 percent. One reason for this disparity may be a lack of access to fresh food among those earning less, as cheap food in the United States tends to be unhealthier. What is the most obese state? As of 2023, the states with the highest rates of obesity were West Virginia, Mississippi, and Arkansas. At that time, around 41 percent of adults in West Virginia were obese. The states with the lowest rates of obesity were Colorado, Hawaii, and Massachusetts. Still, around a quarter of adults in Colorado were obese in 2023. West Virginia and Mississippi are also the states with the highest rates of obesity among high school students. Children with obesity are more likely to be obese as adults and are at increased risk of health conditions such as asthma, type 2 diabetes, and sleep apnea.
Facebook
TwitterClick a census tract on the map to view the details. Click "Layers" to explore other demographic layers.The Racial and Social Equity Index combines information on race, ethnicity, and related demographics with data on socioeconomic and health disadvantages to identify where priority populations make up relatively large proportions of neighborhood residents. Click here for a User Guide.The Composite Index includes sub-indices of: Race, English Language Learners, and Origins Index ranks census tracts by an index of three measures weighted as follows: Persons of color (weight: 1.0) English language learner (weight: 0.5) Foreign born (weight: 0.5)Socioeconomic Disadvantage Index ranks census tracts by an index of two equally weighted measures: Income below 200% of poverty level Educational attainment less than a bachelor’s degreeHealth Disadvantage Index ranks census tracts by an index of seven equally weighted measures: Adults with no leisure-time physical activity Adults with diagnosed diabetes Adults with obesity Adults who reported mental health as not good Adults with asthma Low life expectancy at birth Adults with one or more disabilityThe index does not reflect population densities, nor does it show variation within census tracts which can be important considerations at a local level.Sources are as indicated below. Additional layers are updated annually by the Office of Planning and Community Development.Produced by City of Seattle Office of Planning & Community Development. For more information on the indices, including guidance for use, contact Diana Canzoneri (diana.canzoneri@seattle.gov).Get the data for this map from SeattleGeoDataSources: 2017-2021 5-Year American Community Survey Estimates, U.S. Census Bureau; 2020 Decennial Census, U.S. Census Bureau; modeled estimates from the Centers for Disease Control’ in the PLACES project; Washington State Department of Health’s Washington Tracking Network (WTN);, and estimates from the Public Health – Seattle & King County (based on the Community Health Assessment Tool).Notes: Language is for population age 5 and older. Educational attainment is for the population age 25 and over.Life expectancy is life expectancy at birth.Other health measures based on percentages of the adult population.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
BackgroundTo investigate the effects of age and sex on the relationship between socioeconomic status (SES) and the prevalence and control status of diabetes mellitus (DM) in Korean adults.MethodsData came from 16,175 adults (6,951 men and 9,227 women) over the age of 30 who participated in the 2008-2010 Korea National Health and Nutrition Examination Survey. SES was measured by household income or education level. The adjusted odds ratios (ORs) and corresponding 95% confidence intervals (95% CI) for the prevalence or control status of diabetes were calculated using multiple logistic regression analyses across household income quartiles and education levels.ResultsThe household income-DM and education level-DM relationships were significant in younger age groups for both men and women. The adjusted ORs and 95% CI for diabetes were 1.51 (0.97, 2.34) and 2.28 (1.29, 4.02) for the lowest vs. highest quartiles of household income and education level, respectively, in women younger than 65 years of age (both P for linear trend < 0.05 with Bonferroni adjustment). The adjusted OR and 95% CI for diabetes was 2.28 (1.53, 3.39) for the lowest vs. highest quartile of household income in men younger than 65 (P for linear trend < 0.05 with Bonferroni adjustment). However, in men and women older than 65, no associations were found between SES and the prevalence of DM. No significant association between SES and the status of glycemic control was detected.ConclusionsWe found age- and sex-specific differences in the relationship of household income and education with the prevalence of DM in Korea. DM preventive care is needed for groups with a low SES, particularly in young or middle-aged populations.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
PurposeThe purpose of this study was to explore the relationship between education level and health behavior including sleep, work activity, exercise activity, and sedentary behavior among emerging adults.MethodsThis study utilized data from the National Health and Nutrition Examination Survey (NHANES) collected between 2007 and 2018. The study sample included 4,484 emerging adults aged 18–25 years and the weighted participants were 30,057,813. Weighted multivariable regression analysis was performed to investigate the association between education level and the aforementioned health behavior, adjusting for age, gender, race/ethnicity, marital status, poverty-income ratio, BMI, smoking, and alcohol drinking status.ResultsThis study revealed that higher education level was associated with shorter sleep duration [Fully adjusted model, β (95% CI): −0.588 (−0.929, −0.246), p < 0.001]. Additionally, those with higher education levels were more likely to allocate time in sedentary behavior [β (95% CI): 90.162 (41.087, 139.238), p < 0.001]. Moreover, higher education level was related to less work activity [β (95% CI): −806.991 (−1,500.280, −113.703), p = 0.023] and more exercise activity time [β (95% CI): 118.196 (−21.992, 258.385), p = 0.097]. Subgroup analysis further verified this trend and detected that males with higher education level tended to participate in less work activity [β (95% CI): −1,139.972 (−2,136.707, −143.237), p = 0.026] while females with higher education level tended to engage in more exercise activity [Fully adjusted model, β (95% CI): 141.709 (45.468, 237.950), p = 0.004].ConclusionThis study highlighted the importance of education level as a significant factor in promoting healthy behavior among emerging adults. The findings underscored the need for the Ministry of Education to prioritize educating this demographic about the significance of maintaining adequate sleep patterns and reducing sedentary habits. Encouraging them to allocate more time for work and physical activities can significantly contribute to their overall wellbeing and success, ultimately fostering a healthier next generation.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
BackgroundThe prevalence of diabetes is increasing rapidly in low- and middle-income countries (LMICs), urgently requiring detailed evidence to guide the response of health systems to this epidemic. In an effort to understand at what step in the diabetes care continuum individuals are lost to care, and how this varies between countries and population groups, this study examined health system performance for diabetes among adults in 28 LMICs using a cascade of care approach.Methods and findingsWe pooled individual participant data from nationally representative surveys done between 2008 and 2016 in 28 LMICs. Diabetes was defined as fasting plasma glucose ≥ 7.0 mmol/l (126 mg/dl), random plasma glucose ≥ 11.1 mmol/l (200 mg/dl), HbA1c ≥ 6.5%, or reporting to be taking medication for diabetes. Stages of the care cascade were as follows: tested, diagnosed, lifestyle advice and/or medication given (“treated”), and controlled (HbA1c < 8.0% or equivalent). We stratified cascades of care by country, geographic region, World Bank income group, and individual-level characteristics (age, sex, educational attainment, household wealth quintile, and body mass index [BMI]). We then used logistic regression models with country-level fixed effects to evaluate predictors of (1) testing, (2) treatment, and (3) control. The final sample included 847,413 adults in 28 LMICs (8 low income, 9 lower-middle income, 11 upper-middle income). Survey sample size ranged from 824 in Guyana to 750,451 in India. The prevalence of diabetes was 8.8% (95% CI: 8.2%–9.5%), and the prevalence of undiagnosed diabetes was 4.8% (95% CI: 4.5%–5.2%). Health system performance for management of diabetes showed large losses to care at the stage of being tested, and low rates of diabetes control. Total unmet need for diabetes care (defined as the sum of those not tested, tested but undiagnosed, diagnosed but untreated, and treated but with diabetes not controlled) was 77.0% (95% CI: 74.9%–78.9%). Performance along the care cascade was significantly better in upper-middle income countries, but across all World Bank income groups, only half of participants with diabetes who were tested achieved diabetes control. Greater age, educational attainment, and BMI were associated with higher odds of being tested, being treated, and achieving control. The limitations of this study included the use of a single glucose measurement to assess diabetes, differences in the approach to wealth measurement across surveys, and variation in the date of the surveys.ConclusionsThe study uncovered poor management of diabetes along the care cascade, indicating large unmet need for diabetes care across 28 LMICs. Performance across the care cascade varied by World Bank income group and individual-level characteristics, particularly age, educational attainment, and BMI. This policy-relevant analysis can inform country-specific interventions and offers a baseline by which future progress can be measured.
Facebook
TwitterODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
!!PLEASE NOTE!! When downloading the data, please select "File Geodatabase" to preserve long field names. Shapefile will truncate field names to 10 characters.Version: CurrentThe Racial and Social Equity Index combines information on race, ethnicity, and related demographics with data on socioeconomic and health disadvantages to identify where priority populations make up relatively large proportions of neighborhood residents. Click here for a User Guide.See the layer in action in the Racial and Social Equity ViewerClick here for an 11x17 printable pdf version of the map.The Composite Index includes sub-indices of: Race, English Language Learners, and Origins Index ranks census tracts by an index of three measures weighted as follows: Persons of color (weight: 1.0) English language learner (weight: 0.5) Foreign born (weight: 0.5)Socioeconomic Disadvantage Index ranks census tracts by an index of two equally weighted measures:Income below 200% of poverty level Educational attainment less than a bachelor’s degreeHealth Disadvantage Index ranks census tracts by an index of seven equally weighted measures:No leisure-time physical activityDiagnosed diabetes ObesityMental health not good AsthmaLow life expectancy at birthDisabilityThe index does not reflect population densities, nor does it show variation within census tracts which can be important considerations at a local level.Sources are as indicated below.Produced by City of Seattle Office of Planning & Community Development. For more information on the indices, including guidance for use, contact Diana Canzoneri (diana.canzoneri@seattle.gov).Sources: 2017-2021 Five-Year American Community Survey Estimates, U.S. Census Bureau; 2020 Decennial Census, U.S. Census Bureau; estimates from the Centers for Disease Control’ Behavioral Risk Factor Surveillance System (BRFSS) published in the “The 500 Cities Project,”; Washington State Department of Health’s Washington Tracking Network (WTN);, and estimates from the Public Health – Seattle & King County (based on the Community Health Assessment Tool).Language is for population age 5 and older. Educational attainment is for the population age 25 and over.Life expectancy is life expectancy at birth.Other health measures based on percentages of the adult population.
Facebook
Twitterhttps://www.gesis.org/en/institute/data-usage-termshttps://www.gesis.org/en/institute/data-usage-terms
In the project "Studie zum Zusammenhang von Kompetenzen und Arbeitsmarktchancen von gering Qualifizierten in Deutschland" (Study on the Relationship between Skills and Labour Market Opportunities of People with Low Qualifications in Germany, funded by the Federal Ministry of Education and Research, funding number PLI3061), the skills and labour market opportunities of people aged 26 to 55 in Germany were examined in more detail. This is an age group that is in the active employment phase and has generally completed its training phase. In order to be able to make reliable statements about this group, an increase sample of people aged 26 to 55 living in eastern Germany was drawn at the same time as the PIAAC sample was drawn. The 560 additional cases surveyed are not part of the main sample in the PIAAC Public and Scientific Use Files (ZA 5845), but were later combined with the net cases of the PIAAC main sample (aged 26 to 55) in the present dataset.
The present data set thus includes the supplementary sample for East Germany and the 26 to 55-year-old respondents from the main sample (study number ZA 5845). For these persons, competence values (plausible values) are available in the following areas as well as background information - reading literacy - expertise in everyday mathematics - technology-based problem solving.
Respondents aged 26 to 55 from the main PIAAC sample may have slightly different values in some variables in this data set. These include competence, income and weighting variables. The reason for this is that the imputation and scaling procedures for these variables were performed separately for both datasets to ensure maximum internal consistency of each dataset.
The background questionnaire for PIAAC is divided into the following topics:
A: General information such as age and gender
B: Education as the highest educational attainment, current education, participation in further education
C: Employment status and background such as paid work and unpaid work for a family business, job search information
D: Information on current employment such as occupation, self-employment and income
E: Information on last gainful employment such as occupation, self-employment, reason for leaving the company
Q: Skills used at work such as influence and physical skills
G: Reading, writing, etc. during work
H: Reading, writing etc. in everyday life
I: Attitude and self-assessment to e.g. learning and voluntary work
J: background information such as country of birth, nationality, language, professions of parents
In addition, the data set contains further derived background variables, information on competence measurement, information on sampling and weighting, limited regional data, and time data for the interview.
For data protection reasons, the information on the municipal size class is only available to a limited extent. Furthermore, the data on the country of origin, nationality and the country where the highest school leaving certificate was obtained have been coarsened. These data were categorised on the basis of the microcensus.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
ObjectiveType II diabetes is a recognized risk factor of declining cognitive function in high-income countries. However, there is limited research on this association across low- and middle-income countries. We aimed to examine and compare the relationship between type II diabetes and cognition amongst adults aged 60 years and older for two of the largest LMICs: India and China.MethodsCross-sectional data was analyzed from population-based Harmonized Cognitive Assessment Protocols studies in India (n = 4,062) and China (n = 9,741). Multivariable-adjusted linear regression models examined the relationship between diabetes (self-reported or biomarker HbA1c ≥6.5%) and general cognition. Interaction testing assessed effect modification based on urban versus rural residence and educational attainment.ResultsType II diabetes was not associated with general cognitive scores in India or China in fully adjusted models. Interaction testing revealed a positive association in rural but not urban residences in India, however this was not seen in China. Both countries showed effect modification by education attainment. In India, diabetes was associated with higher average cognitive scores among those with none or early childhood education, while the relationship was null among those with at least an upper secondary education. In China, diabetes was inversely related to average cognitive scores among those with less than lower secondary education, while the relationship was null among the remainder of the study sample.ConclusionThe type II diabetes and cognitive function association in India and China differs from that observed in high-income countries. These findings suggest epidemiologic and nutrition transition variations. In India, health care access, urbanization and social differences between urban and rural areas may influence this relationship. In both countries, epidemiologic and nutrition patterns may adversely impact individuals from socially and financially vulnerable populations with less than lower secondary education. Longitudinal research using harmonized cognitive scores is encouraged to further investigate these findings.
Facebook
TwitterODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
!!PLEASE NOTE!! When downloading the data, please select "File Geodatabase" to preserve long field names. Shapefile will truncate field names to 10 characters.This version of the Racial and Social Equity Index indexes all tracts in the remainder of King County against tracts in the city of Seattle. This index should only be used in direct consultation with the Office of Planning and Community Development, and is intended to be of use for comparing tracts in the remainder of King County within the context of percentiles set by tracts within the city of Seattle.Version: CurrentThe Racial and Social Equity Index combines information on race, ethnicity, and related demographics with data on socioeconomic and health disadvantages to identify where priority populations make up relatively large proportions of neighborhood residents. Click here for a User Guide.See the City of Seattle RSE Index in action in the Racial and Social Equity ViewerThe Composite Index includes sub-indices of: Race, English Language Learners, and Origins Index ranks census tracts by an index of three measures weighted as follows: Persons of color (weight: 1.0) English language learner (weight: 0.5) Foreign born (weight: 0.5)Socioeconomic Disadvantage Index ranks census tracts by an index of two equally weighted measures: Income below 200% of poverty level Educational attainment less than a bachelor’s degreeHealth Disadvantage Index ranks census tracts by an index of seven equally weighted measures: No leisure-time physical activity Diagnosed diabetes Obesity Mental health not good AsthmaLow life expectancy at birth DisabilityThe index does not reflect population densities, nor does it show variation within census tracts which can be important considerations at a local level.Sources are as indicated below.Produced by City of Seattle Office of Planning & Community Development. For more information on the indices, including guidance for use, contact Diana Canzoneri (diana.canzoneri@seattle.gov).Sources: 2017-2021 Five-Year American Community Survey Estimates, U.S. Census Bureau; 2020 Decennial Census, U.S. Census Bureau; estimates from the Centers for Disease Control’ Behavioral Risk Factor Surveillance System (BRFSS) published in the “The 500 Cities Project,”; Washington State Department of Health’s Washington Tracking Network (WTN);, and estimates from the Public Health – Seattle & King County (based on the Community Health Assessment Tool).Language is for population age 5 and older. Educational attainment is for the population age 25 and over.Life expectancy is life expectancy at birth.Other health measures based on percentages of the adult population.
Facebook
Twitterhttps://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de473812https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de473812
Abstract (en): This data collection is comprised of responses from two sets of survey questionnaires, the basic Current Population Survey (CPS) and a Tobacco Use Supplement (TUS) survey. The TUS 2010-2011 Wave consists of four collections: May 2010, August 2010, January 2011, and May 2010-May 2011. The Current Population Survey, administered monthly, is the source of the official government statistics on employment and unemployment. From time to time, additional questions are included on health, education, and previous work experience. Similar to other CPS supplements, the Tobacco Use Supplement was designed for both proxy and self-respondents. All CPS household members age 18 and older who had completed CPS core items were eligible for the supplement items. Both proxy and self-respondents were asked about their smoking status and the use of other tobacco products. For self-respondents only, different questions were asked depending on their tobacco use status: For former/current smokers, questions were asked about type of cigarettes smoked, measures of addiction, attempts to quit smoking, methods and treatments used to quit, cost of cigarettes and age initiating everyday cigarette smoking and the state of residence at that time, etc. Current smokers were asked whether the medical and dental community had advised them to quit smoking, and if they were planning to quit in the future. All self-respondents were asked about smoking policy at their work place and their attitudes towards smoking in different locations. Demographic information within this collection includes age, sex, race, Hispanic origin, marital status, veteran status, educational attainment, family relationship, occupation, and income. All adult records retain the basic CPS final weight, PWSSWGT, which controls for age, race, sex, Hispanic origin estimates, and individual state estimates. Please use this basic final weight for tallying the labor force items. This collection also contains two special supplement weights: a supplement non-response adjustment weight (PWNRWGT), and a supplement self-response adjustment weight (PWSRWGT). Please use PWNRWGT for tallying the supplement items. Users interested in self-response analysis (especially for those items requiring self-response only), please use PWSRWGT for tallying the supplement items. Additional weights include: HWHHWGT, which is the household weight used for tallying household characteristics and adjusts for household nonresponse. PWFMWGT, which is the family weight used only for tallying family characteristics. PWLGWGT, the longitudinal weight found only on adult records matched from month to month; also used for gross flows analysis. PWORWGT, the outgoing rotation weight used for tallying information collected only in outgoing rotations. PWVETWGT, the veterans weight used for tallying veteran's data only; controlled to estimates of veterans supplied by VA. PWCMPWGT, the composited final weight used to create BLS's published labor force statistics. For more information on weights, please refer to the Technical Documentation. ICPSR data undergo a confidentiality review and are altered when necessary to limit the risk of disclosure. ICPSR also routinely creates ready-to-go data files along with setups in the major statistical software formats as well as standard codebooks to accompany the data. In addition to these procedures, ICPSR performed the following processing steps for this data collection: Created variable labels and/or value labels.; Created online analysis version with question text.; Checked for undocumented or out-of-range codes.. All persons aged 18 and above in the civilian non-institutional population of the United States. Smallest Geographic Unit: city The current CPS sample is selected based on information from the 2000 census. The first stage of the sample design created 2,025 geographic areas called primary sampling units (PSU) in the entire United States. The PSUs were grouped into strata within each state. A total of 824 PSUs were selected for sampling. Approximately 72,000 housing units within the selected PSUs are assigned for interview each month. A unique feature of the CPS is its panel design, in which each household in the sample is surveyed for four consecutive months and then four more consecutive months nine months later. Due to this sampling strategy, a subset of persons who were in sample for any given month of TUS-CPS fielding can be linked with other CPS conducted wi...
Facebook
TwitterBy Data Exercises [source]
This dataset contains a wealth of health-related information and socio-economic data aggregated from multiple sources such as the American Community Survey, clinicaltrials.gov, and cancer.gov, covering a variety of US counties. Your task is to use this collection of data to build an Ordinary Least Squares (OLS) regression model that predicts the target death rate in each county. The model should incorporate variables related to population size, health insurance coverage, educational attainment levels, median incomes and poverty rates. Additionally you will need to assess linearity between your model parameters; measure serial independence among errors; test for heteroskedasticity; evaluate normality in the residual distribution; identify any outliers or missing values and determine how categories variables are handled; compare models through implementation with k=10 cross validation within linear regressions as well as assessing multicollinearity among model parameters. Examine your results by utilizing statistical agreements such as R-squared values and Root Mean Square Error (RMSE) while also interpreting implications uncovered by your analysis based on health outcomes compared to correlates among demographics surrounding those effected most closely by land structure along geographic boundaries throughout the United States
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
This dataset provides data on health outcomes, demographics, and socio-economic factors for various US counties from 2010-2016. It can be used to uncover trends in health outcomes and socioeconomic factors across different counties in the US over a six year period.
The dataset contains a variety of information including statefips (a two digit code that identifies the state), countyfips (a three digit code that identifies the county), avg household size, avg annual count of cancer cases, average deaths per year, target death rate, median household income, population estimate for 2015, poverty percent study per capita binned income as well as demographic information such as median age of male and female population percent married households adults with no high school diploma adults with high school diploma percentage with some college education bachelor's degree holders among adults over 25 years old employed persons 16 and over unemployed persons 16 and over private coverage available private coverage available alone temporary private coverage available public coverage available public coverage available alone percentages of white black Asian other race married households and birth rate.
Using this dataset you can build a multivariate ordinary least squares regression model to predict “target_deathrate”. You will also need to implement k-fold (k=10) cross validation to best select your model parameters. Model diagnostics should be performed in order to assess linearity serial independence heteroskedasticity normality multicollinearity etc., while outliers missing values or categorical variables will also have an effect your model selection process. Finally it is important to interpret the resulting models within their context based upon all given factors associated with it such as outliers missing values demographic changes etc., before arriving at a meaningful conclusion which may explain trends in health outcomes and socioeconomic factors found within this dataset
- Analysis of factors influencing target deathrates in different US counties.
- Prediction of the effects of varying poverty levels on health outcomes in different US counties.
- In-depth analysis of how various socio-economic factors (e.g., median income, educational attainment, etc.) contribute to overall public health outcomes in US counties
If you use this dataset in your research, please credit the original authors. Data Source
License: Dataset copyright by authors - You are free to: - Share - copy and redistribute the material in any medium or format for any purpose, even commercially. - Adapt - remix, transform, and build upon the material for any purpose, even commercially. - You must: - Give appropriate credit - Provide a link to the license, and indicate if changes were made. - ShareAlike - You must distribute your contributions under the same license as the original. -...
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
SABE Study (Saúde, Bem-estar e Envelhecimento), Brazil, 2015.
Facebook
TwitterThe 2018 Tonga National Disabiltiy Survey was conducted jointly by the Tonga Department of Statistics (TDS) and the Ministry of Internal Affairs, Social Protection and Disability. It is the first population-based comprehensive disability survey in the country. Funding was provided through number of bodies including UNICEF, DFAT and Tonga Government. The Pacific Community provided technical supports through out different stages of the survey.
The main purpose of the survey is to desctibe demographic, social and economic characteristics of persons with disabilities and detemine the prevalence by type of disability in Tonga, and thus help the government and decision makers in formulating more suitable national plans and policies relevant to persons with disabilities.
The other objectives of the Disability survey were collect data that would determine but not limited to the following: a. Disability prevalence rate at the national, urban and rural based on the Washington Group recommendations; b. degree of activity limitations and participation restrictions and societal activities for persons with disability: c. ascertain the specific vulnerabilities that children and adults with disability face in Tonga d. establish the accessibility of health and social services for persons with disability in Tonga e. generate data that guides the development of policies and strategies that ensure equity and opportunities for children and adults with disabilities.
An additional module was included to collect information on people's perception/experiences of service delivery of Goverment to the public.
Version 01: Clean, labelled and de-identified version of the Master file.
The scope of the study involves Disability. Various sections of the Questionnaire are listed below.
HOUSEHOLDS:
-Basic household characteristics of the private dwellings, including sanitation, water, electricity, households materials and household wealth;
INDIVIDUALS:
-Basic demographic characteristics of individuals in a particular household dwelling, including age, sex, ethnicity, religion, marital status, educational attainment, and economic activity
(Children aged 2-4 years:
-Level of difficulty functioning by domain, tools and supports, age of onset of difficulty, cause of difficulty, health, transport;)
(Children aged 5-17 years:
-Level of difficulty functioning by domain, tools and supports received, age of onset of difficulty, cause of difficulty, health, transport, education, employment, income, participation and accessibility)
(Adult aged 18 years and older:
-Level of difficulty functioning by domain, tools and supports received, age of onset of difficulty, cause of difficulty, health, transport, education, employment, income, participation and accessibility).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This study examines the associations of socioeconomic status (SES) with intensity of different types of physical activity (PA) in Chinese adults, aimed at outlining and projecting socioeconomic disparities in PA among the population undergoing a rapid nutrition transition. A community-based survey was conducted among 3,567 residents aged 30-65 years old in Jiaxing, China, in 2010. SES and PA were assessed by a structured questionnaire. SES was assessed as socioeconomic index (SEI) score based on self-reported educational attainment, household income and occupation. Metabolic equivalents (METs) were calculated for each subject to quantify the total amount of PA from occupation, exercise, transportation and housework.
Facebook
TwitterU.S. citizens with a professional degree had the highest median household income in 2023, at 172,100 U.S. dollars. In comparison, those with less than a 9th grade education made significantly less money, at 35,690 U.S. dollars. Household income The median household income in the United States has fluctuated since 1990, but rose to around 70,000 U.S. dollars in 2021. Maryland had the highest median household income in the United States in 2021. Maryland’s high levels of wealth is due to several reasons, and includes the state's proximity to the nation's capital. Household income and ethnicity The median income of white non-Hispanic households in the United States had been on the rise since 1990, but declining since 2019. While income has also been on the rise, the median income of Hispanic households was much lower than those of white, non-Hispanic private households. However, the median income of Black households is even lower than Hispanic households. Income inequality is a problem without an easy solution in the United States, especially since ethnicity is a contributing factor. Systemic racism contributes to the non-White population suffering from income inequality, which causes the opportunity for growth to stagnate.