Socioeconomic indicators like the poverty rate, population change, unemployment rate, and education levels vary across the nation. ERS has compiled the latest data on these measures into a mapping and data display/download application that allows users to identify and compare States and counties on these indicators.
Data on county socioeconomic status for 2,132 US counties and each county’s average annual cardiovascular mortality rate (CMR) and total PM2.5 concentration for 21 years (1990-2010). County CMR, PM2.5, and socioeconomic data were obtained from the U.S. National Center for Health Statistics, U.S. Environmental Protection Agency’s Community Multiscale Air Quality modeling system, and the U.S. Census, respectively. A socioeconomic index was created using seven county-level measures from the 1990 US census using factor analysis. Quintiles of this index were used to generate categories of county socioeconomic status. This dataset is associated with the following publication: Wyatt, L., G. Peterson, T. Wade, L. Neas, and A. Rappold. The contribution of improved air quality to reduced cardiovascular mortality: Declines in socioeconomic differences over time. ENVIRONMENT INTERNATIONAL. Elsevier B.V., Amsterdam, NETHERLANDS, 136: 105430, (2020).
https://www.icpsr.umich.edu/web/ICPSR/studies/36366/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/36366/terms
These data are part of NACJD's Fast Track Release and are distributed as they were received from the data depositor. The files have been zipped by NACJD for release, but not checked or processed except for the removal of direct identifiers. Users should refer to the accompanying readme file for a brief description of the files available with this collection and consult the investigator(s) if further information is needed. The study includes data collected with the purpose of creating an integrated dataset that would allow researchers to address significant, policy-relevant gaps in the literature--those that are best answered with cross-jurisdictional data representing a wide array of economic and social factors. The research addressed five research questions: What is the impact of gentrification and suburban diversification on crime within and across jurisdictional boundaries? How does crime cluster along and around transportation networks and hubs in relation to other characteristics of the social and physical environment? What is the distribution of criminal justice-supervised populations in relation to services they must access to fulfill their conditions of supervision? What are the relationships among offenders, victims, and crimes across jurisdictional boundaries? What is the increased predictive power of simulation models that employ cross-jurisdictional data?
https://www.icpsr.umich.edu/web/ICPSR/studies/38528/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/38528/terms
These datasets contain measures of socioeconomic and demographic characteristics by U.S. census tract for the years 1990-2022 and ZIP code tabulation area (ZCTA) for the years 2008-2022. Example measures include population density; population distribution by race, ethnicity, age, and income; income inequality by race and ethnicity; and proportion of population living below the poverty level, receiving public assistance, and female-headed or single parent families with kids. The datasets also contain a set of theoretically derived measures capturing neighborhood socioeconomic disadvantage and affluence, as well as a neighborhood index of Hispanic, foreign born, and limited English.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Country Socioeconomic Status Scores: 1880-2010’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/sdorius/globses on 14 February 2022.
--- Dataset description provided by original source is as follows ---
This dataset contains estimates of the socioeconomic status (SES) position of each of 149 countries covering the period 1880-2010. Measures of SES, which are in decades, allow for a 130 year time-series analysis of the changing position of countries in the global status hierarchy. SES scores are the average of each country’s income and education ranking and are reported as percentile rankings ranging from 1-99. As such, they can be interpreted similarly to other percentile rankings, such has high school standardized test scores. If country A has an SES score of 55, for example, it indicates that 55 percent of the world’s people live in a country with a lower average income and education ranking than country A. ISO alpha and numeric country codes are included to allow users to merge these data with other variables, such as those found in the World Bank’s World Development Indicators Database and the United Nations Common Database.
See here for a working example of how the data might be used to better understand how the world came to look the way it does, at least in terms of status position of countries.
VARIABLE DESCRIPTIONS: UNID: ISO numeric country code (used by the United Nations) WBID: ISO alpha country code (used by the World Bank) SES: Socioeconomic status score (percentile) based on GDP per capita and educational attainment (n=174) country: Short country name year: Survey year SES: Socioeconomic status score (1-99) for each of 174 countries gdppc: GDP per capita: Single time-series (imputed) yrseduc: Completed years of education in the adult (15+) population popshare: Total population shares
DATA SOURCES:
The dataset was compiled by Shawn Dorius (sdorius@iastate.edu) from a large number of data sources, listed below.
GDP per Capita:
1. Maddison, Angus. 2004. 'The World Economy: Historical Statistics'. Organization for Economic Co-operation and Development: Paris. Maddison population data in 000s; GDP & GDP per capita data in (1990 Geary-Khamis dollars, PPPs of currencies and average prices of commodities). Maddison data collected from: http://www.ggdc.net/MADDISON/Historical_Statistics/horizontal-file_02-2010.xls.
2. World Development Indicators Database
Years of Education
1. Morrisson and Murtin.2009. 'The Century of Education'. Journal of Human Capital(3)1:1-42. Data downloaded from http://www.fabricemurtin.com/
2. Cohen, Daniel & Marcelo Cohen. 2007. 'Growth and human capital: Good data, good results' Journal of economic growth 12(1):51-76. Data downloaded from http://soto.iae-csic.org/Data.htm
3. Barro, Robert and Jong-Wha Lee, 2013, "A New Data Set of Educational Attainment in the World, 1950-2010." Journal of Development Economics, vol 104, pp.184-198. Data downloaded from http://www.barrolee.com/
Total Population
1. Maddison, Angus. 2004. 'The World Economy: Historical Statistics'. Organization for Economic Co-operation and Development: Paris. 13.
2. United Nations Population Division. 2009.
--- Original source retains full ownership of the source dataset ---
The American Community Survey (ACS) 5 Year 2016-2020 socioeconomic estimate data is a subset of information derived from the following census tables:B08013 - Aggregate Travel Time To Work Of Workers By Sex;B08303 - Travel Time To Work;B17019 - Poverty Status In The Past 12 Months Of Families By Household Type By Tenure;B17021 - Poverty Status Of Individuals In The Past 12 Months By Living Arrangement;B19001 - Household Income In The Past 12 Months;B19013 - Median Household Income In The Past 12 Months;B19025 - Aggregate Household Income In The Past 12 Months;B19113 - Median Family Income In The Past 12 Months;B19202 - Median Non-family Household Income In The Past 12 Months;B23001 - Sex By Age By Employment Status For The Population 16 Years And Over;B25014 - Tenure By Occupants Per Room;B25026 - Total Population in Occupied Housing Units by Tenure by year Householder Moved into Unit;B25106 - Tenure By Housing Costs As A Percentage Of Household Income In The Past 12 Months;C24010 - Sex By Occupation For The Civilian Employed Population 16 Years And Over;B20004 - Median Earnings In the Past 12 Months (In 2015 Inflation-Adjusted Dollars) by Sex by Educational Attainment for the Population 25 Years and Over;B23006 - Educational Attainment by Employment Status for the Population 25 to 64 Years, and;B24021 - Occupation By Median Earnings In The Past 12 Months (In 2015 Inflation-Adjusted Dollars) For The Full-Time, Year-Round Civilian Employed Population 16 Years And Over.
To learn more about the American Community Survey (ACS), and associated datasets visit: https://www.census.gov/programs-surveys/acs, for questions about the spatial attribution of this dataset, please reach out to us at GISHelpdesk@hud.gov. Data Dictionary: DD_ACS 5-Year Socioeconomic Estimate Data by StateDate of Coverage: 2016-2020
http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
This dataset contains estimates of the socioeconomic status (SES) position of each of 149 countries covering the period 1880-2010. Measures of SES, which are in decades, allow for a 130 year time-series analysis of the changing position of countries in the global status hierarchy. SES scores are the average of each country’s income and education ranking and are reported as percentile rankings ranging from 1-99. As such, they can be interpreted similarly to other percentile rankings, such has high school standardized test scores. If country A has an SES score of 55, for example, it indicates that 55 percent of the countries in this dataset have a lower average income and education ranking than country A. ISO alpha and numeric country codes are included to allow users to merge these data with other variables, such as those found in the World Bank’s World Development Indicators Database and the United Nations Common Database.
See here for a working example of how the data might be used to better understand how the world came to look the way it does, at least in terms of status position of countries.
VARIABLE DESCRIPTIONS:
unid: ISO numeric country code (used by the United Nations)
wbid: ISO alpha country code (used by the World Bank)
SES: Country socioeconomic status score (percentile) based on GDP per capita and educational attainment (n=174)
country: Short country name
year: Survey year
gdppc: GDP per capita: Single time-series (imputed)
yrseduc: Completed years of education in the adult (15+) population
region5: Five category regional coding schema
regionUN: United Nations regional coding schema
DATA SOURCES:
The dataset was compiled by Shawn Dorius (sdorius@iastate.edu) from a large number of data sources, listed below. GDP per Capita:
Maddison, Angus. 2004. 'The World Economy: Historical Statistics'. Organization for Economic Co-operation and Development: Paris. GDP & GDP per capita data in (1990 Geary-Khamis dollars, PPPs of currencies and average prices of commodities). Maddison data collected from: http://www.ggdc.net/MADDISON/Historical_Statistics/horizontal-file_02-2010.xls.
World Development Indicators Database Years of Education 1. Morrisson and Murtin.2009. 'The Century of Education'. Journal of Human Capital(3)1:1-42. Data downloaded from http://www.fabricemurtin.com/ 2. Cohen, Daniel & Marcelo Cohen. 2007. 'Growth and human capital: Good data, good results' Journal of economic growth 12(1):51-76. Data downloaded from http://soto.iae-csic.org/Data.htm
Barro, Robert and Jong-Wha Lee, 2013, "A New Data Set of Educational Attainment in the World, 1950-2010." Journal of Development Economics, vol 104, pp.184-198. Data downloaded from http://www.barrolee.com/
Maddison, Angus. 2004. 'The World Economy: Historical Statistics'. Organization for Economic Co-operation and Development: Paris. 13.
United Nations Population Division. 2009.
This dataset contains estimates of the socioeconomic status (SES) position of each of 149 countries covering the period 1880-2010. Measures of SES, which are in decades, allow for a 130 year time-series analysis of the changing position of countries in the global status hierarchy. SES scores are the average of each country’s income and education ranking and are reported as percentile rankings ranging from 1-99. As such, they can be interpreted similarly to other percentile rankings, such has high school standardized test scores. If country A has an SES score of 55, for example, it indicates that 55 percent of the world’s people live in a country with a lower average income and education ranking than country A. ISO alpha and numeric country codes are included to allow users to merge these data with other variables, such as those found in the World Bank’s World Development Indicators Database and the United Nations Common Database.
See here for a working example of how the data might be used to better understand how the world came to look the way it does, at least in terms of status position of countries.
VARIABLE DESCRIPTIONS: UNID: ISO numeric country code (used by the United Nations) WBID: ISO alpha country code (used by the World Bank) SES: Socioeconomic status score (percentile) based on GDP per capita and educational attainment (n=174) country: Short country name year: Survey year SES: Socioeconomic status score (1-99) for each of 174 countries gdppc: GDP per capita: Single time-series (imputed) yrseduc: Completed years of education in the adult (15+) population popshare: Total population shares
DATA SOURCES:
The dataset was compiled by Shawn Dorius (sdorius@iastate.edu) from a large number of data sources, listed below.
GDP per Capita:
1. Maddison, Angus. 2004. 'The World Economy: Historical Statistics'. Organization for Economic Co-operation and Development: Paris. Maddison population data in 000s; GDP & GDP per capita data in (1990 Geary-Khamis dollars, PPPs of currencies and average prices of commodities). Maddison data collected from: http://www.ggdc.net/MADDISON/Historical_Statistics/horizontal-file_02-2010.xls.
2. World Development Indicators Database
Years of Education
1. Morrisson and Murtin.2009. 'The Century of Education'. Journal of Human Capital(3)1:1-42. Data downloaded from http://www.fabricemurtin.com/
2. Cohen, Daniel & Marcelo Cohen. 2007. 'Growth and human capital: Good data, good results' Journal of economic growth 12(1):51-76. Data downloaded from http://soto.iae-csic.org/Data.htm
3. Barro, Robert and Jong-Wha Lee, 2013, "A New Data Set of Educational Attainment in the World, 1950-2010." Journal of Development Economics, vol 104, pp.184-198. Data downloaded from http://www.barrolee.com/
Total Population
1. Maddison, Angus. 2004. 'The World Economy: Historical Statistics'. Organization for Economic Co-operation and Development: Paris. 13.
2. United Nations Population Division. 2009.
The purpose of the SEPHER data set is to allow for testing, assessing and generating new analysis and metrics that can address inequalities and climate injustice. The data set was created by Tedesco, M., C. Hultquist, S. E. Char, C. Constantinides, T. Galjanic, and A. D. Sinha.
SEPHER draws upon four major source datasets: CDC Social Vulnerability Index, FEMA National Risk Index, Home Mortgage Disclosure Act, and Evictions datasets. The data from these source datasets have been merged, cleaned, and standardized and all of the variables documented in the data dictionary.
CDC Social Vulnerability Index
CDC Social Vulnerability Index (SVI) dataset is a dataset prepared for the Centers for Disease Control and Prevention for the purpose of assessing the degree of social vulnerability of American communities to natural hazards and anthropogenic events. It contains data on 15 social factors taken or derived from Census reports as well as rankings of each tract based on these individual factors, groups of factors corresponding to four related themes (Socioeconomic, Household Composition & Disability, Minority Status & Language, and Housing Type & Transportation) and overall. The data is available for the years 2000, 2010, 2014, 2016, and 2018.
FEMA National Risk Index
The National Risk Index (NRI) dataset compiled by the Federal Emergency Management Agency (FEMA) consists of historic natural disaster data from across the United States at a tract-level. The dataset includes information about 18 natural disasters including earthquakes, tsunamis, wildfires, volcanic activity and many others. Each disaster is detailed out in terms of its frequency, historic impact, potential exposure, expected annual loss and associated risk. The dataset also includes some summary variables for each tract including the total expected loss in terms of building loss, human loss and agricultural loss, the population of the tract, and the area covered by the tract. It finally includes a few more features to characterize the population such as social vulnerability rating and community resilience.
Home Mortgage Disclosure Act
The Home Mortgage Disclosure Act (HMDA) dataset contains loan-level data for home mortgages including information on applications, denials, approvals, and institution purchases. It is managed and expanded annually by the Consumer Financial Protection Bureau based on the data collected from financial institutions. The dataset is used by public officials to make decisions and policies, uncover lending patterns and discrimination among mortgage applicants, and investigate if lenders are serving the housing needs of the communities. It covers the period from 2007 to 2017.
Evictions
The Evictions dataset is compiled and managed by the Eviction Lab at Princeton University and consists of court records related to eviction cases in the United States between 2000 and 2016. Its purpose is to estimate the prevalence of court-ordered evictions and compare eviction rates among states, counties, cities, and neighborhoods. Besides information on eviction filings and judgments, the dataset includes socioeconomic and real estate data for each tract including race/ethnic origin, household income, poverty rate, property value, median gross rent, rent burden, and others.
This dataset contains a selection of six socioeconomic indicators of public health significance and a “hardship index,” by Chicago community area, for the years 2008 – 2012. The indicators are the percent of occupied housing units with more than one person per room (i.e., crowded housing); the percent of households living below the federal poverty level; the percent of persons in the labor force over the age of 16 years that are unemployed; the percent of persons over the age of 25 years without a high school diploma; the percent of the population under 18 or over 64 years of age (i.e., dependency); and per capita income. Indicators for Chicago as a whole are provided in the final row of the table. See the full dataset description for more information at: https://data.cityofchicago.org/api/views/fwb8-6aw5/files/A5KBlegGR2nWI1jgP6pjJl32CTPwPbkl9KU3FxlZk-A?download=true&filename=P:\EPI\OEPHI\MATERIALS\REFERENCES\ECONOMIC_INDICATORS\Dataset_Description_socioeconomic_indicators_2012_FOR_PORTAL_ONLY.pdf
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
LivWell is a global longitudinal database which provides a range of key indicators related to women’s socioeconomic status, health and well-being, access to basic services, and demographic outcomes. Data are available at the sub-national level for 52 countries and 447 regions. A total of 134 indicators are based on 199 Demographic and Health Surveys for the period 1990-2019, supplemented by extensive information on socioeconomic and climatic conditions in the respective regions for a total of 190 indicators. The resulting data offer various opportunities for policy-relevant research on gender inequality, inclusive development, and demographic trends at the sub-national level.
For a full description, please refer to the article describing the database here: https://www.nature.com/articles/s41597-022-01824-2
The companion repository livwelldata allows to easily use the database in R. The R package can be downloaded following the instructions on the following git repository: https://gitlab.pik-potsdam.de/belmin/livwelldata. The version of the database in the package is the same as in this repository.
https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
1) Data Introduction • The Adult Census Dataset is a structured dataset for binary classification, designed to predict whether an individual's income exceeds a certain threshold based on demographic and employment attributes such as age, education, occupation, and hours worked per week. It serves as a valuable foundation for analyzing socioeconomic factors and understanding the determinants of income levels.
2) Data Utilization (1) Characteristics of the Adult Census Dataset: • Each data entry includes personal socioeconomic attributes such as age, gender, education level, occupation, marital status, and working hours, with a binary label indicating whether the individual's income is over or under the specified threshold.
(2) Applications of the Adult Census Dataset: • Income prediction model development: Machine learning classification models can be trained to predict individual income levels, which can be applied to financial product recommendations and policy design. • Socioeconomic factor analysis: The dataset can be used in sociological and economic research to analyze and visualize how various factors—such as education, occupation, and gender—impact income distribution.
A more recent web map on this same topic is available for ArcGIS Online subscribers here.This map shows the socioeconomic status of each census tract. Data come from the US Census Bureau's 2011-2015 American Community Survey. Neighborhood Socioeconomic Status, over and above individual socioeconomic status, is a predictor of many health outcomes. The Neighborhood Socioeconomic Status (NSES) Index is on a scale from 0 to 100 with 50 being the national average around 2010. The Index incorporates the following indicators (fields in this layer's attribute table):Median Household Income (from Table B19013)Percent of individuals with income below the Federal Poverty Line (from Table S1701)The educational attainment of adults (age 25+) (from Table B15003)Unemployment Rate (from Table S2301)Percent of households with children under the age of 18 that are "female-headed" (no male present) (from Table B11005)NSES = log(median household income) + (-1.129 * (log(percent of female-headed households))) + (-1.104 * (log(unemployment rate))) + (-1.974 * (log(percent below poverty))) + .451*((high school grads)+(2*(bachelor's degree holders)))To learn more about how the NSES Index was developed, please explore this journal articleMiles, Jeremy and Weden, Margaret; Lavery, Diana; Escarce, José; Kathleen Cagney; Shih, Regina. 2016. “Constructing a Time-Invariant Measure of the Socio-Economic Status of U.S. Census Tracts.” Journal of Urban Health, vol. 93, issue no.1, pp. 213-232. or this PPT presentation presented at the University of Texas at San Antonio's Applied Demography Conference in 2014.
This dataset contains a selection of six socioeconomic indicators of public health significance and a “hardship index,” by Chicago community area, for the years 2007 – 2011. The indicators are the percent of occupied housing units with more than one person per room (i.e., crowded housing); the percent of households living below the federal poverty level; the percent of persons in the labor force over the age of 16 years that are unemployed; the percent of persons over the age of 25 years without a high school diploma; the percent of the population under 18 or over 64 years of age (i.e., dependency); and per capita income. Indicators for Chicago as a whole are provided in the final row of the table. See the full dataset description for more information at https://data.cityofchicago.org/api/assets/8D10B9D1-CCA3-4E7E-92C7-5125E9AB46E9.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The data were collected from the administrative records of a Colombian public university. This dataset was compiled by the Universidad Nacional de Colombia to analyze socioeconomic characteristics of students in the first semester of 2021. The dataset includes attributes such as socioeconomic status, tuition fees, academic program, and campus location. Each record corresponds to a unique student, with identifiers fully anonymized to ensure privacy. Additionally, it contains detailed socioeconomic variables, such as the income levels of students' parents, family residence type, and other demographic indicators, allowing for a more nuanced analysis of students' backgrounds. Each record corresponds to a unique student, with all identifiers anonymized to ensure privacy. The data were processed and converted into CSV format, resulting in a structured dataset containing approximately 3361 records with multiple variables that enable extensive analysis of socioeconomic distribution, educational access, and the influence of family economic factors on higher education outcomes. This dataset is intended for research on the socioeconomic dynamics within higher education and is ready for analysis with minimal preprocessing required for further analysis.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Socioeconomic Status Social Vulnerability Index 2014-2018’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://catalog.data.gov/dataset/2c5de914-693a-405e-80f7-5abb6b7dd3d3 on 28 January 2022.
--- Dataset description provided by original source is as follows ---
Socioeconomic Status is one of the themes of the Social Vulnerability Index from the CDC. This data set is data from the Social Vulnerability Index. Only the Socioeconomic Status columns are represented in this dataset.
ATSDR’s Geospatial Research, Analysis & Services Program (GRASP) created Centers for Disease Control and Prevention Social Vulnerability Index (CDC SVI or simply SVI, hereafter) to help public health officials and emergency response planners identify and map the communities that will most likely need support before, during, and after a hazardous event.
SVI indicates the relative vulnerability of every U.S. Census tract. Census tracts are subdivisions of counties for which the Census collects statistical data. SVI ranks the tracts on 15 social factors, including unemployment, minority status, and disability, and further groups them into four related themes. Thus, each tract receives a ranking for each Census variable and for each of the four themes, as well as an overall ranking.
In addition to tract-level rankings, SVI 2018 also has corresponding rankings at the county level. Notes below that describe “tract” methods also refer to county methods.
--- Original source retains full ownership of the source dataset ---
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset presents the mean household income for each of the five quintiles in Social Circle, GA, as reported by the U.S. Census Bureau. The dataset highlights the variation in mean household income across quintiles, offering valuable insights into income distribution and inequality.
Key observations
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.
Income Levels:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Social Circle median household income. You can refer the same here
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset comprises electroencephalogram (EEG) data collected from 127 young adults (18-30 years), along with retrospective objective and subjective indicators of childhood family socioeconomic status (SES), as well as SES indicators in adulthood, such as educational attainment, individual and household income, food security, and home and neighborhood characteristics. The EEG data were recorded with tasks directly acquired from the Event-Related Potentials Compendium of Open Resources and Experiments ERP CORE (Kappenman et al., 2021), or adapted from these tasks (Isbell et al., 2024). These tasks were optimized to capture neural activity manifest in perception, cognition, and action, in neurotypical young adults. Furthermore, the dataset includes a symptoms checklist, consisting of questions that were found to be predictive of symptoms consistent with attention-deficit/hyperactivity disorder (ADHD) in adulthood, which can be used to investigate the links between ADHD symptoms and neural activity in a socioeconomically diverse young adult sample. The detailed description of the dataset is accepted for publication in Scientific Data, with the title: "Cognitive Electrophysiology in Socioeconomic Context in Adulthood."
EEG data were recorded using the Brain Products actiCHamp Plus system, in combination with BrainVision Recorder (Version 1.25.0101). We used a 32-channel actiCAP slim active electrode system, with electrodes mounted on elastic snap caps (Brain Products GmbH, Gilching, Germany). The ground electrode was placed at FPz. From the electrode bundle, we repurposed 2 electrodes by placing them on the mastoid bones behind the left and right ears to be used for re-referencing after data collection. We also repurposed 3 additional electrodes to record electrooculogram (EOG). To capture eye artifacts, we placed the horizontal EOG (HEOG) electrodes ateral to the external canthus of each eye. We also placed one vertical EOG (VEOG) electrode below the right eye. The remaining 27 electrodes were used as scalp electrodes, which were mounted per the international 10/20 system. EEG data were recorded at a sampling rate of 500 Hz and referenced to the Cz electrode. StimTrak was used to assess stimulus presentation delays for both the monitor and headphones. The results indicated that both the visual and auditory stimuli had a delay of approximately 20 ms. Therefore, users should shift the event-codes by 20 ms when conducting stimulus-locked analyses.
Before the data were publicly shared, all identifiable information was removed, including date of birth, date of session, race/ethnicity, zip code, occupation (self and parent), and names of the languages the participants reported speaking and understanding fluently. Date of birth and date of session were used to compute age in years, which is included in the dataset. Furthermore, several variables were recoded based on re-identification risk assessments. Users who would like to establish secure access to components of the dataset we could not publicly share due to re-identification risks, should contact the corresponding researcher as described below. The dataset consists of participants recruited for studies on adult cognition in context. To provide the largest sample size, we included all participants who completed at least one of the EEG tasks of interest. Each participant completed each EEG task only once. The original participant IDs with which the EEG data were saved were recoded and the raw EEG files were renamed to make the dataset BIDS compatible.
The ERP CORE experimental tasks can be found on OSF, under Experiment Control Files: https://osf.io/thsqg/
Examples of EEGLAB/ERPLAB data processing scripts that can be used with the EEG data shared here can be found on OSF:
osf.io/thsqg osf.io/43H75
Contact * If you have any questions, comments, or requests, please contact: * Elif Isbell: eisbell@ucmerced.edu
This dataset is licensed under CC0.
Isbell, E., Peters, A. N., Richardson, D. M., & Rodas De León, N. E. (2025). Cognitive electrophysiology in socioeconomic context in adulthood. Scientific Data, 12(1), 1–9. https://doi.org/10.1038/s41597-025-05209-z
Isbell, E., De León, N. E. R., & Richardson, D. M. (2024). Childhood family socioeconomic status is linked to adult brain electrophysiology. PloS One, 19(8), e0307406.
Isbell, E., De León, N. E. R. & Richardson, D. M. Childhood family socioeconomic status is linked to adult brain electrophysiology - accompanying analytic data and code. OSF https://doi.org/10.17605/osf.io/43H75 (2024).
Kappenman, E. S., Farrens, J. L., Zhang, W., Stewart, A. X., & Luck, S. J. (2021). ERP CORE: An open resource for human event-related potential research. NeuroImage, 225, 117465.
Kappenman, E. S., Farrens, J., Zhang, W., Stewart, A. X. & Luck, S. J. ERP CORE. https://osf.io/thsqg (2020).
Kappenman, E., Farrens, J., Zhang, W., Stewart, A. & Luck, S. Experiment control files. https://osf.io/47uf2 (2020).
blockgroupvulnerability OPPORTUNITY The US Centers for Disease Control (CDC) publishes a set of percentiles that compare US geographies by vulnerability across household, socioeconomic, racial/ethnic and housing themes. These Social Vulnerability Indexes (SVI) were originally intended to to help public health officials and emergency response planners identify communities that will need support around an event. They are generally valuable for any public interest that wants to relate themselves to needy communities by geography. The SVI publication and its basis variables are provided at the Census tract level of geographic detail. The Census' American Community Survey is available down the to the block group level, however. Recasting the SVI methods at this lower level of geography allows it to be tied to thousands of other demographic variables available. Because the SVI relies on ACS variables only available at the tract level, a projection model needs to applied to approximate its results using blockgroup level ACS variables. The blockgroupvulnerability dataset casts a prediction for the CDCs logic for a new contribution to the Open Environments blockgroup series available on Harvard's dataverse platform. DATA The CDC's annual SVI publication starts with 23 simple derivations using 50 ACS Census variables. Next the SVI process ranks census geographies to calculate a rank for each, where Percentile Rank = (Rank-1) / (N-1). The SVI themes are then calculated at the tract level as a percentile rank of a sum of the percentile ranks of the first level ACS derived variables. Finally, the overall ranking is taken as the sum of the theme percentile rankings. The SVI data publication is keyed by geography (7 cols) where ultimately the Census Tract FIPS code is 2 State + 3 County + 4 Tract + 2 Tract Decimals eg, 56043000301 is 56 Wyoming, 043 Washakie County, Tract 3.01 republishes Census demographics called 'adjunct variables' including area, population, households and housing units from the ACS daytime population taken from LandScan 2020 estimates derives 23 SVI variables from 50 ACS 5 Year variables with each having an estimate (E_), estimate precentage (EP_), margin of error (M_), margin percentage (MP_) and flag variable (F_) for those greater than 90% or less than 10% provides the final 4 themes and a composite SVI percentile annually vars = ['ST', 'STATE', 'ST_ABBR', 'STCNTY', 'COUNTY', 'FIPS', 'LOCATION'] +\ ['SNGPNT','LIMENG','DISABL','AGE65','AGE17','NOVEH','MUNIT','MOBILE','GROUPQ','CROWD','UNINSUR','UNEMP','POV150','NOHSDP','HBURD','TWOMORE','OTHERRACE','NHPI','MINRTY','HISP','ASIAN','AIAN','AFAM','NOINT'] +\ ['TOTAL','THEME1','THEME2','THEME3','THEME4'] + \ ['AREA_SQMI', 'TOTPOP', 'DAYPOP', 'HU', 'HH'] knowns = vars + \ # Estimates, the result of calc against ACS vars [('E_'+v) for v in vars] + \ # Flag 0,1 whether this geog is in 90 percentile rank (its vulnerable) [('F_'+v) for v in vars] +\ # Margine of error for ACS calcs [('M_'+v) for v in vars] + \ # Margine of error for ACS calcs, as percentage [('MP_'+v) for v in vars] +\ # Estimates of ACS calcs, as percentage [('EP_'+v) for v in vars] + \ # Estimated percentile ranks [('EPL_'+v) for v in vars] + \ # Sum across var percentile ranks [('SPL_'+v) for v in vars]+ \ # Percentile rank of the sum of percentile ranks [('RPL_'+v) for v in vars] [c for c in svitract.columns if c not in knowns] The SVI themes range over [0,1] but the CDC uses -999 as an NA value; this is set for ~800 or 1% of tracts which have no total poulation. The themes are numbered: Socioeconomic Status – RPL_THEME1 Household Characteristics – RPL_THEME2 Racial & Ethnic Minority Status – RPL_THEME3 Housing Type & Transportation – RPL_THEME4 The themes with their variables and ACS sources are as follows: Unlike Census data, the CDC ranks Puerto Rico and Tribal tracts separately from the US otherwise. Theme SVI Variable ACS Table ACS Variables Socioeconomic E_UNINSUR S2701 S2701_C04_001E Socioeconomic E_UNEMP DP03 DP03_0005E Socioeconomic E_POV150 S1701 S1701_C01_040E Socioeconomic E_NOHSDP B06009 B06009_002E Socioeconomic E_HBURD S2503 S2503_C01_028E + S2503_C01_032E + S2503_C01_036E + S2503_C01_040E Household E_SNGPNT B11012 B11012_010E + B11012_015E Household E_LIMENG B16005 B16005_007E + B16005_008E + B16005_012E + B16005_013E + B16005_017E + B16005_018E + B16005_022E + B16005_023E + B16005_029E + B16005_030E + B16005_034E + B16005_035E + B16005_039E + B16005_040E + B16005_044E + B16005_045E Household E_DISABL DP02 DP02_0072E Household E_AGE65 S0101 S0101_C01_030E Household E_AGE17 B09001 B09001_001E Racial & Ethnic E_TWOMORE DP05 DP05_0083E Racial & Ethnic E_OTHERRACE DP05 DP05_0082E Racial & Ethnic E_NHPI DP05 DP05_0081E Racial & Ethnic E_MINRTY DP05 DP05_0071E + DP05_0078E + DP05_0079E + DP05_0080E + DP05_0081E + DP05_0082E + ... Visit https://dataone.org/datasets/sha256%3A3edd5defce2f25c7501953ca3e77c4f15a8c71251352373a328794f961755c1c for complete metadata about this dataset.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
When I was searching for COVID-19 datasets online, I soon realized that there were no comprehensive datasets of the United States on a county level basis which included social, economic, and demographic factors in addition to the general case information that was already available on several sites. To quench my thirst for clean and relevant data, I proceeded to gather information from several various sources to compile the dataset I was looking for.
I started by looking for a reliable dataset that has general information such as confirmed cases, deaths, etc. I found John Hopkin's COVID-19 dataset to be the best one for this purpose as it is well organized and updated daily. Then, I set out looking for economic factors and population data for each county in the United States. I found a collection of such files compiled by the Economic Research Service branch of the USDA on their website. Finally, I had to find a dataset which had racial and demographic information for each county, which I found on the US Census Bureau's website under a page which was dedicated to county population data by several characteristics. Now that I had all the data I was looking for, I proceeded to find which counties were common in all datasets. After several hours of cleaning each dataset and extracting relevant information, I combined all the information into one CSV file with 2959 counties of clean information - exactly what I was looking for.
I hope that the Kaggle community will use this dataset to answer important questions regarding COVID-19 in the United States and the role that external economic, social, and demographic factors play in the shaping of the pandemic. I know that there are several patterns to be discovered and I sincerely hope that this helps our community understand just a little more about the pandemic than we do right now.
Socioeconomic indicators like the poverty rate, population change, unemployment rate, and education levels vary across the nation. ERS has compiled the latest data on these measures into a mapping and data display/download application that allows users to identify and compare States and counties on these indicators.