This dataset contains replication files for "The Surrogate Index: Combining Short-Term Proxies to Estimate Long-Term Treatment Effects More Rapidly and Precisely" by Susan Athey, Raj Chetty, Guido Imbens, and Hyunseung Kang. For more information, see https://opportunityinsights.org/paper/the-surrogate-index/. A summary of the related publication follows. The impacts of many policies, such as efforts to increase upward income mobility or improve health outcomes, are only observed with long delays. For example, it can take decades to see the effects of early childhood interventions on lifetime earnings. This problem has greatly limited researchers’ and policymakers’ ability to test and improve policies and arises frequently in our own work at Opportunity Insights on the determinants of economic opportunity. In this study, we develop a new method of estimating the long-term impacts of policies more rapidly and precisely using short-term proxies. We predict long-term outcomes (e.g., lifetime earnings) using short-term outcomes (e.g., earnings in early adulthood or test scores). We then show that the causal effects of policies on this predictive index (which we term a “surrogate index”, following terminology in the statistics literature) can help us learn about their long-term impacts more quickly under certain assumptions that are described in the full paper. We apply our method to analyze the long-term impacts of a job training experiment in California. Using short-term employment rates as surrogates, we show that one could have estimated the program’s impact on mean employment rates over a 9 year horizon within 1.5 years, with a 35% reduction in standard errors. The success of the surrogate index in this job training application suggests that our method could be applied to predict the long-term impacts of other programs as well. Going forward, we hope to build a public library of early indicators (surrogate indices) for social science by harnessing historical experiments along with the large-scale datasets we have built. If you would like to contribute to this effort by reporting a surrogate index that predicts long-term impacts estimated in an experiment, as in the GAIN program, please contact us.
https://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.7910/DVN/EI4WE2https://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.7910/DVN/EI4WE2
This dataset contains replication files for "The Impacts of Neighborhoods on Intergenerational Mobility I: Childhood Exposure Effects" and "The Impacts of Neighborhoods on Intergenerational Mobility II: County-Level Estimates" by Raj Chetty and Nathaniel Hendren. For more information, see https://opportunityinsights.org/paper/neighborhoodsi/ and https://opportunityinsights.org/paper/neighborhoodsii/. A summary of the related publications follows. To what extent are children’s opportunities for upward economic mobility shaped by the neighborhoods in which they grow up? We study this question using data from de-identified tax records on more than five million children whose families moved across counties between 1996 and 2012. The study consists of two parts. In part one, we show that the area in which a child grows up has significant causal effects on her prospects for upward mobility. In part two, we present estimates of the causal effect of each county in the United States on a child’s chances of success. Using these results, we identify the properties of high- vs. low-opportunity areas to obtain insights into policies that can increase economic opportunity. The opinions expressed in this paper are those of the authors alone and do not necessarily reflect the views of the Internal Revenue Service or the U.S. Treasury Department. This work is a component of a larger project examining the effects of tax expenditures on the budget deficit and economic activity. All results based on tax data in this paper are constructed using statistics originally reported in the SOI Working Paper “The Economic Impacts of Tax Expenditures: Evidence from Spatial Variationacross the U.S.,” approved under IRS contract TIRNO-12-P-00374.
This dataset contains replication files for "The Fading American Dream: Trends in Absolute Income Mobility Since 1940" by Raj Chetty, David Grusky, Maximilian Hell, Nathaniel Hendren, Robert Manduca, and Jimmy Narang. For more information, see https://opportunityinsights.org/paper/the-fading-american-dream/. A summary of the related publication follows. One of the defining features of the “American Dream” is the ideal that children have a higher standard of living than their parents. We assess whether the U.S. is living up to this ideal by estimating rates of “absolute income mobility” – the fraction of children who earn more than their parents – since 1940. We measure absolute mobility by comparing children’s household incomes at age 30 (adjusted for inflation using the Consumer Price Index) with their parents’ household incomes at age 30. We find that rates of absolute mobility have fallen from approximately 90% for children born in 1940 to 50% for children born in the 1980s. Absolute income mobility has fallen across the entire income distribution, with the largest declines for families in the middle class. These findings are unaffected by using alternative price indices to adjust for inflation, accounting for taxes and transfers, measuring income at later ages, and adjusting for changes in household size. Absolute mobility fell in all 50 states, although the rate of decline varied, with the largest declines concentrated in states in the industrial Midwest, such as Michigan and Illinois. The decline in absolute mobility is especially steep – from 95% for children born in 1940 to 41% for children born in 1984 – when we compare the sons’ earnings to their fathers’ earnings. Why have rates of upward income mobility fallen so sharply over the past half-century? There have been two important trends that have affected the incomes of children born in the 1980s relative to those born in the 1940s and 1950s: lower Gross Domestic Product (GDP) growth rates and greater inequality in the distribution of growth. We find that most of the decline in absolute mobility is driven by the more unequal distribution of economic growth rather than the slowdown in aggregate growth rates. When we simulate an economy that restores GDP growth to the levels experienced in the 1940s and 1950s but distributes that growth across income groups as it is distributed today, absolute mobility only increases to 62%. In contrast, maintaining GDP at its current level but distributing it more broadly across income groups – at it was distributed for children born in the 1940s – would increase absolute mobility to 80%, thereby reversing more than two-thirds of the decline in absolute mobility. These findings show that higher growth rates alone are insufficient to restore absolute mobility to the levels experienced in mid-century America. Under the current distribution of GDP, we would need real GDP growth rates above 6% per year to return to rates of absolute mobility in the 1940s. Intuitively, because a large fraction of GDP goes to a small fraction of high-income households today, higher GDP growth does not substantially increase the number of children who earn more than their parents. Of course, this does not mean that GDP growth does not matter: changing the distribution of growth naturally has smaller effects on absolute mobility when there is very little growth to be distributed. The key point is that increasing absolute mobility substantially would require more broad-based economic growth. We conclude that absolute mobility has declined sharply in America over the past half-century primarily because of the growth in inequality. If one wants to revive the “American Dream” of high rates of absolute mobility, one must have an interest in growth that is shared more broadly across the income distribution.
https://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.7910/DVN/HM91JNhttps://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.7910/DVN/HM91JN
This dataset contains replication files for "Is the United States Still a Land of Opportunity? Recent Trends in Intergenerational Mobility" by Raj Chetty, Nathaniel Hendren, Patrick Kline, Emmanuel Saez, and Nicholas Turner. For more information, see https://opportunityinsights.org/paper/recentintergenerationalmobility/. A summary of the related publication follows. We present new evidence on trends in intergenerational mobility in the U.S. using administrative earnings records. We find that percentile rank-based measures of intergenerational mobility have remained extremely stable for the 1971-1993 birth cohorts. For children born between 1971 and 1986, we measure intergenerational mobility based on the correlation between parent and child income percentile ranks. For more recent cohorts, we measure mobility as the correlation between a child’s probability of attending college and her parents’ income rank. We also calculate transition probabilities, such as a child’s chances of reaching the top quintile of the income distribution starting from the bottom quintile. Based on all of these measures, we find that children entering the labor market today have the same chances of moving up in the income distribution (relative to their parents) as children born in the 1970s. However, because inequality has risen, the consequences of the “birth lottery” – the parents to whom a child is born – are larger today than in the past. The views expressed in this paper are those of the authors and do not necessarily represent the views or policies of the US Treasury Department or the Internal Revenue Service or the National Bureau of Economic Research.
This dataset contains replication files for "Measuring the Impacts of Teachers I: Evaluating Bias in Teacher Value-Added Estimates" and "Measuring the Impacts of Teachers II: Teacher Value-Added and Student Outcomes in Adulthood" by Raj Chetty, John Friedman, and Jonah E. Rockoff. For more information, see https://opportunityinsights.org/paper/teachersi/ and https://opportunityinsights.org/paper/teachersii/. A summary of each related publication follows. I: Are teachers’ impacts on students’ test scores (“value-added”) a good measure of their quality? One reason this question has sparked debate is disagreement about whether value-added (VA) measures provide unbiased estimates of teachers’ causal impacts on student achievement. We test for bias in VA using previously unobserved parent characteristics and a quasi-experimental design based on changes in teaching staff. Using school district and tax records for more than one million children, we find that VA models which control for a student’s prior test scores exhibit little bias in forecasting teachers’ impacts on student achievement. II: Are teachers’ impacts on students’ test scores (“value-added”) a good measure of their quality? This question has sparked debate partly because of a lack of evidence on whether high value-added (VA) teachers improve students’ long-term outcomes. Using school district and tax records for more than one million children, we find that students assigned to high-VA teachers are more likely to attend college, earn higher salaries, and are less likely to have children as teenagers. Replacing a teacher whose VA is in the bottom 5% with an average teacher would increase the present value of students’ lifetime income by approximately $250,000 per classroom.
This dataset contains replication files for "The Association Between Income and Life Expectancy in the United States, 2001-2014" by Augustin Bergeron, Raj Chetty, David Cutler, Benjamin Scuderi, Michael Stepner, and Nicholas Turner. For more information, see https://opportunityinsights.org/paper/lifeexpectancy/. A summary of the related publication follows. How can we reduce socioeconomic disparities in health outcomes? Although it is well known that there are significant differences in health and longevity between income groups, debate remains about the magnitudes and determinants of these differences. We use new data from 1.4 billion anonymous earnings and mortality records to construct more precise estimates of the relationship between income and life expectancy at the national level than was feasible in prior work. We then construct new local area (county and metro area) estimates of life expectancy by income group and identify factors that are associated with higher levels of life expectancy for low-income individuals. Our findings show that disparities in life expectancy are not inevitable. There are cities throughout America — from New York to San Francisco to Birmingham, AL — where gaps in life expectancy are relatively small or are narrowing over time. Replicating these successes more broadly will require targeted local efforts, focusing on improving health behaviors among the poor in cities such as Las Vegas and Detroit. Our findings also imply that federal programs such as Social Security and Medicare are less redistributive than they might appear because low-income individuals obtain these benefits for significantly fewer years than high-income individuals, especially in cities like Detroit. Going forward, the challenge is to understand the mechanisms that lead to better health and longevity for low-income individuals in some parts of the U.S. To facilitate future research and monitor local progress, we have posted annual statistics on life expectancy by income group and geographic area (state, CZ, and county) at The Health Inequality Project website. Using these data, researchers will be able to study why certain places have high or improving levels of life expectancy and ultimately apply these lessons to reduce health disparities in other parts of the country.
https://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.7910/DVN/RCHDXXhttps://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.7910/DVN/RCHDXX
This dataset contains replication files for "A Practical Method to Reduce Privacy Loss when Disclosing Statistics Based on Small Samples" by Raj Chetty and John Friedman. For more information, see https://opportunityinsights.org/paper/differential-privacy/. A summary of the related publication follows. Releasing statistics based on small samples – such as estimates of social mobility by Census tract, as in the Opportunity Atlas – is very valuable for policy but can potentially create privacy risks by unintentionally disclosing information about specific individuals. To mitigate such risks, we worked with researchers at the Harvard Privacy Tools Project and Census Bureau staff to develop practical methods of reducing the risks of privacy loss when releasing such data. This paper describes the methods that we developed, which can be applied to disclose any statistic of interest that is estimated using a sample with a small number of observations. We focus on the case where the dataset can be broken into many groups (“cells”) and one is interested in releasing statistics for one or more of these cells. Building on ideas from the differential privacy literature, we add noise to the statistic of interest in proportion to the statistic’s maximum observed sensitivity, defined as the maximum change in the statistic from adding or removing a single observation across all the cells in the data. Intuitively, our approach permits the release of statistics in arbitrarily small samples by adding sufficient noise to the estimates to protect privacy. Although our method does not offer a formal privacy guarantee, it generally outperforms widely used methods of disclosure limitation such as count-based cell suppression both in terms of privacy loss and statistical bias. We illustrate how the method can be implemented by discussing how it was used to release estimates of social mobility by Census tract in the Opportunity Atlas. We also provide a step-by-step guide and illustrative Stata code to implement our approach.
Daily Track the Recovery Economic Data
Daily Track the Recovery Economic Data
Geography Level: StateItem Vintage: Not Available
Update Frequency: DailyAgency: Opportunity InsightsAvailable File Type: Excel with GitHub link to other orgs CSV and other data files
Return to Other Federal Agency Datasets Page
This dataset contains replication files for "Who Becomes an Inventor in America? The Importance of Exposure to Innovation" by Alex Bell, Raj Chetty, Xavier Jaravel, Neviana Petkova, and John van Reenen. For more information, see https://opportunityinsights.org/paper/losteinsteins/. A summary of the related publication follows. Innovation is widely viewed as the engine of economic growth. As a result, many policies have been proposed to spur innovation, ranging from tax cuts to investments in STEM (science, technology, engineering, and math) education. Unfortunately, the effectiveness of such policies is unclear because we know relatively little about the factors that induce people to become inventors. Who are America’s most successful inventors and what can we learn from their experiences in designing policies to stimulate innovation? We study the lives of more than one million inventors in the United States using a new de-identified database linking patent records to tax and school district records. Tracking these individuals from birth onward, we identify the key factors that determine who becomes an inventor, as measured by filing a patent.1 Our results shed light on what policies can be most effective in increasing innovation, showing in particular that increasing exposure to innovation among women, minorities, and children from low-income families may have greater potential to spark innovation and growth than traditional approaches such as reducing tax rates. The opinions expressed in this paper are those of the authors alone and do not necessarily reflect the views of the Internal Revenue Service, U.S. Department of the Treasury, or the National Institutes of Health.
https://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.7910/DVN/UM5S3Xhttps://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.7910/DVN/UM5S3X
This dataset contains replication files for "Childhood Environment and Gender Gaps in Adulthood" by Raj Chetty, Nathaniel Hendren, Frina Lin, Jeremy Majerovitz, and Benjamin Scuderi. For more information, see https://opportunityinsights.org/paper/gendergaps/. A summary of the related publication follows. We show that differences in childhood environments play an important role in shaping gender gaps in adulthood by documenting three facts using population tax records for children born in the 1980s. First, gender gaps in employment rates, earnings, and college attendance vary substantially across the parental income distribution. Notably, the traditional gender gap in employment rates is reversed for children growing up in poor families: boys in families in the bottom quintile of the income distribution are less likely to work than girls. Second, these gender gaps vary substantially across counties and commuting zones in which children grow up. The degree of variation in outcomes across places is largest for boys growing up in poor, single-parent families. Third, the spatial variation in gender gaps is highly correlated with proxies for neighborhood disadvantage. Low-income boys who grow up in high-poverty, high-minority areas work significantly less than girls. These areas also have higher rates of crime, suggesting that boys growing up in concentrated poverty substitute from formal employment to crime. Together, these findings demonstrate that gender gaps in adulthood have roots in childhood, perhaps because childhood disadvantage is especially harmful for boys.
Small business transactions and revenue data aggregated from several credit card processors, collected by Womply and compiled by Opportunity Insights. Transactions and revenue are reported based on the ZIP code where the business is located.
Data provided for CT (FIPS code 9), MA (25), NJ (34), NY (36), and RI (44).
Data notes from Opportunity Insights: Seasonally adjusted change since January 2020. Data is indexed in 2019 and 2020 as the change relative to the January index period. We then seasonally adjust by dividing year-over-year, which represents the difference between the change since January observed in 2020 compared to the change since January observed since 2019. We account for differences in the dates of federal holidays between 2019 and 2020 by shifting the 2019 reference data to align the holidays before performing the year-over-year division.
Small businesses are defined as those with annual revenue below the Small Business Administration’s thresholds. Thresholds vary by 6 digit NAICS code ranging from a maximum number of employees between 100 to 1500 to be considered a small business depending on the industry.
County-level and metro-level data and breakdowns by High/Middle/Low income ZIP codes have been temporarily removed since the August 21st 2020 update due to revisions in the structure of the raw data we receive. We hope to add them back to the OI Economic Tracker soon.
More detailed documentation on Opportunity Insights data can be found here: https://github.com/OpportunityInsights/EconomicTracker/blob/main/docs/oi_tracker_data_documentation.pdf
This dataset is an export from Opportunity Insights Economic Tracker ( https://www.tracktherecovery.org/)
The data in this dataset was last updated September 17, 2020. More current data is available at the project's GitHub repository: https://github.com/OpportunityInsights/EconomicTracker
From the Web site: The Opportunity Insights Economic Tracker (https://tracktherecovery.org) combines anonymized data from leading private companies – from credit card processors to payroll firms – to provide a real-time picture of indicators such as employment rates, consumer spending, and job postings across counties, industries, and income groups.
All of the data displayed on the Economic Tracker can be downloaded here. In collaboration with our data partners, we are making this data freely available in order to assist in efforts to inform the public, policymakers, and researchers about the real-time state of the economy and the effects of COVID-19.
Anyone is welcome to use this data; we simply we ask that you attribute our work by citing or linking to the accompanying paper and the Economic Tracker at https://tracktherecovery.org.
Url of original source : https://www.zearn.org/. To access the data dictionnary go to: https://github.com/OpportunityInsights/EconomicTracker/blob/main/docs/oi_tracker_data_dictionary.md
Number of active employees, aggregating information from multiple data providers. This series is based on firm-level payroll data from Paychex and Intuit, worker-level data on employment and earnings from Earnin, and firm-level timesheet data from Kronos. This data is compiled by Opportunity Insights. Data notes from Opportunity Insights: Data Source: Paychex, Intuit, Earnin, Kronos Update Frequency: Weekly Date Range: January 15th 2020 until the most recent date available. The most recent date available for the full series depends on the combination of Paychex, Intuit and Earnin data. We extend the national trend of aggregate employment and employment by income quartile by using Kronos timecard data and Paychex data for workers paid on a weekly paycycle to forecast beyond the end of the Paychex, Intuit and Earnin data. Data Frequency: Daily, presented as a 7-day moving average Indexing Period: January 4th - January 31st Indexing Type: Change relative to the January 2020 index period, not seasonally adjusted. More detailed documentation on Opportunity Insights data can be found here: https://github.com/OpportunityInsights/EconomicTracker/blob/main/docs/oi_tracker_data_documentation.pdf
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Unemployment Claims by Type’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://catalog.data.gov/dataset/d8b9fa53-e257-4df9-8971-33f2d732d522 on 27 January 2022.
--- Dataset description provided by original source is as follows ---
Weekly unemployment insurance claims counts and rates (as a share of the 2019 labor force) for Connecticut from the U.S. Department of Labor, compiled by Opportunity Insights.
Breakdowns by claim type: Initial Claims – Regular Claims – PUA Claims – Combined Claims
Continued Claims – Regular Claims – PUA Claims – PEUC Claims – Combined Claims
More detailed documentation on Opportunity Insights data can be found here: https://github.com/OpportunityInsights/EconomicTracker/blob/main/docs/oi_tracker_data_documentation.pdf
--- Original source retains full ownership of the source dataset ---
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
Weekly unemployment insurance claims counts and rates (as a share of the 2019 labor force) for Connecticut from the U.S. Department of Labor, compiled by Opportunity Insights.
Breakdowns by claim type: Initial Claims – Regular Claims – PUA Claims – Combined Claims
Continued Claims – Regular Claims – PUA Claims – PEUC Claims – Combined Claims
More detailed documentation on Opportunity Insights data can be found here: https://github.com/OpportunityInsights/EconomicTracker/blob/main/docs/oi_tracker_data_documentation.pdf
Not seeing a result you expected?
Learn how you can add new datasets to our index.
This dataset contains replication files for "The Surrogate Index: Combining Short-Term Proxies to Estimate Long-Term Treatment Effects More Rapidly and Precisely" by Susan Athey, Raj Chetty, Guido Imbens, and Hyunseung Kang. For more information, see https://opportunityinsights.org/paper/the-surrogate-index/. A summary of the related publication follows. The impacts of many policies, such as efforts to increase upward income mobility or improve health outcomes, are only observed with long delays. For example, it can take decades to see the effects of early childhood interventions on lifetime earnings. This problem has greatly limited researchers’ and policymakers’ ability to test and improve policies and arises frequently in our own work at Opportunity Insights on the determinants of economic opportunity. In this study, we develop a new method of estimating the long-term impacts of policies more rapidly and precisely using short-term proxies. We predict long-term outcomes (e.g., lifetime earnings) using short-term outcomes (e.g., earnings in early adulthood or test scores). We then show that the causal effects of policies on this predictive index (which we term a “surrogate index”, following terminology in the statistics literature) can help us learn about their long-term impacts more quickly under certain assumptions that are described in the full paper. We apply our method to analyze the long-term impacts of a job training experiment in California. Using short-term employment rates as surrogates, we show that one could have estimated the program’s impact on mean employment rates over a 9 year horizon within 1.5 years, with a 35% reduction in standard errors. The success of the surrogate index in this job training application suggests that our method could be applied to predict the long-term impacts of other programs as well. Going forward, we hope to build a public library of early indicators (surrogate indices) for social science by harnessing historical experiments along with the large-scale datasets we have built. If you would like to contribute to this effort by reporting a surrogate index that predicts long-term impacts estimated in an experiment, as in the GAIN program, please contact us.