Facebook
Twitteranalyze the current population survey (cps) annual social and economic supplement (asec) with r the annual march cps-asec has been supplying the statistics for the census bureau's report on income, poverty, and health insurance coverage since 1948. wow. the us census bureau and the bureau of labor statistics ( bls) tag-team on this one. until the american community survey (acs) hit the scene in the early aughts (2000s), the current population survey had the largest sample size of all the annual general demographic data sets outside of the decennial census - about two hundred thousand respondents. this provides enough sample to conduct state- and a few large metro area-level analyses. your sample size will vanish if you start investigating subgroups b y state - consider pooling multiple years. county-level is a no-no. despite the american community survey's larger size, the cps-asec contains many more variables related to employment, sources of income, and insurance - and can be trended back to harry truman's presidency. aside from questions specifically asked about an annual experience (like income), many of the questions in this march data set should be t reated as point-in-time statistics. cps-asec generalizes to the united states non-institutional, non-active duty military population. the national bureau of economic research (nber) provides sas, spss, and stata importation scripts to create a rectangular file (rectangular data means only person-level records; household- and family-level information gets attached to each person). to import these files into r, the parse.SAScii function uses nber's sas code to determine how to import the fixed-width file, then RSQLite to put everything into a schnazzy database. you can try reading through the nber march 2012 sas importation code yourself, but it's a bit of a proc freak show. this new github repository contains three scripts: 2005-2012 asec - download all microdata.R down load the fixed-width file containing household, family, and person records import by separating this file into three tables, then merge 'em together at the person-level download the fixed-width file containing the person-level replicate weights merge the rectangular person-level file with the replicate weights, then store it in a sql database create a new variable - one - in the data table 2012 asec - analysis examples.R connect to the sql database created by the 'download all microdata' progr am create the complex sample survey object, using the replicate weights perform a boatload of analysis examples replicate census estimates - 2011.R connect to the sql database created by the 'download all microdata' program create the complex sample survey object, using the replicate weights match the sas output shown in the png file below 2011 asec replicate weight sas output.png statistic and standard error generated from the replicate-weighted example sas script contained in this census-provided person replicate weights usage instructions document. click here to view these three scripts for more detail about the current population survey - annual social and economic supplement (cps-asec), visit: the census bureau's current population survey page the bureau of labor statistics' current population survey page the current population survey's wikipedia article notes: interviews are conducted in march about experiences during the previous year. the file labeled 2012 includes information (income, work experience, health insurance) pertaining to 2011. when you use the current populat ion survey to talk about america, subract a year from the data file name. as of the 2010 file (the interview focusing on america during 2009), the cps-asec contains exciting new medical out-of-pocket spending variables most useful for supplemental (medical spending-adjusted) poverty research. confidential to sas, spss, stata, sudaan users: why are you still rubbing two sticks together after we've invented the butane lighter? time to transition to r. :D
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
analyze the survey of consumer finances (scf) with r the survey of consumer finances (scf) tracks the wealth of american families. every three years, more than five thousand households answer a battery of questions about income, net worth, credit card debt, pensions, mortgages, even the lease on their cars. plenty of surveys collect annual income, only the survey of consumer finances captures such detailed asset data. responses are at the primary economic unit-level (peu) - the economically dominant, financially interdependent family members within a sampled household. norc at the university of chicago administers the data collection, but the board of governors of the federal reserve pay the bills and therefore call the shots. if you were so brazen as to open up the microdata and run a simple weighted median, you'd get the wrong answer. the five to six thousand respondents actually gobble up twenty-five to thirty thousand records in the final pub lic use files. why oh why? well, those tables contain not one, not two, but five records for each peu. wherever missing, these data are multiply-imputed, meaning answers to the same question for the same household might vary across implicates. each analysis must account for all that, lest your confidence intervals be too tight. to calculate the correct statistics, you'll need to break the single file into five, necessarily complicating your life. this can be accomplished with the meanit sas macro buried in the 2004 scf codebook (search for meanit - you'll need the sas iml add-on). or you might blow the dust off this website referred to in the 2010 codebook as the home of an alternative multiple imputation technique, but all i found were broken links. perhaps it's time for plan c, and by c, i mean free. read the imputation section of the latest codebook (search for imputation), then give these scripts a whirl. they've got that new r smell. the lion's share of the respondents in the survey of consumer finances get drawn from a pretty standard sample of american dwellings - no nursing homes, no active-duty military. then there's this secondary sample of richer households to even out the statistical noise at the higher end of the i ncome and assets spectrum. you can read more if you like, but at the end of the day the weights just generalize to civilian, non-institutional american households. one last thing before you start your engine: read everything you always wanted to know about the scf. my favorite part of that title is the word always. this new github repository contains t hree scripts: 1989-2010 download all microdata.R initiate a function to download and import any survey of consumer finances zipped stata file (.dta) loop through each year specified by the user (starting at the 1989 re-vamp) to download the main, extract, and replicate weight files, then import each into r break the main file into five implicates (each containing one record per peu) and merge the appropriate extract data onto each implicate save the five implicates and replicate weights to an r data file (.rda) for rapid future loading 2010 analysis examples.R prepare two survey of consumer finances-flavored multiply-imputed survey analysis functions load the r data files (.rda) necessary to create a multiply-imputed, replicate-weighted survey design demonstrate how to access the properties of a multiply-imput ed survey design object cook up some descriptive statistics and export examples, calculated with scf-centric variance quirks run a quick t-test and regression, but only because you asked nicely replicate FRB SAS output.R reproduce each and every statistic pr ovided by the friendly folks at the federal reserve create a multiply-imputed, replicate-weighted survey design object re-reproduce (and yes, i said/meant what i meant/said) each of those statistics, now using the multiply-imputed survey design object to highlight the statistically-theoretically-irrelevant differences click here to view these three scripts for more detail about the survey of consumer finances (scf), visit: the federal reserve board of governors' survey of consumer finances homepage the latest scf chartbook, to browse what's possible. (spoiler alert: everything.) the survey of consumer finances wikipedia entry the official frequently asked questions notes: nationally-representative statistics on the financial health, wealth, and assets of american hous eholds might not be monopolized by the survey of consumer finances, but there isn't much competition aside from the assets topical module of the survey of income and program participation (sipp). on one hand, the scf interview questions contain more detail than sipp. on the other hand, scf's smaller sample precludes analyses of acute subpopulations. and for any three-handed martians in the audience, ther e's also a few biases between these two data sources that you ought to consider. the survey methodologists at the federal reserve take their job...
Facebook
TwitterContains the following files: 1) EconLetters.txt 2) EconLetters.dta 3) DoFileEconLetters.do The first two contain the data used in: Escobari, Diego. "Systematic Peak-load Pricing, Congestion Premia and Demand Diverting: Empirical Evidence." Economics Letters, 103 (1), April 2009, 59-61. (1) is a text file, while (2) is in Stata format. The third file is the Stata Do file to replicate all the tables in the paper. Please cite the paper if you use these data in your own research. Stephanie C. Reynolds helped in the collection of the data. This research received financial support from the Private Enterprise Research Center at Texas A&M and the Bradley Foundation. Feel free to e-mail me if you have any questions: escobaridiego@gmail.com Diego Escobari, Ph.D. Professor of Economics Department of Economics Robert C. Vackar College of Business and Entrepreneurship The University of Texas Rio Grande Valley 1201 West University Drive Edinburg, Texas 78541 Phone: (956) 665 3366 https://faculty.utrgv.edu/diego.escobari/
Facebook
TwitterReplication Package – Decompressing to Prevent Unrest David Altman, Pontificia Universidad Católica de Chile 1. Description This dataset accompanies the article: Altman, David. “Decompressing to prevent unrest: political participation through citizen-initiated mechanisms of direct democracy” (2025), Social Movement Studies. It contains the data and code necessary to replicate all statistical analyses and tables presented in the article. 2. Coverage Time frame: 1970–2019 Countries: 116 democracies worldwide (electoral and liberal, according to V-Dem v14) Unit of analysis: Country-year 3. Data Sources V-Dem v14 (Coppedge et al., 2024): direct democracy indices (CIC-DPVI, TOC-DPVI), civil society participation index. NAVCO 1.3 (Chenoweth & Shay, 2020): violent and nonviolent resistance campaigns (dependent variable). World Bank, World Development Indicators: GDP per capita (constant 2015 US$), inflation. Author’s coding: harmonization and cleaning of datasets, construction of dependent variable (excluding self-determination/secession cases). 4. Variables accepted: dichotomous dependent variable (1 if violent or nonviolent regime-change/“other” campaign occurred in a given year; 0 otherwise). CIC_DPVI: citizen-initiated component of V-Dem’s Direct Popular Vote Index. TOC_DPVI: top-down component of direct democracy (plebiscites, obligatory referenda). pc_GDP: GDP per capita (constant 2015 US$). Inflation: annual inflation (%). v2x_cspart: Civil Society Participation Index (V-Dem). country, year: identifiers. 5. Files Included data.dta / data.csv – panel dataset used in the article. master.do – Stata do-file to reproduce all analyses. tables.do – generates Tables 1–2. figures.do – generates Figure 1 (coefficient plot). ReadMe.txt – this document. 6. Instructions Open master.do in Stata (v17 or later). Set working directory to the folder containing the replication package. Run the file. This will: Load data.dta Estimate the models (fixed-effects and random-effects logit with lagged IVs) Produce Tables 1–2 in /results/ Produce Figure 1 in /figures/ 7 Citation If you use this dataset, please cite: Altman, David (2025). Replication data for: Decompressing to Prevent Unrest: Political Participation through Citizen-Initiated Mechanisms of Direct Democracy. Harvard Dataverse. DOI: [to be added]
Facebook
TwitterAttribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
The evolution of a software system can be studied in terms of how various properties as reflected by software metrics change over time. Current models of software evolution have allowed for inferences to be drawn about certain attributes of the software system, for instance, regarding the architecture, complexity and its impact on the development effort. However, an inherent limitation of these models is that they do not provide any direct insight into where growth takes place. In particular, we cannot assess the impact of evolution on the underlying distribution of size and complexity among the various classes. Such an analysis is needed in order to answer questions such as 'do developers tend to evenly distribute complexity as systems get bigger?', and 'do large and complex classes get bigger over time?'. These are questions of more than passing interest since by understanding what typical and successful software evolution looks like, we can identify anomalous situations and take action earlier than might otherwise be possible. Information gained from an analysis of the distribution of growth will also show if there are consistent boundaries within which a software design structure exists. The specific research questions that we address in Chapter 5 (Growth Dynamics) of the thesis this data accompanies are: What is the nature of distribution of software size and complexity measures? How does the profile and shape of this distribution change as software systems evolve? Is the rate and nature of change erratic? Do large and complex classes become bigger and more complex as software systems evolve? In our study of metric distributions, we focused on 10 different measures that span a range of size and complexity measures. In order to assess assigned responsibilities we use the two metrics Load Instruction Count and Store Instruction Count. Both metrics provide a measure for the frequency of state changes in data containers within a system. Number of Branches, on the other hand, records all branch instructions and is used to measure the structural complexity at class level. This measure is equivalent to Weighted Method Count (WMC) as proposed by Chidamber and Kemerer (1994) if a weight of 1 is applied for all methods and the complexity measure used is cyclomatic complexity. We use the measures of Fan-Out Count and Type Construction Count to obtain insight into the dynamics of the software systems. The former offers a means to document the degree of delegation, whereas the latter can be used to count the frequency of object instantiations. The remaining metrics provide structural size and complexity measures. In-Degree Count and Out-Degree Count reveal the coupling of classes within a system. These measures are extracted from the type dependency graph that we construct for each analyzed system. The vertices in this graph are classes, whereas the edges are directed links between classes. We associate popularity (i.e., the number of incoming links) with In-Degree Count and usage or delegation (i.e., the number of outgoing links) with Out-Degree Count. Number of Methods, Public Method Count, and Number of Attributes define typical object-oriented size measures and provide insights into the extent of data and functionality encapsulation. The raw metric data (4 .txt files and 1 .log file in a .zip file measuring ~0.5MB in total) is provided as a comma separated values (CSV) file, and the first line of the CSV file contains the header. A detailed output of the statistical analysis undertaken is provided as log files generated directly from Stata (statistical analysis software).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains data from an online experiment designed to test whether economically equivalent penalties—fees (paid before taking) and fines (paid after taking)—influence prosocial behaviour differently. Participants played a modified dictator game in which they could take points from another participant.
The dataset is provided in Excel format (Full-data.xlsx), along with a Stata do-file (submit.do) that reshapes, cleans, and analyses the data.
Platform: oTree
Recruitment: Prolific
Sample size: 201 participants
Design: Each participant played 20 rounds: 10 in the control condition and 10 in one treatment condition (fee or fine). Order of blocks was randomised.
Payment: 200 points = £1. One round was randomly selected for payment.
session – Session number
id – Participant ID
treatment – Assigned treatment (1 = Fee, 2 = Fine)
order – Order of blocks (0 = Control first, 1 = Treatment first)
For each round, participants made decisions in both control (c) and treatment (t) conditions.
c1, t1, c2, t2, … – Tokens available and/or allocated across control and treatment rounds.
takeX – Amount taken from the other participant in case X.
Social norms were elicited after the taking task. Variables include empirical, normative, and responsibility measures at both extensive and intensive margins:
eyX, etX – Empirical expectations (beliefs about what others do)
nyX, ntX – Normative expectations (beliefs about what others think is appropriate)
ryX, rtX – Responsibility measures
casenormX – Case identifier for norm elicitation
From survey responses:
Sex – Gender
Ethnicitysimplified – Simplified ethnicity category
Countryofresidence – Participant’s country of residence
order, session – Experimental setup metadata
analysis.do)The .do file performs the following steps:
Data Preparation
Import raw Excel file
Reshape from wide to long format (cases per participant)
Declare panel data (xtset id)
Variable Generation
Rename variables for clarity (e.g., take for amount taken)
Generate treatment dummies (treat)
Construct demographic dummies (gender, race, nationality)
Analysis Preparation
Create extensive and intensive margin variables
Generate expectation and norm measures
Output
Ready-to-analyse panel dataset for regression and statistical analysis
Facebook
TwitterThis README.txt file was generated on 2025-10-04 by Erik Nelson.
GENERAL INFORMATION
The Stata .do files in this depository generate the results that are plotted or presented in table format in the paper "The impact of light-rail stations on income sorting in US urban areas." All .do files load the needed datasets. All datasets are .xlsx format. Each Excel file contains data for the urban area that is part of the file's name. The data in each Excel file is in panel form. Each observation in a dataset represents a treated or control area i in urban area u in year t. We observe each area i's average nominal per capita and median HH income in year t = 1990, 2000, 2010, 2017, 2019, 2021, and 2022 (thes...
Facebook
TwitterThis file describes the replication material for: Trajectories of mental health problems in childhood and adult voting behaviour: Evidence from the 1970s British Cohort Study. Authors: Lisa-Christine Girard & Martin Okolikj. Accepted in Political Behavior. This dataverse holds the following 4 replication files: 1. data_cleaning_traj.R - This file is designed to load, merge and clean the datasets for the estimation of trajectories along with the rescaling of the age 10 Rutter scale. This file was prepared using R-4.1.1 version. 2. traj_estimation.do - With the dataset merged from data_cleaning_traj.R, we run this file in STATA to create and estimate trajectories, to be included in the full dataset. This file was prepared using STATA 17.0 version. 3. data_cleaning.R - This is the file designed to load, merge and clean all datasets in one for preparation of the main analysis following the trajectory estimation. This file was prepared using R-4.1.1 version. 4. POBE Analysis.do - The analysis file is designed to generate the results from the tables in the published paper along with all supplementary materials. This file was prepared using STATA 17.0 version. The data can be accessed at the following address. It requires user registration under special licence conditions: http://discover.ukdataservice.ac.uk/series/?sn=200001. If you have any questions or spot any errors please contact g.lisachristine@gmail.com or martin.okolic@gmail.com.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
analyze the area resource file (arf) with r the arf is fun to say out loud. it's also a single county-level data table with about 6,000 variables, produced by the united states health services and resources administration (hrsa). the file contains health information and statistics for over 3,000 us counties. like many government agencies, hrsa provides only a sas importation script and an as cii file. this new github repository contains two scripts: 2011-2012 arf - download.R download the zipped area resource file directly onto your local computer load the entire table into a temporary sql database save the condensed file as an R data file (.rda), comma-separated value file (.csv), and/or stata-readable file (.dta). 2011-2012 arf - analysis examples.R limit the arf to the variables necessary for your analysis sum up a few county-level statistics merge the arf onto other data sets, using both fips and ssa county codes create a sweet county-level map click here to view these two scripts for mo re detail about the area resource file (arf), visit: the arf home page the hrsa data warehouse notes: the arf may not be a survey data set itself, but it's particularly useful to merge onto other survey data. confidential to sas, spss, stata, and sudaan users: time to put down the abacus. time to transition to r. :D
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We provide the stata files that allow to reproduce the results presented in the paper
"Crop Prices and Deforestation in the Tropics" by Nicolas Berman, Mathieu Couttenier, Antoine Leblois and Raphaël Soubeyran.
The replication folder contains different files:
1- *.dta file: database
2- *.do file: do-file containing the codes to replicate the results (figures and tables)
3- * Ancillary data:
.csv file: data needed to produce a map of the initial forest cover (in 2000).
.dta additional files to run sensitivity analysis
Simply change the path to files (on line 25 of the replication_code.do file) to re-run the analysis:
** Change pathway to load and save the data
global dir ".../Replication_files_BCLS_2022"
Stata 17 was used for this work.
Facebook
TwitterThe files provided within this .zip file are meant to reproduce the tables and figures included in the article "Tabloid Media Campaigns and Public Opinion: Quasi-Experimental Evidence on Euroscepticism in England" by Florian Foos and Daniel Bischof in the APSR. Notice: - This is a fully reproducible archive written in Stata's project environment: https://www.statalist.org/forums/forum/general-stata-discussion/general/1302147-how-project-from-ssc-is-different-from-stata-built-in-project. - As the code is written in a project environment we advise all users to carefully read the README.TXT in order to understand how reproduction in Stata's project environment works. - The largest part of our analyses are based on yearly attitudinal data from the British Social Attitudes Survey (BSA): https://www.bsa.natcen.ac.uk. The BSA does not allow researchers to upload these data as part of their replication files; we are also not allowed to upload a recoded version of the data file. However, all yearly BSA surveys are available via the UK Data Service. In order to reproduce the results reported in this paper, you will need to a) register with the UK Data Service (https://beta.ukdataservice.ac.uk/myaccount/login) and b) access and download the relevant .dta files and place them into the replication archive (data_original/BSA/*YEAR*).
Facebook
Twitteranalyze the national survey on drug use and health (nsduh) with r the national survey on drug use and health (nsduh) monitors illicit drug, alcohol, and tobacco use with more detail than any other survey out there. if you wanna know the average age at first chewing tobacco dip, the prevalence of needle-sharing, the family structure of households with someone abusing pain relievers, even the health insurance coverage of peyote users, you are in the right place. the substance abuse and mental health services administration (samhsa) contracts with the north carolinians over at research triangle institute to run the survey, but the university of michigan's substance abuse and mental health data archive (samhda) holds the keys to this data castle. nsduh in its current form only goes back about a decade, when samhsa re-designed the methodology and started paying respondents thirty bucks a pop. before that, look for its predecessor - the national household survey on drug abuse (nhsda) - with public use files available back to 1979 (included in these scripts). be sure to read those changes in methodo logy carefully before you start trying to trend smokers' virginia slims brand loyalty back to 1999. although (to my knowledge) only the national health interview survey contains r syntax examples in its documentation, the friendly folks at samhsa have shown promise. since their published data tables were run on a restricted-access data set, i requested that they run the same sudaan analysis code on the public use files to confirm that this new r syntax does what it should. they delivered, i matched, pats on the back all around. if you need a one-off data point, samhda is overflowing with options to analyze the data online. you even might find some restricted statistics that won't appear in the public use files. still, that's no substitute for getting your hands dirty. when you tire of menu-driven online query tools and you're ready to bark with the big data dogs, give these puppies a whirl. the national survey on drug use and health targets the civilian, noninstitutionalized population of the united states aged twelve and older. this new github repository contains three scripts: 1979-2011 - download all microdata.R authenticate the university of michi gan's "i agree with these terms" page download, import, save each available year of data (with documentation) back to 1979 convert ea ch pre-packaged stata do-file (.do) into r, run the damn thing, get NAs where they belong 2010 single-year - analysis examples.R load a single year of data limit the table to the variables needed for an example analysis construct the complex sample survey object run enough example analyses to make a kitchen sink jealous replicate sam hsa puf.R load a single year of data limit the table to the variables needed for an example analysis construct the complex sample survey object print statistics and standard errors matching the target replicati on table click here to view these three scripts for more detail about the national survey on drug use and health, visit: the substance abuse and mental health services administration's nsduh homepage research triangle in stitute's nsduh homepage the university of michigan's nsduh homepage notes: the 'download all microdata' program intentionally breaks unless you complete the clearly-defined, one-step instruction to authenticate that you have read and agree with the download terms. the script will download the entire public use file archive, but only after this step has been completed. if you c ontact me for help without reading those instructions, i reserve the right to tease you mercilessly. also: thanks to the great hadley wickham for figuring out how to authenticate in the first place. confidential to sas, spss, stata, and sudaan users: did you know that you don't have to stop reading just because you've run out of candlewax? maybe it's time to switch to r. :D
Facebook
TwitterDataset Description: Purpose: Enables replication of empirical findings presented in the manuscript "EFFECTS OF TRADE RESISTANCES ON THE CAPITAL GOODS SECTOR: EVIDENCE FROM A STRUCTURAL GRAVITY MODEL FOR BRAZIL (2008–2016)". Nature and Scope: Quantitative panel dataset covering 2008-2016. Includes 144 countries with bilateral (exporter-importer) observations focused on the capital goods sector. Contains variables related to trade flows, trade policy, gravity determinants, macroeconomic indicators, and estimated model parameters. Content: The Stata dataset (20251019_submitted manuscript_data.dta) includes: bilateral capital goods import flows (imp); applied tariffs (tau, t_imp_Bra, t_exp_Bra); gravity variables (ln_DIST, contig, comlang, colony, rta); country-level macro data (ll, lk, lrgdpna); and estimated Multilateral Resistance terms (OMR/IMR). Origin: Data compiled from public sources: UN Comtrade, WITS, WTO (TRAINS, IDB, CTS), CEPII, PWT 9.1, Mario’s Larch RTA Database. OMR/IMR terms were generated via the estimation procedure detailed in the accompanying paper . Code Description: Purpose: This Stata do-file (20251019_submitted manuscript_dofile.do) contains the complete code necessary to replicate the empirical results presented in the manuscript "EFFECTS OF TRADE RESISTANCES ON THE CAPITAL GOODS SECTOR: EVIDENCE FROM A STRUCTURAL GRAVITY MODEL FOR BRAZIL (2008–2016)". Nature and Scope: The file is a script written in Stata command language. Its scope covers the entire empirical workflow. Content: The do-file executes the following main procedures: Loads the accompanying dataset (20251019_submitted manuscript_data.dta). Defines global macros and sets up the estimation environment. Runs the first-stage Poisson Pseudo-Maximum Likelihood (PPML) gravity model estimations with high-dimensional fixed effects (exporter-product-year, importer-product-year, bilateral pairs) using the ppmlhdfe command. Recovers the estimated fixed effects and constructs the Multilateral Resistance terms (OMR and IMR) based on the gravity model results. Merges the MR terms back into the main dataset. Runs the second-stage OLS regressions for the production function using the recovered OMR term. Runs the second-stage OLS regressions for the capital accumulation function using the recovered IMR term. Includes commands for diagnostic tests (e.g., RESET, MaMu variance tests, if applicable within the code). Dependencies: Requires Stata statistical software (version specified in the do-file or compatible) and likely requires user-written packages such as ppmlhdfe. Requires the accompanying dataset (20251019_submitted manuscript_data.dta) to be in the Stata working directory.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This file contains Stata data used for these analyses. (DTA)
Facebook
TwitterCode and data to produce final results and summary statistics using Stata 15.1. There are four code files (and a master do file to call them in sequence) and three data sets to produce the results in the paper. All data are included in a single .ZIP file (uncompressed files exceeded upload limits).
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
AbstractWe study the effect of inconsistent time preferences on actual and planned retirement timing decisions in two independent datasets. Theory predicts that hyperbolic time preferences can lead to dynamically inconsistent retirement timing. In an online experiment with more than 2,000 participants, we find that time-inconsistent participants retire on average 1.75 years earlier than time-consistent participants do. The planned retirement age of non-retired participants decreases with age. This negative age effect is about twice as strong among time-inconsistent participants. The temptation of early retirement seems to rise in the final years of approaching retirement. Consequently, time-inconsistent participants have a higher probability of regretting their retirement decision. We find similar results for a representative household survey (German SAVE panel). Using smoking behavior and overdraft usage as time preference proxies, we confirm that time-inconsistent participants retire earlier and that non-retirees reduce their planned retirement age within the panel.MethodsWe conduct an online experiment in cooperation with a large and well-circulated German newspaper, the Frankfurter Allgemeine Zeitung (FAZ). Participants are recruited via a link on the newspaper's website and two announcements in the print edition. In total, 3,077 participants complete the experiment, which takes them on average 11 minutes. Participants answer questions about retirement planning, time preferences, risk preferences, financial literacy, and demographics. The initial sample for this study consists of 256 retired participants and 2,173 non-retired participants.Usage NotesOur dataset: STATA Do File is attached Additional Datasets: In addition, a German Household Panle is used in this paper. The data cannot be uploaded by us but is available via the Max Planck Institute (https://www.mpisoc.mpg.de/en/social-policy-mea/research/save-2001-2013/). We upload the Do-Files used in the analysis and the results in an excel format (xlsx).
Facebook
TwitterTwo STATA files with code to replicate the duration analysis 1. duration_estimation.do: This file estimates first-stage OLS and second-stage duration model with dual IV. Log file for the estimation of the main model is included. 2. duration_simulation.do: The file bootstraps parameters for the main model, and performs the accept/reject for a baseline counterfactual with no changes. Log file is included Because the data are proprietary, we cannot upload the complete dataset. A 10% random sample can be provided upon request.
Facebook
TwitterThis study delves into the intricate political dynamics that influence legislators’ policy stances concerning the import of US meat into Taiwan over the last decade. It specifically centers on the instances of US beef importation in 2012 and US pork importation in 2021. Within this folder, you will find two datasets along with a Stata .do file, all of which are instrumental for the analysis of quantitative data as presented in the paper. Additionally, the folder encompasses a spreadsheet that facilitates the creation of Figure 5.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
For a comprehensive guide to this data and other UCR data, please see my book at ucrbook.comVersion 14 release notes:Adds .parquet file formatVersion 13 release notes:Adds 2023-2024 dataVersion 12 release notes:Adds 2022 dataVersion 11 release notes:Adds 2021 data.Version 10 release notes:Adds 2020 data. Please note that the FBI has retired UCR data ending in 2020 data so this will be the last arson data they release. Changes .rda file to .rds.Version 9 release notes:Changes release notes description, does not change data.Version 8 release notes:Adds 2019 data.Note that the number of months missing variable sharply changes starting in 2018. This is probably due to changes in UCR reporting of the column_2_type variable which is used to generate the months missing county (the code I used does not change). So pre-2018 and 2018+ years may not be comparable for this variable. Version 7 release notes:Adds a last_month_reported column which says which month was reported last. This is actually how the FBI defines number_of_months_reported so is a more accurate representation of that. Removes the number_of_months_reported variable as the name is misleading. You should use the last_month_reported or the number_of_months_missing (see below) variable instead.Adds a number_of_months_missing in the annual data which is the sum of the number of times that the agency reports "missing" data (i.e. did not report that month) that month in the card_2_type variable or reports NA in that variable. Please note that this variable is not perfect and sometimes an agency does not report data but this variable does not say it is missing. Therefore, this variable will not be perfectly accurate.Version 6 release notes:Adds 2018 dataVersion 5 release notes:Adds data in the following formats: SPSS and Excel.Changes project name to avoid confusing this data for the ones done by NACJD.Version 4 release notes: Adds 1979-2000, 2006, and 2017 dataAdds agencies that reported 0 months.Adds monthly data.All data now from FBI, not NACJD. Changes some column names so all columns are <=32 characters to be usable in Stata.Version 3 release notes: Add data for 2016.Order rows by year (descending) and ORI.Removed data from Chattahoochee Hills (ORI = "GA06059") from 2016 data. In 2016, that agency reported about 28 times as many vehicle arsons as their population (Total mobile arsons = 77762, population = 2754.Version 2 release notes: Fix bug where Philadelphia Police Department had incorrect FIPS county code. This Arson data set is an FBI data set that is part of the annual Uniform Crime Reporting (UCR) Program data. This data contains information about arsons reported in the United States. The information is the number of arsons reported, to have actually occurred, to not have occurred ("unfounded"), cleared by arrest of at least one arsoning, cleared by arrest where all offenders are under the age of 18, and the cost of the arson. This is done for a number of different arson location categories such as community building, residence, vehicle, and industrial/manufacturing structure. The yearly data sets here combine data from the years 1979-2018 into a single file for each group of crimes. Each monthly file is only a single year as my laptop can't handle combining all the years together. These files are quite large and may take some time to load. I also added state, county, and place FIPS code from the LEAIC (crosswalk).A small number of agencies had some months with clearly incorrect data. I changed the incorrect columns to NA and left the other columns unchanged for that agency. The following are data problems that I fixed - there are still likely issues remaining in the data so make sure to check yourself before running analyses. Oneida, New York (ORI = NY03200) had multiple years that reported single arsons costing over $700 million. I deleted this agency from all years of data.In January 1989 Union, North Carolina (ORI = NC09000) reported 30,000 arsons in uninhabited single occupancy buildings and none any other months. In December 1991 Gadsden, Florida (ORI = FL02000) reported that a single arson at a community/public building caused $99,999,999 in damages (the maximum possible).In April 2017 St. Paul, Minnesota (ORI = MN06209) reported 73,400 arsons in uninhabited storage buildings and 10,000 arsons in uninhabited community
Facebook
TwitterThis dataset includes a complete record of the 36,066 public comments submitted to the Commodity Futures Trading Commission (CFTC) in response to notices of proposed rule-making (NPRMs) implementing the Dodd-Frank Act over a 42-month period (January 14, 2010 to July 16, 2014). The data was exported from the agency’s internal database by the CFTC and provided to the authors by email correspondence following a cold call to the CFTC public relations department. The source internal database is maintained by the CFTC as part of its internal compliance with the Administrative Procedures Act (APA) and includes all rule-making notices that appear in the Federal Register. Owing to the salience and publicity of the Dodd-Frank Act, the CFTC made a special tag in its database for all comments submitted in response to rules proposed under the authority of the Dodd-Frank Act. This database thus includes all comments which the CFTC considers relevant to the Dodd-Frank reform. In short, the CFTC gave t..., This dataset was exported by the CFTC from their internal database of public comments in response to NPRMs. The uploaded file is the exact raw data generated by the CTFC and provided to the authors. An updated version of the data file including the author's classifications based on the organization value will be uploaded when the related work is accepted for publication., , # Dodd Frank Financial Reform at the CFTC - Public Comments, January 14th, 2010 to July 16th, 2014
NOTE: The Comment Text ( and variables) are longer than the maximum character count of Microsoft Excel cells (32,767 characters). All analysis should take this into account and import the .txt file directly into your analysis program (R, Stata, etc.) rather than attempt to edit or modify the data in Excel before using computational analysis.
There are two files provided:
Codebook:Â
| Variable | Explanation ...
Facebook
Twitteranalyze the current population survey (cps) annual social and economic supplement (asec) with r the annual march cps-asec has been supplying the statistics for the census bureau's report on income, poverty, and health insurance coverage since 1948. wow. the us census bureau and the bureau of labor statistics ( bls) tag-team on this one. until the american community survey (acs) hit the scene in the early aughts (2000s), the current population survey had the largest sample size of all the annual general demographic data sets outside of the decennial census - about two hundred thousand respondents. this provides enough sample to conduct state- and a few large metro area-level analyses. your sample size will vanish if you start investigating subgroups b y state - consider pooling multiple years. county-level is a no-no. despite the american community survey's larger size, the cps-asec contains many more variables related to employment, sources of income, and insurance - and can be trended back to harry truman's presidency. aside from questions specifically asked about an annual experience (like income), many of the questions in this march data set should be t reated as point-in-time statistics. cps-asec generalizes to the united states non-institutional, non-active duty military population. the national bureau of economic research (nber) provides sas, spss, and stata importation scripts to create a rectangular file (rectangular data means only person-level records; household- and family-level information gets attached to each person). to import these files into r, the parse.SAScii function uses nber's sas code to determine how to import the fixed-width file, then RSQLite to put everything into a schnazzy database. you can try reading through the nber march 2012 sas importation code yourself, but it's a bit of a proc freak show. this new github repository contains three scripts: 2005-2012 asec - download all microdata.R down load the fixed-width file containing household, family, and person records import by separating this file into three tables, then merge 'em together at the person-level download the fixed-width file containing the person-level replicate weights merge the rectangular person-level file with the replicate weights, then store it in a sql database create a new variable - one - in the data table 2012 asec - analysis examples.R connect to the sql database created by the 'download all microdata' progr am create the complex sample survey object, using the replicate weights perform a boatload of analysis examples replicate census estimates - 2011.R connect to the sql database created by the 'download all microdata' program create the complex sample survey object, using the replicate weights match the sas output shown in the png file below 2011 asec replicate weight sas output.png statistic and standard error generated from the replicate-weighted example sas script contained in this census-provided person replicate weights usage instructions document. click here to view these three scripts for more detail about the current population survey - annual social and economic supplement (cps-asec), visit: the census bureau's current population survey page the bureau of labor statistics' current population survey page the current population survey's wikipedia article notes: interviews are conducted in march about experiences during the previous year. the file labeled 2012 includes information (income, work experience, health insurance) pertaining to 2011. when you use the current populat ion survey to talk about america, subract a year from the data file name. as of the 2010 file (the interview focusing on america during 2009), the cps-asec contains exciting new medical out-of-pocket spending variables most useful for supplemental (medical spending-adjusted) poverty research. confidential to sas, spss, stata, sudaan users: why are you still rubbing two sticks together after we've invented the butane lighter? time to transition to r. :D