Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
While fixed effects (FE) models are often employed to address potential omitted variables, we argue that these models’ real utility is in isolating a particular dimension of variance from panel data for analysis. In addition, we show through novel mathematical decomposition and simulation that only one-way FE models cleanly capture either the over-time or cross-sectional dimensions in panel data, while the two-way FE model unhelpfully combines within-unit and cross-sectional variation in a way that produces un-interpretable answers. In fact, as we show in this paper, if we begin with the interpretation that many researchers wrongly assign to the two-way FE model—that it represents a single estimate of X on Y while accounting for unit-level heterogeneity and time shocks—the two-way FE specification is statistically unidentified, a fact that statistical software packages like R and Stata obscure through internal matrix processing.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This repository provides code and data used in "Social Equity of Bridge Management" (DOI: 10.1061/JMENEA/MEENG-5265). Both the dataset used in the analysis ("Panel.csv") and the R script to create the dataset ("Panel_Prep.R") are provided. The main results of the paper as well as alternate specifications for the ordered probit with random effects models can be replicated with "Models_OrderedProbit.R". Note that these models take an extensive amount of memory and computational resources. Additionally, we have provided alternate model specifications in the "Robustness" R scripts: binomial probit with random effects, ordered probit without random effects, and Ordinary Least Squares with random effects. An extended version of the supplemental materials is also provided.
Facebook
TwitterIn this paper we study neural networks and their approximating power in panel data models. We provide asymptotic guarantees on deep feed-forward neural network estimation of the conditional mean, building on the work of Farrell et al. (2021), and explore latent patterns in the cross-section. We use the proposed estimators to forecast the progression of new COVID-19 cases across the G7 countries during the pandemic. We find significant forecasting gains over both linear panel and nonlinear time-series models. Containment or lockdown policies, as instigated at the national level by governments, are found to have out-of-sample predictive power for new COVID-19 cases. We illustrate how the use of partial derivatives can help open the “black box” of neural networks and facilitate semi-structural analysis: school and workplace closures are found to have been effective policies at restricting the progression of the pandemic across the G7 countries. But our methods illustrate significant heterogeneity and time variation in the effectiveness of specific containment policies.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This paper introduces a time-varying panel data model that incorporates latent group structures, designed to tackle both individual heterogeneity and smooth structural changes over time. We develop an innovative centre-augmented K-power means (KPM) methodology that promotes convergence of subjects toward their respective cluster centers, enabling the identification of latent group structures without requiring prior knowledge of group composition. This approach delivers both superior precision and computational efficiency. We provide rigorous theoretical foundations, demonstrating estimation consistency, accurate subgroup identification, and consistent selection of the number of groups. The efficacy of the proposed KPM method in accurately identifying the latent group structures in panel data is demonstrated through comprehensive numerical analysis, including simulation studies and two real-world applications.
Facebook
TwitterWe propose a new method for estimating dynamic panel data models with selection. The method uses backward substitution for the lagged dependent variable, which leads to an estimating equation that requires correcting for contemporaneous selection only. The estimator is valid under relatively weak assumptions about errors and permits avoiding the weak instruments problem associated with differencing. We also propose a simple test for selection bias that is based on the addition of a selection term to the first-difference equation and subsequent testing for significance of this term. The methods are applied to estimating dynamic earnings equations for women.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Summary : Fuel demand is shown to be influenced by fuel prices, people's income and motorization rates. We explore the effects of electric vehicle's rates in gasoline demand using this panel dataset.
Files : dataset.csv - Panel dimensions are the Brazilian state ( i ) and year ( t ). The other columns are: gasoline sales per capita (ln_Sg_pc), prices of gasoline (ln_Pg) and ethanol (ln_Pe) and their lags, motorization rates of combustion vehicles (ln_Mi_c) and electric vehicles (ln_Mi_e) and GDP per capita (ln_gdp_pc). All variables are all under the natural log function, since we use this to calculate demand elasticities in a regression model.
adjacency.csv - The adjacency matrix used in interaction with electric vehicles' motorization rates to calculate spatial effects. At first, it follows a binary adjacency formula: for each pair of states i and j, the cell (i, j) is 0 if the states are not adjacent and 1 if they are. Then, each row is normalized to have sum equal to one.
regression.do - Series of Stata commands used to estimate the regression models of our study. dataset.csv must be imported to work, see comment section.
dataset_predictions.xlsx - Based on the estimations from Stata, we use this excel file to make average predictions by year and by state. Also, by including years beyond the last panel sample, we also forecast the model into the future and evaluate the effects of different policies that influence gasoline prices (taxation) and EV motorization rates (electrification). This file is primarily used to create images, but can be used to further understand how the forecasting scenarios are set up.
Sources: Fuel prices and sales: ANP (https://www.gov.br/anp/en/access-information/what-is-anp/what-is-anp) State population, GDP and vehicle fleet: IBGE (https://www.ibge.gov.br/en/home-eng.html?lang=en-GB) State EV fleet: Anfavea (https://anfavea.com.br/en/site/anuarios/)
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In this article, we consider estimation of common structural breaks in panel data models with unobservable interactive fixed effects. We introduce a penalized principal component (PPC) estimation procedure with an adaptive group fused LASSO to detect the multiple structural breaks in the models. Under some mild conditions, we show that with probability approaching one the proposed method can correctly determine the unknown number of breaks and consistently estimate the common break dates. Furthermore, we estimate the regression coefficients through the post-LASSO method and establish the asymptotic distribution theory for the resulting estimators. The developed methodology and theory are applicable to the case of dynamic panel data models. Simulation results demonstrate that the proposed method works well in finite samples with low false detection probability when there is no structural break and high probability of correctly estimating the break numbers when the structural breaks exist. We finally apply our method to study the environmental Kuznets curve for 74 countries over 40 years and detect two breaks in the data. Supplementary materials for this article are available online.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This is the replication file for 'Bayesian Sensitivity Analysis for Unmeasured Confounding in Causal Panel Data Models', including the package that implements the proposed method, as well as replication code for Monte Carlo studies, simulated example and empirical analysis.
Facebook
TwitterWe present a sequential approach to estimating a dynamic Hausman-Taylor model. We first estimate the coefficients of the time-varying regressors and subsequently regress the first-stage residuals on the time-invariant regressors. In comparison to estimating all coefficients simultaneously, this two-stage procedure is more robust against model misspecification, allows for a flexible choice of the first-stage estimator, and enables simple testing of the overidentifying restrictions. For correct inference, we derive analytical standard error adjustments. We evaluate the finite-sample properties with Monte Carlo simulations and apply the approach to a dynamic gravity equation for US outward foreign direct investment.
Facebook
TwitterAttribution-NonCommercial 3.0 (CC BY-NC 3.0)https://creativecommons.org/licenses/by-nc/3.0/
License information was derived automatically
Code for simulations
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We address the estimation of factor-augmented panel data models using observed measurements to proxy for unobserved factors or loadings and explore the use of internal instruments to address the resulting endogeneity. The main challenge consists in that economic theory rarely provides insights into which measurements to choose as proxies when several are available. To overcome this problem, we propose a new class of estimators that are linear combinations of instrumental variable estimators and establish large sample results. We also show that an optimal weighting scheme exists, leading to efficiency gains relative to an instrumental variable estimator. Simulations show that the proposed approach performs better than existing methods. We illustrate the new method using data on test scores across U.S. school districts.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Datasets for a qualitative study conducted on component regression models. The resources wich were used for the application sections of a research essay are included.
Facebook
TwitterThis paper studies the estimation of a panel data model with latent structures where individuals can be classified into different groups with the slope parameters being homogeneous within the same group but heterogeneous across groups. To identify the unknown group structure of vector parameters, we design an algorithm called Panel-CARDS. We show that it can identify the true group structure asymptotically and estimate the model parameters consistently at the same time. Simulations evaluate the performance and corroborate the asymptotic theory in several practical design settings. The empirical application reveals the heterogeneous grouping effect of income on democracy.
Facebook
TwitterReplication data for application. Visit https://dataone.org/datasets/sha256%3Ad1b60121aa674a5618dfe7e00ccaaae8beb063be28c982d294277dafeb21e5a6 for complete metadata about this dataset.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This article considers linear panel data models where the dependence of the regressors and the unobservables is modeled through a factor structure. The number of time periods and the sample size both go to infinity. Unlike in most existing methods for the estimation of this type of models, nonstrong factors are allowed and the number of factors can grow to infinity with the sample size. We study a class of two-step estimators of the regression coefficients. In the first step, factors and factor loadings are estimated. Then, the second step corresponds to the panel regression of the outcome on the regressors and the estimates of the factors and the factor loadings from the first step. The estimators enjoy double robustness. Different methods can be used in the first step while the second step is unique. We derive sufficient conditions on the first-step estimator and the data generating process under which the two-step estimator is asymptotically normal. Assumptions under which using an approach based on principal components analysis in the first step yields an asymptotically normal estimator are also given. The two-step procedure exhibits good finite sample properties in simulations. The approach is illustrated by an empirical application on fiscal policy.
Facebook
TwitterThis study identifies causal links between a high-PM2.5 episode in Korea and air pollutants originating from China during a high-PM2.5 episode that occurred in Korea between February 23 and March 12, 2019. Datasets on ground-based PM2.5 levels in Korea and China, airflows from the back-trajectory models, and satellite images were investigated, and long-range transboundary transport (LRTT) effects were statistically analyzed using spatial panel-data models. The findings are: 1) visual presentations of the observed PM2.5 concentration in China and Korea, back-trajectory air flows, and satellite images from the Moderate Resolution Imaging Spectroradiometer Aerosol Optical Depth and the Copernicus Atmosphere Monitoring Service clearly show that transboundary air pollutants from China affect PM2.5 concentration in Korea; 2) the effect of LRTT from China is likely to intensify under certain meteorological conditions, such as westerly winds from China to Korea, the formation of high pressure in China and low pressure in Korea, relatively high temperature, and stagnant air flow in Korea; 3) the results from the spatial panel-data models provide statistical evidence of the positive effect of LRTT from China on increasing local PM2.5 concentration in Korea. The nationwide average LRTT contributions to PM2.5 concentration in Korea are 38.4%, while regional contributions are 41.3% for the Seoul Metropolitan Area, 38.6% for the northwest region, and 27.5% for the southeast regions in Korea, indicating the greatest impact on the Seoul Metropolitan Area.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In this study, we develop a novel estimation method for quantile treatment effects (QTE) under rank invariance and rank stationarity assumptions. Ishihara (2020) explores identification of the nonseparable panel data model under these assumptions and proposes a parametric estimation based on the minimum distance method. However, when the dimensionality of the covariates is large, the minimum distance estimation using this process is computationally demanding. To overcome this problem, we propose a two-step estimation method based on the quantile regression and minimum distance methods. We then show the uniform asymptotic properties of our estimator and the validity of the nonparametric bootstrap. The Monte Carlo studies indicate that our estimator performs well in finite samples. Finally, we present two empirical illustrations, to estimate the distributional effects of insurance provision on household production and TV watching on child cognitive development.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This article considers a consistent test for serial correlation of unknown form in the residual of panel data models with interactive fixed effects and possibly lagged dependent variables. Following the spirit of Hong, we construct a test statistic based on the comparison of a kernel-based spectral density estimator and the null spectral density. Under the null hypothesis, our test statistic is asymptotically N(0, 1) as both N and T tend to infinity. In contrast to existing tests for serial correlation, there is no need to specify the order of serial correlation about the alternative. We further examine the local and global power properties of test. A simulation study shows that our test performs well in finite samples. In the empirical application, we apply the test to study the impact of the divorce law reform on divorce rate. We find strong evidence of serial correlation in the residual, and our results show that the divorce law reform has permanent positive effects on divorce rates.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Conventional OLS fixed-effects and GLS random-effects estimators of dynamic models that control for individual-effects are known to be biased when applied to short panel data (T <= 10). GMM estimators are the most used alternative but are known to have drawbacks. Transformed-likelihood estimators are unused in political science. Of these, orthogonal reparameterization estimators are only tangentially referred to in any discipline. We introduce these estimators and test their performance, demonstrating that the unused orthogonal reparameterization transformed-likelihood estimator in particular performs very well and is an improvement on the commonly used GMM estimators. When T and/or N are small, it provides efficiency gains and overcomes the issues GMM estimators encounter in the estimation of long-run effects when the coefficient on the lagged dependent variable is close to one.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This file describes and lists the Stata and data files that are used to produce the results of the paper titled "Revisiting Neoclassical Growth Theory: A Primary Role for Inflation and Capacity Utilization". A step-by-step instructions are found in the Readme.pdf.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
While fixed effects (FE) models are often employed to address potential omitted variables, we argue that these models’ real utility is in isolating a particular dimension of variance from panel data for analysis. In addition, we show through novel mathematical decomposition and simulation that only one-way FE models cleanly capture either the over-time or cross-sectional dimensions in panel data, while the two-way FE model unhelpfully combines within-unit and cross-sectional variation in a way that produces un-interpretable answers. In fact, as we show in this paper, if we begin with the interpretation that many researchers wrongly assign to the two-way FE model—that it represents a single estimate of X on Y while accounting for unit-level heterogeneity and time shocks—the two-way FE specification is statistically unidentified, a fact that statistical software packages like R and Stata obscure through internal matrix processing.