49 datasets found
  1. f

    Interpretation and identification of within-unit and cross-sectional...

    • plos.figshare.com
    pdf
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jonathan Kropko; Robert Kubinec (2023). Interpretation and identification of within-unit and cross-sectional variation in panel data models [Dataset]. http://doi.org/10.1371/journal.pone.0231349
    Explore at:
    pdfAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Jonathan Kropko; Robert Kubinec
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    While fixed effects (FE) models are often employed to address potential omitted variables, we argue that these models’ real utility is in isolating a particular dimension of variance from panel data for analysis. In addition, we show through novel mathematical decomposition and simulation that only one-way FE models cleanly capture either the over-time or cross-sectional dimensions in panel data, while the two-way FE model unhelpfully combines within-unit and cross-sectional variation in a way that produces un-interpretable answers. In fact, as we show in this paper, if we begin with the interpretation that many researchers wrongly assign to the two-way FE model—that it represents a single estimate of X on Y while accounting for unit-level heterogeneity and time shocks—the two-way FE specification is statistically unidentified, a fact that statistical software packages like R and Stata obscure through internal matrix processing.

  2. H

    Replication data for: Inferring Transition Probabilities from Repeated Cross...

    • dataverse.harvard.edu
    Updated Feb 18, 2010
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ben Pelzer; Rob Eisinga; Philip Hans Franses (2010). Replication data for: Inferring Transition Probabilities from Repeated Cross Sections [Dataset]. http://doi.org/10.7910/DVN/MKJ5EN
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 18, 2010
    Dataset provided by
    Harvard Dataverse
    Authors
    Ben Pelzer; Rob Eisinga; Philip Hans Franses
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This paper discusses a nonstationary, heterogeneous Markov model designed to estimate entry and exit transition probabilities at the micro level from a time series of independent cross-sectional samples with a binary outcome variable. The model has its origins in the work of Moffitt and shares features with standard statistical methods for ecological inference. We outline the methodological framework proposed by Moffitt and present several extensions of the model to increase its potential application in a wider array of research contexts. We also discuss the relationship with previous lines of related research in political science. The example illustration uses survey data on American presidential vote intentions from a five-wave panel study conducted by Patterson in 1976. We treat the panel data as independent cross sections and compare the estimates of the Markov model with both dynamic panel parameter estimates and the actual observations in the panel. The results suggest that the proposed model provides a useful framework for the analysis of transitions in repeated cross sections. Open problems requiring further study are discussed.

  3. f

    National Panel Survey- Universal Panel Questionnaire, 2008-2015 - United...

    • microdata.fao.org
    Updated Nov 8, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Bureau of Statistics (2022). National Panel Survey- Universal Panel Questionnaire, 2008-2015 - United Republic of Tanzania [Dataset]. https://microdata.fao.org/index.php/catalog/1772
    Explore at:
    Dataset updated
    Nov 8, 2022
    Dataset authored and provided by
    National Bureau of Statistics
    Time period covered
    2008 - 2015
    Area covered
    Tanzania
    Description

    Abstract

    Panel data possess several advantages over conventional cross-sectional and time-series data, including their power to isolate the effects of specific actions, treatments, and general policies often at the core of large-scale econometric development studies. While the concept of panel data alone provides the capacity for modelling the complexities of human behaviour, the notion of universal panel data - in which time- and situation-driven variances leading to variations in tools, and thus results, are mitigated - can further enhance exploitation of the richness of panel information. The NPS Universal Panel Questionnaire (UPQ) consists of both survey instruments and datasets, meticulously aligned and engineered with the aim of facilitating the use of and improving access to the wealth of panel data offered by the NPS. The NPS-UPQ provides a consistent and straightforward means of conducting not only user-driven analyses using convenient, standardized tools, but also for monitoring MKUKUTA, FYDP II, and other national level development indicators reported by the NPS.

    The design of the NPS-UPQ combines the four completed rounds of the NPS - NPS 2008/09 (R1), NPS 2010/11 (R2), NPS 2012/13 (R3), and NPS 2014/15 (R4) - into pooled, module-specific survey instruments and datasets. The panel survey instruments offer the ease of comparability over time, with modifications and variances easily identifiable as well as those aspects of the questionnaire which have remained identical and offer consistent information. By providing all module-specific data over time within compact, pooled datasets, panel datasets eliminate the need for user-generated merges between rounds and present data in a clear, logical format, increasing both the usability and comprehension of complex data.

    Geographic coverage

    Regional coverage

    Analysis unit

    Households

    Universe

    The universe includes all households and individuals in Tanzania with the exception of those residing in military barracks or other institutions.

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    SAMPLING PROCEDURE While the same sample of respondents was maintained over the first three rounds of the NPS, longitudinal surveys tend to suffer from bias introduced by households leaving the survey over time, i.e. attrition. Although the NPS maintains a highly successful recapture rate (roughly 96% retention at the household level), minimizing the escalation of this selection bias, a refresh of longitudinal cohorts was done for the NPS 2014/15 to ensure proper representativeness of estimates while maintaining a sufficient primary sample to maintain cohesion within panel analysis. A newly completed Population and Housing Census (PHC) in 2012, providing updated population figures along with changes in administrative boundaries, emboldened the opportunity to realign the NPS sample and abate collective bias potentially introduced through attrition.

    To maintain the panel concept of the NPS, the sample design for NPS 2014/2015 consisted of a combination of the original NPS sample and a new NPS sample. A nationally representative sub-sample was selected to continue as part of the “Extended Panel” while an entirely new sample, “Refresh Panel”, was selected to represent national and sub-national domains. Similar to the sample in NPS 2008/2009, the sample design for the “Refresh Panel” allows analysis at four primary domains of inference, namely: Dar es Salaam, other urban areas on mainland Tanzania, rural mainland Tanzania, and Zanzibar. This new cohort in NPS 2014/2015 will be maintained and tracked in all future rounds between national censuses.

    Mode of data collection

    Face-to-face [f2f]

    Research instrument

    The format of the NPS-UPQ survey instrument is similar to previously disseminated NPS survey instruments. Each module has a questionnaire and clearly identifies if the module collects information at the individual or household level. Within each module-specific questionnaire of the NPS-UPQ survey instrument, there are five distinct sections, arranged vertically: (1) the UPQ - “U” on the survey instrument, (2) R4, (3), R3, (4) R2, and (5) R1 – the latter 4 sections presenting each questionnaire in its original form at time of its respective dissemination.

    The uppermost section of each module’s questionnaire (“U”) represents the model universal panel questionnaire, with questions generated from the comprehensive listing of questions across all four rounds of the NPS and codes generated from the comprehensive collection of codes. The following sections are arranged vertically by round, considering R4 as most recent. While not all rounds will have data reported for each question in the UPQ and not each question will have reports for each of the UPQ codes listed, the NPS-UPQ survey instrument represents the visual, all-inclusive set of information collected by the NPS over time.

    The four round-specific sections (R4, R3, R2, R1) are aligned with their UPQ-equivalent question, visually presenting their contribution to compatibility with the UPQ. Each round-specific section includes the original round-specific variable names, response codes and skip patterns (corresponding to their respective round-specific NPS data sets, and despite their variance from other rounds or from the comprehensive UPQ code listing)4.

    • Household identification;
    • Survey staff details;
    • Household member roster;
    • Education,
    • Health,
    • Labour;
    • Food outside the household;
    • Subject welfare;
    • Food security;
    • Housing, water and sanitation;
    • Consumption of food over the past one week;
    • Non-food expenditures (past one week & one month);
    • Non-food expenditures (past twelve months);
    • Household assets;
    • Family/household non-farm enterprises;
    • Assistance and groups;
    • Credit;
    • Finance;
    • Recent shocks to household welfare;
    • Deaths in the household;
    • Household recontact information;
    • Filter questions;
    • Anthropometry.
  4. f

    Data from: Quantile Co-Movement in Financial Markets: A Panel Quantile Model...

    • tandf.figshare.com
    zip
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tomohiro Ando; Jushan Bai (2023). Quantile Co-Movement in Financial Markets: A Panel Quantile Model With Unobserved Heterogeneity [Dataset]. http://doi.org/10.6084/m9.figshare.7461701.v3
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    Taylor & Francis
    Authors
    Tomohiro Ando; Jushan Bai
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This article introduces a new procedure for analyzing the quantile co-movement of a large number of financial time series based on a large-scale panel data model with factor structures. The proposed method attempts to capture the unobservable heterogeneity of each of the financial time series based on sensitivity to explanatory variables and to the unobservable factor structure. In our model, the dimension of the common factor structure varies across quantiles, and the explanatory variables is allowed to depend on the factor structure. The proposed method allows for both cross-sectional and serial dependence, and heteroscedasticity, which are common in financial markets. We propose new estimation procedures for both frequentist and Bayesian frameworks. Consistency and asymptotic normality of the proposed estimator are established. We also propose a new model selection criterion for determining the number of common factors together with theoretical support. We apply the method to analyze the returns for over 6000 international stocks from over 60 countries during the subprime crisis, European sovereign debt crisis, and subsequent period. The empirical analysis indicates that the common factor structure varies across quantiles. We find that the common factors for the quantiles and the common factors for the mean are different. Supplementary materials for this article are available online.

  5. w

    General Household Survey 2010-2019 - Nigeria

    • microdata.worldbank.org
    • catalog.ihsn.org
    Updated May 18, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Bureau of Statistics (NBS) (2023). General Household Survey 2010-2019 - Nigeria [Dataset]. https://microdata.worldbank.org/index.php/catalog/5835
    Explore at:
    Dataset updated
    May 18, 2023
    Dataset authored and provided by
    National Bureau of Statistics (NBS)
    Time period covered
    2010 - 2019
    Area covered
    Nigeria
    Description

    Abstract

    Panel data possess several advantages over conventional cross-sectional and time-series data, including their power to isolate the effects of specific actions, treatments, and general policies often at the core of large-scale econometric development studies. While the concept of panel data alone provides the capacity for modeling the complexities of human behavior, the notion of universal panel data – in which time- and situation-driven variances leading to variations in tools, and thus results, are mitigated – can further enhance exploitation of the richness of panel information.

    The Basic Information Document (BID) provides a brief overview of the Nigerian General Household Survey (GHS) but focuses primarily on the theoretical development and application of panel data, as well as key elements of the universal panel survey instrument and datasets generated by the four rounds of the GHS. As the BID does not describe in detail the background, development, or use of the GHS itself, the wave-specific GHS BIDs should supplement the information provided here.

    The Nigeria Universal Panel Data (NUPD) consists of both survey instruments and datasets from the two survey visits of the GHS - Post-Planting (PP) and Post-Harvest (PH) - meticulously aligned and engineered with the aim of facilitating the use of and improving access to the wealth of panel data offered by the GHS. The NUPD provides a consistent and straightforward means of conducting user-driven analyses using convenient, standardized tools.

    The design of the NUPD combines the four completed Waves of the GHS Household Post-Planting and Post-Harvest Surveys – Wave 1 (2010/11), Wave 2 (2012/13), Wave 3 (2015/16), and Wave 4 (2018/19) – into pooled, module-specific survey instruments and datasets. The panel survey instruments offer the ease of comparability over time, with modifications and variances easily identifiable as well as those aspects of the questionnaire which have remained identical and offer consistent information. By providing all module-specific data over time within compact, pooled datasets, panel datasets eliminate the need for user-generated merges between rounds and present data in a clear, logical format, increasing both the usability and comprehension of complex data.

    Geographic coverage

    National

    Analysis unit

    • Households
    • Individuals

    Universe

    The survey covered all de jure households excluding prisons, hospitals, military barracks, and school dormitories.

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    Please see the GHS BIDs for each round for detailed descriptions of the sample design used in each round and their respective implementation efforts as this is a compilation of datasets from all previous waves.

    Mode of data collection

    Face-to-face [f2f]

    Research instrument

    The larger GHS-Panel project consists of three questionnaires (Household Questionnaire, Agriculture Questionnaire, Community Questionnaire) for each of the two visits (Post-Planting and Post-Harvest). The GHS-NUPD only consists of the Household Questionnaire.

    GHS-Panel Household Questionnaire: The Household Questionnaire provides information on demographics; education; health (including anthropometric measurement for children); labor; food and non-food expenditure; household nonfarm income-generating activities; food security and shocks; safety nets; housing conditions; assets; information and communication technology; and other sources of household income.

    The Household Questionnaire is slightly different for the two visits. Some information was collected only in the post-planting visit, some only in the post-harvest visit, and some in both visits.

    Cleaning operations

    Please see the GHS BIDs for each round for detailed descriptions of data editing and additional data processing efforts as this is a compilation of datasets from all previous waves.

  6. f

    Data from: Panel Data Cointegration Testing with Structural Instabilities

    • tandf.figshare.com
    bin
    Updated Dec 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anindya Banerjee; Josep Lluís Carrion-i-Silvestre (2024). Panel Data Cointegration Testing with Structural Instabilities [Dataset]. http://doi.org/10.6084/m9.figshare.25365593.v2
    Explore at:
    binAvailable download formats
    Dataset updated
    Dec 18, 2024
    Dataset provided by
    Taylor & Francis
    Authors
    Anindya Banerjee; Josep Lluís Carrion-i-Silvestre
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Spurious regression analysis in panel data when the time series are cross-section dependent is analyzed in the article. The set-up includes (possibly unknown) multiple structural breaks that can affect both the deterministic and the common factor components. We show that consistent estimation of the long-run average parameter is possible once cross-section dependence is controlled using cross-section averages in the spirit of Pesaran’s common correlated effects approach. This result is used to design individual and panel cointegration test statistics that accommodate the presence of structural breaks that can induce parameter instabilities in the deterministic component, the cointegration vector and the common factor loadings.

  7. H

    Data from: A Bayesian Approach to Dynamic Panel Models with Endogenous...

    • dataverse.harvard.edu
    bin, pdf +1
    Updated Mar 11, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Harvard Dataverse (2016). A Bayesian Approach to Dynamic Panel Models with Endogenous Rarely Changing Variables [Dataset]. http://doi.org/10.7910/DVN/08RCPK
    Explore at:
    text/plain; charset=us-ascii(122753), text/plain; charset=us-ascii(27681), text/plain; charset=us-ascii(122720), text/plain; charset=us-ascii(611687), pdf(164502), text/plain; charset=us-ascii(611625), text/plain; charset=us-ascii(122703), text/plain; charset=us-ascii(29655), text/plain; charset=us-ascii(122715), text/plain; charset=us-ascii(107576), text/plain; charset=us-ascii(122636), text/plain; charset=us-ascii(122756), pdf(319325), text/plain; charset=us-ascii(611649), text/plain; charset=us-ascii(535038), text/plain; charset=us-ascii(536123), text/plain; charset=us-ascii(611808), text/plain; charset=us-ascii(611979), text/plain; charset=us-ascii(611621), pdf(140697), text/plain; charset=us-ascii(122683), text/plain; charset=us-ascii(611688), pdf(319002), text/plain; charset=us-ascii(23312), text/plain; charset=us-ascii(611651), pdf(319369), text/plain; charset=us-ascii(122661), bin(0), text/plain; charset=us-ascii(107822), text/plain; charset=us-ascii(107754), text/plain; charset=us-ascii(122693), text/plain; charset=us-ascii(35903), text/plain; charset=us-ascii(611636), text/plain; charset=us-ascii(26589), pdf(319020), text/plain; charset=us-ascii(26662), pdf(152741), pdf(538377), text/plain; charset=us-ascii(611608), text/plain; charset=us-ascii(122712), text/plain; charset=us-ascii(611596), text/plain; charset=us-ascii(814), pdf(319067), text/plain; charset=us-ascii(10518), text/plain; charset=us-ascii(26661), text/plain; charset=us-ascii(611640), text/plain; charset=us-ascii(611561), text/plain; charset=us-ascii(611544), text/plain; charset=us-ascii(122738), text/plain; charset=us-ascii(122732), pdf(543317), text/plain; charset=us-ascii(122695), pdf(333677), text/plain; charset=us-ascii(122670), text/plain; charset=us-ascii(611657), text/plain; charset=us-ascii(107562), pdf(319310), pdf(319365), text/plain; charset=us-ascii(122674), text/plain; charset=us-ascii(611553), pdf(318915), text/plain; charset=us-ascii(122771), text/plain; charset=us-ascii(107746), text/plain; charset=us-ascii(107937), text/plain; charset=us-ascii(122662), text/plain; charset=us-ascii(65619), text/plain; charset=us-ascii(26564), pdf(339753), text/plain; charset=us-ascii(2110), text/plain; charset=us-ascii(122664), text/plain; charset=us-ascii(122789), text/plain; charset=us-ascii(26633), text/plain; charset=us-ascii(107468), pdf(319313), text/plain; charset=us-ascii(611496), text/plain; charset=us-ascii(611474), pdf(340076), text/plain; charset=us-ascii(611556), text/plain; charset=us-ascii(611638), text/plain; charset=us-ascii(122700), text/plain; charset=us-ascii(107965), text/plain; charset=us-ascii(320), text/plain; charset=us-ascii(122699), text/plain; charset=us-ascii(23014), text/plain; charset=us-ascii(330), pdf(152139), text/plain; charset=us-ascii(611914), text/plain; charset=us-ascii(122680), pdf(141262), text/plain; charset=us-ascii(285), text/plain; charset=us-ascii(175576), pdf(339725), pdf(319274), text/plain; charset=us-ascii(122684), text/plain; charset=us-ascii(611601), text/plain; charset=us-ascii(107916), text/plain; charset=us-ascii(611532), text/plain; charset=us-ascii(536235), text/plain; charset=us-ascii(26595), text/plain; charset=us-ascii(364924), text/plain; charset=us-ascii(122708), pdf(319351), pdf(319382), text/plain; charset=us-ascii(611693), text/plain; charset=us-ascii(122706), text/plain; charset=us-ascii(611500), text/plain; charset=us-ascii(122697), text/plain; charset=us-ascii(611617), text/plain; charset=us-ascii(122747), text/plain; charset=us-ascii(611629), pdf(319121), text/plain; charset=us-ascii(611956), pdf(319411), text/plain; charset=us-ascii(122734), pdf(319388), text/plain; charset=us-ascii(611581), text/plain; charset=us-ascii(122759), text/plain; charset=us-ascii(7860196), text/plain; charset=us-ascii(81625), text/plain; charset=us-ascii(122725), text/plain; charset=us-ascii(107529), text/plain; charset=us-ascii(107596), text/plain; charset=us-ascii(122727), pdf(319417), text/plain; charset=us-ascii(122751), text/plain; charset=us-ascii(122677), text/plain; charset=us-ascii(122690), pdf(556304), pdf(319217), text/plain; charset=us-ascii(122760), text/plain; charset=us-ascii(611719), pdf(538842), text/plain; charset=us-ascii(107654), text/plain; charset=us-ascii(1726), pdf(318984), text/plain; charset=us-ascii(122667), text/plain; charset=us-ascii(27776), pdf(339665), pdf(318945), text/plain; charset=us-ascii(122691), pdf(318976), text/plain; charset=us-ascii(1673), text/plain; charset=us-ascii(122741), text/plain; charset=us-ascii(16588), pdf(319009), text/plain; charset=us-ascii(611658), pdf(318910), text/plain; charset=us-ascii(26717), pdf(318997), text/plain; charset=us-ascii(122705), pdf(319459), text/plain; charset=us-ascii(611917), text/plain; charset=us-ascii(108156), text/plain; charset=us-ascii(611702), text/plain; charset=us-ascii(611703), text/plain; charset=us-ascii(611700), pdf(222919), text/plain; charset=us-ascii(536384), text/plain; charset=us-ascii(122688), text/plain; charset=us-ascii(26718), pdf(115288), text/plain; charset=us-ascii(18759), pdf(555136), pdf(131396), text/plain; charset=us-ascii(26603), text/plain; charset=us-ascii(611627), text/plain; charset=us-ascii(611565), text/plain; charset=us-ascii(611523), text/plain; charset=us-ascii(122726), text/plain; charset=us-ascii(611672), pdf(319078), text/plain; charset=us-ascii(122713), text/plain; charset=us-ascii(611630), pdf(526947), text/plain; charset=us-ascii(611566), text/plain; charset=us-ascii(536927), text/plain; charset=us-ascii(26628), text/plain; charset=us-ascii(122704), text/plain; charset=us-ascii(3188), text/plain; charset=us-ascii(535395), pdf(158978), text/plain; charset=us-ascii(611524), pdf(139139)Available download formats
    Dataset updated
    Mar 11, 2016
    Dataset provided by
    Harvard Dataverse
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Whether democratic and nondemocratic regimes perform differently in social provision policy is an important issue to social scientists and policy makers. Since political regimes are rarely changing, their long-term or dynamic effects on the outcome are of concern to researchers when they evaluate how political regimes affect social policy. However, estimating the dynamic effects of rarely changing variables in the analysis of time-series cross-sectional (TSCS) data by conventional estimators may be problematic when the unit effects are included in the model specification. This article proposes a model to account for and estimate the correlation between the unit effects and explanatory variables. Applying the proposed model to 18 Latin American countries, this article finds evidence that democracy has a positive effect on social spending both in the short and long term.

  8. f

    Data from: Estimation in a semiparametric panel data model with...

    • tandf.figshare.com
    pdf
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chaohua Dong; Jiti Gao; Bin Peng (2023). Estimation in a semiparametric panel data model with nonstationarity [Dataset]. http://doi.org/10.6084/m9.figshare.7335209.v1
    Explore at:
    pdfAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Taylor & Francis
    Authors
    Chaohua Dong; Jiti Gao; Bin Peng
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In this paper, we consider a partially linear panel data model with nonstationarity and certain cross-sectional dependence. Accounting for the explosive feature of the nonstationary time series, we particularly employ Hermite orthogonal functions in this study. Under a general spatial error dependence structure, we then establish some consistent closed-form estimates for both the unknown parameters and the unknown functions for the cases where N and T go jointly to infinity. Rates of convergence and asymptotic normalities are established for the proposed estimators. Both the finite sample performance and the empirical applications show that the proposed estimation methods work well.

  9. d

    Replication Data for: A Practical Guide to Counterfactual Estimators for...

    • search.dataone.org
    Updated Nov 9, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Liu, Licheng; Wang, Ye; Xu, Yiqing (2023). Replication Data for: A Practical Guide to Counterfactual Estimators for Causal Inference with Time-Series Cross-Sectional Data [Dataset]. http://doi.org/10.7910/DVN/ZVC9W5
    Explore at:
    Dataset updated
    Nov 9, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Liu, Licheng; Wang, Ye; Xu, Yiqing
    Description

    This paper introduces a simple framework of counterfactual estimation for causal inference with time-series cross-sectional data, in which we estimate the average treatment effect on the treated by directly imputing counterfactual outcomes for treated observations. We discuss several novel estimators under this framework, including the fixed effects counterfactual estimator, interactive fixed effects counterfactual estimator, and matrix completion estimator. They provide more reliable causal estimates than conventional twoway fixed effects models when treatment effects are heterogeneous or unobserved time-varying confounders exist. Moreover, we propose a new dynamic treatment effects plot, along with several diagnostic tests, to help researchers gauge the validity of the identifying assumptions. We illustrate these methods with two political economy examples and develop an open-source package, fect, in both R and Stata to facilitate implementation.

  10. Replication Data for: Democratization and Gini index: Panel data analysis...

    • search.datacite.org
    Updated 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    LEIZHEN ZANG; Xiong Feng (2019). Replication Data for: Democratization and Gini index: Panel data analysis based on random forest method [Dataset]. http://doi.org/10.7910/dvn/w2cxvu
    Explore at:
    Dataset updated
    2019
    Dataset provided by
    DataCitehttps://www.datacite.org/
    Harvard Dataverse
    Authors
    LEIZHEN ZANG; Xiong Feng
    Description

    The mechanism for the association between democratic development and the wealth gap has always been the focus of political and economic research, yet with no consistent conclusion. The reasons for that often are, 1) challenges to generalize the results obtained from analyzing a single country’s time series studies or multinational cross-section data analysis, and 2) deviations in research results caused by missing values or variable selection in panel data analysis. When it comes to the latter one, there are two factors contribute to it. One is that the accuracy of estimation is interfered with the presence of missing values in variables, another is that subjective discretion that must be exercised to select suitable proxies amongst many candidates, which are likely to cause variable selection bias. In order to solve these problems, this study is the pioneeringly research to utilize the machine learning method to interpolate missing values efficiently through the random forest model in this topic, and effectively analyzed cross-country data from 151 countries covering the period 1993–2017. Since this paper measures the importance of different variables to the dependent variable, more appropriate and important variables could be selected to construct a complete regression model. Results from different models come to a consensus that the promotion of democracy can significantly narrow the gap between the rich and the poor, with marginally decreasing effect with respect to wealth. In addition, the study finds out that this mechanism exists only in non-colonial nations or presidential states. Finally, this paper discusses the potential theoretical and policy implications of results.

  11. d

    Replication Data for: Getting Time Right: Using Cox Models and Probabilities...

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Nov 19, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Metzger, Shawna; Jones, Benjamin (2023). Replication Data for: Getting Time Right: Using Cox Models and Probabilities to Interpret Binary Panel Data [Dataset]. http://doi.org/10.7910/DVN/FEW2JP
    Explore at:
    Dataset updated
    Nov 19, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Metzger, Shawna; Jones, Benjamin
    Description

    Replication material for Metzger and Jones' "Getting Time Right" (forthcoming, Political Analysis). See "readme.html" in /code folder for further documentation. The CO capsule does not rerun the main simulations, but does provide the raw simulation results from those simulations. Abstract: Logit and probit (L/P) models are a mainstay of binary time-series cross-sectional analyses (BTSCS). Researchers include cubic splines or time polynomials to acknowledge the temporal element inherent in these data. However, L/P models cannot easily accommodate three other aspects of the data’s temporality: whether covariate effects are conditional on time, whether the process of interest is causally complex, and whether our functional form assumption regarding time’s effect is correct. Failing to account for any of these issues amounts to misspecification bias, threatening our inferences’ validity. We argue scholars should consider using Cox duration models when analyzing BTSCS data, as they create fewer opportunities for such misspecification bias, while also having the ability to assess the same hypotheses as L/P. We use Monte Carlo simulations to bring new evidence to light showing Cox models perform just as well—and sometimes better—than logit models in a basic BTSCS setting, and perform considerably better in more complex BTSCS situations. In addition, we highlight a new interpretation technique for Cox models—transition probabilities—to make Cox model results more readily interpretable. We use an application from interstate conflict to demonstrate our points.

  12. General Social Survey 2010 Cross-Section and Panel Combined

    • thearda.com
    Updated 2010
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Association of Religion Data Archives (2010). General Social Survey 2010 Cross-Section and Panel Combined [Dataset]. http://doi.org/10.17605/OSF.IO/C6G27
    Explore at:
    Dataset updated
    2010
    Dataset provided by
    Association of Religion Data Archives
    Dataset funded by
    National Science Foundation
    Description

    The General Social Surveys (GSS) have been conducted by the National Opinion Research Center (NORC) annually since 1972, except for the years 1979, 1981, and 1992 (a supplement was added in 1992), and biennially beginning in 1994. The GSS are designed to be part of a program of social indicator research, replicating questionnaire items and wording in order to facilitate time-trend studies. This data file has all cases and variables asked on the 2010 GSS. There are a total of 4,901 cases in the data set but their initial sampling years vary because the GSS now contains panel cases. Sampling years can be identified with the variable SAMPTYPE.

    The 2010 GSS featured special modules on aging, the Internet, shared capitalism, gender roles, intergroup relations, immigration, meeting spouse, knowledge about and attitudes toward science, religious identity, religious trends, genetics, veterans, crime and victimization, social networks and group membership, and sexual behavior (continuing the series started in 1988).

    The GSS has switched from a repeating, cross-section design to a combined repeating cross-section and panel-component design. The 2006 GSS was the base year for the first panel. A sub-sample of 2,000 GSS cases from 2006 was selected for reinterview in 2008 and again in 2010 as part of the GSSs in those years. The 2008 GSS consists of a new cross-section plus the reinterviews from 2006. The 2010 GSS consists of a new cross-section of 2,044, the first reinterview wave of the 2,023 2008 panel cases with 1,581 completed cases, and the second and final reinterview of the 2006 panel with 1,276 completed cases. Altogether, the 2010 GSS had 4,901 cases (2,044 in the new 2010 panel, 1,581 in the 2008 panel, and 1,276 in the 2006 panel). The 2010 GSS is the first round to fully implement the new, rolling panel design. In 2012 and later GSSs, there will likewise be a fresh cross-section (wave one of a new panel), wave two panel cases from the immediately preceding GSS, and wave three panel cases from the next earlier GSS.

    To download syntax files for the GSS that reproduce well-known religious group recodes, including RELTRAD, please visit the "/research/syntax-repository-list" Target="_blank">ARDA's Syntax Repository.

  13. d

    Health and Retirement Study (HRS)

    • search.dataone.org
    Updated Nov 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Damico, Anthony (2023). Health and Retirement Study (HRS) [Dataset]. http://doi.org/10.7910/DVN/ELEKOY
    Explore at:
    Dataset updated
    Nov 21, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Damico, Anthony
    Description

    analyze the health and retirement study (hrs) with r the hrs is the one and only longitudinal survey of american seniors. with a panel starting its third decade, the current pool of respondents includes older folks who have been interviewed every two years as far back as 1992. unlike cross-sectional or shorter panel surveys, respondents keep responding until, well, death d o us part. paid for by the national institute on aging and administered by the university of michigan's institute for social research, if you apply for an interviewer job with them, i hope you like werther's original. figuring out how to analyze this data set might trigger your fight-or-flight synapses if you just start clicking arou nd on michigan's website. instead, read pages numbered 10-17 (pdf pages 12-19) of this introduction pdf and don't touch the data until you understand figure a-3 on that last page. if you start enjoying yourself, here's the whole book. after that, it's time to register for access to the (free) data. keep your username and password handy, you'll need it for the top of the download automation r script. next, look at this data flowchart to get an idea of why the data download page is such a righteous jungle. but wait, good news: umich recently farmed out its data management to the rand corporation, who promptly constructed a giant consolidated file with one record per respondent across the whole panel. oh so beautiful. the rand hrs files make much of the older data and syntax examples obsolete, so when you come across stuff like instructions on how to merge years, you can happily ignore them - rand has done it for you. the health and retirement study only includes noninstitutionalized adults when new respondents get added to the panel (as they were in 1992, 1993, 1998, 2004, and 2010) but once they're in, they're in - respondents have a weight of zero for interview waves when they were nursing home residents; but they're still responding and will continue to contribute to your statistics so long as you're generalizing about a population from a previous wave (for example: it's possible to compute "among all americans who were 50+ years old in 1998, x% lived in nursing homes by 2010"). my source for that 411? page 13 of the design doc. wicked. this new github repository contains five scripts: 1992 - 2010 download HRS microdata.R loop through every year and every file, download, then unzip everything in one big party impor t longitudinal RAND contributed files.R create a SQLite database (.db) on the local disk load the rand, rand-cams, and both rand-family files into the database (.db) in chunks (to prevent overloading ram) longitudinal RAND - analysis examples.R connect to the sql database created by the 'import longitudinal RAND contributed files' program create tw o database-backed complex sample survey object, using a taylor-series linearization design perform a mountain of analysis examples with wave weights from two different points in the panel import example HRS file.R load a fixed-width file using only the sas importation script directly into ram with < a href="http://blog.revolutionanalytics.com/2012/07/importing-public-data-with-sas-instructions-into-r.html">SAScii parse through the IF block at the bottom of the sas importation script, blank out a number of variables save the file as an R data file (.rda) for fast loading later replicate 2002 regression.R connect to the sql database created by the 'import longitudinal RAND contributed files' program create a database-backed complex sample survey object, using a taylor-series linearization design exactly match the final regression shown in this document provided by analysts at RAND as an update of the regression on pdf page B76 of this document . click here to view these five scripts for more detail about the health and retirement study (hrs), visit: michigan's hrs homepage rand's hrs homepage the hrs wikipedia page a running list of publications using hrs notes: exemplary work making it this far. as a reward, here's the detailed codebook for the main rand hrs file. note that rand also creates 'flat files' for every survey wave, but really, most every analysis you c an think of is possible using just the four files imported with the rand importation script above. if you must work with the non-rand files, there's an example of how to import a single hrs (umich-created) file, but if you wish to import more than one, you'll have to write some for loops yourself. confidential to sas, spss, stata, and sudaan users: a tidal wave is coming. you can get water up your nose and be dragged out to sea, or you can grab a surf board. time to transition to r. :D

  14. f

    Data_Sheet_1_Impact of Alcohol Outlet Density on Reported Cases of Child...

    • frontiersin.figshare.com
    docx
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yuna Koyama; Takeo Fujiwara (2023). Data_Sheet_1_Impact of Alcohol Outlet Density on Reported Cases of Child Maltreatment in Japan: Fixed Effects Analysis.docx [Dataset]. http://doi.org/10.3389/fpubh.2019.00265.s001
    Explore at:
    docxAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Frontiers
    Authors
    Yuna Koyama; Takeo Fujiwara
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Japan
    Description

    Background: Parental drinking habits or binge drinking are a known risk factor of child maltreatment. Though drinking habits are affected by alcohol outlet density, the direct association between alcohol outlet density and child maltreatment is still controversial.Purpose: This study aimed to examine the impact of off-premises alcohol outlet density on child maltreatment cases reported to Child Guidance Centers in Japan.Methods: A fixed effects model was used to investigate the association between a change in off-premises alcohol outlet density and a change in child maltreatment cases in each unit. Time-series of cross-sectional ecological data collected from across Japan over 16 years (2000 to 2015) was used, and maltreatment cases were further sub-grouped by type of maltreatment (physical, sexual, psychological abuse and neglect) and by perpetrators (father, stepfather, mother, and stepmother).Results: The association between alcohol outlet density and total cases of child maltreatment was not observed (coefficient = 0.98, 95% confidence interval: −6.30, 8.25). However, alcohol outlet density was shown to be positively associated with neglect (coefficient = 3.08, 95% confidence interval: 0.54, 5.62), which indicates that 1 alcohol outlet per 1,000 adults increase would lead to 3 more neglect cases per 10,000 children. Also, a negative association was observed between a change in the incidence of total child maltreatment by father and a change in alcohol outlet density (coefficient = −3.03, 95% confidence interval: −5.78, −0.28).Conclusion: The findings suggest that off-premises alcohol outlet density may have a causal effect on the increasing cases of neglect and decrease in maltreatment by father in Japan.

  15. ANES 2012 Time Series Study

    • icpsr.umich.edu
    ascii, delimited, r +3
    Updated May 17, 2016
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The American National Election Studies (ANES) (2016). ANES 2012 Time Series Study [Dataset]. http://doi.org/10.3886/ICPSR35157.v1
    Explore at:
    delimited, spss, stata, sas, r, asciiAvailable download formats
    Dataset updated
    May 17, 2016
    Dataset provided by
    Inter-university Consortium for Political and Social Researchhttps://www.icpsr.umich.edu/web/pages/
    Authors
    The American National Election Studies (ANES)
    License

    https://www.icpsr.umich.edu/web/ICPSR/studies/35157/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/35157/terms

    Time period covered
    Sep 2012 - Jan 2013
    Area covered
    United States
    Description

    This study is part of the American National Election Study (ANES), a time-series collection of national surveys fielded continuously since 1948. The American National Election Studies are designed to present data on Americans' social backgrounds, enduring political predispositions, social and political values, perceptions and evaluations of groups and candidates, opinions on questions of public policy, and participation in political life. As with all Time Series studies conducted during years of presidential elections, respondents were interviewed during the two months preceding the November election (Pre-election interview), and then re-interviewed during the two months following the election (Post-election interview). Like its predecessors, the 2012 ANES was divided between questions necessary for tracking long-term trends and questions necessary to understand the particular political moment of 2012. The study maintains and extends the ANES time-series 'core' by collecting data on Americans' basic political beliefs, allegiances, and behaviors, which are so critical to a general understanding of politics that they are monitored at every election, no matter the nature of the specific campaign or the broader setting. For the first time in the ANES Time Series history, face-to-face interviewing was supplemented in 2012 with data collection on the Internet. Data collection was conducted in the two modes independently, using separate samples. While face-to-face (FTF) respondents were administered the single pre-election interview and single post-election interview traditional to Time Series presidential-election-year studies, for the internet sample the same questions were administered over a total of four shorter online interviews, two pre-election and two post-election. Web-administered cases constituted a representative sample separate from the face-to-face sample and were drawn from panel members of GfK Knowledge Networks. The face-to-face (FTF) sample of fresh cross-section cases featured oversamples of African-Americans and Hispanics. For the first time in the ANES Time Series, FTF respondents were administered CAPI interviews programmed as instruments on handheld tablets, which were employed by interviewers using touchscreen, stylus, attached keyboard or any combination of entry modes according to interviewer preference. In both the pre-election and post-election FTF interviews a special CASI (Computer Assisted Self-Interviewing) segment was conducted. In addition to content on electoral participation, voting behavior, and public opinion, the 2012 ANES Time Series Study contains questions about areas such as media exposure, cognitive style, and values and predispositions. Several items were measured on the ANES for the first time, including "Big Five" personality traits using the Ten Item Personality Inventory (TIPI), skin tone observations made by interviewers in the face-to-face study, and a vocabulary test from the General Social Survey called "Wordsum." The Post-Election interview also included Module 4 from the Comparative Study of Electoral Systems (CSES). Demographic variables include respondent age, education level, political affiliation, race/ethnicity, marital status, and family composition.

  16. f

    Data from: FNETS: Factor-Adjusted Network Estimation and Forecasting for...

    • tandf.figshare.com
    rtf
    Updated Oct 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Matteo Barigozzi; Haeran Cho; Dom Owens (2023). FNETS: Factor-Adjusted Network Estimation and Forecasting for High-Dimensional Time Series [Dataset]. http://doi.org/10.6084/m9.figshare.24138670.v2
    Explore at:
    rtfAvailable download formats
    Dataset updated
    Oct 4, 2023
    Dataset provided by
    Taylor & Francis
    Authors
    Matteo Barigozzi; Haeran Cho; Dom Owens
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    We propose FNETS, a methodology for network estimation and forecasting of high-dimensional time series exhibiting strong serial- and cross-sectional correlations. We operate under a factor-adjusted vector autoregressive (VAR) model which, after accounting for pervasive co-movements of the variables by common factors, models the remaining idiosyncratic dynamic dependence between the variables as a sparse VAR process. Network estimation of FNETS consists of three steps: (i) factor-adjustment via dynamic principal component analysis, (ii) estimation of the latent VAR process via l1-regularized Yule-Walker estimator, and (iii) estimation of partial correlation and long-run partial correlation matrices. In doing so, we learn three networks underpinning the VAR process, namely a directed network representing the Granger causal linkages between the variables, an undirected one embedding their contemporaneous relationships and finally, an undirected network that summarizes both lead-lag and contemporaneous linkages. In addition, FNETS provides a suite of methods for forecasting the factor-driven and the idiosyncratic VAR processes. Under general conditions permitting tails heavier than the Gaussian one, we derive uniform consistency rates for the estimators in both network estimation and forecasting, which hold as the dimension of the panel and the sample size diverge. Simulation studies and real data application confirm the good performance of FNETS.

  17. o

    Data from: A new panel dataset for cross-country analyses of national...

    • explore.openaire.eu
    Updated Jan 1, 2011
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fulvio Castellacci; Jose Miguel Natera (2011). A new panel dataset for cross-country analyses of national systems, growth and development (CANA) [Dataset]. https://explore.openaire.eu/search/other?orpId=od_1201::b46bcf887628f58767c9a2d4079db514
    Explore at:
    Dataset updated
    Jan 1, 2011
    Authors
    Fulvio Castellacci; Jose Miguel Natera
    Description

    Missing data represent an important limitation for cross-country analyses of national systems, growth and development. This paper presents a new cross-country panel dataset with no missing value. We make use of a new method of multiple imputation that has recently been developed by Honaker and King (2010) to deal specifically with time-series cross-section data at the country-level. We apply this method to construct a large dataset containing a great number of indicators measuring six key country-specific dimensions: innovation and technological capabilities, education system and human capital, infrastructures, economic competitiveness, political-institutional factors, and social capital. The CANA panel dataset thus obtained provides a rich and complete set of 41 indicators for 134 countries in the period 1980-2008 (for a total of 3886 country-year observations). The empirical analysis shows the reliability of the dataset and its usefulness for cross-country analyses of national systems, growth and development. The new dataset is publicly available.

  18. e

    The economic and demographic transition, mortality, and comparative...

    • b2find.eudat.eu
    Updated May 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). The economic and demographic transition, mortality, and comparative development 2010-2015 - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/4ce322e1-8a7f-5045-a5ff-82a005a9bdc8
    Explore at:
    Dataset updated
    May 1, 2023
    Description

    The authors propose a unified growth theory to explain demographic empirical regularities. They calibrate the model to match data moments for Sweden in 2000 and around 1800. The simulated data generated by the calibrated model are then compared to the historical time series for Sweden over the period 1750-2000 in order to investigate the fit of long-term development dynamics, as well as to cross-country panel data for the period 1960-2000 to analyze the relevance for cross-sectional patterns of comparative development. For the calibration, data was used from the OECD webpage, ERS Dataset, historical statistics from the Bank of Sweden, micro data from the ECHP dataset, Data from the Human Mortality Data Base, UN Population Statistics, or data from existing papers. For the time-series and cross section analysis, data was taken from the Human Mortality Database, World Development Indicators, Swedish Central Statistical Office, UN Population Statistics and existing literature.

  19. General Social Survey 2010 Cross-Section and Panel Combined, (Inapplicable...

    • thearda.com
    Updated 2010
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Association of Religion Data Archives (2010). General Social Survey 2010 Cross-Section and Panel Combined, (Inapplicable Responses Coded as Missing) [Dataset]. http://doi.org/10.17605/OSF.IO/AT5WV
    Explore at:
    Dataset updated
    2010
    Dataset provided by
    Association of Religion Data Archives
    Dataset funded by
    National Science Foundation
    Description

    This file differs from the General Social Survey 2010 in that all inapplicable values are set to system missing. The General Social Surveys (GSS) have been conducted by the "https://www.norc.org/Pages/default.aspx" Target="_blank">National Opinion Research Center (NORC) annually since 1972, except for the years 1979, 1981, and 1992 (a supplement was added in 1992), and biennially beginning in 1994. The GSS are designed to be part of a program of social indicator research, replicating questionnaire items and wording in order to facilitate time-trend studies. This data file has all cases and variables asked on the 2010 GSS. There are a total of 4,901 cases in the data set but their initial sampling years vary because the GSS now contains panel cases. Sampling years can be identified with the variable SAMPTYPE.

    The 2010 GSS featured special modules on aging, the Internet, shared capitalism, gender roles, intergroup relations, immigration, meeting spouse, knowledge about and attitudes toward science, religious identity, religious trends, genetics, veterans, crime and victimization, social networks and group membership, and sexual behavior (continuing the series started in 1988).

    The GSS has switched from a repeating, cross-section design to a combined repeating cross-section and panel-component design. The 2006 GSS was the base year for the first panel. A sub-sample of 2,000 GSS cases from 2006 was selected for reinterview in 2008 and again in 2010 as part of the GSSs in those years. The 2008 GSS consists of a new cross-section plus the reinterviews from 2006. The 2010 GSS consists of a new cross-section of 2,044, the first reinterview wave of the 2,023 2008 panel cases with 1,581 completed cases, and the second and final reinterview of the 2006 panel with 1,276 completed cases. Altogether, the 2010 GSS had 4,901 cases (2,044 in the new 2010 panel, 1,581 in the 2008 panel, and 1,276 in the 2006 panel). The 2010 GSS is the first round to fully implement the new, rolling panel design. In 2012 and later GSSs, there will likewise be a fresh cross-section (wave one of a new panel), wave two panel cases from the immediately preceding GSS, and wave three panel cases from the next earlier GSS.

    To download syntax files for the GSS that reproduce well-known religious group recodes, including RELTRAD, please visit the "/research/syntax-repository-list" Target="_blank">ARDA's Syntax Repository.

  20. d

    Dataset of companies’ profitability, government debt, Financial Statements'...

    • search.dataone.org
    Updated Nov 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mgammal, Mahfoudh; Al-Matari, Ebrahim (2023). Dataset of companies’ profitability, government debt, Financial Statements' Key Indicators and earnings in an emerging market: Developing a panel and time series database of value-added tax rate increase impacts [Dataset]. http://doi.org/10.7910/DVN/HEL3YG
    Explore at:
    Dataset updated
    Nov 8, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Mgammal, Mahfoudh; Al-Matari, Ebrahim
    Description

    The dataset included with this article contains three files describing and defining the sample and variables for VAT impact, and Excel file 1 consists of all raw and filtered data for the variables for the panel data sample. Excel file 2 depicts time-series and cross-sectional data for nonfinancial firms listed on the Saudi market for the second and third quarters of 2019 and the third and fourth quarters of 2020. Excel file 3 presents the raw material of variables used in measuring the company's profitability of the panel data sample

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Jonathan Kropko; Robert Kubinec (2023). Interpretation and identification of within-unit and cross-sectional variation in panel data models [Dataset]. http://doi.org/10.1371/journal.pone.0231349

Interpretation and identification of within-unit and cross-sectional variation in panel data models

Explore at:
98 scholarly articles cite this dataset (View in Google Scholar)
pdfAvailable download formats
Dataset updated
May 31, 2023
Dataset provided by
PLOS ONE
Authors
Jonathan Kropko; Robert Kubinec
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

While fixed effects (FE) models are often employed to address potential omitted variables, we argue that these models’ real utility is in isolating a particular dimension of variance from panel data for analysis. In addition, we show through novel mathematical decomposition and simulation that only one-way FE models cleanly capture either the over-time or cross-sectional dimensions in panel data, while the two-way FE model unhelpfully combines within-unit and cross-sectional variation in a way that produces un-interpretable answers. In fact, as we show in this paper, if we begin with the interpretation that many researchers wrongly assign to the two-way FE model—that it represents a single estimate of X on Y while accounting for unit-level heterogeneity and time shocks—the two-way FE specification is statistically unidentified, a fact that statistical software packages like R and Stata obscure through internal matrix processing.

Search
Clear search
Close search
Google apps
Main menu