CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Multifactor error structures utilize factor analysis to deal with complex cross-sectional dependence in Time-Series Cross-Sectional data caused by cross-level interactions. The multifactor error structure specification is a generalization of the fixed-effects model. This paper extends the existing multifactor error models from panel econometrics to multilevel modeling, from linear setups to generalized linear models with the probit and logistic links, and from assuming serial independence to modeling the error dynamics with an autoregressive process. I develop Markov Chain Monte Carlo algorithms mixed with a rejection sampling scheme to estimate the multilevel multifactor error structure model with a p-th order autoregressive process in linear, probit, and logistic specifications. I conduct several Monte Carlo studies to compare the performance of alternative specifications and approaches with varying degrees of data complication and different sample sizes. The Monte Carlo studies provide guidance on when and how to apply the proposed model. An empirical application sovereign default demonstrates how the proposed approach can accommodate a complex pattern of cross-sectional dependence and helps answer research questions related to units' sensitivity or vulnerability to systemic shocks.
Companion files for: 2014. Jessica Fortin-Rittberger. “Time-Series Cross-Section” in Henning Best and Christof Wolf (Eds.), The SAGE Handbook of Regression Analysis and Causal Inference, Sage Publishers. DOI: http://dx.doi.org/10.4135/9781446288146.n17 data file (Norris, P. (2009). Democracy timeseries data release 3.0. http://www.pippanorris.com/) and Stata do file
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
When estimating hedonic models of housing prices, the use of time series cross-section repeat sales data can provide improvements in estimator efficiency and correct for unobserved characteristics. However, in cases where serial correlation is present, the irregular timing of sales should also be considered. In this paper we develop a model that uses information on the timing of events to account for the sporadic occurrence of events. The model presumes that the serial correlation process can be decomposed into a time-independent (event-wise) component and a time-dependent (time-wise) component. Empirical tests cannot reject the presence of sporadic correlation patterns, while simulations show that the failure to account for sporadic correlation leads to significant losses in efficiency, and that the losses from ignoring sporadic correlation when it exists are larger than losses when sporadic correlation is falsely assumed.
https://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.7910/DVN/ZTDHVEhttps://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.7910/DVN/ZTDHVE
Matching methods improve the validity of causal inference by reducing model dependence and offering intuitive diagnostics. While they have become a part of the standard tool kit across disciplines, matching methods are rarely used when analyzing time-series cross-sectional data. We fill this methodological gap. In the proposed approach, we first match each treated observation with control observations from other units in the same time period that have an identical treatment history up to the pre-specified number of lags. We use standard matching and weighting methods to further refine this matched set so that the treated and matched control observations have similar covariate values. Assessing the quality of matches is done by examining covariate balance. Finally, we estimate both short-term and long-term average treatment effects using the difference-in-differences estimator, accounting for a time trend. We illustrate the proposed methodology through simulation and empirical studies. An open-source software package is available for implementing the proposed methods.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Regression results for baseline model and alternative specifications.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
(C, V, S, K) index time series data generated using 0.5-Hz GEMS time series data from Taiwan. In this data set, C is a modified autocorrelation function, V is the variance, S the skewness, and K the kurtosis of the GEMS geo-electric field time series.
The QoG Institute is an independent research institute within the Department of Political Science at the University of Gothenburg. The main objective of our research is to address the theoretical and empirical problem of how political institutions of high quality can be created and maintained.
To achieve said goal, the QoG Institute makes comparative data on QoG and its correlates publicly available. To accomplish this, we have compiled several datasets that draw on a number of freely available data sources, including aggregated individual-level data.
The QoG OECD Datasets focus exclusively on OECD member countries. They have a high data coverage in terms of geography and time. In the QoG OECD TS dataset, data from 1946 to 2021 is included and the unit of analysis is country-year (e.g., Sweden-1946, Sweden-1947, etc.).
In the QoG OECD Cross-Section dataset, data from and around 2018 is included. Data from 2018 is prioritized, however, if no data are available for a country for 2018, data for 2019 is included. If no data for 2019 exists, data for 2017 is included, and so on up to a maximum of +/- 3 years. In the QoG OECD Time-Series dataset, data from 1946 to 2021 are included and the unit of analysis is country-year (e.g. Sweden-1946, Sweden-1947 and so on).
The QoG OECD Datasets focus exclusively on OECD member countries. They have a high data coverage in terms of geography and time.
In the QoG OECD Cross-Section dataset, data from and around 2018 is included. Data from 2018 is prioritized, however, if no data are available for a country for 2018, data for 2019 is included. If no data for 2019 exists, data for 2017 is included, and so on up to a maximum of +/- 3 years. In the QoG OECD Time-Series dataset, data from 1946 to 2021 are included and the unit of analysis is country-year (e.g. Sweden-1946, Sweden-1947 and so on).
https://dataverse.harvard.edu/api/datasets/:persistentId/versions/5.1/customlicense?persistentId=doi:10.7910/DVN/GGUR0Phttps://dataverse.harvard.edu/api/datasets/:persistentId/versions/5.1/customlicense?persistentId=doi:10.7910/DVN/GGUR0P
Applications of modern methods for analyzing data with missing values, based primarily on multiple imputation, have in the last half-decade become common in American politics and political behavior. Scholars in these fields have thus increasingly avoided the biases and inefficiencies caused by ad hoc methods like listwise deletion and best guess imputation. However, researchers in much of comparative politics and international relations, and others with similar data, have been unable to do the same because the best available imputation methods work poorly with the time-series cross-section data structures common in these fields. We attempt to rectify this situation. First, we build a multiple i mputation model that allows smooth time trends, shifts across cross-sectional units, and correlations over time and space, resulting in far more accurate imputations. Second, we build nonignorable missingness models by enabling analysts to incorporate knowledge from area studies experts via priors on individual missing cell values, rather than on difficult-to-interpret model parameters. Third, since these tasks could not be accomplished within existing imputation algorithms, in that they cannot handle as many variables as needed even in the simpler cross-sectional data for which they were designed, we also develop a new algorithm that substantially expands the range of computationally feasible data types and sizes for which multiple imputation can be used. These developments also made it possible to implement the methods introduced here in freely available open source software that is considerably more reliable than existing strategies. These developments also made it possible to implement the methods introduced here in freely available open source software, Amelia II: A Program for Missing Data, that is considerably more reliable than existing strategies. See also: Missing Data
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is a data set related to a bedload tracer field study in an alpine section of the Drava River between 11 May 2017 and 11 June 2018. A time series of bed shear stress is provided for the seeding site of the tracers in the time span of the entire tracer study. The shear stress was calculated from water depths that were modelled with a one-dimensional hydrodynamic-numerical model and based on a channel slope obtained from the analysis of cross-sections. The shear stress can also be calculated for cross-sections downstream of the seeding location by using the functions available in the corresponding publication.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
R Scripts contain statistical data analisys for streamflow and sediment data, including Flow Duration Curves, Double Mass Analysis, Nonlinear Regression Analysis for Suspended Sediment Rating Curves, Stationarity Tests and include several plots.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Researchers typically analyze time-series-cross-section data with a binary dependent variable (BTSCS) using ordinary logit or probit. However, BTSCS observations are likely to violate the independence assumption of the ordinary logit or probit statistical model. It is well known that if the observations are temporally related that the results of an ordinary logit or probit analysis may be misleading. In this paper, we provide a simple diagnostic for temporal dependence and a simple remedy. Our remedy is based on the idea that BTSCS data is identical to grouped duration data. This remedy does not require the BTSCS analyst to acquire any further methodological skills and it can be easily implemented in any standard statistical software package. While our approach is suitable for any type of BTSCS data, we provide examples and applications from the field of International Relations, where BTSCS data is frequently used. We use our methodology to re-assess Oneal and Russett's (1997) findings regarding the relationship between economic interdependence, democracy, and peace. Our analyses show that 1) their finding that economic interdependence is associated with peace is an artifact of their failure to account for temporal dependence and 2) their finding that democracy inhibits conflict is upheld even taking duration dependence into account.
This data release includes cross section survey data collected during site visits to USGS gaging stations located throughout the Willamette and Delaware River Basins and multispectral images of these locations acquired as close in time as possible to the date of each site visit. In addition, MATLAB source code developed for the Bathymetric Mapping using Gage Records and Image Databases (BaMGRID) framework is also provided. The site visit data were obtained from the Aquarius Time Series database, part of the USGS National Water Information System (NWIS), using the Publish Application Programming Interface (API). More specifically, a custom MATLAB function was used to query the FieldVisitDataByLocationServiceRequest endpoint of the Aquarius API by specifying the gaging station ID number and the date range of interest and then retrieve the QRev XML attachments associated with site visits meeting these criteria. These XML files were then parsed using another custom MATLAB function that served to extract the cross section survey data collected during the site visit. Note that because many of the site visits involved surveying cross sections using instrumentation that was not GPS-enabled, latitude and longitude coordinates were not available and no data values (NaN) are used in the site visit files provided in this data release. Remotely sensed data acquired as close as possible to the date of each site visit were also retrieved via APIs. Multispectral satellite images from the PlanetScope constellation were obtained using custom MATLAB functions developed to interact with the Planet Orders API, which provided tools for clipping the images to a specified area of interest focused on the gaging station and harmonizing the pixel values to be consistent across the different satellites within the PlanetScope constellation. The data product retrieved was the PlanetScope orthorectified 8-band surface reflectance bundle. PlanetScope images are acquired with high frequency, often multiple times per day at a given location, and so the search was restricted to a time window spanning from three days prior to three days after the site visit. All images meeting these criteria were downloaded and manually inspected; the highest quality image closest in time to the site visit date was retained for further analysis. For the gaging stations within the Willamette River Basin, digital aerial photography acquired through the National Agricultural Imagery Program (NAIP) in 2022 were obtained using a similar set of MATLAB functions developed to access the USGS EarthExplorer Machine-to-Machine (M2M) API. The NAIP quarter-quadrangle image encompassing each gaging station was downloaded and then clipped to a smaller area centered on the gaging station. Only one NAIP image at each gaging station was acquired in 2022, so differences in streamflow between the image acquisition date and the date of the site visit closest in time were accounted for by performing separate NWIS web queries to retrieve the stage and discharge recorded at the gaging station on the date the image was acquired and on the date of the site visit. These data sets were used as an example application of the framework for Bathymetric Mapping using Gage Records and Image Databases (BaMGRID) and this data release also provides MATLAB source code developed to implement this approach. The code is packaged in a zip archive that includes the following individual .m files: 1) getSiteVisit.m, for retrieving data collected during site visits to USGS gaging stations through the Aquarius API; 2) Qrev2depth.m, for parsing the XML file from the site visit and extracting depth measurements surveyed along a channel cross section during a direct discharge measurement; 3) orderPlanet.m, for searching for and ordering PlanetScope images via the Planet Orders API; 4) pollThenGrabPlanet.m, for querying the status of an order and then downloading PlanetScope images requested through the Planet Orders API; 5) organizePlanet.m, for file management and cleanup of the original PlanetScope image data obtained via the previous two functions; 6) ingestNaip.m, for searching for, ordering, and downloading NAIP data via the USGS Machine-to-Machine (M2M) API; 7) naipExtractClip.m, for clipping the downloaded NAIP images to the specified area of interest and performing file management and cleanup; and 8) crossValObra.m, for performing spectrally based depth retrieval via the Optimal Band Ratio Analysis (OBRA) algorithm using a k-fold cross-validation approach intended for small sample sizes. The files provided through this data release include: 1) A zipped shapefile with polygons delineating the Willamette and Delaware River basins 2) .csv text files with information on site visits within each basin during 2022 3) .csv text files with information on PlanetScope images of each gaging station close in time to the date of each site visit that can be used to obtain the image data through the Planet Orders API or Planet Explorer web interface. 4) A .csv text tile with information on NAIP images of each gaging station in the Willamette River Basin as close in time as possible to the date of each site visit, along with the stage and discharge recorded at the gaging station on the date of image acquisition and the date of the site visit. 5) A zip archive of the clipped NAIP images of each gaging station in the Willamette River Basin in GeoTIFF format. 6) A zip archive with source code (MATLAB *.m files) developed to implement the Bathymetric Mapping using Gage Records and Image Databases (BaMGRID) framework.
In the ANES Time Series Cumulative Data File, the project staff have merged into a single file all cross-section cases and variables for select questions from the ANES Time Series studies conducted since 1948. Questions that have been asked in three or more Time Series studies are eligible for inclusion, with variables recoded as necessary for comparability across years.
The data track political attitudes and behaviors across the decades, including attitudes about religion. This dataset is unique given its size and comprehensive assessment of politics and religion over time. For information about the structure of the cumulative file, please see the notes listed on this page.
Abstract copyright UK Data Service and data collection copyright owner.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Using sequential trend break and panel data models, we investigate the unit root hypothesis for the inflation rates of thirteen OECD countries. With individual country tests, we find evidence of stationarity in only four of the thirteen countries. The results are more striking with the panel data model. We can strongly reject the unit root hypothesis both for a panel of all thirteen countries and for a number of smaller panels consisting of as few as three countries. The non-rejection of the unit root hypothesis for inflation is very fragile to even a small amount of cross-section variation.
Liberalization is a perennial topic in politics and political science. We first review a broad scholarly debate, showing that the mainstream theories make rival and contradictory claims regarding the role of political parties in (de)liberalization reforms. We then develop a framework of conditional partisan influence, arguing that and under what conditions par-ties matter. We test our (and rival) propositions with a new dataset on (de)liberalization reforms in 23 democracies since 1973 covering several policy areas. Methodologically, we argue that existing quantitative studies are problematic: They rely on time-series cross-section models using country-year observations; but governments do not change annually, so that the number of observations is artificially inflated, resulting in incorrect estimates. We propose mixed-effects models instead, with country-year observations nested in cabi-nets, which are nested in countries and years. The results show under what conditions par-ties matter for (de)liberalization. More generally, the paper argues that mixed-effects mod-els should become the new standard for studying partisan influences.
The QoG Institute is an independent research institute within the Department of Political Science at the University of Gothenburg. Overall 30 researchers conduct and promote research on the causes, consequences and nature of Good Governance and the Quality of Government - that is, trustworthy, reliable, impartial, uncorrupted and competent government institutions.
The main objective of our research is to address the theoretical and empirical problem of how political institutions of high quality can be created and maintained. A second objective is to study the effects of Quality of Government on a number of policy areas, such as health, the environment, social policy, and poverty.
QoG Standard Dataset is the largest dataset consisting of more than 2,000 variables from sources related to the Quality of Government. The data exist in both time-series (year 1946 and onwards) and cross-section (year 2020). Many of the variables are available in both datasets, but some are not. The datasets draws on a number of freely available data sources related to QoG and its correlates.
In the QoG Standard CS dataset, data from and around 2020 is included. Data from 2020 is prioritized; however, if no data is available for a country for 2020, data for 2021 is included. If no data exists for 2021, data for 2019 is included, and so on up to a maximum of +/- 3 years.
In the QoG Standard TS dataset, data from 1946 and onwards is included and the unit of analysis is country-year (e.g., Sweden-1946, Sweden-1947, etc.).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Correlation matrix for different ratio measures.
https://www.icpsr.umich.edu/web/ICPSR/studies/7215/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/7215/terms
This study is part of a time-series collection of national surveys fielded continuously since 1948. The election studies are designed to present data on Americans' social backgrounds, enduring political predispositions, social and political values, perceptions and evaluations of groups and candidates, opinions on questions of public policy, and participation in political life. The 1958 study may be analyzed both on its own, as a cross-section survey representative of the U.S. population of voting age, and as the second wave of a panel study that started with the ANES 1956 Time Series Study (ICPSR 7214) and ended with the ANES 1960 Time Series Study (ICPSR 7216). Each respondent was interviewed only once, after the election. Respondents who had not been interviewed in 1956 were selected from dwelling units vacated by 1956 respondents (movers). The questionnaires contained both closed and open-ended questions covering a wide range of topics. In addition to general political attitudes, the study obtained information about the more specific attitudes and behaviors pertinent to the 1958 Congressional Election, like the respondents' actual vote and reasons for the vote, attitudes toward political parties and candidates, and the respondents' political history. Data were also collected on specific domestic and foreign policy issues such as government involvement in housing and public utilities, and United States aid to anti-Communist nations. The study also ascertained the financial situation of the family unit and other demographic information.
The U.S. Geological Survey (USGS), in cooperation with the Texas Department of Transportation (TxDOT), deployed RQ-30 surface velocimetry sensors (hereinafter referred to as “RQ-30 sensors”) made by Sommer Messtechnik to collect radar gage-height data, cross section area, surface velocity, learned surface velocity, discharge, and learned discharge at 80 streamgages located in stream reaches with varying hydrologic and hydraulic characteristics. Land-use types in the contributing drainage basins included agricultural, forest, mixed, and coastal, that are common in central, east, and southeast Texas. Many of the drainage basins and streams have relatively low gradients. To test the efficacy of the remote-sensing methods, the RQ-30 sensors were deployed for 1 to 3 years to capture and compute data over a range of hydraulic conditions. Continuous time series of radar-measured gage-height and surface velocity and radar-derived cross-sectional area, learned surface velocity, discharge, and learned discharge were recorded at 5-minute intervals. Discharge data were computed by using radar-derived cross-sectional area and surface velocity data, whereas learned discharge were computed by using radar-derived cross-sectional area and learned surface velocity. The two types of discharge data obtained with the RQ-30 sensors were compared to discharge data computed by using the standard USGS stage-discharge methods. For each of the 80 streamgages, information regarding the USGS site number, station name, location, datum, installation date, and data start date, can be found in the file named "Station Metadata.csv" included with this release.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Multifactor error structures utilize factor analysis to deal with complex cross-sectional dependence in Time-Series Cross-Sectional data caused by cross-level interactions. The multifactor error structure specification is a generalization of the fixed-effects model. This paper extends the existing multifactor error models from panel econometrics to multilevel modeling, from linear setups to generalized linear models with the probit and logistic links, and from assuming serial independence to modeling the error dynamics with an autoregressive process. I develop Markov Chain Monte Carlo algorithms mixed with a rejection sampling scheme to estimate the multilevel multifactor error structure model with a p-th order autoregressive process in linear, probit, and logistic specifications. I conduct several Monte Carlo studies to compare the performance of alternative specifications and approaches with varying degrees of data complication and different sample sizes. The Monte Carlo studies provide guidance on when and how to apply the proposed model. An empirical application sovereign default demonstrates how the proposed approach can accommodate a complex pattern of cross-sectional dependence and helps answer research questions related to units' sensitivity or vulnerability to systemic shocks.