This study contains script files to create teaching versions of Understanding Society: Waves 1-3, the new UK household panel survey. Specifically, the user can focus on individual waves, or can create a panel survey dataset for use in teaching undergraduates and postgraduates. Core areas of focus are attitudes to voting and political parties, to the environment, and to ethnicity and migration. Script files are available for SPSS, STATA and R. Individuals wishing to make use of this resource will need to apply separately to the UK data archive for access to the original datasets: http://discover.ukdataservice.ac.uk/catalogue/?sn=6614 &type=Data%20catalogue
Political scientists often find themselves analyzing datasets with a large number of observations, a large number of variables, or both. Yet, traditional statistical techniques fail to take full advantage of the opportunities inherent in ``big data'' as they are too rigid to recover nonlinearities and do not facilitate the easy exploration of interactions in high-dimensional datasets. In this paper, we introduce a family of tree-based nonparametric techniques that may, in some circumstances, be more appropriate than traditional methods for confronting these data challenges. In particular, tree models are very effective for detecting nonlinearities and interactions, even in datasets with many (potentially irrelevant) covariates. We introduce the basic logic of tree-based models, provide an overview of the most prominent methods in the literature, and conduct three analyses that illustrate how the methods can be implemented while highlighting both their advantages and limitations.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
These data include information on methodology, field, and empirical coverage of every article published in the American Political Science Review, American Journal of Political Science, Journal of Politics, World Politics, Comparative Political Studies, and Comparative Politics in five year increments from 1965-2015 plus 2017.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The E-DEM dataset is a micro-level online panel survey of the Spanish voting age population comprised of four waves carried out over a six-month period between late October 2018 and May 2019 (the detailed timing of each wave will be presented in Table 1). The survey waves coincide with key moments in Spanish political life (including local, regional, national, and European elections, as well as the conviction of Catalan secessionist leaders). It also covers the six-month period of the surge of Spain’s new radical right party, Vox, spanning from shortly before its first major electoral success in Spain’s most populate region, Andalusia, to its consolidation in the May 2019 European elections. In addition, the project comprises a series of survey experiments, embedded in the different waves, regarding measures on confidence in institutions, exposure to media and social networks, as well as on political behavior and polarization attitudes based on passive data, captured with software that the interviewees installed on their mobile devices. Three datasets are provided: the four-waves dataset, the experimental dataset, and the integrated dataset, which includes the former two. Each dataset is provided in three formats: tab-separated delimited text, tab-separated rawtext, and Stata 15.0 (.dta).
The Data for Undergraduate Political Science Courses datasets have been derived from three major public opinion studies: Eurobarometer 64.2: the European Constitution, Globalization, Energy Resources, and Agricultural Policy, October - November, 2005 (held at the UKDA under SN 5505); British Election Study, 2005 (BES) (held under SNs 5494-5496); and the British Social Attitudes Survey, 2005 (BSA) (held under SN 5618), for the purpose of teaching data analysis to undergraduates in political science. The datasets have been 'cleaned' in order to aid students using data for the first time. Some variables have been removed, many variable names have been changed to enable more substantive meaning to be taken from them, and new codebooks have been created for each of the three derived datasets.
Further information may be found on the Development of Undergraduate Curricula in Quantitative Methods project web site, and the ESRC award web page.
Each R script replicates all of the example code from one chapter from the book. All required data for each script are also uploaded, as are all data used in the practice problems at the end of each chapter. The data are drawn from a wide array of sources, so please cite the original work if you ever use any of these data sets for research purposes.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Stata dataset of 2,118 Lobbying Disclosure Act reports from 574 organizations active on the 2014 Farm Bill. Data originally collected and coded by the Center for Responsive Politics. Includes name of organization, dollar amount of reported expenses, number of lobbyists, lobbyists with previous government experiences (revolving door), description of issue, sector and industry of organization, and topic codes (created by author).The Excel file includes lobbying scores for each organization in the data set (n=574). Scores are based on a Principal Components Analysis of resources devoted to lobbying. There are separate worksheets for the overall ranking and by topic.The associated Stata do-file includes commands to replicate the ranking of organizations (overall and by topic).
Dataset, code, and codebook. Visit https://dataone.org/datasets/sha256%3A3b1dcf51c8cad8d91760cb8eef8783d8f3f302a91c8f1e0142ae2af7061689b3 for complete metadata about this dataset.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Database on Party Membership Figures
Principal Investigator: Emilie van Haute
Funding: FNRS (http://www.frs-fnrs.be)
Coverage:
Period: 1946-2014
Countries: Australia, Austria, Belgium, Brazil, Canada, Croatia, Cyprus, Czech Republic, Denmark, Estonia, Finland, France, Germany, Hungary, Iceland, Ireland, Israel, Italy, Lithuania, Mexico, the Netherlands, Norway, Poland, Portugal, Romania, Slovakia, Slovenia, Spain, Sweden, Switzerland, the United Kingdom
Spatial Units: National level
Unit of analysis: Parties
Unit of observation: Number of party members (M) per party per year
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is the data used for the development of the Index Index model.
The QoG Institute is an independent research institute within the Department of Political Science at the University of Gothenburg. Overall 30 researchers conduct and promote research on the causes, consequences and nature of Good Governance and the Quality of Government - that is, trustworthy, reliable, impartial, uncorrupted and competent government institutions.
The main objective of our research is to address the theoretical and empirical problem of how political institutions of high quality can be created and maintained. A second objective is to study the effects of Quality of Government on a number of policy areas, such as health, the environment, social policy, and poverty.
QoG Standard Dataset is the largest dataset consisting of more than 2,000 variables from sources related to the Quality of Government. The data exist in both time-series (year 1946 and onwards) and cross-section (year 2020). Many of the variables are available in both datasets, but some are not. The datasets draws on a number of freely available data sources related to QoG and its correlates.
In the QoG Standard CS dataset, data from and around 2020 is included. Data from 2020 is prioritized; however, if no data is available for a country for 2020, data for 2021 is included. If no data exists for 2021, data for 2019 is included, and so on up to a maximum of +/- 3 years.
In the QoG Standard TS dataset, data from 1946 and onwards is included and the unit of analysis is country-year (e.g., Sweden-1946, Sweden-1947, etc.).
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Replication data and scripts to reproduce all plots, tables, and analyses reported in the paper and Online Supporting Information. The file 00000_README.pdf contains detailed information on each script and explains how to run the files. The PolNos datasets are available at: Müller, S. and S.-O. Proksch (2023). PolNos: Political Nostalgia in Party Manifestos. Harvard Dataverse, V1. URL: https://doi.org/10.7910/DVN/L198GI.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Cline Center Global News Index is a searchable database of textual features extracted from millions of news stories, specifically designed to provide comprehensive coverage of events around the world. In addition to searching documents for keywords, users can query metadata and features such as named entities extracted using Natural Language Processing (NLP) methods and variables that measure sentiment and emotional valence. Archer is a web application purpose-built by the Cline Center to enable researchers to access data from the Global News Index. Archer provides a user-friendly interface for querying the Global News Index (with the back-end indexing still handled by Solr). By default, queries are built using icons and drop-down menus. More technically-savvy users can use Lucene/Solr query syntax via a ‘raw query’ option. Archer allows users to save and iterate on their queries, and to visualize faceted query results, which can be helpful for users as they refine their queries. Additional Resources: - Access to Archer and the Global News Index is limited to account-holders. If you are interested in signing up for an account, you can fill out the Archer User Information Form. - Current users who would like to provide feedback, such as reporting a bug or requesting a feature, can fill out the Archer User Feedback Form. - The Cline Center sends out periodic email newsletters to the Archer Users Group. Please fill out this form to subscribe to Archer Users Group. Citation Guidelines: 1) To cite the GNI codebook (or any other documentation associated with the Global News Index and Archer) please use the following citation: Cline Center for Advanced Social Research. 2020. Global News Index and Extracted Features Repository [codebook]. Champaign, IL: University of Illinois. doi:10.13012/B2IDB-5649852_V1 2) To cite data from the Global News Index (accessed via Archer or otherwise) please use the following citation (filling in the correct date of access): Cline Center for Advanced Social Research. 2020. Global News Index and Extracted Features Repository [database]. Champaign, IL: University of Illinois. Accessed Month, DD, YYYY. doi:10.13012/B2IDB-5649852_V1
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Cline Center Historical Phoenix Event Data covers the period 1945-2019 and includes 8.2 million events extracted from 21.2 million news stories. This data was produced using the state-of-the-art PETRARCH-2 software to analyze content from the New York Times (1945-2018), the BBC Monitoring's Summary of World Broadcasts (1979-2019), the Wall Street Journal (1945-2005), and the Central Intelligence Agency’s Foreign Broadcast Information Service (1995-2004). It documents the agents, locations, and issues at stake in a wide variety of conflict, cooperation and communicative events in the Conflict and Mediation Event Observations (CAMEO) ontology. The Cline Center produced these data with the generous support of Linowes Fellow and Faculty Affiliate Prof. Dov Cohen and help from our academic and private sector collaborators in the Open Event Data Alliance (OEDA). For details on the CAMEO framework, see: Schrodt, Philip A., Omür Yilmaz, Deborah J. Gerner, and Dennis Hermreck. "The CAMEO (conflict and mediation event observations) actor coding framework." In 2008 Annual Meeting of the International Studies Association. 2008. http://eventdata.parusanalytics.com/papers.dir/APSA.2005.pdf Gerner, D.J., Schrodt, P.A. and Yilmaz, O., 2012. Conflict and mediation event observations (CAMEO) Codebook. http://eventdata.parusanalytics.com/cameo.dir/CAMEO.Ethnic.Groups.zip For more information about PETRARCH and OEDA, see: http://openeventdata.org/
A characteristic of recent decades of scholarly work in the social sciences has been the increased amounts of empirical research. Access and availability of data are prerequisites for further research, replication work, and scientific development. As international peer-reviewed journals have gradually become the central forum for research debate, moves towards data sharing are dependent upon the policies of journals regarding data availability. This dataset presents data availability policies in political science in the year 2011.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This archive contains materials to replicate the analysis reported in: Doherty, David, Peter J. Schraeder, and Kirstie L. Dobbs. "Do Democratic Revolutions 'Activate' Participants?: The Case of Tunisia"The root directory includes five Stata DO files (run on Stata 14.1). The file replication.do calls the other four DO files. These other four DO files conduct analysis specific to a particular dataset. In the case of arab_barometer_w2.do and afrobarometer.do the files simply do necessary recoding and output summary statistics. The remaining two DO files use two datasets used to conduct the statistical analysis reported in the paper. They also complete recoding to ensure that variables from these two datasets are coded similarly. The file replication.do then stacks the recoded data to complete the core analysis reported. The directory includes four folders:1) prepped_data: this folder is where the two recoded datasets that are stacked for the core analysis are deposited. It is empty in this archive.2) private_data: Empty folder referred to in commented out code. The only file originally included in this folder was the full dataset from the original survey used in the analysis. The commented out code (top of "orig_survey.do") stripped out variables not used in the analysis and saved the resulting dataset in the raw_data folder.3) raw_data: Contains all datasets used in the analysis. The tunisia_2012_survey.dta file is from our original survey. The remaining files were downloaded from the Arab Barometer and AfroBarometer websites. 4) tables: Empty folder where tables and figures are saved.To run the analysis, users should simply set the directory at the top of the replication.do file.
Attribution-NonCommercial 3.0 (CC BY-NC 3.0)https://creativecommons.org/licenses/by-nc/3.0/
License information was derived automatically
This is a do-file, which processes publicly available CCES and Census data available at the URL cited in the references. The second file is correct, it is not clear how to delete the duplicate.
NYU Libraries has licensed access to the L2 Political Academic Voter File. The file is a continuously updated dataset consisting of public information for every registered voter in the United States and includes basic socio-demographic indicators (some of which are modeled), consumer preferences, political party affiliation, voting history, and more.
The data consists of .tab files organized into individual state folders (all states and DC). Each state folder contains two files: demographics data and voter history data, with a data dictionary for each dataset. The size of the folders vary by state and data for all states adds up to approximately 40 GB. The data is organized into releases, generally two per year (spring and fall), which represent a snapshot of the country's voters at the time of the dataset creation.
NYU has also licensed access to L2 Political historical backlog of data. This backlog includes versions of the L2 Processed voter file going back to 2008 (for most U.S. states) and unprocessed "raw" state voter rolls, also going back to 2008 for most U.S. states.
This collection is available to NYU faculty and students only, and requires user to first submit a data management plan to account for how access and storage of the data will be handled. Information on how to submit a request to use this data and create a data management plan is available at https://guides.nyu.edu/l2political.
Stata 13 format data of state population and area ratios 1790-2010
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The ICOW Project has also collected several supplementary data sets to help in subsequent data collection and analysis. While not directly involving issues, these data sets are important for testing various issue-related hypotheses (involving the impact of the regional or global institutional context, for example) and for collecting data using historical reference sources that may refer to states or other entities by non-current names. Historical State Names The ICOW historical state names data set includes alternative names (or alternative spellings of names) for each nation-state in the COW interstate system. The primary purpose of this data set is to assist data coders (or other researchers using historical sources), who can often be confused by references to entity names that no longer exist or are no longer used (leading to a risk of ignored or miscoded data). The data set attempts to list all relatively common alternative names that have been used to refer to each state over the past two centuries, so that the researcher can determine easily that "New Granada" actually refers to Colom bia rather than risking data loss or errors.
This study contains script files to create teaching versions of Understanding Society: Waves 1-3, the new UK household panel survey. Specifically, the user can focus on individual waves, or can create a panel survey dataset for use in teaching undergraduates and postgraduates. Core areas of focus are attitudes to voting and political parties, to the environment, and to ethnicity and migration. Script files are available for SPSS, STATA and R. Individuals wishing to make use of this resource will need to apply separately to the UK data archive for access to the original datasets: http://discover.ukdataservice.ac.uk/catalogue/?sn=6614 &type=Data%20catalogue