This dataset was used to conduct the NYC Campaign Finance Board's voter participation research, published in the 2019-2020 Voter Analysis Report. Each row contains information about an active voter in 2018 and their voting history dating back to 2008, along with geographical information from their place of residence for each year they were registered voters. Because this dataset contains only active voters in the year 2018, this dataset cannot be used to calculate election turnout.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Voter Registration Data’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://catalog.data.gov/dataset/4595f141-7b3f-45a8-b09d-093b01eb783f on 28 January 2022.
--- Dataset description provided by original source is as follows ---
All registered voters in Oregon. Find more elections and voter statistics for Oregon at https://sos.oregon.gov/elections/Pages/electionsstatistics.aspx
--- Original source retains full ownership of the source dataset ---
https://www.icpsr.umich.edu/web/ICPSR/studies/5904/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/5904/terms
This study contains data on legislative assembly general and mid-term election returns for all states in India in the period 1952-1967. The legislative constituency is the unit of analysis. Data are provided for the year of election, state and party names, eligible voters, number of seats, number of candidates, total votes, valid votes, caste or tribe indicator, and ranking of parties according to total votes cast.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘US non-voters poll data’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/yamqwe/us-non-voters-poll-datae on 28 January 2022.
--- Dataset description provided by original source is as follows ---
This dataset contains the data behind Why Many Americans Don't Vote.
Data presented here comes from polling done by Ipsos for FiveThirtyEight, using Ipsos’s KnowledgePanel, a probability-based online panel that is recruited to be representative of the U.S. population. The poll was conducted from Sept. 15 to Sept. 25 among a sample of U.S. citizens that oversampled young, Black and Hispanic respondents, with 8,327 respondents, and was weighted according to general population benchmarks for U.S. citizens from the U.S. Census Bureau’s Current Population Survey March 2019 Supplement. The voter file company Aristotle then matched respondents to a voter file to more accurately understand their voting history using the panelist’s first name, last name, zip code, and eight characters of their address, using the National Change of Address program if applicable. Sixty-four percent of the sample (5,355 respondents) matched, although we also included respondents who did not match the voter file but described themselves as voting “rarely” or “never” in our survey, so as to avoid underrepresenting nonvoters, who are less likely to be included in the voter file to begin with. We dropped respondents who were only eligible to vote in three elections or fewer. We defined those who almost always vote as those who voted in all (or all but one) of the national elections (presidential and midterm) they were eligible to vote in since 2000; those who vote sometimes as those who voted in at least two elections, but fewer than all the elections they were eligible to vote in (or all but one); and those who rarely or never vote as those who voted in no elections, or just one.
The data included here is the final sample we used: 5,239 respondents who matched to the voter file and whose verified vote history we have, and 597 respondents who did not match to the voter file and described themselves as voting "rarely" or "never," all of whom have been eligible for at least 4 elections.
If you find this information useful, please let us know.
License: Creative Commons Attribution 4.0 International License
Source: https://github.com/fivethirtyeight/data/tree/master/non-voters
This dataset was created by data.world's Admin and contains around 6000 samples along with Race, Q27 6, technical information and other features such as: - Q4 6 - Q8 3 - and more.
- Analyze Q10 3 in relation to Q8 6
- Study the influence of Q6 on Q10 4
- More datasets
If you use this dataset in your research, please credit data.world's Admin
--- Original source retains full ownership of the source dataset ---
https://www.icpsr.umich.edu/web/ICPSR/studies/36853/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/36853/terms
Voting Behavior, The 2016 Election is an instructional module designed to offer students the opportunity to analyze a dataset drawn from the American National Election (ANES) 2016 Time Series Study [ICPSR 36824]. This instructional module is part of the SETUPS (Supplementary Empirical Teaching Units in Political Science) series and differs from previous modules in that it is completely online, including the data analysis system components.
https://www.icpsr.umich.edu/web/ICPSR/studies/1304/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/1304/terms
The research addresses the evolution of electoral sentiment over the campaign cycle. The researchers translate general arguments about the role of election campaigns into a set of formal, statistical expectations, then outline an empirical analysis and examine trial-heat poll results for the 15 United States presidential elections between 1944 and 2000. The analysis focuses specifically on two questions. First, to what extent does the observable variation in aggregate poll results represent real movement in electoral preferences (if the election were held the day of the poll) as opposed to mere survey error? Second, to the extent polls register true movement of preferences owing to the shocks of campaign events, do the effects last or do they decay? Answers to these questions tell us whether and the extent to which campaign events have effects on preferences and whether these effects persist until Election Day. The answers thus inform about whether campaigns have any real impact on the final election outcome.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Agency Voter Registration Activity’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://catalog.data.gov/dataset/998f045b-4270-4800-92aa-1434e6f98550 on 27 January 2022.
--- Dataset description provided by original source is as follows ---
Section 1057-a of the New York City Charter requires certain agencies to engage in and report on voter registration activities. This dataset captures how many voter registration applications each agency has distributed, how many applications agency staff sent to the Board of Elections, how many staff each agency trained to distribute voter registration applications, whether or not the agency hosts a link to voting.nyc on its website and if so, how many clicks that link received during the reporting period. Some agencies distribute voter registration applications during the course of a direct interaction with a member of the public and other agencies distribute applications passively, such as making the applications available in waiting rooms. This makes it difficult in some cases to ascertain the exact number of voter registration forms distributed. Applications sent to the Board of Elections only captures those sent by agency staff. Individuals may also choose to send in the application themselves. These applications are not counted towards the total number of applications sent to the Board of Elections. Data is reported by agencies to the Mayor’s Office of Operations twice a year. The first reporting period covers January 1 to June 30; the second reporting period covers July 1 to December 31.
--- Original source retains full ownership of the source dataset ---
No description is available. Visit https://dataone.org/datasets/sha256%3A831ec24f99cfff6083ed3a3e5cd6edf17d61e783d659bcd1edf777fa3f4ab198 for complete metadata about this dataset.
https://www.icpsr.umich.edu/web/ICPSR/studies/1/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/1/terms
Please read the collection notes below; there are many points to be aware of for this collection prior to analysis. This collection of historical election data contains state files that list county-level returns for over 90 percent of all elections to the offices of president, governor, United States senator, and United States representative from 1824 through 1968. The data files include returns for all parties and candidates (as well as write-in and scattering votes if available for individual states), and for special elections as well as regularly-scheduled contests. Over 1,000 individual party names and many additional unaffiliated candidates are included.
https://www.icpsr.umich.edu/web/ICPSR/studies/7757/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/7757/terms
These data are derived from CANDIDATE NAME AND CONSTITUENCY TOTALS, 1788-1990 (ICPSR 0002). They consist of returns for two-thirds of all elections from 1788 to 1823 to the offices of president, governor, and United States representative, and over 90 percent of all elections to those offices since 1824. They also include information on United States Senate elections since 1912. Returns for one additional statewide office are included beginning with the 1968 election. This file provides a set of derived measures describing the vote totals for candidates and the pattern of contest in each constituency. These measures include the total number of votes cast for all candidates in the election, each candidate's percentage of the vote received, and several measures of the relative performance of each candidate. They are appended to the individual candidate records and permit extensive analysis of electoral contests over time. This dataset contains returns for all parties and candidates (as well as scattering vote) for general elections and special elections, including information on elections for which returns were available only at the constituency level. Included in this edition are data from the District of Columbia election for United States senator and United States representative. The offices of two senators and one representative were created by the "District of Columbia Statehood Constitutional Convention Initiative," which was approved by District voters in 1980. Elections for these offices were postponed until the 1990 general election. The three offices are currently local District positions, which will turn into federal offices if the District becomes a state.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This study uses prediction market data from the nation’s historical election betting markets to measure electoral competition in the American states during the era before the advent of scientific polling. Betting odds data capture ex ante expectations of electoral closeness in the aggregate, and as such improve upon existing measures of competition based on election returns data. Situated in an analysis of the1896 presidential election and its associated realignment, I argue that the market odds data show that people were able to anticipate the realignment and that expectations on the outcome in the states influenced voter turnout. Findings show that a month ahead of the election betting markets accurately forecast a McKinley victory in most states. This study further demonstrates that the market predictions identify those states where electoral competition would increase or decline that year and the consequences of these expected partisanship shifts on turnout. In places where the anticipation was for a close race voter expectations account for a turnout increase of as much as 6%. Participation dropped by 1% to 6% in states perceived as becoming electorally uncompetitive. The results support the conversion and dealignment theories from the realignment literature.
https://www.icpsr.umich.edu/web/ICPSR/studies/3356/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/3356/terms
This Supplementary Empirical Teaching Units in Political Science (SETUPS) module is designed as an introduction to the study of elections, voting behavior, and survey data through the analysis of the 2000 United States general election. The data are taken from the AMERICAN NATIONAL ELECTION STUDY, 2000: PRE- AND POST-ELECTION SURVEY (ICPSR 3131), conducted by Nancy Burns, Donald R. Kinder, Steven J. Rosenstone, Virginia Sapiro, and the National Election Studies. A subset of items was drawn from the full election survey, including questions on voting behavior, political involvement, media involvement, candidate images, presidential approval and government performance, economic conditions, ideology, general spending and taxation, social welfare policy, foreign policy and defense issues, social and other domestic issues, civil rights and equality, and general orientations toward government. A number of social and demographic characteristics such as gender, race, age, marital status, education, occupation, income, religious affiliation, region, and employment status are also included.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
September 1., 2016 REPLICATION FILES FOR «THE IMPACT OF STATE TELEVISION ON VOTER TURNOUT», TO BE PUBLISHED BY THE BRITISH JOURNAL OF POLITICAL SCIENCE The replication files consist of two datasets and corresponding STATA do-files. Please note the following: 1. The data used in the current microanalysis are based on the National Election Surveys of 1965, 1969, and 1973. The Institute of Social Research (ISF) was responsible for the original studies, and data was made available by the NSD (Norwegian Center for Research Data). Neither ISF nor NSD are responsible for the analyses/interpretations of the data presented here. 2. Some of the data used in the municipality-level analyses are taken from NSD’s local government database (“Kommunedatabasen”). The NSD is not responsible for the analysis presented here or the interpretation offered in the BJPS-paper. 3. Note the municipality identification has been anonymized to avoid identification of individual respondents. 4. Most of the analyses generate Word-files that are produced by the outreg2 facility in STATA. These tables can be compared with those presented in the paper. The graphs are directly comparable to those in the paper. In a few cases, the results are only generated in the STATA output window. The paper employs two sets of data: I. Municipal level data in entered in STATA-format (AggregateReplicationTVData.dta), and with a corresponding data with map coordinates (muncoord.dta). The STATA code is in a do-file (ReplicationOfAggregateAnalysis.do). II. The survey data is in a STATA-file (ReplicationofIndividualLevelPanel.dta) and a with a corresponding do-file (ReplicationOfIndividualLevelAnalysis 25.08.2016.do). Please remember to change the file reference (i.e. use-statement) to execute the do-files.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
CodeThis figshare repository hosts a collection of tools and scripts for Twitter data analysis, focusing on Election Prediction using sentiment analysis and tweet processing. The repository includes four key files:twitter_data_collection.py: This Python script is designed for collecting tweets from Twitter in JSON format. It provides a robust method for gathering data from the Twitter platform.EP.ipynb: EP.ipynb" is designed for sentiment analysis and tweet processing. It features three sentiment analysis methods: VADER, BERT, and BERTweet. It includes a US states dictionary for geolocating and categorizing tweets by state, providing sentiment analysis results in both volumetric and percentage formats. Furthermore, it offers time-series analysis options, particularly on a monthly basis. It also includes a feature for filtering COVID-19-related tweets. Additionally, it conducts election analysis at both state and country levels, giving insights into public sentiment and engagement regarding political elections.Datasetbiden and trump.csv Files:The "biden.csv" and "trump.csv" files together constitute an extensive dataset of tweets related to two prominent U.S. political figures, Joe Biden and Donald Trump. These files contain detailed information about each tweet, including the following key attributes:create_date: The date the tweet was created.id: A unique identifier for each tweet.tweet_text: The actual text content of the tweet.user_id: The unique identifier for the Twitter user who posted the tweet.user_name: The name of the Twitter user.user_screen_name: The Twitter handle of the user.user_location: The location provided by the user in their Twitter profile.state (location): The U.S. state associated with the user's provided location.text_clean: The tweet text after preprocessing, making it suitable for analysis.Additionally, sentiment analysis has been applied to these tweets using two different methods:VADER Sentiment Analysis: Each tweet has been assigned a sentiment score and a sentiment category (positive, negative, or neutral) using VADER sentiment analysis. The sentiment scores are provided in the "Vader_score" column, and the sentiment categories are in the "Vader_sentiment" column.BERTweet Sentiment Analysis: The files also feature sentiment labels assigned using the BERTweet sentiment analysis method, along with associated sentiment scores. The sentiment labels can be found in the "Sentiment" column, and the cleaned sentiment labels are available in the "Sentiment_clean" column.This combined dataset offers a valuable resource for exploring sentiment trends, conducting research on public sentiment, and analyzing Twitter users' opinions related to Joe Biden and Donald Trump. Researchers, data analysts, and sentiment analysis practitioners can utilize this data for a wide range of studies and projects.This repository serves as a resource for collecting, processing, and analyzing Twitter data with a focus on sentiment analysis. It offers a range of tools and datasets to support research and experimentation in this area.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘United Nations voting database’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/guybarash/un-resolutions on 13 February 2022.
--- Dataset description provided by original source is as follows ---
This database contains all logged UN resolutions that have votes.
This database contains 7855 records. Each record is either a General Assembly (GA) vote, or a Security Council (SC) vote. The votes are from 1946 until 2021.
Each nation can vote: Y (YES) N (NO) X (Chose not to vote) A (Absent) [EMPTY] the nation is not relevant to the vote.
The data was scrape from the very well organized UN digital library at: https://digitallibrary.un.org/
This database was constructed to better understand relationship and biases between nations
--- Original source retains full ownership of the source dataset ---
https://dataful.in/terms-and-conditionshttps://dataful.in/terms-and-conditions
The Dataset contains constituency type and gender-wise Nominations Withdrawn, Forfeited Deposits, Contesting Candidates, Nominations Filed, Nominations Rejected in the assembly elections for each state
https://www.icpsr.umich.edu/web/ICPSR/studies/9249/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/9249/terms
This SETUPS is designed as an introduction to the study of elections, voting behavior, and survey data through the analysis of the 1988 United States general election. The data are taken from the AMERICAN NATIONAL ELECTION STUDY, 1988: PRE- AND POST-ELECTION SURVEY (ICPSR 9196), conducted by Warren E. Miller and the National Election Studies. A subset of items including behavioral, attitudinal, and sociodemographic data were drawn from the full election survey.
This is partial replication data for a paper analyzing voting irregularities in Bolivia's 2019 election. This dataset includes municipal-level data for the following presidential elections: 2002, 2005, 2009, 2014, 2019, and 2020. The data includes voter turnout, MAS vote share, the share of the largest opposition party, effective number of parties, and select socioeconomic and demographic indicators.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘BOE - Precinct Voter Counts’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://catalog.data.gov/dataset/4317aa05-035d-47d0-b1e1-39527678f11c on 11 February 2022.
--- Dataset description provided by original source is as follows ---
This dataset contains Voter Counts by Precinct and political parties. Update Frequency : Monthly
--- Original source retains full ownership of the source dataset ---
AP VoteCast is a survey of the American electorate conducted by NORC at the University of Chicago for Fox News, NPR, PBS NewsHour, Univision News, USA Today Network, The Wall Street Journal and The Associated Press.
AP VoteCast combines interviews with a random sample of registered voters drawn from state voter files with self-identified registered voters selected using nonprobability approaches. In general elections, it also includes interviews with self-identified registered voters conducted using NORC’s probability-based AmeriSpeak® panel, which is designed to be representative of the U.S. population.
Interviews are conducted in English and Spanish. Respondents may receive a small monetary incentive for completing the survey. Participants selected as part of the random sample can be contacted by phone and mail and can take the survey by phone or online. Participants selected as part of the nonprobability sample complete the survey online.
In the 2020 general election, the survey of 133,103 interviews with registered voters was conducted between Oct. 26 and Nov. 3, concluding as polls closed on Election Day. AP VoteCast delivered data about the presidential election in all 50 states as well as all Senate and governors’ races in 2020.
This is survey data and must be properly weighted during analysis: DO NOT REPORT THIS DATA AS RAW OR AGGREGATE NUMBERS!!
Instead, use statistical software such as R or SPSS to weight the data.
National Survey
The national AP VoteCast survey of voters and nonvoters in 2020 is based on the results of the 50 state-based surveys and a nationally representative survey of 4,141 registered voters conducted between Nov. 1 and Nov. 3 on the probability-based AmeriSpeak panel. It included 41,776 probability interviews completed online and via telephone, and 87,186 nonprobability interviews completed online. The margin of sampling error is plus or minus 0.4 percentage points for voters and 0.9 percentage points for nonvoters.
State Surveys
In 20 states in 2020, AP VoteCast is based on roughly 1,000 probability-based interviews conducted online and by phone, and roughly 3,000 nonprobability interviews conducted online. In these states, the margin of sampling error is about plus or minus 2.3 percentage points for voters and 5.5 percentage points for nonvoters.
In an additional 20 states, AP VoteCast is based on roughly 500 probability-based interviews conducted online and by phone, and roughly 2,000 nonprobability interviews conducted online. In these states, the margin of sampling error is about plus or minus 2.9 percentage points for voters and 6.9 percentage points for nonvoters.
In the remaining 10 states, AP VoteCast is based on about 1,000 nonprobability interviews conducted online. In these states, the margin of sampling error is about plus or minus 4.5 percentage points for voters and 11.0 percentage points for nonvoters.
Although there is no statistically agreed upon approach for calculating margins of error for nonprobability samples, these margins of error were estimated using a measure of uncertainty that incorporates the variability associated with the poll estimates, as well as the variability associated with the survey weights as a result of calibration. After calibration, the nonprobability sample yields approximately unbiased estimates.
As with all surveys, AP VoteCast is subject to multiple sources of error, including from sampling, question wording and order, and nonresponse.
Sampling Details
Probability-based Registered Voter Sample
In each of the 40 states in which AP VoteCast included a probability-based sample, NORC obtained a sample of registered voters from Catalist LLC’s registered voter database. This database includes demographic information, as well as addresses and phone numbers for registered voters, allowing potential respondents to be contacted via mail and telephone. The sample is stratified by state, partisanship, and a modeled likelihood to respond to the postcard based on factors such as age, race, gender, voting history, and census block group education. In addition, NORC attempted to match sampled records to a registered voter database maintained by L2, which provided additional phone numbers and demographic information.
Prior to dialing, all probability sample records were mailed a postcard inviting them to complete the survey either online using a unique PIN or via telephone by calling a toll-free number. Postcards were addressed by name to the sampled registered voter if that individual was under age 35; postcards were addressed to “registered voter” in all other cases. Telephone interviews were conducted with the adult that answered the phone following confirmation of registered voter status in the state.
Nonprobability Sample
Nonprobability participants include panelists from Dynata or Lucid, including members of its third-party panels. In addition, some registered voters were selected from the voter file, matched to email addresses by V12, and recruited via an email invitation to the survey. Digital fingerprint software and panel-level ID validation is used to prevent respondents from completing the AP VoteCast survey multiple times.
AmeriSpeak Sample
During the initial recruitment phase of the AmeriSpeak panel, randomly selected U.S. households were sampled with a known, non-zero probability of selection from the NORC National Sample Frame and then contacted by mail, email, telephone and field interviewers (face-to-face). The panel provides sample coverage of approximately 97% of the U.S. household population. Those excluded from the sample include people with P.O. Box-only addresses, some addresses not listed in the U.S. Postal Service Delivery Sequence File and some newly constructed dwellings. Registered voter status was confirmed in field for all sampled panelists.
Weighting Details
AP VoteCast employs a four-step weighting approach that combines the probability sample with the nonprobability sample and refines estimates at a subregional level within each state. In a general election, the 50 state surveys and the AmeriSpeak survey are weighted separately and then combined into a survey representative of voters in all 50 states.
State Surveys
First, weights are constructed separately for the probability sample (when available) and the nonprobability sample for each state survey. These weights are adjusted to population totals to correct for demographic imbalances in age, gender, education and race/ethnicity of the responding sample compared to the population of registered voters in each state. In 2020, the adjustment targets are derived from a combination of data from the U.S. Census Bureau’s November 2018 Current Population Survey Voting and Registration Supplement, Catalist’s voter file and the Census Bureau’s 2018 American Community Survey. Prior to adjusting to population totals, the probability-based registered voter list sample weights are adjusted for differential non-response related to factors such as availability of phone numbers, age, race and partisanship.
Second, all respondents receive a calibration weight. The calibration weight is designed to ensure the nonprobability sample is similar to the probability sample in regard to variables that are predictive of vote choice, such as partisanship or direction of the country, which cannot be fully captured through the prior demographic adjustments. The calibration benchmarks are based on regional level estimates from regression models that incorporate all probability and nonprobability cases nationwide.
Third, all respondents in each state are weighted to improve estimates for substate geographic regions. This weight combines the weighted probability (if available) and nonprobability samples, and then uses a small area model to improve the estimate within subregions of a state.
Fourth, the survey results are weighted to the actual vote count following the completion of the election. This weighting is done in 10–30 subregions within each state.
National Survey
In a general election, the national survey is weighted to combine the 50 state surveys with the nationwide AmeriSpeak survey. Each of the state surveys is weighted as described. The AmeriSpeak survey receives a nonresponse-adjusted weight that is then adjusted to national totals for registered voters that in 2020 were derived from the U.S. Census Bureau’s November 2018 Current Population Survey Voting and Registration Supplement, the Catalist voter file and the Census Bureau’s 2018 American Community Survey. The state surveys are further adjusted to represent their appropriate proportion of the registered voter population for the country and combined with the AmeriSpeak survey. After all votes are counted, the national data file is adjusted to match the national popular vote for president.
This dataset was used to conduct the NYC Campaign Finance Board's voter participation research, published in the 2019-2020 Voter Analysis Report. Each row contains information about an active voter in 2018 and their voting history dating back to 2008, along with geographical information from their place of residence for each year they were registered voters. Because this dataset contains only active voters in the year 2018, this dataset cannot be used to calculate election turnout.