Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset contains the county-wise vote share of the United States presidential election of 2020, and in the future 2024, the main advantage of the dataset is that it contains various important county statistics such as the counties racial composition, median and mean income, income inequality, population density, education level, population and the counties occupational distribution.
_Imp: this dataset will be updated as the 2024 results come in, I will also be adding more county demographic data, if you have any queries or suggestions please feel free to comment _
The reasons for constructing this dataset are many, however the prime reason was to aggregate all the data on counties along with the election result data for easy analysis in one place. I noticed that Kaggle contains no datasets with detailed county information, and that using the US census bureau site is pretty difficult and time consuming to extract data so it would be better to have a pre-prepared table of data
Facebook
TwitterThe Cumulative Report includes complete official election results for the 2020 Presidential General Election as of November 29, 2020. Results are released in three separate reports: The Vote By Mail 1 report contains complete results for ballots received by the Board of Elections on or before October 21, 2020, that could be accepted and opened before Election Day. The Vote By Mail 2 Canvass report contains complete results for all remaining Vote By Mail ballots that were received in a drop box or in person at the Board of Elections by 8:00pm on November 3, or were postmarked by November 3 and received timely by the Board of Elections by 10:00am on Friday, November 13. The Vote By Mail 2 Canvass begins on Thursday, November 5. The Provisional Canvass contains complete results for all provisional ballots issued to voters at Early Voting or on Election Day. For more information on this process, please visit the 2020 Presidential General Election Ballot Canvass webpage at https://www.montgomerycountymd.gov/Elections/2020GeneralElection/general-ballot-canvass.html. For turnout information, please visit the Maryland State Board of Elections Press Room webpage at https://elections.maryland.gov/press_room/index.html.
Facebook
TwitterPROBLEM AND OPPORTUNITY In the United States, voting is largely a private matter. A registered voter is given a randomized ballot form or machine to prevent linkage between their voting choices and their identity. This disconnect supports confidence in the election process, but it provides obstacles to an election's analysis. A common solution is to field exit polls, interviewing voters immediately after leaving their polling location. This method is rife with bias, however, and functionally limited in direct demographics data collected. For the 2020 general election, though, most states published their election results for each voting location. These publications were additionally supported by the geographical areas assigned to each location, the voting precincts. As a result, geographic processing can now be applied to project precinct election results onto Census block groups. While precinct have few demographic traits directly, their geographies have characteristics that make them projectable onto U.S. Census geographies. Both state voting precincts and U.S. Census block groups: are exclusive, and do not overlap are adjacent, fully covering their corresponding state and potentially county have roughly the same size in area, population and voter presence Analytically, a projection of local demographics does not allow conclusions about voters themselves. However, the dataset does allow statements related to the geographies that yield voting behavior. One could say, for example, that an area dominated by a particular voting pattern would have mean traits of age, race, income or household structure. The dataset that results from this programming provides voting results allocated by Census block groups. The block group identifier can be joined to Census Decennial and American Community Survey demographic estimates. DATA SOURCES The state election results and geographies have been compiled by Voting and Election Science team on Harvard's dataverse. State voting precincts lie within state and county boundaries. The Census Bureau, on the other hand, publishes its estimates across a variety of geographic definitions including a hierarchy of states, counties, census tracts and block groups. Their definitions can be found here. The geometric shapefiles for each block group are available here. The lowest level of this geography changes often and can obsolesce before the next census survey (Decennial or American Community Survey programs). The second to lowest census level, block groups, have the benefit of both granularity and stability however. The 2020 Decennial survey details US demographics into 217,740 block groups with between a few hundred and a few thousand people. Dataset Structure The dataset's columns include: Column Definition BLOCKGROUP_GEOID 12 digit primary key. Census GEOID of the block group row. This code concatenates: 2 digit state 3 digit county within state 6 digit Census Tract identifier 1 digit Census Block Group identifier within tract STATE State abbreviation, redundent with 2 digit state FIPS code above REP Votes for Republican party candidate for president DEM Votes for Democratic party candidate for president LIB Votes for Libertarian party candidate for president OTH Votes for presidential candidates other than Republican, Democratic or Libertarian AREA square kilometers of area associated with this block group GAP total area of the block group, net of area attributed to voting precincts PRECINCTS Number of voting precincts that intersect this block group ASSUMPTIONS, NOTES AND CONCERNS: Votes are attributed based upon the proportion of the precinct's area that intersects the corresponding block group. Alternative methods are left to the analyst's initiative. 50 states and the District of Columbia are in scope as those U.S. possessions voting in the general election for the U.S. Presidency. Three states did not report their results at the precinct level: South Dakota, Kentucky and West Virginia. A dummy block group is added for each of these states to maintain national totals. These states represent 2.1% of all votes cast. Counties are commonly coded using FIPS codes. However, each election result file may have the county field named differently. Also, three states do not share county definitions - Delaware, Massachusetts, Alaska and the District of Columbia. Block groups may be used to capture geographies that do not have population like bodies of water. As a result, block groups without intersection voting precincts are not uncommon. In the U.S., elections are administered at a state level with the Federal Elections Commission compiling state totals against the Electoral College weights. The states have liberty, though, to define and change their own voting precincts https://en.wikipedia.org/wiki/Electoral_precinct. The Census Bureau... Visit https://dataone.org/datasets/sha256%3A05707c1dc04a814129f751937a6ea56b08413546b18b351a85bc96da16a7f8b5 for complete metadata about this dataset.
Facebook
TwitterAP VoteCast is a survey of the American electorate conducted by NORC at the University of Chicago for Fox News, NPR, PBS NewsHour, Univision News, USA Today Network, The Wall Street Journal and The Associated Press.
AP VoteCast combines interviews with a random sample of registered voters drawn from state voter files with self-identified registered voters selected using nonprobability approaches. In general elections, it also includes interviews with self-identified registered voters conducted using NORC’s probability-based AmeriSpeak® panel, which is designed to be representative of the U.S. population.
Interviews are conducted in English and Spanish. Respondents may receive a small monetary incentive for completing the survey. Participants selected as part of the random sample can be contacted by phone and mail and can take the survey by phone or online. Participants selected as part of the nonprobability sample complete the survey online.
In the 2020 general election, the survey of 133,103 interviews with registered voters was conducted between Oct. 26 and Nov. 3, concluding as polls closed on Election Day. AP VoteCast delivered data about the presidential election in all 50 states as well as all Senate and governors’ races in 2020.
This is survey data and must be properly weighted during analysis: DO NOT REPORT THIS DATA AS RAW OR AGGREGATE NUMBERS!!
Instead, use statistical software such as R or SPSS to weight the data.
National Survey
The national AP VoteCast survey of voters and nonvoters in 2020 is based on the results of the 50 state-based surveys and a nationally representative survey of 4,141 registered voters conducted between Nov. 1 and Nov. 3 on the probability-based AmeriSpeak panel. It included 41,776 probability interviews completed online and via telephone, and 87,186 nonprobability interviews completed online. The margin of sampling error is plus or minus 0.4 percentage points for voters and 0.9 percentage points for nonvoters.
State Surveys
In 20 states in 2020, AP VoteCast is based on roughly 1,000 probability-based interviews conducted online and by phone, and roughly 3,000 nonprobability interviews conducted online. In these states, the margin of sampling error is about plus or minus 2.3 percentage points for voters and 5.5 percentage points for nonvoters.
In an additional 20 states, AP VoteCast is based on roughly 500 probability-based interviews conducted online and by phone, and roughly 2,000 nonprobability interviews conducted online. In these states, the margin of sampling error is about plus or minus 2.9 percentage points for voters and 6.9 percentage points for nonvoters.
In the remaining 10 states, AP VoteCast is based on about 1,000 nonprobability interviews conducted online. In these states, the margin of sampling error is about plus or minus 4.5 percentage points for voters and 11.0 percentage points for nonvoters.
Although there is no statistically agreed upon approach for calculating margins of error for nonprobability samples, these margins of error were estimated using a measure of uncertainty that incorporates the variability associated with the poll estimates, as well as the variability associated with the survey weights as a result of calibration. After calibration, the nonprobability sample yields approximately unbiased estimates.
As with all surveys, AP VoteCast is subject to multiple sources of error, including from sampling, question wording and order, and nonresponse.
Sampling Details
Probability-based Registered Voter Sample
In each of the 40 states in which AP VoteCast included a probability-based sample, NORC obtained a sample of registered voters from Catalist LLC’s registered voter database. This database includes demographic information, as well as addresses and phone numbers for registered voters, allowing potential respondents to be contacted via mail and telephone. The sample is stratified by state, partisanship, and a modeled likelihood to respond to the postcard based on factors such as age, race, gender, voting history, and census block group education. In addition, NORC attempted to match sampled records to a registered voter database maintained by L2, which provided additional phone numbers and demographic information.
Prior to dialing, all probability sample records were mailed a postcard inviting them to complete the survey either online using a unique PIN or via telephone by calling a toll-free number. Postcards were addressed by name to the sampled registered voter if that individual was under age 35; postcards were addressed to “registered voter” in all other cases. Telephone interviews were conducted with the adult that answered the phone following confirmation of registered voter status in the state.
Nonprobability Sample
Nonprobability participants include panelists from Dynata or Lucid, including members of its third-party panels. In addition, some registered voters were selected from the voter file, matched to email addresses by V12, and recruited via an email invitation to the survey. Digital fingerprint software and panel-level ID validation is used to prevent respondents from completing the AP VoteCast survey multiple times.
AmeriSpeak Sample
During the initial recruitment phase of the AmeriSpeak panel, randomly selected U.S. households were sampled with a known, non-zero probability of selection from the NORC National Sample Frame and then contacted by mail, email, telephone and field interviewers (face-to-face). The panel provides sample coverage of approximately 97% of the U.S. household population. Those excluded from the sample include people with P.O. Box-only addresses, some addresses not listed in the U.S. Postal Service Delivery Sequence File and some newly constructed dwellings. Registered voter status was confirmed in field for all sampled panelists.
Weighting Details
AP VoteCast employs a four-step weighting approach that combines the probability sample with the nonprobability sample and refines estimates at a subregional level within each state. In a general election, the 50 state surveys and the AmeriSpeak survey are weighted separately and then combined into a survey representative of voters in all 50 states.
State Surveys
First, weights are constructed separately for the probability sample (when available) and the nonprobability sample for each state survey. These weights are adjusted to population totals to correct for demographic imbalances in age, gender, education and race/ethnicity of the responding sample compared to the population of registered voters in each state. In 2020, the adjustment targets are derived from a combination of data from the U.S. Census Bureau’s November 2018 Current Population Survey Voting and Registration Supplement, Catalist’s voter file and the Census Bureau’s 2018 American Community Survey. Prior to adjusting to population totals, the probability-based registered voter list sample weights are adjusted for differential non-response related to factors such as availability of phone numbers, age, race and partisanship.
Second, all respondents receive a calibration weight. The calibration weight is designed to ensure the nonprobability sample is similar to the probability sample in regard to variables that are predictive of vote choice, such as partisanship or direction of the country, which cannot be fully captured through the prior demographic adjustments. The calibration benchmarks are based on regional level estimates from regression models that incorporate all probability and nonprobability cases nationwide.
Third, all respondents in each state are weighted to improve estimates for substate geographic regions. This weight combines the weighted probability (if available) and nonprobability samples, and then uses a small area model to improve the estimate within subregions of a state.
Fourth, the survey results are weighted to the actual vote count following the completion of the election. This weighting is done in 10–30 subregions within each state.
National Survey
In a general election, the national survey is weighted to combine the 50 state surveys with the nationwide AmeriSpeak survey. Each of the state surveys is weighted as described. The AmeriSpeak survey receives a nonresponse-adjusted weight that is then adjusted to national totals for registered voters that in 2020 were derived from the U.S. Census Bureau’s November 2018 Current Population Survey Voting and Registration Supplement, the Catalist voter file and the Census Bureau’s 2018 American Community Survey. The state surveys are further adjusted to represent their appropriate proportion of the registered voter population for the country and combined with the AmeriSpeak survey. After all votes are counted, the national data file is adjusted to match the national popular vote for president.
Facebook
TwitterThis data comes from the Associated Press - the AP has been tracking vote counts in US elections since 1848 and their data is widely considered to be accurate.
The variables in this dataset are:
- state: State to which the vote count corresponds
- state_abr: Two-letter abbreviation of state name
- trump_pct: Percentage of the vote won by Donald Trump
- biden_pct: Percentage of the vote won by Joe Biden
- trump_win: Binary variable denoting whether Donald Trump won the vote in a state
- biden_win: Binary variable denoting whether Joe Biden won the vote in a state
Facebook
Twitterhttps://www.usa.gov/government-works/https://www.usa.gov/government-works/
Voter turnout is the percentage of eligible voters who cast a ballot in an election. Eligibility varies by country, and the voting-eligible population should not be confused with the total adult population. Age and citizenship status are often among the criteria used to determine eligibility, but some countries further restrict eligibility based on sex, race, or religion.
The historical trends in voter turnout in the United States presidential elections have been determined by the gradual expansion of voting rights from the initial restriction to white male property owners aged 21 or older in the early years of the country's independence, to all citizens aged 18 or older in the mid-20th century. Voter turnout in United States presidential elections has historically been higher than the turnout for midterm elections.
https://upload.wikimedia.org/wikipedia/commons/a/a7/U.S._Vote_for_President_as_Population_Share.png" alt="f">
Turnout rates by demographic breakdown from the Census Bureau's Current Population Survey, November Voting and Registration Supplement (or CPS for short). This table are corrected for vote overreporting bias. For uncorrected weights see the source link.
Original source: https://data.world/government/vep-turnout
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Terms of Access: By downloading the data, you agree to use the data only for academic research, agree not to share the data with outside parties, and agree not to attempt to re-identify individuals in the data set. We require this in order to protect the privacy of individuals in the data set and to comply with agreements made with TargetSmart. Abstract: We present the results of a large, $8.9 million campaign-wide field experiment, conducted among 2 million moderate and low-information “persuadable” voters in five battleground states during the 2020 US Presidential election. Treatment group subjects were exposed to an eight-month-long advertising program delivered via social media, designed to persuade people to vote against Donald Trump and for Joe Biden. We found no evidence the program increased or decreased turnout on average. We find evidence of differential turnout effects by modeled level of Trump support: the campaign increased voting among Biden leaners by 0.4 percentage points (SE: 0.2pp) and decreased voting among Trump leaners by 0.3 percentage points (SE: 0.3pp), for a difference-in-CATES of 0.7 points that is just distinguishable from zero (t(1035571) = −2.09, p = 0.036, DIC = 0.7 points, 95% CI = [−0.014, −0.00]). An important but exploratory finding is that the strongest differential effects appear in early voting data, which may inform future work on early campaigning in a post-COVID electoral environment. Our results indicate that differential mobilization effects of even large digital advertising campaigns in presidential elections are likely to be modest.
Facebook
TwitterData from https://github.com/TheUpshot/presidential-precinct-map-2020 released under MIT license: https://github.com/TheUpshot/presidential-precinct-map-2020/blob/main/LICENSE. For more detail, see https://www.nytimes.com/interactive/2021/upshot/2020-election-map.html.
The Upshot scraped and standardized precinct-level election results from around the country, and joined this tabular data to precinct GIS data to create a nationwide election map. This map does not have full coverage for every state: data availability and caveats for each state are listed below, and statistics about data coverage are available here. We are releasing this map's data for attributed re-use under the MIT license in this repository.
The GeoJSON dataset can be downloaded at: https://int.nyt.com/newsgraphics/elections/map-data/2020/national/precincts-with-results.geojson.gz
Properties on each precinct polygon:
GEOID: unique identifier for the precinct, formed from the five-digit county FIPS code followed by the precinct name/ID (eg, 30003-08 or 39091-WEST MANSFIELD)votes_dem: votes received by Joseph Bidenvotes_rep: votes received by Donald Trumpvotes_total: total votes in the precinct, including for third-party candidates and write-insvotes_per_sqkm: total votes divided by the area of the precinct, rounded to one decimal placepct_dem_lead: (votes_dem - votes_rep) / (votes_dem + votes_rep), rounded to one decimal place (eg, -21.3)Due to licensing restrictions, we are unable to include the 2016 election results that appear in our interactive map.
Please contact dear.upshot@nytimes.com if you have any questions about data quality or sourcing, beyond the caveats we describe below.
| symbol | meaning |
|---|---|
| ✅ | have gathered data, no significant caveats |
| ⚠️ | have gathered data, but doesn't cover entire state or has other significant caveats |
| ❌ | precinct data not usable |
| ❓ | precinct data not yet available |
Note: One of the most common causes of precinct data being unusable is "countywide" tabulations. This occurs when a county reports, say, all of its absentee ballots together as a single row in its Excel download (instead of precinct-by-precinct); because we can't attribute those ballots to specific precincts, that means that all precincts in the county will be missing an indeterminite number of votes, and therefore can't be reliably mapped. In these cases, we drop the entire county from our GeoJSON.
AL: ❌ absentee and provisional results are reported countywideAK: ❌ absentee, early, and provisional results are reported district-wideAZ: ✅AR: ⚠️ we could not generate or procure precinct maps for Jefferson County or Phillips CountyCA: ⚠️ only certain counties report results at the precinct level, additional collection is in progressCO: ✅CT: ⚠️ township-level results rather than precinct-level resultsDE: ✅DC: ✅FL: ⚠️ precinct results not yet available statewideGA: ✅HI: ✅ID: ⚠️ many counties ...
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This data set consists of all Fulton County Election results from April 2012 to present. Included with each record is the race, candidate, precinct, number of election day votes, number of absentee by mail votes, number of advance in person votes, number of provisional votes, total number of votes, name of election, and date of election. This data set is updated after each election.
Facebook
TwitterAccording to exit polling in ten key states of the 2024 presidential election in the United States, 46 percent of voters with a 2023 household income of 30,000 U.S. dollars or less reported voting for Donald Trump. In comparison, 51 percent of voters with a total family income of 100,000 to 199,999 U.S. dollars reported voting for Kamala Harris.
Facebook
TwitterThis web map displays data from the voter registration database as the percent of registered voters by census tract in King County, Washington. The data for this web map is compiled from King County Elections voter registration data for the years 2013-2019. The total number of registered voters is based on the geo-location of the voter's registered address at the time of the general election for each year. The eligible voting population, age 18 and over, is based on the estimated population increase from the US Census Bureau and the Washington Office of Financial Management and was calculated as a projected 6 percent population increase for the years 2010-2013, 7 percent population increase for the years 2010-2014, 9 percent population increase for the years 2010-2015, 11 percent population increase for the years 2010-2016 & 2017, 14 percent population increase for the years 2010-2018 and 17 percent population increase for the years 2010-2019. The total population 18 and over in 2010 was 1,517,747 in King County, Washington. The percentage of registered voters represents the number of people who are registered to vote as compared to the eligible voting population, age 18 and over. The voter registration data by census tract was grouped into six percentage range estimates: 50% or below, 51-60%, 61-70%, 71-80%, 81-90% and 91% or above with an overall 84 percent registration rate. In the map the lighter colors represent a relatively low percentage range of voter registration and the darker colors represent a relatively high percentage range of voter registration. PDF maps of these data can be viewed at King County Elections downloadable voter registration maps. The 2019 General Election Voter Turnout layer is voter turnout data by historical precinct boundaries for the corresponding year. The data is grouped into six percentage ranges: 0-30%, 31-40%, 41-50% 51-60%, 61-70%, and 71-100%. The lighter colors represent lower turnout and the darker colors represent higher turnout. The King County Demographics Layer is census data for language, income, poverty, race and ethnicity at the census tract level and is based on the 2010-2014 American Community Survey 5 year Average provided by the United States Census Bureau. Since the data is based on a survey, they are considered to be estimates and should be used with that understanding. The demographic data sets were developed and are maintained by King County Staff to support the King County Equity and Social Justice program. Other data for this map is located in the King County GIS Spatial Data Catalog, where data is managed by the King County GIS Center, a multi-department enterprise GIS in King County, Washington. King County has nearly 1.3 million registered voters and is the largest jurisdiction in the United States to conduct all elections by mail. In the map you can view the percent of registered voters by census tract, compare registration within political districts, compare registration and demographic data, verify your voter registration or register to vote through a link to the VoteWA, Washington State Online Voter Registration web page.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
EPILOGUE:
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F360751%2Fa5eefdb31428bd5ce99cdf76fa484a63%2Fmap.jpg?generation=1733007717460285&alt=media" alt="">
FINAL UPDATE: It's election night, and the results are coming in. The final update includes the latest poll data from 538, which is from two days ago. Thanks all for following the development of this dataset.
OCTOBER UPDATE: The past month has been typical of the final weeks before the election - rallies, interviews, and advertising. This update includes a transcript of the VP debate between Walz and Vance, and the latest poll summaries.
SEPTEMBER UPDATE: Trump and Harris had their first debate. This update includes the transcript and recent poll results. Also, there was a second attempt to kill former President Trump! No shots fired though on this one. You'll see aerial diagrams of both attempts in the dataset.
https://external-content.duckduckgo.com/iu/?u=https%3A%2F%2Ftse4.mm.bing.net%2Fth%3Fid%3DOIF.edyLiGntLZbwC9fBkg8TsQ%26pid%3DApi&f=1&ipt=a1096b37cf3eced7dff70d362a2c76f8876422f53c47856cadf09f9fa18b367e&ipo=images" alt="debate">
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F360751%2F0ecedf88421c303e0112734a30de9e29%2Frouth.jpg?generation=1726701011377683&alt=media">
LATE AUGUST UPDATE: The Democratic Party replaced President Biden with his VP, Kamala Harris. It's now Trump v Harris along with one nominee from each of the smaller factions.
https://external-content.duckduckgo.com/iu/?u=https%3A%2F%2Fmedia.cnn.com%2Fapi%2Fv1%2Fimages%2Fstellar%2Fprod%2F240122181719-trump-kamala-vpx-split-2.jpg%3Fc%3D16x9%26q%3Dw_850%2Cc_fill&f=1&nofb=1&ipt=984b6cf55cf55e1539003ca1c1beaa359625f6e5b08b511b3b018c9d2c959ae5&ipo=imagesg">
https://upload.wikimedia.org/wikipedia/commons/thumb/e/e7/Chase_Oliver%2C_Jill_Stein_%26_Randall_Terry_%2853866448015%29.jpg/1280px-Chase_Oliver%2C_Jill_Stein_%26_Randall_Terry_%2853866448015%29.jpg">
AUGUST UPDATE: This election season just gets crazier and crazier. You'll see new data related to the assassination attempt on former President Trump. There are transcripts of Secret Service hearings and an annotated image of the rally area.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F360751%2F75dd20a00c2ac6d81c6d6e1f83cbd941%2Fdonald-trump-rally-shooting-2024-113.webp?generation=1722800392288670&alt=media">
JULY UPDATE: Added the transcript of the debate between Trump and Biden.
MAY UPDATE: Added some new polls and also a meta-poll assessing the quality of select pollsters.
APRIL UPDATE : The dataset now contains approval ratings for sitting presidents, which includes Biden and Trump.
MARCH UPDATE: As of last week, the presumptive nominees are Joe Biden(D) and Donald Trump(R). They also ran against each other in 2020. Robert F Kennedy Jr is running as an independent.
Presidential elections occur quadrennially in years evenly divisible by 4, on the first Tuesday after November 1. Presidential candidates from the major political parties usually declare their intentions to run as early as the spring of the previous calendar year before the election. The two major parties each nominate one candidate through a process of primary elections and nominating conventions during the election year. (source: Wikipedia)
This dataset contains data on candidates, primary/caucus results, polls, and debate transcripts. Updates and additional data will be added as the landscape develops.
Note: Version 3 of this dataset contains previous coverage of the 2022 Congressional Mid-term Elections.
Facebook
TwitterOpen Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Electoral registrations for parliamentary and local government elections as recorded in electoral registers for England, Wales, Scotland and Northern Ireland.
Facebook
TwitterThis is the dataset I used to figure out which sociodemographic factor including the current pandemic status of each state has the most significan impace on the result of the US Presidential election last year. I also included sentiment scores of tweets created from 2020-10-15 to 2020-11-02 as well, in order to figure out the effect of positive/negative emotion for each candidate - Donald Trump and Joe Biden - on the result of the election.
Details for each variable are as below: - state: name of each state in the United States, including District of Columbia - elec16, elec20: dummy variable indicating whether Trump gained the electoral votes of each state or not. If the electors casted their votes for Trump, the value is 1; otherwise the value is 0 - elecchange: dummy variable indicating whether each party flipped the result in 2020 compared to that of the 2016 - demvote16: the rate of votes that the Democrats, i.e. Hillary Clinton earned in the 2016 Presidential election - repvote16: the rate of votes that the Republicans , i.e. Donald Trump earned in the 2016 Presidential election - demvote20: the rate of votes that the Democrats, i.e. Joe Biden earned in the 2020 Presidential election - repvote20: the rate of votes that the Republicans , i.e. Donald Trump earned in the 2020 Presidential election - demvotedif: the difference between demvote20 and demvote16 - repvotedif: the difference between repvote20 and repvote16 - pop: the population of each state - cumulcases: the cumulative COVID-19 cases on the Election day - caseMar ~ caseOct: the cumulative COVID-19 cases during each month - Marper10k ~ Octper10k: the cumulative COVID-19 cases during each month per 10 thousands - unemp20: the unemployment rate of each state this year before the election - unempdif: the difference between the unemployment rate of the last year and that of this year - jan20unemp ~ oct20unemp: the unemployment rate of each month - cumulper10k: the cumulative COVID-19 cases on the Election day per 10 thousands - b_str_poscount_total: the total number of positive tweets on Biden measured by the SentiStrength - b_str_negcount_total: the total number of negative tweets on Biden measured by the SentiStrength - t_str_poscount_total: the total number of positive tweets on Trump measured by the SentiStrength - t_str_poscount_total: the total number of negative tweets on Trump measured by the SentiStrength - b_str_posprop_total: the proportion of positive tweets on Biden measured by the SentiStrength - b_str_negprop_total: the proportion of negative tweets on Biden measured by the SentiStrength - t_str_posprop_total: the proportion of positive tweets on Trump measured by the SentiStrength - t_str_negprop_total: the proportion of negative tweets on Trump measured by the SentiStrength - white: the proportion of white people - colored: the proportion of colored people - secondary: the proportion of people who has attained the secondary education - tertiary: the proportion of people who has attained the tertiary education - q3gdp20: GDP of the 3rd quarter 2020 - q3gdprate: the growth rate of the 3rd quarter 2020, compared to that of the same quarter last year - 3qsgdp20: GDP of 3 quarters 2020 - 3qsrate20: the growth rate of GDP compared to that of the 3 quarters last year - q3gdpdif: the difference in the level of GDP of the 3rd quarter compared to the last quarter - q3rate: the growth rate of the 3rd quarter compared to the last quarter - access: the proportion of households having the Internet access
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Our codes provide a tool for researchers using any part of the integrated datasets of the European Social Survey (European Social Survey Cumulative File, ESS 1-9, 2020) project to easily differentiate between respondents based on their political affiliation, dividing them into pro-government and pro-opposition groups. Individuals are coded as “government supporters”, “opposition supporters” and “non-identifiers” according to their survey response, while we excluded refusals. The database includes data for 422 985 respondents from eight data rounds between 2002 and 2020 from 33 European countries, organized all in all in 215 country-years.
There are two data files attached.
a. The variable “votedforwinner” differentiates between government voters (1), opposition voters (0) and non-voters (missing values); thus it defines the government-opposition status of European voters based on their last vote on the previous election.
b. The variable “closetowinner” differentiates between government partisans (1), opposition partisans (0) and non-partisans (missing values); thus it defines the government-opposition status of European party identifiers based on their partisan attachment.
c. The variable “cseqno” is a unique identification number for European Social Survey (ESS) respondents included in the integrated data sets of the ESS project.
The European Government-Opposition Voters Data Set has been produced by using the following pieces of information coming from the (European Social Survey Cumulative File, ESS 1-9, 2020), Comparative Political Data Sets (Armingeon, Isler, Knöpfel, Weisstanner, et al., 2016) and ParlGov (Döring and Manow, 2019) data sets.
partisan
preferences, that is, respondents’ vote on the last general election (164 variables, ESS) and respondents’ partisan identity (167 variables, ESS)
date of
the interview (year, month, day, ESS)
date of
national elections and investitures in each country-case (CPDS and ParlGov)
cabinet
composition (CPDS and ParlGov)
official
sites on information on national elections for clarification, if necessary
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description:
This dataset combines data from three sources to provide a comprehensive overview of county-level socioeconomic indicators, educational attainment, and voting outcomes in the United States. The dataset includes variables such as unemployment rates, median household income, urban influence codes, education levels, and voting percentages for the 2020 U.S. presidential election. By integrating this data, the dataset enables analysis of how factors like income, education, and unemployment correlate with political preferences, offering insights into regional voting behaviors across the country.
References:
The following reference datasets were used to construct this dataset.
[1] Harvard Dataverse, Voting Data Set by County. Available: https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi: 10.7910/DVN/VOQCHQ
[2] USDA Economic Research Service, Educational Attainment and Un- employment Data. Available: https://www.ers.usda.gov/data-products/ county-level-data-sets/county-level-data-sets-download-data/
Facebook
Twitterhttp://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
The dataset comes from The Center for Public Integrity. You can read more about the data and how it was collected in their September 2020 article "National data release sheds light on past polling place changes".
Note: Some states do not have data in this dataset. Several states (Colorado, Hawaii, Oregon, Washington and Utah) vote primarily by mail and have little or no data in this colletion, and others were not available for other reasons.
For states with data for multiple elections, how have polling location counts per county changed over time?
variable class description election_date date date of the election as YYYY-MM-DD state character 2-letter abbreviation of the state county_name character county name, if available jurisdiction character jurisdiction, if available jurisdiction_type character type of jurisdiction, if available; one of "county", "borough", "town", "municipality", "city", "parish", or "county_municipality" precinct_id character unique ID of the precinct, if available precinct_name character name of the precinct, if available polling_place_id character unique ID of the polling place, if available location_type character type of polling location, if available; one of "early_vote", "early_vote_site", "election_day", "polling_location", "polling_place", or "vote_center" name character name of the polling place, if available address character address of the polling place, if available notes character optional notes about the polling place source character source of the polling place data; one of "ORR", "VIP", "website", or "scraper" source_date date date that the source was compiled source_notes character optional notes about the source
Facebook
TwitterThis dataset lists the total population 18 years and older by census block in Connecticut before and after population adjustments were made pursuant to Public Act 21-13. PA 21-13 creates a process to adjust the U.S. Census Bureau population data to allow for most individuals who are incarcerated to be counted at their address before incarceration. Prior to enactment of the act, these inmates were counted at their correctional facility address.
The act requires the CT Office of Policy and Management (OPM) to prepare and publish the adjusted and unadjusted data by July 1 in the year after the U.S. census is taken or 30 days after the U.S. Census Bureau’s publication of the state’s data.
A report documenting the population adjustment process was prepared by a team at OPM composed of the Criminal Justice Policy and Planning Division (OPM CJPPD) and the Data and Policy Analytics (DAPA) unit. The report is available here: https://portal.ct.gov/-/media/OPM/CJPPD/CjAbout/SAC-Documents-from-2021-2022/PA21-13_OPM_Summary_Report_20210921.pdf
Note: On September 21, 2021, following the initial publication of the report, OPM and DOC revised the count of juveniles, reallocating 65 eighteen-year-old individuals who were incorrectly designated as being under age 18. After the DOC released the updated data to OPM, the report and this dataset were updated to reflect the revision.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset shows a record of the cumulative number of people who have participated in advance voting for the August 11 runoff election in each county of Georgia.
Facebook
TwitterThe 119th Congressional Districts dataset reflects boundaries from January 3rd, 2025 from the United States Census Bureau (USCB), and the attributes are updated every Sunday from the United States House of Representatives and is part of the U.S. Department of Transportation (USDOT)/Bureau of Transportation Statistics (BTS) National Transportation Atlas Database (NTAD). The TIGER/Line shapefiles and related database files (.dbf) are an extract of selected geographic and cartographic information from the U.S. Census Bureau's Master Address File / Topologically Integrated Geographic Encoding and Referencing (MAF/TIGER) Database (MTDB). The MTDB represents a seamless national file with no overlaps or gaps between parts, however, each TIGER/Line shapefile is designed to stand alone as an independent data set, or they can be combined to cover the entire nation. Information for each member of Congress is appended to the Census Congressional District shapefile using information from the Office of the Clerk, U.S. House of Representatives' website https://clerk.house.gov/xml/lists/MemberData.xml and its corresponding XML file. Congressional districts are the 435 areas from which people are elected to the U.S. House of Representatives. This dataset also includes 9 geographies for non-voting at large delegate districts, resident commissioner districts, and congressional districts that are not defined. After the apportionment of congressional seats among the states based on census population counts, each state is responsible for establishing congressional districts for the purpose of electing representatives. Each congressional district is to be as equal in population to all other congressional districts in a state as practicable. The 119th Congress is seated from January 3, 2025 through January 3, 2027. In Connecticut, Illinois, and New Hampshire, the Redistricting Data Program (RDP) participant did not define the CDs to cover all of the state or state equivalent area. In these areas with no CDs defined, the code "ZZ" has been assigned, which is treated as a single CD for purposes of data presentation. The TIGER/Line shapefiles for the District of Columbia, Puerto Rico, and the Island Areas (American Samoa, Guam, the Commonwealth of the Northern Mariana Islands, and the U.S. Virgin Islands) each contain a single record for the non-voting delegate district in these areas. The boundaries of all other congressional districts reflect information provided to the Census Bureau by the states by May 31, 2024. A data dictionary, or other source of attribute information, is accessible at https://doi.org/10.21949/1529006
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset contains the county-wise vote share of the United States presidential election of 2020, and in the future 2024, the main advantage of the dataset is that it contains various important county statistics such as the counties racial composition, median and mean income, income inequality, population density, education level, population and the counties occupational distribution.
_Imp: this dataset will be updated as the 2024 results come in, I will also be adding more county demographic data, if you have any queries or suggestions please feel free to comment _
The reasons for constructing this dataset are many, however the prime reason was to aggregate all the data on counties along with the election result data for easy analysis in one place. I noticed that Kaggle contains no datasets with detailed county information, and that using the US census bureau site is pretty difficult and time consuming to extract data so it would be better to have a pre-prepared table of data