Facebook
TwitterPROBLEM AND OPPORTUNITY In the United States, voting is largely a private matter. A registered voter is given a randomized ballot form or machine to prevent linkage between their voting choices and their identity. This disconnect supports confidence in the election process, but it provides obstacles to an election's analysis. A common solution is to field exit polls, interviewing voters immediately after leaving their polling location. This method is rife with bias, however, and functionally limited in direct demographics data collected. For the 2020 general election, though, most states published their election results for each voting location. These publications were additionally supported by the geographical areas assigned to each location, the voting precincts. As a result, geographic processing can now be applied to project precinct election results onto Census block groups. While precinct have few demographic traits directly, their geographies have characteristics that make them projectable onto U.S. Census geographies. Both state voting precincts and U.S. Census block groups: are exclusive, and do not overlap are adjacent, fully covering their corresponding state and potentially county have roughly the same size in area, population and voter presence Analytically, a projection of local demographics does not allow conclusions about voters themselves. However, the dataset does allow statements related to the geographies that yield voting behavior. One could say, for example, that an area dominated by a particular voting pattern would have mean traits of age, race, income or household structure. The dataset that results from this programming provides voting results allocated by Census block groups. The block group identifier can be joined to Census Decennial and American Community Survey demographic estimates. DATA SOURCES The state election results and geographies have been compiled by Voting and Election Science team on Harvard's dataverse. State voting precincts lie within state and county boundaries. The Census Bureau, on the other hand, publishes its estimates across a variety of geographic definitions including a hierarchy of states, counties, census tracts and block groups. Their definitions can be found here. The geometric shapefiles for each block group are available here. The lowest level of this geography changes often and can obsolesce before the next census survey (Decennial or American Community Survey programs). The second to lowest census level, block groups, have the benefit of both granularity and stability however. The 2020 Decennial survey details US demographics into 217,740 block groups with between a few hundred and a few thousand people. Dataset Structure The dataset's columns include: Column Definition BLOCKGROUP_GEOID 12 digit primary key. Census GEOID of the block group row. This code concatenates: 2 digit state 3 digit county within state 6 digit Census Tract identifier 1 digit Census Block Group identifier within tract STATE State abbreviation, redundent with 2 digit state FIPS code above REP Votes for Republican party candidate for president DEM Votes for Democratic party candidate for president LIB Votes for Libertarian party candidate for president OTH Votes for presidential candidates other than Republican, Democratic or Libertarian AREA square kilometers of area associated with this block group GAP total area of the block group, net of area attributed to voting precincts PRECINCTS Number of voting precincts that intersect this block group ASSUMPTIONS, NOTES AND CONCERNS: Votes are attributed based upon the proportion of the precinct's area that intersects the corresponding block group. Alternative methods are left to the analyst's initiative. 50 states and the District of Columbia are in scope as those U.S. possessions voting in the general election for the U.S. Presidency. Three states did not report their results at the precinct level: South Dakota, Kentucky and West Virginia. A dummy block group is added for each of these states to maintain national totals. These states represent 2.1% of all votes cast. Counties are commonly coded using FIPS codes. However, each election result file may have the county field named differently. Also, three states do not share county definitions - Delaware, Massachusetts, Alaska and the District of Columbia. Block groups may be used to capture geographies that do not have population like bodies of water. As a result, block groups without intersection voting precincts are not uncommon. In the U.S., elections are administered at a state level with the Federal Elections Commission compiling state totals against the Electoral College weights. The states have liberty, though, to define and change their own voting precincts https://en.wikipedia.org/wiki/Electoral_precinct. The Census Bureau... Visit https://dataone.org/datasets/sha256%3A05707c1dc04a814129f751937a6ea56b08413546b18b351a85bc96da16a7f8b5 for complete metadata about this dataset.
Facebook
TwitterThis web map displays data from the voter registration database as the percent of registered voters by census tract in King County, Washington. The data for this web map is compiled from King County Elections voter registration data for the years 2013-2019. The total number of registered voters is based on the geo-location of the voter's registered address at the time of the general election for each year. The eligible voting population, age 18 and over, is based on the estimated population increase from the US Census Bureau and the Washington Office of Financial Management and was calculated as a projected 6 percent population increase for the years 2010-2013, 7 percent population increase for the years 2010-2014, 9 percent population increase for the years 2010-2015, 11 percent population increase for the years 2010-2016 & 2017, 14 percent population increase for the years 2010-2018 and 17 percent population increase for the years 2010-2019. The total population 18 and over in 2010 was 1,517,747 in King County, Washington. The percentage of registered voters represents the number of people who are registered to vote as compared to the eligible voting population, age 18 and over. The voter registration data by census tract was grouped into six percentage range estimates: 50% or below, 51-60%, 61-70%, 71-80%, 81-90% and 91% or above with an overall 84 percent registration rate. In the map the lighter colors represent a relatively low percentage range of voter registration and the darker colors represent a relatively high percentage range of voter registration. PDF maps of these data can be viewed at King County Elections downloadable voter registration maps. The 2019 General Election Voter Turnout layer is voter turnout data by historical precinct boundaries for the corresponding year. The data is grouped into six percentage ranges: 0-30%, 31-40%, 41-50% 51-60%, 61-70%, and 71-100%. The lighter colors represent lower turnout and the darker colors represent higher turnout. The King County Demographics Layer is census data for language, income, poverty, race and ethnicity at the census tract level and is based on the 2010-2014 American Community Survey 5 year Average provided by the United States Census Bureau. Since the data is based on a survey, they are considered to be estimates and should be used with that understanding. The demographic data sets were developed and are maintained by King County Staff to support the King County Equity and Social Justice program. Other data for this map is located in the King County GIS Spatial Data Catalog, where data is managed by the King County GIS Center, a multi-department enterprise GIS in King County, Washington. King County has nearly 1.3 million registered voters and is the largest jurisdiction in the United States to conduct all elections by mail. In the map you can view the percent of registered voters by census tract, compare registration within political districts, compare registration and demographic data, verify your voter registration or register to vote through a link to the VoteWA, Washington State Online Voter Registration web page.
Facebook
TwitterAccording to exit polling in ten key states of the 2024 presidential election in the United States, 46 percent of voters with a 2023 household income of 30,000 U.S. dollars or less reported voting for Donald Trump. In comparison, 51 percent of voters with a total family income of 100,000 to 199,999 U.S. dollars reported voting for Kamala Harris.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description:
This dataset combines data from three sources to provide a comprehensive overview of county-level socioeconomic indicators, educational attainment, and voting outcomes in the United States. The dataset includes variables such as unemployment rates, median household income, urban influence codes, education levels, and voting percentages for the 2020 U.S. presidential election. By integrating this data, the dataset enables analysis of how factors like income, education, and unemployment correlate with political preferences, offering insights into regional voting behaviors across the country.
References:
The following reference datasets were used to construct this dataset.
[1] Harvard Dataverse, Voting Data Set by County. Available: https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi: 10.7910/DVN/VOQCHQ
[2] USDA Economic Research Service, Educational Attainment and Un- employment Data. Available: https://www.ers.usda.gov/data-products/ county-level-data-sets/county-level-data-sets-download-data/
Facebook
TwitterAP VoteCast is a survey of the American electorate conducted by NORC at the University of Chicago for Fox News, NPR, PBS NewsHour, Univision News, USA Today Network, The Wall Street Journal and The Associated Press.
AP VoteCast combines interviews with a random sample of registered voters drawn from state voter files with self-identified registered voters selected using nonprobability approaches. In general elections, it also includes interviews with self-identified registered voters conducted using NORC’s probability-based AmeriSpeak® panel, which is designed to be representative of the U.S. population.
Interviews are conducted in English and Spanish. Respondents may receive a small monetary incentive for completing the survey. Participants selected as part of the random sample can be contacted by phone and mail and can take the survey by phone or online. Participants selected as part of the nonprobability sample complete the survey online.
In the 2020 general election, the survey of 133,103 interviews with registered voters was conducted between Oct. 26 and Nov. 3, concluding as polls closed on Election Day. AP VoteCast delivered data about the presidential election in all 50 states as well as all Senate and governors’ races in 2020.
This is survey data and must be properly weighted during analysis: DO NOT REPORT THIS DATA AS RAW OR AGGREGATE NUMBERS!!
Instead, use statistical software such as R or SPSS to weight the data.
National Survey
The national AP VoteCast survey of voters and nonvoters in 2020 is based on the results of the 50 state-based surveys and a nationally representative survey of 4,141 registered voters conducted between Nov. 1 and Nov. 3 on the probability-based AmeriSpeak panel. It included 41,776 probability interviews completed online and via telephone, and 87,186 nonprobability interviews completed online. The margin of sampling error is plus or minus 0.4 percentage points for voters and 0.9 percentage points for nonvoters.
State Surveys
In 20 states in 2020, AP VoteCast is based on roughly 1,000 probability-based interviews conducted online and by phone, and roughly 3,000 nonprobability interviews conducted online. In these states, the margin of sampling error is about plus or minus 2.3 percentage points for voters and 5.5 percentage points for nonvoters.
In an additional 20 states, AP VoteCast is based on roughly 500 probability-based interviews conducted online and by phone, and roughly 2,000 nonprobability interviews conducted online. In these states, the margin of sampling error is about plus or minus 2.9 percentage points for voters and 6.9 percentage points for nonvoters.
In the remaining 10 states, AP VoteCast is based on about 1,000 nonprobability interviews conducted online. In these states, the margin of sampling error is about plus or minus 4.5 percentage points for voters and 11.0 percentage points for nonvoters.
Although there is no statistically agreed upon approach for calculating margins of error for nonprobability samples, these margins of error were estimated using a measure of uncertainty that incorporates the variability associated with the poll estimates, as well as the variability associated with the survey weights as a result of calibration. After calibration, the nonprobability sample yields approximately unbiased estimates.
As with all surveys, AP VoteCast is subject to multiple sources of error, including from sampling, question wording and order, and nonresponse.
Sampling Details
Probability-based Registered Voter Sample
In each of the 40 states in which AP VoteCast included a probability-based sample, NORC obtained a sample of registered voters from Catalist LLC’s registered voter database. This database includes demographic information, as well as addresses and phone numbers for registered voters, allowing potential respondents to be contacted via mail and telephone. The sample is stratified by state, partisanship, and a modeled likelihood to respond to the postcard based on factors such as age, race, gender, voting history, and census block group education. In addition, NORC attempted to match sampled records to a registered voter database maintained by L2, which provided additional phone numbers and demographic information.
Prior to dialing, all probability sample records were mailed a postcard inviting them to complete the survey either online using a unique PIN or via telephone by calling a toll-free number. Postcards were addressed by name to the sampled registered voter if that individual was under age 35; postcards were addressed to “registered voter” in all other cases. Telephone interviews were conducted with the adult that answered the phone following confirmation of registered voter status in the state.
Nonprobability Sample
Nonprobability participants include panelists from Dynata or Lucid, including members of its third-party panels. In addition, some registered voters were selected from the voter file, matched to email addresses by V12, and recruited via an email invitation to the survey. Digital fingerprint software and panel-level ID validation is used to prevent respondents from completing the AP VoteCast survey multiple times.
AmeriSpeak Sample
During the initial recruitment phase of the AmeriSpeak panel, randomly selected U.S. households were sampled with a known, non-zero probability of selection from the NORC National Sample Frame and then contacted by mail, email, telephone and field interviewers (face-to-face). The panel provides sample coverage of approximately 97% of the U.S. household population. Those excluded from the sample include people with P.O. Box-only addresses, some addresses not listed in the U.S. Postal Service Delivery Sequence File and some newly constructed dwellings. Registered voter status was confirmed in field for all sampled panelists.
Weighting Details
AP VoteCast employs a four-step weighting approach that combines the probability sample with the nonprobability sample and refines estimates at a subregional level within each state. In a general election, the 50 state surveys and the AmeriSpeak survey are weighted separately and then combined into a survey representative of voters in all 50 states.
State Surveys
First, weights are constructed separately for the probability sample (when available) and the nonprobability sample for each state survey. These weights are adjusted to population totals to correct for demographic imbalances in age, gender, education and race/ethnicity of the responding sample compared to the population of registered voters in each state. In 2020, the adjustment targets are derived from a combination of data from the U.S. Census Bureau’s November 2018 Current Population Survey Voting and Registration Supplement, Catalist’s voter file and the Census Bureau’s 2018 American Community Survey. Prior to adjusting to population totals, the probability-based registered voter list sample weights are adjusted for differential non-response related to factors such as availability of phone numbers, age, race and partisanship.
Second, all respondents receive a calibration weight. The calibration weight is designed to ensure the nonprobability sample is similar to the probability sample in regard to variables that are predictive of vote choice, such as partisanship or direction of the country, which cannot be fully captured through the prior demographic adjustments. The calibration benchmarks are based on regional level estimates from regression models that incorporate all probability and nonprobability cases nationwide.
Third, all respondents in each state are weighted to improve estimates for substate geographic regions. This weight combines the weighted probability (if available) and nonprobability samples, and then uses a small area model to improve the estimate within subregions of a state.
Fourth, the survey results are weighted to the actual vote count following the completion of the election. This weighting is done in 10–30 subregions within each state.
National Survey
In a general election, the national survey is weighted to combine the 50 state surveys with the nationwide AmeriSpeak survey. Each of the state surveys is weighted as described. The AmeriSpeak survey receives a nonresponse-adjusted weight that is then adjusted to national totals for registered voters that in 2020 were derived from the U.S. Census Bureau’s November 2018 Current Population Survey Voting and Registration Supplement, the Catalist voter file and the Census Bureau’s 2018 American Community Survey. The state surveys are further adjusted to represent their appropriate proportion of the registered voter population for the country and combined with the AmeriSpeak survey. After all votes are counted, the national data file is adjusted to match the national popular vote for president.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This file shows all election related data by state and county (i.e. total votes, Republican votes, Democratic votes, Republican voting percentage, Democratic voting percentage) for both the 2020 and 2024 U.S. Presidential Elections.
Facebook
Twitterhttps://www.usa.gov/government-works/https://www.usa.gov/government-works/
Why Millions Of Americans Don’t Vote
Data presented here comes from polling done by Ipsos for FiveThirtyEight, using Ipsos’s KnowledgePanel, a probability-based online panel that is recruited to be representative of the U.S. population. The poll was conducted from Sept. 15 to Sept. 25 among a sample of U.S. citizens that oversampled young, Black and Hispanic respondents, with 8,327 respondents, and was weighted according to general population benchmarks for U.S. citizens from the U.S. Census Bureau’s Current Population Survey March 2019 Supplement. The voter file company Aristotle then matched respondents to a voter file to more accurately understand their voting history using the panelist’s first name, last name, zip code, and eight characters of their address, using the National Change of Address program if applicable. Sixty-four percent of the sample (5,355 respondents) matched, although we also included respondents who did not match the voter file but described themselves as voting “rarely” or “never” in our survey, so as to avoid underrepresenting nonvoters, who are less likely to be included in the voter file to begin with. We dropped respondents who were only eligible to vote in three elections or fewer. We defined those who almost always vote as those who voted in all (or all but one) of the national elections (presidential and midterm) they were eligible to vote in since 2000; those who vote sometimes as those who voted in at least two elections, but fewer than all the elections they were eligible to vote in (or all but one); and those who rarely or never vote as those who voted in no elections, or just one.
The data included here is the final sample we used: 5,239 respondents who matched to the voter file and whose verified vote history we have, and 597 respondents who did not match to the voter file and described themselves as voting "rarely" or "never," all of whom have been eligible for at least 4 elections.
We wouldn't be here without the help of others. If you owe any attributions or thanks, include them here along with any citations of past research.
Your data will be in front of the world's largest data science community. What questions do you want to see answered?
Facebook
TwitterOpen Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
The Voter Participation indicator presents voter turnout in Champaign County as a percentage, calculated using two different methods.
In the first method, the voter turnout percentage is calculated using the number of ballots cast compared to the total population in the county that is eligible to vote. In the second method, the voter turnout percentage is calculated using the number of ballots cast compared to the number of registered voters in the county.
Since both methods are in use by other agencies, and since there are real differences in the figures that both methods return, we have provided the voter participation rate for Champaign County using each method.
Voter participation is a solid illustration of a community’s engagement in the political process at the federal and state levels. One can infer a high level of political engagement from high voter participation rates.
The voter participation rate calculated using the total eligible population is consistently lower than the voter participation rate calculated using the number of registered voters, since the number of registered voters is smaller than the total eligible population.
There are consistent trends in both sets of data: the voter participation rate, no matter how it is calculated, shows large spikes in presidential election years (e.g., 2008, 2012, 2016, 2020, 2024) and smaller spikes in intermediary even years (e.g., 2010, 2014, 2018, 2022). The lowest levels of voter participation can be seen in odd years (e.g., 2015, 2017, 2019, 2021, 2023).
This data primarily comes from the election results resources on the Champaign County Clerk website. Election results resources from Champaign County include the number of ballots cast and the number of registered voters. The results are published frequently, following each election.
Data on the total eligible population for Champaign County was sourced from the U.S. Census Bureau, using American Community Survey (ACS) 1-Year Estimates for each year starting in 2005, when the American Community Survey was created. The estimates are released annually by the Census Bureau.
Due to the impact of the COVID-19 pandemic, instead of providing the standard 1-year data products, the Census Bureau released experimental estimates from the 1-year data in 2020. This includes a limited number of data tables for the nation, states, and the District of Columbia. The Census Bureau states that the 2020 ACS 1-year experimental tables use an experimental estimation methodology and should not be compared with other ACS data. For these reasons, and because this data is not available for Champaign County, the eligible voting population for 2020 is not included in this Indicator.
For interested data users, the 2020 ACS 1-Year Experimental data release includes datasets on Population by Sex and Population Under 18 Years by Age.
Sources: Champaign County Clerk Historical Election Data; U.S. Census Bureau; American Community Survey, 2024 American Community Survey 1-Year Estimates, Table B05003; generated by CCRPC staff; using data.census.gov; (24 November 2025).; American Community Survey, 2023 American Community Survey 1-Year Estimates, Table B05003; generated by CCRPC staff; using data.census.gov; (10 October 2024).; U.S. Census Bureau; American Community Survey, 2022 American Community Survey 1-Year Estimates, Table B05003; generated by CCRPC staff; using data.census.gov; (5 October 2023).; Champaign County Clerk Historical Election Data; U.S. Census Bureau; American Community Survey, 2021 American Community Survey 1-Year Estimates, Table B05003; generated by CCRPC staff; using data.census.gov; (7 October 2022).; U.S. Census Bureau; American Community Survey, 2019 American Community Survey 1-Year Estimates, Table B05003; generated by CCRPC staff; using data.census.gov; (8 June 2021).; U.S. Census Bureau; American Community Survey, 2018 American Community Survey 1-Year Estimates, Table B05003; generated by CCRPC staff; using data.census.gov; (8 June 2021).; Champaign County Clerk Election History; U.S. Census Bureau; American Community Survey, 2017 American Community Survey 1-Year Estimates, Table B05003; generated by CCRPC staff; using American FactFinder; (13 May 2019).; U.S. Census Bureau; American Community Survey, 2016 American Community Survey 1-Year Estimates, Table B05003; generated by CCRPC staff; using American FactFinder; (13 May 2019).; U.S. Census Bureau; American Community Survey, American Community Survey 1-Year Estimates, Table B05003; generated by CCRPC staff; using American FactFinder; (6 March 2017).; U.S. Census Bureau; American Community Survey, 2014 American Community Survey 1-Year Estimates, Table B05003; generated by CCRPC staff; using American FactFinder; (15 March 2016).; U.S. Census Bureau; American Community Survey, U.S. Census Bureau; American Community Survey, 2013 American Community Survey 1-Year Estimates, Table B05003; generated by CCRPC staff; using American FactFinder; (15 March 2016).; U.S. Census Bureau; American Community Survey 2012 American Community Survey 1-Year Estimates, Table B05003; generated by CCRPC staff; using American FactFinder; (15 March 2016).; U.S. Census Bureau; American Community Survey, 2011 American Community Survey 1-Year Estimates, Table B05003; generated by CCRPC staff; using American FactFinder; (15 March 2016).; U.S. Census Bureau; American Community Survey, 2010 American Community Survey 1-Year Estimates, Table B05003; generated by CCRPC staff; using American FactFinder; (15 March 2016).; U.S. Census Bureau; American Community Survey, 2009 American Community Survey 1-Year Estimates, Table B05003; generated by CCRPC staff; using American FactFinder; (15 March 2016).; U.S. Census Bureau; American Community Survey, 2008 American Community Survey 1-Year Estimates, Table B05003; generated by CCRPC staff; using American FactFinder; (15 March 2016).; U.S. Census Bureau; American Community Survey, 2007 American Community Survey 1-Year Estimates, Table B05003; generated by CCRPC staff; using American FactFinder; (15 March 2016).; U.S. Census Bureau; American Community Survey, 2006 American Community Survey 1-Year Estimates, Table B05003; generated by CCRPC staff; using American FactFinder; (15 March 2016).; U.S. Census Bureau; American Community Survey, 2005 American Community Survey 1-Year Estimates, Table B05003; generated by CCRPC staff; using American FactFinder; (15 March 2016).
Facebook
TwitterThis data comes from the Associated Press - the AP has been tracking vote counts in US elections since 1848 and their data is widely considered to be accurate.
The variables in this dataset are:
- state: State to which the vote count corresponds
- state_abr: Two-letter abbreviation of state name
- trump_pct: Percentage of the vote won by Donald Trump
- biden_pct: Percentage of the vote won by Joe Biden
- trump_win: Binary variable denoting whether Donald Trump won the vote in a state
- biden_win: Binary variable denoting whether Joe Biden won the vote in a state
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset is collected from 1824 to 2020: 1. Year: Description: The year in which the U.S. election took place. Type: Numeric (Integer) Example: 1824, 1860, 1920, 2020
Candidate: Description: The name of the candidate participating in the election. Type: String (Candidate's name) Example: John Adams, Abraham Lincoln, Franklin D. Roosevelt, Joe Biden
Party: Description: The political party affiliation of the candidate. Type: String (Party name or abbreviation) Example: Democratic, Republican, Whig, Libertarian
Popular Vote: Description: The total number of votes that the candidate received in the popular vote. Type: Numeric (Integer) Example: 500,000, 5,000,000, 70,000,000
Result: Description: The outcome of the election for the specified candidate. Type: String (e.g., "Winner," "Runner-up," "Withdrew") Example: Winner, Runner-up, Withdrew, Conceded
Percentage: Description: The percentage of the total popular vote received by the candidate. Type: Numeric (Float) Example: 25.3%, 49.8%, 60.5%
This dataset appears to capture essential information about U.S. elections over time, including details about the candidates, their political party affiliations, the number of popular votes they received, the outcome of the election, and the percentage of the total popular vote they secured. This comprehensive dataset allows for the analysis of historical U.S. election trends and outcomes.
Facebook
TwitterThis is the dataset I used to figure out which sociodemographic factor including the current pandemic status of each state has the most significan impace on the result of the US Presidential election last year. I also included sentiment scores of tweets created from 2020-10-15 to 2020-11-02 as well, in order to figure out the effect of positive/negative emotion for each candidate - Donald Trump and Joe Biden - on the result of the election.
Details for each variable are as below: - state: name of each state in the United States, including District of Columbia - elec16, elec20: dummy variable indicating whether Trump gained the electoral votes of each state or not. If the electors casted their votes for Trump, the value is 1; otherwise the value is 0 - elecchange: dummy variable indicating whether each party flipped the result in 2020 compared to that of the 2016 - demvote16: the rate of votes that the Democrats, i.e. Hillary Clinton earned in the 2016 Presidential election - repvote16: the rate of votes that the Republicans , i.e. Donald Trump earned in the 2016 Presidential election - demvote20: the rate of votes that the Democrats, i.e. Joe Biden earned in the 2020 Presidential election - repvote20: the rate of votes that the Republicans , i.e. Donald Trump earned in the 2020 Presidential election - demvotedif: the difference between demvote20 and demvote16 - repvotedif: the difference between repvote20 and repvote16 - pop: the population of each state - cumulcases: the cumulative COVID-19 cases on the Election day - caseMar ~ caseOct: the cumulative COVID-19 cases during each month - Marper10k ~ Octper10k: the cumulative COVID-19 cases during each month per 10 thousands - unemp20: the unemployment rate of each state this year before the election - unempdif: the difference between the unemployment rate of the last year and that of this year - jan20unemp ~ oct20unemp: the unemployment rate of each month - cumulper10k: the cumulative COVID-19 cases on the Election day per 10 thousands - b_str_poscount_total: the total number of positive tweets on Biden measured by the SentiStrength - b_str_negcount_total: the total number of negative tweets on Biden measured by the SentiStrength - t_str_poscount_total: the total number of positive tweets on Trump measured by the SentiStrength - t_str_poscount_total: the total number of negative tweets on Trump measured by the SentiStrength - b_str_posprop_total: the proportion of positive tweets on Biden measured by the SentiStrength - b_str_negprop_total: the proportion of negative tweets on Biden measured by the SentiStrength - t_str_posprop_total: the proportion of positive tweets on Trump measured by the SentiStrength - t_str_negprop_total: the proportion of negative tweets on Trump measured by the SentiStrength - white: the proportion of white people - colored: the proportion of colored people - secondary: the proportion of people who has attained the secondary education - tertiary: the proportion of people who has attained the tertiary education - q3gdp20: GDP of the 3rd quarter 2020 - q3gdprate: the growth rate of the 3rd quarter 2020, compared to that of the same quarter last year - 3qsgdp20: GDP of 3 quarters 2020 - 3qsrate20: the growth rate of GDP compared to that of the 3 quarters last year - q3gdpdif: the difference in the level of GDP of the 3rd quarter compared to the last quarter - q3rate: the growth rate of the 3rd quarter compared to the last quarter - access: the proportion of households having the Internet access
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset was developed by the Research & Analytics Group at the Atlanta Regional Commission using data from the U.S. Census Bureau.For a deep dive into the data model including every specific metric, see the Infrastructure Manifest. The manifest details ARC-defined naming conventions, field names/descriptions and topics, summary levels; source tables; notes and so forth for all metrics.Naming conventions:Prefixes: None Countp Percentr Ratem Mediana Mean (average)t Aggregate (total)ch Change in absolute terms (value in t2 - value in t1)pch Percent change ((value in t2 - value in t1) / value in t1)chp Change in percent (percent in t2 - percent in t1)s Significance flag for change: 1 = statistically significant with a 90% CI, 0 = not statistically significant, blank = cannot be computed Suffixes: _e19 Estimate from 2014-19 ACS_m19 Margin of Error from 2014-19 ACS_00_v19 Decennial 2000, re-estimated to 2019 geography_00_19 Change, 2000-19_e10_v19 2006-10 ACS, re-estimated to 2019 geography_m10_v19 Margin of Error from 2006-10 ACS, re-estimated to 2019 geography_e10_19 Change, 2010-19The user should note that American Community Survey data represent estimates derived from a surveyed sample of the population, which creates some level of uncertainty, as opposed to an exact measure of the entire population (the full census count is only conducted once every 10 years and does not cover as many detailed characteristics of the population). Therefore, any measure reported by ACS should not be taken as an exact number – this is why a corresponding margin of error (MOE) is also given for ACS measures. The size of the MOE relative to its corresponding estimate value provides an indication of confidence in the accuracy of each estimate. Each MOE is expressed in the same units as its corresponding measure; for example, if the estimate value is expressed as a number, then its MOE will also be a number; if the estimate value is expressed as a percent, then its MOE will also be a percent. The user should also note that for relatively small geographic areas, such as census tracts shown here, ACS only releases combined 5-year estimates, meaning these estimates represent rolling averages of survey results that were collected over a 5-year span (in this case 2015-2019). Therefore, these data do not represent any one specific point in time or even one specific year. For geographic areas with larger populations, 3-year and 1-year estimates are also available. For further explanation of ACS estimates and margin of error, visit Census ACS website.Source: U.S. Census Bureau, Atlanta Regional CommissionDate: 2015-2019Data License: Creative Commons Attribution 4.0 International (CC by 4.0)Link to the manifest: https://www.arcgis.com/sharing/rest/content/items/3d489c725bb24f52a987b302147c46ee/data
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset was developed by the Research & Analytics Group at the Atlanta Regional Commission using data from the U.S. Census Bureau.For a deep dive into the data model including every specific metric, see the Infrastructure Manifest. The manifest details ARC-defined naming conventions, field names/descriptions and topics, summary levels; source tables; notes and so forth for all metrics.Naming conventions:Prefixes: None Countp Percentr Ratem Mediana Mean (average)t Aggregate (total)ch Change in absolute terms (value in t2 - value in t1)pch Percent change ((value in t2 - value in t1) / value in t1)chp Change in percent (percent in t2 - percent in t1)s Significance flag for change: 1 = statistically significant with a 90% CI, 0 = not statistically significant, blank = cannot be computed Suffixes: _e19 Estimate from 2014-19 ACS_m19 Margin of Error from 2014-19 ACS_00_v19 Decennial 2000, re-estimated to 2019 geography_00_19 Change, 2000-19_e10_v19 2006-10 ACS, re-estimated to 2019 geography_m10_v19 Margin of Error from 2006-10 ACS, re-estimated to 2019 geography_e10_19 Change, 2010-19The user should note that American Community Survey data represent estimates derived from a surveyed sample of the population, which creates some level of uncertainty, as opposed to an exact measure of the entire population (the full census count is only conducted once every 10 years and does not cover as many detailed characteristics of the population). Therefore, any measure reported by ACS should not be taken as an exact number – this is why a corresponding margin of error (MOE) is also given for ACS measures. The size of the MOE relative to its corresponding estimate value provides an indication of confidence in the accuracy of each estimate. Each MOE is expressed in the same units as its corresponding measure; for example, if the estimate value is expressed as a number, then its MOE will also be a number; if the estimate value is expressed as a percent, then its MOE will also be a percent. The user should also note that for relatively small geographic areas, such as census tracts shown here, ACS only releases combined 5-year estimates, meaning these estimates represent rolling averages of survey results that were collected over a 5-year span (in this case 2015-2019). Therefore, these data do not represent any one specific point in time or even one specific year. For geographic areas with larger populations, 3-year and 1-year estimates are also available. For further explanation of ACS estimates and margin of error, visit Census ACS website.Source: U.S. Census Bureau, Atlanta Regional CommissionDate: 2015-2019Data License: Creative Commons Attribution 4.0 International (CC by 4.0)Link to the manifest: https://www.arcgis.com/sharing/rest/content/items/3d489c725bb24f52a987b302147c46ee/data
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This folder contains the data behind the stories:
This project looks at patterns in open Democratic and Republican primary elections for the U.S. Senate, U.S. House and governor in 2018.
dem_candidates.csv contains information about the 811 candidates who have appeared on the ballot this year in Democratic primaries for Senate, House and governor, not counting races featuring a Democratic incumbent, as of August 7, 2018.
rep_candidates.csv contains information about the 774 candidates who have appeared on the ballot this year in Republican primaries for Senate, House and governor, not counting races featuring a Republican incumbent, through September 13, 2018.
Here is a description and source for each column in the accompanying datasets.
dem_candidates.csv and rep_candidates.csv include:
| Column | Description |
|---|---|
Candidate | All candidates who received votes in 2018’s Democratic primary elections for U.S. Senate, U.S. House and governor in which no incumbent ran. Supplied by Ballotpedia. |
State | The state in which the candidate ran. Supplied by Ballotpedia. |
District | The office and, if applicable, congressional district number for which the candidate ran. Supplied by Ballotpedia. |
Office Type | The office for which the candidate ran. Supplied by Ballotpedia. |
Race Type | Whether it was a “regular” or “special” election. Supplied by Ballotpedia. |
Race Primary Election Date | The date on which the primary was held. Supplied by Ballotpedia. |
Primary Status | Whether the candidate lost (“Lost”) the primary or won/advanced to a runoff (“Advanced”). Supplied by Ballotpedia. |
Primary Runoff Status | “None” if there was no runoff; “On the Ballot” if the candidate advanced to a runoff but it hasn’t been held yet; “Advanced” if the candidate won the runoff; “Lost” if the candidate lost the runoff. Supplied by Ballotpedia. |
General Status | “On the Ballot” if the candidate won the primary or runoff and has advanced to November; otherwise, “None.” Supplied by Ballotpedia. |
Primary % | The percentage of the vote received by the candidate in his or her primary. In states that hold runoff elections, we looked only at the first round (the regular primary). In states that hold all-party primaries (e.g., California), a candidate’s primary percentage is the percentage of the total Democratic vote they received. Unopposed candidates and candidates nominated by convention (not primary) are given a primary percentage of 100 but were excluded from our analysis involving vote share. Numbers come from official results posted by the secretary of state or local elections authority; if those were unavailable, we used unofficial election results from the New York Times. |
Won Primary | “Yes” if the candidate won his or her primary and has advanced to November; “No” if he or she lost. |
dem_candidates.csv includes:
| Column | Description |
|---|---|
Gender | “Male” or “Female.” Supplied by Ballotpedia. |
Partisan Lean | The FiveThirtyEight partisan lean of the district or state in which the election was held. Partisan leans are calculated by finding the average difference between how a state or district voted in the past two presidential elections and how the country voted overall, with 2016 results weighted 75 percent and 2012 results weighted 25 percent. |
Race | “White” if we identified the candidate as non-Hispanic white; “Nonwhite” if we identified the candidate as Hispanic and/or any nonwhite race; blank if we could not identify the candidate’s race or ethnicity. To determine race and ethnicity, we checked each candidate’s website to see if he or she identified as a certain race. If not, we spent no more than two minutes searching online news reports for references to the candidate’s race. |
Veteran? | If the candidate’s website says that he or she served in the armed forces, we put “Yes.” If the website is silent on the subject (or explicitly says he or she didn’t serve), we put “No.” If the field was left blank, no website was available. |
LGBTQ? | If the candidate’s website says that he or she is LGBTQ (including indirect references like to a same-sex partner), we put “Yes.” If the website is silent on the subject (or explicitly says he or she is straight), we put “No.” If the field was... |
Facebook
TwitterAge, Sex, Race, Ethnicity, Total Housing Units, and Voting Age Population. This service is updated annually with American Community Survey (ACS) 5-year data. Contact: District of Columbia, Office of Planning. Email: planning@dc.gov. Geography: District-wide. Current Vintage: 2019-2023. ACS Table(s): DP05. Data downloaded from: Census Bureau's API for American Community Survey. Date of API call: January 2, 2025. National Figures: data.census.gov. Please cite the Census and ACS when using this data. Data Note from the Census: Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see Accuracy of the Data). The effect of nonsampling error is not represented in these tables. Data Processing Notes: This layer is updated automatically when the most current vintage of ACS data is released each year, usually in December. The layer always contains the latest available ACS 5-year estimates. It is updated annually within days of the Census Bureau's release schedule. Boundaries come from the US Census TIGER geodatabases. Boundaries are updated at the same time as the data updates (annually), and the boundary vintage appropriately matches the data vintage as specified by the Census. These are Census boundaries with water and/or coastlines clipped for cartographic purposes. For census tracts, the water cutouts are derived from a subset of the 2020 AWATER (Area Water) boundaries offered by TIGER. For state and county boundaries, the water and coastlines are derived from the coastlines of the 500k TIGER Cartographic Boundary Shapefiles. The original AWATER and ALAND fields are still available as attributes within the data table (units are square meters). Field alias names were created based on the Table Shells file available from the American Community Survey Summary File Documentation page. Data processed using R statistical package and ArcGIS Desktop. Margin of Error was not included in this layer but is available from the Census Bureau. Contact the Office of Planning for more information about obtaining Margin of Error values.
Facebook
TwitterAge, Sex, Race, Ethnicity, Total Housing Units, and Voting Age Population. This service is updated annually with American Community Survey (ACS) 1-year data. Contact: District of Columbia, Office of Planning. Email: planning@dc.gov. Geography: District-wide. Current Vintage: 2023. ACS Table(s): DP05. Data downloaded from: Census Bureau's API for American Community Survey. Date of API call: January 3, 2025. National Figures: data.census.gov. Please cite the Census and ACS when using this data. Data Note from the Census: Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see Accuracy of the Data). The effect of nonsampling error is not represented in these tables. Data Processing Notes: This layer is updated automatically when the most current vintage of ACS data is released each year, usually in December. The layer always contains the latest available ACS 5-year estimates. It is updated annually within days of the Census Bureau's release schedule. Boundaries come from the US Census TIGER geodatabases. Boundaries are updated at the same time as the data updates (annually), and the boundary vintage appropriately matches the data vintage as specified by the Census. These are Census boundaries with water and/or coastlines clipped for cartographic purposes. For census tracts, the water cutouts are derived from a subset of the 2020 AWATER (Area Water) boundaries offered by TIGER. For state and county boundaries, the water and coastlines are derived from the coastlines of the 500k TIGER Cartographic Boundary Shapefiles. The original AWATER and ALAND fields are still available as attributes within the data table (units are square meters). Field alias names were created based on the Table Shells file available from the American Community Survey Summary File Documentation page. Data processed using R statistical package and ArcGIS Desktop. Margin of Error was not included in this layer but is available from the Census Bureau. Contact the Office of Planning for more information about obtaining Margin of Error values.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset was developed by the Research & Analytics Group at the Atlanta Regional Commission using data from the U.S. Census Bureau.For a deep dive into the data model including every specific metric, see the Infrastructure Manifest. The manifest details ARC-defined naming conventions, field names/descriptions and topics, summary levels; source tables; notes and so forth for all metrics.Naming conventions:Prefixes: None Countp Percentr Ratem Mediana Mean (average)t Aggregate (total)ch Change in absolute terms (value in t2 - value in t1)pch Percent change ((value in t2 - value in t1) / value in t1)chp Change in percent (percent in t2 - percent in t1)s Significance flag for change: 1 = statistically significant with a 90% CI, 0 = not statistically significant, blank = cannot be computed Suffixes: _e19 Estimate from 2014-19 ACS_m19 Margin of Error from 2014-19 ACS_00_v19 Decennial 2000, re-estimated to 2019 geography_00_19 Change, 2000-19_e10_v19 2006-10 ACS, re-estimated to 2019 geography_m10_v19 Margin of Error from 2006-10 ACS, re-estimated to 2019 geography_e10_19 Change, 2010-19The user should note that American Community Survey data represent estimates derived from a surveyed sample of the population, which creates some level of uncertainty, as opposed to an exact measure of the entire population (the full census count is only conducted once every 10 years and does not cover as many detailed characteristics of the population). Therefore, any measure reported by ACS should not be taken as an exact number – this is why a corresponding margin of error (MOE) is also given for ACS measures. The size of the MOE relative to its corresponding estimate value provides an indication of confidence in the accuracy of each estimate. Each MOE is expressed in the same units as its corresponding measure; for example, if the estimate value is expressed as a number, then its MOE will also be a number; if the estimate value is expressed as a percent, then its MOE will also be a percent. The user should also note that for relatively small geographic areas, such as census tracts shown here, ACS only releases combined 5-year estimates, meaning these estimates represent rolling averages of survey results that were collected over a 5-year span (in this case 2015-2019). Therefore, these data do not represent any one specific point in time or even one specific year. For geographic areas with larger populations, 3-year and 1-year estimates are also available. For further explanation of ACS estimates and margin of error, visit Census ACS website.Source: U.S. Census Bureau, Atlanta Regional CommissionDate: 2015-2019Data License: Creative Commons Attribution 4.0 International (CC by 4.0)Link to the manifest: https://www.arcgis.com/sharing/rest/content/items/3d489c725bb24f52a987b302147c46ee/data
Facebook
TwitterGeo-data science applications for critical mineral analysis, development acceleration, economic impact assessment, and project efficiency. Coverage: 10 years (2014-2023), 3 geographic levels (county, state, tract), 45 features Data Categories: - census: 30 features (e.g., asian population percentage, black population percentage, citizen voting age population percentage) - ejscreen: 3 features (e.g., P_DSLPM, P_PM25, P_PWDIS) - energyexpenditure: 12 features (e.g., Housing adjustment factor, Income adjustment factor, Monthly electricity cost)
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Age, Sex, Race, Ethnicity, Total Housing Units, and Voting Age Population. This service is updated annually with American Community Survey (ACS) 5-year data. Contact: District of Columbia, Office of Planning. Email: planning@dc.gov. Geography: 2022 Wards (State Legislative Districts [Upper Chamber]). Current Vintage: 2019-2023. ACS Table(s): DP05. Data downloaded from: Census Bureau's API for American Community Survey. Date of API call: January 2, 2025. National Figures: data.census.gov. Please cite the Census and ACS when using this data. Data Note from the Census: Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see Accuracy of the Data). The effect of nonsampling error is not represented in these tables. Data Processing Notes: This layer is updated automatically when the most current vintage of ACS data is released each year, usually in December. The layer always contains the latest available ACS 5-year estimates. It is updated annually within days of the Census Bureau's release schedule. Boundaries come from the US Census TIGER geodatabases. Boundaries are updated at the same time as the data updates (annually), and the boundary vintage appropriately matches the data vintage as specified by the Census. These are Census boundaries with water and/or coastlines clipped for cartographic purposes. For census tracts, the water cutouts are derived from a subset of the 2020 AWATER (Area Water) boundaries offered by TIGER. For state and county boundaries, the water and coastlines are derived from the coastlines of the 500k TIGER Cartographic Boundary Shapefiles. The original AWATER and ALAND fields are still available as attributes within the data table (units are square meters). Field alias names were created based on the Table Shells file available from the American Community Survey Summary File Documentation page. Data processed using R statistical package and ArcGIS Desktop. Margin of Error was not included in this layer but is available from the Census Bureau. Contact the Office of Planning for more information about obtaining Margin of Error values.
Facebook
TwitterElection Data Attribute Field Definitions | Wisconsin Cities, Towns, & Villages Data Attributes Ward Data Overview:July 2020 municipal wards were collected by LTSB through the WISE-Decade system. Current statutes require each county clerk, or board of election commissioners, no later than January 15 and July 15 of each year, to transmit to the LTSB, in an electronic format (approved by LTSB), a report confirming the boundaries of each municipality, ward and supervisory district within the county as of the preceding “snapshot” date of January 1 or July 1 respectively. Population totals for 2011 wards are carried over to the 2020 dataset for existing wards. New wards created since 2011 due to annexations, detachments, and incorporation are allocated population from Census 2010 collection blocks. LTSB has topologically integrated the data, but there may still be errors.Election Data Overview:The 2012-2020 Wisconsin election data that is included in this file was collected by LTSB from the *Wisconsin Elections Commission (WEC) after each general election. A disaggregation process was performed on this election data based on the municipal ward layer that was available at the time of the election. Disaggregation of Election Data:Election data is first disaggregated from reporting units to wards, and then to census blocks. Next, the election data is aggregated back up to wards, municipalities, and counties. The disaggregation of election data to census blocks is done based on total population. Detailed Methodology:Data is disaggregated first from reporting unit (i.e. multiple wards) to the ward level proportionate to the population of that ward. The data then is distributed down to the block level, again based on total population. When data is disaggregated to block or ward, we restrain vote totals not to exceed population 18 numbers, unless absolutely required.This methodology results in the following: Election data totals reported to the WEC at the state, county, municipal and reporting unit level should match the disaggregated election data total at the same levels. Election data totals reported to the WEC at ward level may not match the ward totals in the disaggregated election data file. Some wards may have more election data allocated than voter age population. This will occur if a change to the geography results in more voters than the 2010 historical population limits.Other things of note…We use a static, official ward layer (in this case created in 2020) to disaggregate election data to blocks. Using this ward layer creates some challenges. New wards are created every year due to annexations and incorporations. When these new wards are reported with election data, an issue arises wherein election data is being reported for wards that do not exist in our official ward layer. For example, if Cityville has four wards in the official ward layer, the election data may be reported for five wards, including a new ward from an annexation. There are two different scenarios and courses of action to these issues: When a single new ward is present in the election data but there is no ward geometry present in the official ward layer, the votes attributed to this new ward are distributed to all the other wards in the municipality based on population percentage. Distributing based on population percentage means that the proportion of the population of the municipality will receive that same proportion of votes from the new ward. In the example of Cityville explained above, the fifth ward may have five votes reported, but since there is no corresponding fifth ward in the official layer, these five votes will be assigned to each of the other wards in Cityville according the percentage of population.Another case is when a new ward is reported, but its votes are part of reporting unit. In this case, the votes for the new ward are assigned to the other wards in the reporting unit by population percentage; and not to wards in the municipality as a whole. For example, Cityville’s ward 5 was given as a reporting unit together with wards 1, 4, and 5. In this case, the votes in ward five are assigned to wards 1 and 4 according to population percentage. Outline Ward-by-Ward Election ResultsThe process of collecting election data and disaggregating to municipal wards occurs after a general election, so disaggregation has occurred with different ward layers and different population totals. We have outlined (to the best of our knowledge) what layer and population totals were used to produce these ward-by-ward election results.Election data disaggregates from WEC Reporting Unit -> Ward [Variant year outlined below]Elections 1990 – 2000: Wards 1991 (Census 1990 totals used for disaggregation)Elections 2002 – 2010: Wards 2001 (Census 2000 totals used for disaggregation)Elections 2012: Wards 2011 (Census 2010 totals used for disaggregation)Elections 2014 – 2016: Wards 2018 (Census 2010 totals used for disaggregation)Elections 2018: Wards 2018Elections 2020: Wards 2020Blocks 2011 -> Centroid geometry and spatially joined with Wards [All Versions]Each Block has an assignment to each of the ward versions outlined aboveIn the event that a ward exists now in which no block exists (occurred with spring 2020) due to annexations, a block centroid was created with a population 0, and encoded with the proper Census IDs.Wards [All Versions] disaggregate -> Blocks 2011This yields a block centroid layer that contains all elections from 1990 to 2018Blocks 2011 [with all election data] -> Wards 2020 (then MCD 2020, and County 2020) All election data (including later elections) is aggregated to the Wards 2020 assignment of the blocksNotes:Population of municipal wards 1991, 2001 and 2011 used for disaggregation were determined by their respective Census.Population and Election data will be contained within a county boundary. This means that even though MCD and ward boundaries vary greatly between versions of the wards, county boundaries have stayed the same, so data should total within a county the same between wards 2011 and wards 2020.Election data may be different for the same legislative district, for the same election, due to changes in the wards from 2011 and 2020. This is due to boundary corrections in the data from 2011 to 2020, and annexations, where a block may have been reassigned.*WEC replaced the previous Government Accountability Board (GAB) in 2016, which replaced the previous State Elections Board in 2008.
Facebook
TwitterPROBLEM AND OPPORTUNITY In the United States, voting is largely a private matter. A registered voter is given a randomized ballot form or machine to prevent linkage between their voting choices and their identity. This disconnect supports confidence in the election process, but it provides obstacles to an election's analysis. A common solution is to field exit polls, interviewing voters immediately after leaving their polling location. This method is rife with bias, however, and functionally limited in direct demographics data collected. For the 2020 general election, though, most states published their election results for each voting location. These publications were additionally supported by the geographical areas assigned to each location, the voting precincts. As a result, geographic processing can now be applied to project precinct election results onto Census block groups. While precinct have few demographic traits directly, their geographies have characteristics that make them projectable onto U.S. Census geographies. Both state voting precincts and U.S. Census block groups: are exclusive, and do not overlap are adjacent, fully covering their corresponding state and potentially county have roughly the same size in area, population and voter presence Analytically, a projection of local demographics does not allow conclusions about voters themselves. However, the dataset does allow statements related to the geographies that yield voting behavior. One could say, for example, that an area dominated by a particular voting pattern would have mean traits of age, race, income or household structure. The dataset that results from this programming provides voting results allocated by Census block groups. The block group identifier can be joined to Census Decennial and American Community Survey demographic estimates. DATA SOURCES The state election results and geographies have been compiled by Voting and Election Science team on Harvard's dataverse. State voting precincts lie within state and county boundaries. The Census Bureau, on the other hand, publishes its estimates across a variety of geographic definitions including a hierarchy of states, counties, census tracts and block groups. Their definitions can be found here. The geometric shapefiles for each block group are available here. The lowest level of this geography changes often and can obsolesce before the next census survey (Decennial or American Community Survey programs). The second to lowest census level, block groups, have the benefit of both granularity and stability however. The 2020 Decennial survey details US demographics into 217,740 block groups with between a few hundred and a few thousand people. Dataset Structure The dataset's columns include: Column Definition BLOCKGROUP_GEOID 12 digit primary key. Census GEOID of the block group row. This code concatenates: 2 digit state 3 digit county within state 6 digit Census Tract identifier 1 digit Census Block Group identifier within tract STATE State abbreviation, redundent with 2 digit state FIPS code above REP Votes for Republican party candidate for president DEM Votes for Democratic party candidate for president LIB Votes for Libertarian party candidate for president OTH Votes for presidential candidates other than Republican, Democratic or Libertarian AREA square kilometers of area associated with this block group GAP total area of the block group, net of area attributed to voting precincts PRECINCTS Number of voting precincts that intersect this block group ASSUMPTIONS, NOTES AND CONCERNS: Votes are attributed based upon the proportion of the precinct's area that intersects the corresponding block group. Alternative methods are left to the analyst's initiative. 50 states and the District of Columbia are in scope as those U.S. possessions voting in the general election for the U.S. Presidency. Three states did not report their results at the precinct level: South Dakota, Kentucky and West Virginia. A dummy block group is added for each of these states to maintain national totals. These states represent 2.1% of all votes cast. Counties are commonly coded using FIPS codes. However, each election result file may have the county field named differently. Also, three states do not share county definitions - Delaware, Massachusetts, Alaska and the District of Columbia. Block groups may be used to capture geographies that do not have population like bodies of water. As a result, block groups without intersection voting precincts are not uncommon. In the U.S., elections are administered at a state level with the Federal Elections Commission compiling state totals against the Electoral College weights. The states have liberty, though, to define and change their own voting precincts https://en.wikipedia.org/wiki/Electoral_precinct. The Census Bureau... Visit https://dataone.org/datasets/sha256%3A05707c1dc04a814129f751937a6ea56b08413546b18b351a85bc96da16a7f8b5 for complete metadata about this dataset.