Facebook
TwitterPROBLEM AND OPPORTUNITY In the United States, voting is largely a private matter. A registered voter is given a randomized ballot form or machine to prevent linkage between their voting choices and their identity. This disconnect supports confidence in the election process, but it provides obstacles to an election's analysis. A common solution is to field exit polls, interviewing voters immediately after leaving their polling location. This method is rife with bias, however, and functionally limited in direct demographics data collected. For the 2020 general election, though, most states published their election results for each voting location. These publications were additionally supported by the geographical areas assigned to each location, the voting precincts. As a result, geographic processing can now be applied to project precinct election results onto Census block groups. While precinct have few demographic traits directly, their geographies have characteristics that make them projectable onto U.S. Census geographies. Both state voting precincts and U.S. Census block groups: are exclusive, and do not overlap are adjacent, fully covering their corresponding state and potentially county have roughly the same size in area, population and voter presence Analytically, a projection of local demographics does not allow conclusions about voters themselves. However, the dataset does allow statements related to the geographies that yield voting behavior. One could say, for example, that an area dominated by a particular voting pattern would have mean traits of age, race, income or household structure. The dataset that results from this programming provides voting results allocated by Census block groups. The block group identifier can be joined to Census Decennial and American Community Survey demographic estimates. DATA SOURCES The state election results and geographies have been compiled by Voting and Election Science team on Harvard's dataverse. State voting precincts lie within state and county boundaries. The Census Bureau, on the other hand, publishes its estimates across a variety of geographic definitions including a hierarchy of states, counties, census tracts and block groups. Their definitions can be found here. The geometric shapefiles for each block group are available here. The lowest level of this geography changes often and can obsolesce before the next census survey (Decennial or American Community Survey programs). The second to lowest census level, block groups, have the benefit of both granularity and stability however. The 2020 Decennial survey details US demographics into 217,740 block groups with between a few hundred and a few thousand people. Dataset Structure The dataset's columns include: Column Definition BLOCKGROUP_GEOID 12 digit primary key. Census GEOID of the block group row. This code concatenates: 2 digit state 3 digit county within state 6 digit Census Tract identifier 1 digit Census Block Group identifier within tract STATE State abbreviation, redundent with 2 digit state FIPS code above REP Votes for Republican party candidate for president DEM Votes for Democratic party candidate for president LIB Votes for Libertarian party candidate for president OTH Votes for presidential candidates other than Republican, Democratic or Libertarian AREA square kilometers of area associated with this block group GAP total area of the block group, net of area attributed to voting precincts PRECINCTS Number of voting precincts that intersect this block group ASSUMPTIONS, NOTES AND CONCERNS: Votes are attributed based upon the proportion of the precinct's area that intersects the corresponding block group. Alternative methods are left to the analyst's initiative. 50 states and the District of Columbia are in scope as those U.S. possessions voting in the general election for the U.S. Presidency. Three states did not report their results at the precinct level: South Dakota, Kentucky and West Virginia. A dummy block group is added for each of these states to maintain national totals. These states represent 2.1% of all votes cast. Counties are commonly coded using FIPS codes. However, each election result file may have the county field named differently. Also, three states do not share county definitions - Delaware, Massachusetts, Alaska and the District of Columbia. Block groups may be used to capture geographies that do not have population like bodies of water. As a result, block groups without intersection voting precincts are not uncommon. In the U.S., elections are administered at a state level with the Federal Elections Commission compiling state totals against the Electoral College weights. The states have liberty, though, to define and change their own voting precincts https://en.wikipedia.org/wiki/Electoral_precinct. The Census Bureau... Visit https://dataone.org/datasets/sha256%3A05707c1dc04a814129f751937a6ea56b08413546b18b351a85bc96da16a7f8b5 for complete metadata about this dataset.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This file shows all election related data by state and county (i.e. total votes, Republican votes, Democratic votes, Republican voting percentage, Democratic voting percentage) for both the 2020 and 2024 U.S. Presidential Elections.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This folder contains the data behind the stories:
This project looks at patterns in open Democratic and Republican primary elections for the U.S. Senate, U.S. House and governor in 2018.
dem_candidates.csv contains information about the 811 candidates who have appeared on the ballot this year in Democratic primaries for Senate, House and governor, not counting races featuring a Democratic incumbent, as of August 7, 2018.
rep_candidates.csv contains information about the 774 candidates who have appeared on the ballot this year in Republican primaries for Senate, House and governor, not counting races featuring a Republican incumbent, through September 13, 2018.
Here is a description and source for each column in the accompanying datasets.
dem_candidates.csv and rep_candidates.csv include:
| Column | Description |
|---|---|
Candidate | All candidates who received votes in 2018’s Democratic primary elections for U.S. Senate, U.S. House and governor in which no incumbent ran. Supplied by Ballotpedia. |
State | The state in which the candidate ran. Supplied by Ballotpedia. |
District | The office and, if applicable, congressional district number for which the candidate ran. Supplied by Ballotpedia. |
Office Type | The office for which the candidate ran. Supplied by Ballotpedia. |
Race Type | Whether it was a “regular” or “special” election. Supplied by Ballotpedia. |
Race Primary Election Date | The date on which the primary was held. Supplied by Ballotpedia. |
Primary Status | Whether the candidate lost (“Lost”) the primary or won/advanced to a runoff (“Advanced”). Supplied by Ballotpedia. |
Primary Runoff Status | “None” if there was no runoff; “On the Ballot” if the candidate advanced to a runoff but it hasn’t been held yet; “Advanced” if the candidate won the runoff; “Lost” if the candidate lost the runoff. Supplied by Ballotpedia. |
General Status | “On the Ballot” if the candidate won the primary or runoff and has advanced to November; otherwise, “None.” Supplied by Ballotpedia. |
Primary % | The percentage of the vote received by the candidate in his or her primary. In states that hold runoff elections, we looked only at the first round (the regular primary). In states that hold all-party primaries (e.g., California), a candidate’s primary percentage is the percentage of the total Democratic vote they received. Unopposed candidates and candidates nominated by convention (not primary) are given a primary percentage of 100 but were excluded from our analysis involving vote share. Numbers come from official results posted by the secretary of state or local elections authority; if those were unavailable, we used unofficial election results from the New York Times. |
Won Primary | “Yes” if the candidate won his or her primary and has advanced to November; “No” if he or she lost. |
dem_candidates.csv includes:
| Column | Description |
|---|---|
Gender | “Male” or “Female.” Supplied by Ballotpedia. |
Partisan Lean | The FiveThirtyEight partisan lean of the district or state in which the election was held. Partisan leans are calculated by finding the average difference between how a state or district voted in the past two presidential elections and how the country voted overall, with 2016 results weighted 75 percent and 2012 results weighted 25 percent. |
Race | “White” if we identified the candidate as non-Hispanic white; “Nonwhite” if we identified the candidate as Hispanic and/or any nonwhite race; blank if we could not identify the candidate’s race or ethnicity. To determine race and ethnicity, we checked each candidate’s website to see if he or she identified as a certain race. If not, we spent no more than two minutes searching online news reports for references to the candidate’s race. |
Veteran? | If the candidate’s website says that he or she served in the armed forces, we put “Yes.” If the website is silent on the subject (or explicitly says he or she didn’t serve), we put “No.” If the field was left blank, no website was available. |
LGBTQ? | If the candidate’s website says that he or she is LGBTQ (including indirect references like to a same-sex partner), we put “Yes.” If the website is silent on the subject (or explicitly says he or she is straight), we put “No.” If the field was... |
Facebook
Twitterhttps://www.icpsr.umich.edu/web/ICPSR/studies/38506/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/38506/terms
This dataset contains counts of voter registration and voter turnout for all counties in the United States for the years 2004-2022. It also contains measures of each county's Democratic and Republican partisanship, including six-year longitudinal partisan indices for 2006-2022.
Facebook
Twitterhttps://dataverse-staging.rdmc.unc.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.15139/S3/YCSYUNhttps://dataverse-staging.rdmc.unc.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.15139/S3/YCSYUN
Washington and California adopted the Top-Two Primary in 2008 and 2012, respectively. Under this new system, all candidates regardless of party affiliation run against each other, narrowing the field down to the top two for the general election. In some jurisdictions, the general election features two candidates from the same party. Ten percent of California voters chose not to vote in the 2016 U.S. Senate election which featured two Democrats. Using data from the Cooperative Congressional Election Study (2012-2016), I find that among those who vote in the national November elections, orphans, or voters without a copartisan candidate on the ballot are more likely to undervote, opting out of voting in their congressional race. Levels of undervoting are nearly 20 percentage points more for orphaned voters compared to non-orphaned voters. Additionally, voters who abstain perceive more ideological distance between themselves and the candidates compared to voters who cast a vote. These findings support a multi-step framework for vote decisions in same-party matchups: voters are more likely to undervote if they are unable to vote for a candidate from their party (partisan model), but all voters are more likely to vote for a candidate when they perceive more ideological proximity (ideological model).
Facebook
Twitterhttps://www.icpsr.umich.edu/web/ICPSR/studies/33/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/33/terms
This data collection contains three files of county-level electoral returns for Ohio, Michigan, Nebraska, and New York in the period 1912, and 1920-1940. The data files were prepared for instructional use in the ICPSR Training Program and for graduate-level social science courses at the University of Michigan and other university campuses. They contain social, demographic, electoral, and economic data for various areas of the United States, usually for an extended period of time. Part 1, Ohio Referenda Counties as Units, and Part 2, Ohio Referenda as Units, consist of county-level returns for 42 referenda in the 1912 general election in Ohio. Data are provided for the names of counties, votes in the affirmative, total number of votes, and percentage of the "yes" votes for referenda on issues such as civil juries, capital punishment, governor's veto, workmen's compensation, 8-hour day, removal of elected officials, prison labor, women's suffrage, and taxes. The referenda included many questions considered important in the Progressive Movement. Part 3, Data Sets for Three States (Michigan, Nebraska, and New York), consists of electoral returns for the offices of president, governor, and United States representative, as well as ecological and population characteristics data in the period 1920-1940. Data are provided for the raw votes and percentage of the total votes received by the Democratic, Republican, Progressive, and other parties. Items also provide information on population characteristics, such as the total number of population, voting age population, urban population, and persons of other races, and school attendance and religion. Economic variables provide information on local government expenditures and revenues, agriculture and manufacturing, employment and unemployment, and the total number of banks and bank deposits.
Facebook
TwitterThis is the dataset I used to figure out which sociodemographic factor including the current pandemic status of each state has the most significan impace on the result of the US Presidential election last year. I also included sentiment scores of tweets created from 2020-10-15 to 2020-11-02 as well, in order to figure out the effect of positive/negative emotion for each candidate - Donald Trump and Joe Biden - on the result of the election.
Details for each variable are as below: - state: name of each state in the United States, including District of Columbia - elec16, elec20: dummy variable indicating whether Trump gained the electoral votes of each state or not. If the electors casted their votes for Trump, the value is 1; otherwise the value is 0 - elecchange: dummy variable indicating whether each party flipped the result in 2020 compared to that of the 2016 - demvote16: the rate of votes that the Democrats, i.e. Hillary Clinton earned in the 2016 Presidential election - repvote16: the rate of votes that the Republicans , i.e. Donald Trump earned in the 2016 Presidential election - demvote20: the rate of votes that the Democrats, i.e. Joe Biden earned in the 2020 Presidential election - repvote20: the rate of votes that the Republicans , i.e. Donald Trump earned in the 2020 Presidential election - demvotedif: the difference between demvote20 and demvote16 - repvotedif: the difference between repvote20 and repvote16 - pop: the population of each state - cumulcases: the cumulative COVID-19 cases on the Election day - caseMar ~ caseOct: the cumulative COVID-19 cases during each month - Marper10k ~ Octper10k: the cumulative COVID-19 cases during each month per 10 thousands - unemp20: the unemployment rate of each state this year before the election - unempdif: the difference between the unemployment rate of the last year and that of this year - jan20unemp ~ oct20unemp: the unemployment rate of each month - cumulper10k: the cumulative COVID-19 cases on the Election day per 10 thousands - b_str_poscount_total: the total number of positive tweets on Biden measured by the SentiStrength - b_str_negcount_total: the total number of negative tweets on Biden measured by the SentiStrength - t_str_poscount_total: the total number of positive tweets on Trump measured by the SentiStrength - t_str_poscount_total: the total number of negative tweets on Trump measured by the SentiStrength - b_str_posprop_total: the proportion of positive tweets on Biden measured by the SentiStrength - b_str_negprop_total: the proportion of negative tweets on Biden measured by the SentiStrength - t_str_posprop_total: the proportion of positive tweets on Trump measured by the SentiStrength - t_str_negprop_total: the proportion of negative tweets on Trump measured by the SentiStrength - white: the proportion of white people - colored: the proportion of colored people - secondary: the proportion of people who has attained the secondary education - tertiary: the proportion of people who has attained the tertiary education - q3gdp20: GDP of the 3rd quarter 2020 - q3gdprate: the growth rate of the 3rd quarter 2020, compared to that of the same quarter last year - 3qsgdp20: GDP of 3 quarters 2020 - 3qsrate20: the growth rate of GDP compared to that of the 3 quarters last year - q3gdpdif: the difference in the level of GDP of the 3rd quarter compared to the last quarter - q3rate: the growth rate of the 3rd quarter compared to the last quarter - access: the proportion of households having the Internet access
Facebook
Twitterhttps://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.7910/DVN/TZVDUQhttps://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.7910/DVN/TZVDUQ
We build a model of American presidential voting in which the cumulative impression left by political events determines the preferences of voters. The impression varies by voter, depending on their age at the time the events took place. We find the Gallup presidential approval rating time series reflects the major events that influence voter preferences, with the most influential occur- ring during a voter’s teenage and early adult years. Our fitted model is predictive, explaining more than eighty percent of the variation in voting trends over the last half-century. It is also interpretable, dividing voters into five meaningful generations: New Deal Democrats, Eisenhower Republicans, 1960s Liberals, Reagan Conservatives, and Millennials. We present each generation in con- text of the political events that shaped its preferences, beginning in 1940 and ending with the 2016 election.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset is collected from 1824 to 2020: 1. Year: Description: The year in which the U.S. election took place. Type: Numeric (Integer) Example: 1824, 1860, 1920, 2020
Candidate: Description: The name of the candidate participating in the election. Type: String (Candidate's name) Example: John Adams, Abraham Lincoln, Franklin D. Roosevelt, Joe Biden
Party: Description: The political party affiliation of the candidate. Type: String (Party name or abbreviation) Example: Democratic, Republican, Whig, Libertarian
Popular Vote: Description: The total number of votes that the candidate received in the popular vote. Type: Numeric (Integer) Example: 500,000, 5,000,000, 70,000,000
Result: Description: The outcome of the election for the specified candidate. Type: String (e.g., "Winner," "Runner-up," "Withdrew") Example: Winner, Runner-up, Withdrew, Conceded
Percentage: Description: The percentage of the total popular vote received by the candidate. Type: Numeric (Float) Example: 25.3%, 49.8%, 60.5%
This dataset appears to capture essential information about U.S. elections over time, including details about the candidates, their political party affiliations, the number of popular votes they received, the outcome of the election, and the percentage of the total popular vote they secured. This comprehensive dataset allows for the analysis of historical U.S. election trends and outcomes.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The dataset contains U.S. House of Representatives election results for all 435 legislative districts from 2002 through 2018. It includes final race ratings from Cook Political Report beginning in 2008.
At-large districts are coded as the 1st district, e.g. WY-01.
In races where multiple members of the same party are on the ballot, the party vote percentages represent only the highest-vote-getting candidate from each party.
There are two rows where the winner was neither a Democrat nor a Republican. They are Bernie Sanders in 2002 and 2004 (VT-01).
Most of the Cook race rating data was scraped from their site here. Everything else was scraped from Wikipedia so errors are possible.
Facebook
TwitterThe downloadable ZIP file contains Esri shapefiles and PDF maps. Contains the information used to determine the location of the new legislative and congressional district boundaries for the state of Idaho as adopted by Idaho's first Commission on Redistricting on March 9, 2002. Contains viewable and printable legislative and congressional district maps, viewable and printable reports, and importable geographic data files.These data were contributed to INSIDE Idaho at the University of Idaho Library in 2001. CD/DVD -ROM availability: https://alliance-primo.hosted.exlibrisgroup.com/permalink/f/m1uotc/CP71156191150001451These files were created by a six-person, by-partisan commission, consisting of six commission members, three democrats and three republicans. This commission was given 90 days to redraw congressional and legislative district boundaries for the state of Idaho. Due to lawsuits, the process was extended. This legislative plan was approved by the commission on March 9th, 2002 and was previously called L97. All digital data originates from TIGER/Line files and 2000 U.S. Census data.Frequently asked questions:How often are Idaho's legislative and congressional districts redrawn? Once every ten years after each census, as required by law, or when directed by the Idaho Supreme Court. The most recent redistricting followed the 2000 census. Redistricting is not expected to occur again in Idaho until after the 2010 census. Who redrew Idaho's legislative and congressional districts? In 2001, for the first time, Idaho used a citizens' commission to redraw its legislative and congressional district boundaries. Before Idaho voters amended the state Constitution in 1994 to create a Redistricting Commission, redistricting was done by a committee of the Idaho Legislature. The committee's new district plans then had to pass the Legislature before becoming law. Who was on the Redistricting Commission? Idaho's first Commission on Redistricting was composed of Co-Chairmen Kristi Sellers of Chubbuck and Tom Stuart of Boise and Stanley. The other four members were Raymond Givens of Coeur d'Alene, Dean Haagenson of Hayden Lake, Karl Shurtliff of Boise, John Hepworth of Buhl (who resigned effective December 4, 2001), and Derlin Taylor of Burley (who was appointed to replace Mr. Hepworth). What are the requirements for being a Redistricting Commissioner? According to Idaho Law, no person may serve on the commission who: 1. Is not a registered voter of the state at the time of selection; or 2. Is or has been within one (1) year a registered lobbyist; or 3. Is or has been within two (2) years prior to selection an elected official or elected legislative district, county or state party officer. (This requirement does not apply to precinct committeepersons.) The individual appointing authorities may consider additional criteria beyond these statutory requirements. Idaho law also prohibits a person who has served on the Redistricting Commission from serving in either house of the legislature for five years following their service on the commission. When did Idaho's first Commission on Redistricting meet? Idaho law allows the Commission only 90 days to conduct its business. The Redistricting Commission was formed on June 5, 2001. Its 90-day time period would expire on September 3, 2001. After holding hearings around the state in June and July, a majority of the Commission voted to adopt new legislative and congressional districts on August 22, 2001. On November 29th, the Idaho Supreme Court ruled the Commission's legislative redistricting plan unconstitutional and directed them to reconvene and adopt an alternative plan. The Commission did so, adopting a new plan on January 8, 2000. The Idaho Supreme Court found the Commission's second legislative map unconstitutional on March 1, 2002 and ordered the Commission to try again. The Commission adopted a third plan on March 9, 2002. The Supreme Court denied numerous challenges to this third map. It then became the basis for the 2002 primary and General elections and is expected to be used until the 2012 elections. What is the basic timetable for Idaho to redraw its legislative and congressional districts?Typically, and according to Idaho law, the Redistricting Commission cannot be formally convened until after Idaho has received the official census counts and not before June 1 of a year ending in one. Idaho's first Commission on redistricting was officially created on June 5, 2001. By law, a Commission then has 90 days (or until September 3, 2001 in the case of Idaho's first Commission) to approve new legislative and congressional district boundaries based on the most recent census figures. If at least four of the six commissioners fail to approve new legislative and congressional district plans before that 90-day time period expires, the Commission will cease to exist. The law is silent as to what happens next. Could you summarize the important dates for Idaho's first Commission on Redistricting one more time please? After January 1, 2001 but before April 1, 2001: As required by federal law, the Census Bureau must deliver to the states the small area population counts upon which redistricting is based. The Census Bureau determines the exact date within this window when Idaho will get its population figures. Idaho's were delivered on March 23, 2001. Why conduct a census anyway? The original and still primary reason for conducting a national census every ten years is to determine how the 435 seats in the United States House of Representatives are to be apportioned among the 50 states. Each state receives its share of the 435 seats in the U.S. House based on the proportion of its population to that of the total U.S. population. For example, the population shifts during the 1990's resulted in the Northeastern states losing population and therefore seats in Congress to the Southern and the Western states. What is reapportionment? Reapportionment is a federal issue that applies only to Congress. It is the process of dividing up the 435 seats in the U.S. House of Representatives among the 50 states based on each state's proportion of the total U.S. population as determined by the most recent census. Apportionment determines the each state's power, as expressed by the size of their congressional delegation, in Congress and, through the electoral college, directly affects the selection of the president (each state's number of votes in the electoral college equals the number of its representatives and senators in Congress). Like all states, Idaho has two U.S. senators. Based on our 1990 population of 1,006,000 people and our 2000 population of 1,293,953, and relative to the populations of the other 49 states, Idaho will have two seats in the U.S. House of Representatives. Even with the state's 28.5% population increase from 1990 to 2000, Idaho will not be getting a third seat in the U.S. House of Representatives. Assuming Idaho keeps growing at the same rate it did through the decade of the 1990's, it will likely be 30 or 40 years (after 3 or 4 more censuses) before Idaho gets a third congressional seat. What is redistricting? Redistricting is the process of redrawing the boundaries of legislative and congressional districts within each state to achieve population equality among all congressional districts and among all legislative districts. The U.S. Constitution requires this be done for all congressional districts after each decennial census. The Idaho Constitution also requires that this be done for all legislative districts after each census. The democratic principle behind redistricting is "one person, one vote." Requiring that districts be of equal population ensures that every elected state legislator or U.S. congressman represents very close to the same number of people in that state, therefore, each citizen's vote will carry the same weight. How are reapportionment and redistricting related to the census? The original and still primary reason for conducting a census every ten years is to apportion the (now) 435 seats in the U.S. House of Representatives among the several states. The census records population changes and is the legally recognized basis for redrawing electoral districts of equal population. Why is redistricting so important? In a democracy, it is important for all citizens to have equal representation. The political parties also see redistricting as an opportunity to draw districts that favor electing their members and, conversely, that are unfavorable for electing their political opposition. (It's for this reason that redistricting has been described as "the purest form of political bloodsport.") What is PL 94-171? Public Law (PL) 94-171 (Title 13, United States Code) was enacted by Congress in 1975. It was intended to provide state legislatures with small-area census population totals for use in redistricting. The law's origins lie with the "one person, one vote" court decisions in the 1960's. State legislatures needed to reconcile Census Bureau's small geographic area boundaries with voting tabulation districts (precincts) boundaries to create legislative districts with balanced populations. The Census Bureau worked with state legislatures and others to meet this need beginning with the 1980 census. The resulting Public Law 94-171 allows states to work voluntarily with the Census Bureau to match voting district boundaries with small-area census boundaries. With this done, the Bureau can report to those participating states the census population totals broken down by major race group and Hispanic origin for the total population and for persons aged 18 years and older for each census subdivision. Idaho participated in the Bureau's Census 2000 Redistricting Data Program and, where counties used visible features to delineate precinct boundaries, matched those boundaries with census reporting areas. In those instances where counties did not use visible features to
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
TwitterPROBLEM AND OPPORTUNITY In the United States, voting is largely a private matter. A registered voter is given a randomized ballot form or machine to prevent linkage between their voting choices and their identity. This disconnect supports confidence in the election process, but it provides obstacles to an election's analysis. A common solution is to field exit polls, interviewing voters immediately after leaving their polling location. This method is rife with bias, however, and functionally limited in direct demographics data collected. For the 2020 general election, though, most states published their election results for each voting location. These publications were additionally supported by the geographical areas assigned to each location, the voting precincts. As a result, geographic processing can now be applied to project precinct election results onto Census block groups. While precinct have few demographic traits directly, their geographies have characteristics that make them projectable onto U.S. Census geographies. Both state voting precincts and U.S. Census block groups: are exclusive, and do not overlap are adjacent, fully covering their corresponding state and potentially county have roughly the same size in area, population and voter presence Analytically, a projection of local demographics does not allow conclusions about voters themselves. However, the dataset does allow statements related to the geographies that yield voting behavior. One could say, for example, that an area dominated by a particular voting pattern would have mean traits of age, race, income or household structure. The dataset that results from this programming provides voting results allocated by Census block groups. The block group identifier can be joined to Census Decennial and American Community Survey demographic estimates. DATA SOURCES The state election results and geographies have been compiled by Voting and Election Science team on Harvard's dataverse. State voting precincts lie within state and county boundaries. The Census Bureau, on the other hand, publishes its estimates across a variety of geographic definitions including a hierarchy of states, counties, census tracts and block groups. Their definitions can be found here. The geometric shapefiles for each block group are available here. The lowest level of this geography changes often and can obsolesce before the next census survey (Decennial or American Community Survey programs). The second to lowest census level, block groups, have the benefit of both granularity and stability however. The 2020 Decennial survey details US demographics into 217,740 block groups with between a few hundred and a few thousand people. Dataset Structure The dataset's columns include: Column Definition BLOCKGROUP_GEOID 12 digit primary key. Census GEOID of the block group row. This code concatenates: 2 digit state 3 digit county within state 6 digit Census Tract identifier 1 digit Census Block Group identifier within tract STATE State abbreviation, redundent with 2 digit state FIPS code above REP Votes for Republican party candidate for president DEM Votes for Democratic party candidate for president LIB Votes for Libertarian party candidate for president OTH Votes for presidential candidates other than Republican, Democratic or Libertarian AREA square kilometers of area associated with this block group GAP total area of the block group, net of area attributed to voting precincts PRECINCTS Number of voting precincts that intersect this block group ASSUMPTIONS, NOTES AND CONCERNS: Votes are attributed based upon the proportion of the precinct's area that intersects the corresponding block group. Alternative methods are left to the analyst's initiative. 50 states and the District of Columbia are in scope as those U.S. possessions voting in the general election for the U.S. Presidency. Three states did not report their results at the precinct level: South Dakota, Kentucky and West Virginia. A dummy block group is added for each of these states to maintain national totals. These states represent 2.1% of all votes cast. Counties are commonly coded using FIPS codes. However, each election result file may have the county field named differently. Also, three states do not share county definitions - Delaware, Massachusetts, Alaska and the District of Columbia. Block groups may be used to capture geographies that do not have population like bodies of water. As a result, block groups without intersection voting precincts are not uncommon. In the U.S., elections are administered at a state level with the Federal Elections Commission compiling state totals against the Electoral College weights. The states have liberty, though, to define and change their own voting precincts https://en.wikipedia.org/wiki/Electoral_precinct. The Census Bureau... Visit https://dataone.org/datasets/sha256%3A05707c1dc04a814129f751937a6ea56b08413546b18b351a85bc96da16a7f8b5 for complete metadata about this dataset.