12 datasets found
  1. d

    AP VoteCast 2020 - General Election

    • data.world
    csv, zip
    Updated Mar 29, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Associated Press (2024). AP VoteCast 2020 - General Election [Dataset]. https://data.world/associatedpress/ap-votecast
    Explore at:
    csv, zipAvailable download formats
    Dataset updated
    Mar 29, 2024
    Authors
    The Associated Press
    Description

    AP VoteCast is a survey of the American electorate conducted by NORC at the University of Chicago for Fox News, NPR, PBS NewsHour, Univision News, USA Today Network, The Wall Street Journal and The Associated Press.

    AP VoteCast combines interviews with a random sample of registered voters drawn from state voter files with self-identified registered voters selected using nonprobability approaches. In general elections, it also includes interviews with self-identified registered voters conducted using NORC’s probability-based AmeriSpeak® panel, which is designed to be representative of the U.S. population.

    Interviews are conducted in English and Spanish. Respondents may receive a small monetary incentive for completing the survey. Participants selected as part of the random sample can be contacted by phone and mail and can take the survey by phone or online. Participants selected as part of the nonprobability sample complete the survey online.

    In the 2020 general election, the survey of 133,103 interviews with registered voters was conducted between Oct. 26 and Nov. 3, concluding as polls closed on Election Day. AP VoteCast delivered data about the presidential election in all 50 states as well as all Senate and governors’ races in 2020.

    Using this Data - IMPORTANT

    This is survey data and must be properly weighted during analysis: DO NOT REPORT THIS DATA AS RAW OR AGGREGATE NUMBERS!!

    Instead, use statistical software such as R or SPSS to weight the data.

    National Survey

    The national AP VoteCast survey of voters and nonvoters in 2020 is based on the results of the 50 state-based surveys and a nationally representative survey of 4,141 registered voters conducted between Nov. 1 and Nov. 3 on the probability-based AmeriSpeak panel. It included 41,776 probability interviews completed online and via telephone, and 87,186 nonprobability interviews completed online. The margin of sampling error is plus or minus 0.4 percentage points for voters and 0.9 percentage points for nonvoters.

    State Surveys

    In 20 states in 2020, AP VoteCast is based on roughly 1,000 probability-based interviews conducted online and by phone, and roughly 3,000 nonprobability interviews conducted online. In these states, the margin of sampling error is about plus or minus 2.3 percentage points for voters and 5.5 percentage points for nonvoters.

    In an additional 20 states, AP VoteCast is based on roughly 500 probability-based interviews conducted online and by phone, and roughly 2,000 nonprobability interviews conducted online. In these states, the margin of sampling error is about plus or minus 2.9 percentage points for voters and 6.9 percentage points for nonvoters.

    In the remaining 10 states, AP VoteCast is based on about 1,000 nonprobability interviews conducted online. In these states, the margin of sampling error is about plus or minus 4.5 percentage points for voters and 11.0 percentage points for nonvoters.

    Although there is no statistically agreed upon approach for calculating margins of error for nonprobability samples, these margins of error were estimated using a measure of uncertainty that incorporates the variability associated with the poll estimates, as well as the variability associated with the survey weights as a result of calibration. After calibration, the nonprobability sample yields approximately unbiased estimates.

    As with all surveys, AP VoteCast is subject to multiple sources of error, including from sampling, question wording and order, and nonresponse.

    Sampling Details

    Probability-based Registered Voter Sample

    In each of the 40 states in which AP VoteCast included a probability-based sample, NORC obtained a sample of registered voters from Catalist LLC’s registered voter database. This database includes demographic information, as well as addresses and phone numbers for registered voters, allowing potential respondents to be contacted via mail and telephone. The sample is stratified by state, partisanship, and a modeled likelihood to respond to the postcard based on factors such as age, race, gender, voting history, and census block group education. In addition, NORC attempted to match sampled records to a registered voter database maintained by L2, which provided additional phone numbers and demographic information.

    Prior to dialing, all probability sample records were mailed a postcard inviting them to complete the survey either online using a unique PIN or via telephone by calling a toll-free number. Postcards were addressed by name to the sampled registered voter if that individual was under age 35; postcards were addressed to “registered voter” in all other cases. Telephone interviews were conducted with the adult that answered the phone following confirmation of registered voter status in the state.

    Nonprobability Sample

    Nonprobability participants include panelists from Dynata or Lucid, including members of its third-party panels. In addition, some registered voters were selected from the voter file, matched to email addresses by V12, and recruited via an email invitation to the survey. Digital fingerprint software and panel-level ID validation is used to prevent respondents from completing the AP VoteCast survey multiple times.

    AmeriSpeak Sample

    During the initial recruitment phase of the AmeriSpeak panel, randomly selected U.S. households were sampled with a known, non-zero probability of selection from the NORC National Sample Frame and then contacted by mail, email, telephone and field interviewers (face-to-face). The panel provides sample coverage of approximately 97% of the U.S. household population. Those excluded from the sample include people with P.O. Box-only addresses, some addresses not listed in the U.S. Postal Service Delivery Sequence File and some newly constructed dwellings. Registered voter status was confirmed in field for all sampled panelists.

    Weighting Details

    AP VoteCast employs a four-step weighting approach that combines the probability sample with the nonprobability sample and refines estimates at a subregional level within each state. In a general election, the 50 state surveys and the AmeriSpeak survey are weighted separately and then combined into a survey representative of voters in all 50 states.

    State Surveys

    First, weights are constructed separately for the probability sample (when available) and the nonprobability sample for each state survey. These weights are adjusted to population totals to correct for demographic imbalances in age, gender, education and race/ethnicity of the responding sample compared to the population of registered voters in each state. In 2020, the adjustment targets are derived from a combination of data from the U.S. Census Bureau’s November 2018 Current Population Survey Voting and Registration Supplement, Catalist’s voter file and the Census Bureau’s 2018 American Community Survey. Prior to adjusting to population totals, the probability-based registered voter list sample weights are adjusted for differential non-response related to factors such as availability of phone numbers, age, race and partisanship.

    Second, all respondents receive a calibration weight. The calibration weight is designed to ensure the nonprobability sample is similar to the probability sample in regard to variables that are predictive of vote choice, such as partisanship or direction of the country, which cannot be fully captured through the prior demographic adjustments. The calibration benchmarks are based on regional level estimates from regression models that incorporate all probability and nonprobability cases nationwide.

    Third, all respondents in each state are weighted to improve estimates for substate geographic regions. This weight combines the weighted probability (if available) and nonprobability samples, and then uses a small area model to improve the estimate within subregions of a state.

    Fourth, the survey results are weighted to the actual vote count following the completion of the election. This weighting is done in 10–30 subregions within each state.

    National Survey

    In a general election, the national survey is weighted to combine the 50 state surveys with the nationwide AmeriSpeak survey. Each of the state surveys is weighted as described. The AmeriSpeak survey receives a nonresponse-adjusted weight that is then adjusted to national totals for registered voters that in 2020 were derived from the U.S. Census Bureau’s November 2018 Current Population Survey Voting and Registration Supplement, the Catalist voter file and the Census Bureau’s 2018 American Community Survey. The state surveys are further adjusted to represent their appropriate proportion of the registered voter population for the country and combined with the AmeriSpeak survey. After all votes are counted, the national data file is adjusted to match the national popular vote for president.

  2. U

    Associated Press Poll #843N: Congressional Election

    • dataverse-staging.rdmc.unc.edu
    • dataverse.unc.edu
    • +1more
    pdf, txt
    Updated Nov 30, 2007
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    UNC Dataverse (2007). Associated Press Poll #843N: Congressional Election [Dataset]. https://dataverse-staging.rdmc.unc.edu/dataset.xhtml?persistentId=hdl:1902.29/D-31425
    Explore at:
    txt(87870), pdf(224198)Available download formats
    Dataset updated
    Nov 30, 2007
    Dataset provided by
    UNC Dataverse
    License

    https://dataverse-staging.rdmc.unc.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=hdl:1902.29/D-31425https://dataverse-staging.rdmc.unc.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=hdl:1902.29/D-31425

    Description

    This survey focuses on the congressional election. Issues addressed include approval of President Clinton, Monica Lewinsky, likelihood of voting in the election, and the most important issues in the election. Demographic variables include sex, age, education, race, income, and party affiliation.

  3. d

    Johns Hopkins COVID-19 Case Tracker

    • data.world
    csv, zip
    Updated Mar 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Associated Press (2025). Johns Hopkins COVID-19 Case Tracker [Dataset]. https://data.world/associatedpress/johns-hopkins-coronavirus-case-tracker
    Explore at:
    zip, csvAvailable download formats
    Dataset updated
    Mar 25, 2025
    Authors
    The Associated Press
    Time period covered
    Jan 22, 2020 - Mar 9, 2023
    Area covered
    Description

    Updates

    • Notice of data discontinuation: Since the start of the pandemic, AP has reported case and death counts from data provided by Johns Hopkins University. Johns Hopkins University has announced that they will stop their daily data collection efforts after March 10. As Johns Hopkins stops providing data, the AP will also stop collecting daily numbers for COVID cases and deaths. The HHS and CDC now collect and visualize key metrics for the pandemic. AP advises using those resources when reporting on the pandemic going forward.

    • April 9, 2020

      • The population estimate data for New York County, NY has been updated to include all five New York City counties (Kings County, Queens County, Bronx County, Richmond County and New York County). This has been done to match the Johns Hopkins COVID-19 data, which aggregates counts for the five New York City counties to New York County.
    • April 20, 2020

      • Johns Hopkins death totals in the US now include confirmed and probable deaths in accordance with CDC guidelines as of April 14. One significant result of this change was an increase of more than 3,700 deaths in the New York City count. This change will likely result in increases for death counts elsewhere as well. The AP does not alter the Johns Hopkins source data, so probable deaths are included in this dataset as well.
    • April 29, 2020

      • The AP is now providing timeseries data for counts of COVID-19 cases and deaths. The raw counts are provided here unaltered, along with a population column with Census ACS-5 estimates and calculated daily case and death rates per 100,000 people. Please read the updated caveats section for more information.
    • September 1st, 2020

      • Johns Hopkins is now providing counts for the five New York City counties individually.
    • February 12, 2021

      • The Ohio Department of Health recently announced that as many as 4,000 COVID-19 deaths may have been underreported through the state’s reporting system, and that the "daily reported death counts will be high for a two to three-day period."
      • Because deaths data will be anomalous for consecutive days, we have chosen to freeze Ohio's rolling average for daily deaths at the last valid measure until Johns Hopkins is able to back-distribute the data. The raw daily death counts, as reported by Johns Hopkins and including the backlogged death data, will still be present in the new_deaths column.
    • February 16, 2021

      - Johns Hopkins has reconciled Ohio's historical deaths data with the state.

      Overview

    The AP is using data collected by the Johns Hopkins University Center for Systems Science and Engineering as our source for outbreak caseloads and death counts for the United States and globally.

    The Hopkins data is available at the county level in the United States. The AP has paired this data with population figures and county rural/urban designations, and has calculated caseload and death rates per 100,000 people. Be aware that caseloads may reflect the availability of tests -- and the ability to turn around test results quickly -- rather than actual disease spread or true infection rates.

    This data is from the Hopkins dashboard that is updated regularly throughout the day. Like all organizations dealing with data, Hopkins is constantly refining and cleaning up their feed, so there may be brief moments where data does not appear correctly. At this link, you’ll find the Hopkins daily data reports, and a clean version of their feed.

    The AP is updating this dataset hourly at 45 minutes past the hour.

    To learn more about AP's data journalism capabilities for publishers, corporations and financial institutions, go here or email kromano@ap.org.

    Queries

    Use AP's queries to filter the data or to join to other datasets we've made available to help cover the coronavirus pandemic

    Interactive

    The AP has designed an interactive map to track COVID-19 cases reported by Johns Hopkins.

    @(https://datawrapper.dwcdn.net/nRyaf/15/)

    Interactive Embed Code

    <iframe title="USA counties (2018) choropleth map Mapping COVID-19 cases by county" aria-describedby="" id="datawrapper-chart-nRyaf" src="https://datawrapper.dwcdn.net/nRyaf/10/" scrolling="no" frameborder="0" style="width: 0; min-width: 100% !important;" height="400"></iframe><script type="text/javascript">(function() {'use strict';window.addEventListener('message', function(event) {if (typeof event.data['datawrapper-height'] !== 'undefined') {for (var chartId in event.data['datawrapper-height']) {var iframe = document.getElementById('datawrapper-chart-' + chartId) || document.querySelector("iframe[src*='" + chartId + "']");if (!iframe) {continue;}iframe.style.height = event.data['datawrapper-height'][chartId] + 'px';}}});})();</script>
    

    Caveats

    • This data represents the number of cases and deaths reported by each state and has been collected by Johns Hopkins from a number of sources cited on their website.
    • In some cases, deaths or cases of people who've crossed state lines -- either to receive treatment or because they became sick and couldn't return home while traveling -- are reported in a state they aren't currently in, because of state reporting rules.
    • In some states, there are a number of cases not assigned to a specific county -- for those cases, the county name is "unassigned to a single county"
    • This data should be credited to Johns Hopkins University's COVID-19 tracking project. The AP is simply making it available here for ease of use for reporters and members.
    • Caseloads may reflect the availability of tests -- and the ability to turn around test results quickly -- rather than actual disease spread or true infection rates.
    • Population estimates at the county level are drawn from 2014-18 5-year estimates from the American Community Survey.
    • The Urban/Rural classification scheme is from the Center for Disease Control and Preventions's National Center for Health Statistics. It puts each county into one of six categories -- from Large Central Metro to Non-Core -- according to population and other characteristics. More details about the classifications can be found here.

    Johns Hopkins timeseries data - Johns Hopkins pulls data regularly to update their dashboard. Once a day, around 8pm EDT, Johns Hopkins adds the counts for all areas they cover to the timeseries file. These counts are snapshots of the latest cumulative counts provided by the source on that day. This can lead to inconsistencies if a source updates their historical data for accuracy, either increasing or decreasing the latest cumulative count. - Johns Hopkins periodically edits their historical timeseries data for accuracy. They provide a file documenting all errors in their timeseries files that they have identified and fixed here

    Attribution

    This data should be credited to Johns Hopkins University COVID-19 tracking project

  4. NFL season MVPs by player 2024

    • statista.com
    Updated Mar 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2024). NFL season MVPs by player 2024 [Dataset]. https://www.statista.com/statistics/1202200/ap-nfl-mvp/
    Explore at:
    Dataset updated
    Mar 12, 2024
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    United States
    Description

    The Associated Press NFL Most Valuable Player Award is an annual award which has been presented since 1957 to the NFL player deemed to have been the best during the regular football season. Since the award was first presented, a total of 11 players have won the trophy more than once. Legendary quarterback Peyton Manning won the award a record five times during his career, including four times while playing for the Indianapolis Colts and most recently in 2013 with the Denver Broncos.

  5. H

    REPLICATION DATA for: "The Costs of Voting and Voter Confidence,” Political...

    • dataverse.harvard.edu
    Updated Aug 28, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    REPLICATION DATA for: "The Costs of Voting and Voter Confidence,” Political Research Quarterly [Dataset]. https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi%3A10.7910%2FDVN%2FYRIXUW&version=DRAFT
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 28, 2024
    Dataset provided by
    Harvard Dataverse
    Authors
    Lonna Atkeson
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    In this paper, we revisit the effect of ballot access laws on voter confidence in the outcome of elections. We argue that voter confidence is conditioned by partisanship. Democrats and Republicans view election laws through a partisan lens, which is especially triggered when coalitions lose. We used The Integrity of Voting data set, along with other data sets, to test our hypotheses. The sample frame for the Integrity of Voting Survey was eligible persons who voted in the 2020 Presidential elections with accessible internet email addresses. Our sample consisted of two samples from two different vendors. Surveys were conducted with 17,526 voters drawing on two independent samples of registered voters who reported voting in the 2020 Presidential election. Email addresses for registered voters in each state were purchased from L2, a commercial vendor specializing in obtaining email addresses for registered voters. Interviews were solicited from one million voters in all 50 states, with 10,770 completed interviews for a response rate of .011%. A second sample of internet interviews were solicited and completed with 6,756 2020 voters using Dynata’s proprietary select-in survey of voters in selected states with smaller populations of registered voters. A minimum of roughly 100 2020 election voters were interviewed in each state. Our state samples were weighted using a raking technique on age, race, gender, education, and vote mode demographics from the U.S. Census Bureau’s 2020 Voting and Registration in the Election of November 2020 supplement to the Current Population survey (2021), as well as party identification totals from post-election exit polls conducted by the Associated Press (2020). Surveys were conducted between the first week in December, 2020 and the first week in February 2021.

  6. d

    Mass Killings in America, 2006 - present

    • data.world
    csv, zip
    Updated Mar 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Associated Press (2025). Mass Killings in America, 2006 - present [Dataset]. https://data.world/associatedpress/mass-killings-public
    Explore at:
    zip, csvAvailable download formats
    Dataset updated
    Mar 25, 2025
    Authors
    The Associated Press
    Time period covered
    Jan 1, 2006 - Feb 21, 2025
    Area covered
    Description

    THIS DATASET WAS LAST UPDATED AT 8:10 PM EASTERN ON MARCH 24

    OVERVIEW

    2019 had the most mass killings since at least the 1970s, according to the Associated Press/USA TODAY/Northeastern University Mass Killings Database.

    In all, there were 45 mass killings, defined as when four or more people are killed excluding the perpetrator. Of those, 33 were mass shootings . This summer was especially violent, with three high-profile public mass shootings occurring in the span of just four weeks, leaving 38 killed and 66 injured.

    A total of 229 people died in mass killings in 2019.

    The AP's analysis found that more than 50% of the incidents were family annihilations, which is similar to prior years. Although they are far less common, the 9 public mass shootings during the year were the most deadly type of mass murder, resulting in 73 people's deaths, not including the assailants.

    One-third of the offenders died at the scene of the killing or soon after, half from suicides.

    About this Dataset

    The Associated Press/USA TODAY/Northeastern University Mass Killings database tracks all U.S. homicides since 2006 involving four or more people killed (not including the offender) over a short period of time (24 hours) regardless of weapon, location, victim-offender relationship or motive. The database includes information on these and other characteristics concerning the incidents, offenders, and victims.

    The AP/USA TODAY/Northeastern database represents the most complete tracking of mass murders by the above definition currently available. Other efforts, such as the Gun Violence Archive or Everytown for Gun Safety may include events that do not meet our criteria, but a review of these sites and others indicates that this database contains every event that matches the definition, including some not tracked by other organizations.

    This data will be updated periodically and can be used as an ongoing resource to help cover these events.

    Using this Dataset

    To get basic counts of incidents of mass killings and mass shootings by year nationwide, use these queries:

    Mass killings by year

    Mass shootings by year

    To get these counts just for your state:

    Filter killings by state

    Definition of "mass murder"

    Mass murder is defined as the intentional killing of four or more victims by any means within a 24-hour period, excluding the deaths of unborn children and the offender(s). The standard of four or more dead was initially set by the FBI.

    This definition does not exclude cases based on method (e.g., shootings only), type or motivation (e.g., public only), victim-offender relationship (e.g., strangers only), or number of locations (e.g., one). The time frame of 24 hours was chosen to eliminate conflation with spree killers, who kill multiple victims in quick succession in different locations or incidents, and to satisfy the traditional requirement of occurring in a “single incident.”

    Offenders who commit mass murder during a spree (before or after committing additional homicides) are included in the database, and all victims within seven days of the mass murder are included in the victim count. Negligent homicides related to driving under the influence or accidental fires are excluded due to the lack of offender intent. Only incidents occurring within the 50 states and Washington D.C. are considered.

    Methodology

    Project researchers first identified potential incidents using the Federal Bureau of Investigation’s Supplementary Homicide Reports (SHR). Homicide incidents in the SHR were flagged as potential mass murder cases if four or more victims were reported on the same record, and the type of death was murder or non-negligent manslaughter.

    Cases were subsequently verified utilizing media accounts, court documents, academic journal articles, books, and local law enforcement records obtained through Freedom of Information Act (FOIA) requests. Each data point was corroborated by multiple sources, which were compiled into a single document to assess the quality of information.

    In case(s) of contradiction among sources, official law enforcement or court records were used, when available, followed by the most recent media or academic source.

    Case information was subsequently compared with every other known mass murder database to ensure reliability and validity. Incidents listed in the SHR that could not be independently verified were excluded from the database.

    Project researchers also conducted extensive searches for incidents not reported in the SHR during the time period, utilizing internet search engines, Lexis-Nexis, and Newspapers.com. Search terms include: [number] dead, [number] killed, [number] slain, [number] murdered, [number] homicide, mass murder, mass shooting, massacre, rampage, family killing, familicide, and arson murder. Offender, victim, and location names were also directly searched when available.

    This project started at USA TODAY in 2012.

    Contacts

    Contact AP Data Editor Justin Myers with questions, suggestions or comments about this dataset at jmyers@ap.org. The Northeastern University researcher working with AP and USA TODAY is Professor James Alan Fox, who can be reached at j.fox@northeastern.edu or 617-416-4400.

  7. d

    Public Health Official Departures

    • data.world
    csv, zip
    Updated Jun 7, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Associated Press (2022). Public Health Official Departures [Dataset]. https://data.world/associatedpress/public-health-official-departures
    Explore at:
    csv, zipAvailable download formats
    Dataset updated
    Jun 7, 2022
    Authors
    The Associated Press
    Description

    Changelog:

    Update September 20, 2021: Data and overview updated to reflect data used in the September 15 story Over Half of States Have Rolled Back Public Health Powers in Pandemic. It includes 303 state or local public health leaders who resigned, retired or were fired between April 1, 2020 and Sept. 12, 2021. Previous versions of this dataset reflected data used in the Dec. 2020 and April 2021 stories.

    Overview

    Across the U.S., state and local public health officials have found themselves at the center of a political storm as they combat the worst pandemic in a century. Amid a fractured federal response, the usually invisible army of workers charged with preventing the spread of infectious disease has become a public punching bag.

    In the midst of the coronavirus pandemic, at least 303 state or local public health leaders in 41 states have resigned, retired or been fired since April 1, 2020, according to an ongoing investigation by The Associated Press and KHN.

    According to experts, that is the largest exodus of public health leaders in American history.

    Many left due to political blowback or pandemic pressure, as they became the target of groups that have coalesced around a common goal — fighting and even threatening officials over mask orders and well-established public health activities like quarantines and contact tracing. Some left to take higher profile positions, or due to health concerns. Others were fired for poor performance. Dozens retired. An untold number of lower level staffers have also left.

    The result is a further erosion of the nation’s already fragile public health infrastructure, which KHN and the AP documented beginning in 2020 in the Underfunded and Under Threat project.

    Findings

    The AP and KHN found that:

    • One in five Americans live in a community that has lost its local public health department leader during the pandemic
    • Top public health officials in 28 states have left state-level departments ## Using this data To filter for data specific to your state, use this query

    To get total numbers of exits by state, broken down by state and local departments, use this query

    Methodology

    KHN and AP counted how many state and local public health leaders have left their jobs between April 1, 2020 and Sept. 12, 2021.

    The government tasks public health workers with improving the health of the general population, through their work to encourage healthy living and prevent infectious disease. To that end, public health officials do everything from inspecting water and food safety to testing the nation’s babies for metabolic diseases and contact tracing cases of syphilis.

    Many parts of the country have a health officer and a health director/administrator by statute. The analysis counted both of those positions if they existed. For state-level departments, the count tracks people in the top and second-highest-ranking job.

    The analysis includes exits of top department officials regardless of reason, because no matter the reason, each left a vacancy at the top of a health agency during the pandemic. Reasons for departures include political pressure, health concerns and poor performance. Others left to take higher profile positions or to retire. Some departments had multiple top officials exit over the course of the pandemic; each is included in the analysis.

    Reporters compiled the exit list by reaching out to public health associations and experts in every state and interviewing hundreds of public health employees. They also received information from the National Association of City and County Health Officials, and combed news reports and records.

    Public health departments can be found at multiple levels of government. Each state has a department that handles these tasks, but most states also have local departments that either operate under local or state control. The population served by each local health department is calculated using the U.S. Census Bureau 2019 Population Estimates based on each department’s jurisdiction.

    KHN and the AP have worked since the spring on a series of stories documenting the funding, staffing and problems around public health. A previous data distribution detailed a decade's worth of cuts to state and local spending and staffing on public health. That data can be found here.

    Attribution

    Findings and the data should be cited as: "According to a KHN and Associated Press report."

    Is Data Missing?

    If you know of a public health official in your state or area who has left that position between April 1, 2020 and Sept. 12, 2021 and isn't currently in our dataset, please contact authors Anna Maria Barry-Jester annab@kff.org, Hannah Recht hrecht@kff.org, Michelle Smith mrsmith@ap.org and Lauren Weber laurenw@kff.org.

  8. Most active press media relating to the war in Ukraine in Poland 2022-2023

    • statista.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista, Most active press media relating to the war in Ukraine in Poland 2022-2023 [Dataset]. https://www.statista.com/statistics/1370112/poland-most-active-press-media-relating-to-the-war-in-ukraine/
    Explore at:
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Feb 24, 2022 - Feb 19, 2023
    Area covered
    Poland
    Description

    During the period under review, the press with the highest number of publications on the war in Ukraine was the Rzeczpospolita newspaper. Followed by Gazeta Polska Codziennie and Gazeta Wyborcza.

  9. d

    COVID Tracking Project — Testing in States

    • data.world
    csv, zip
    Updated Oct 14, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Associated Press (2024). COVID Tracking Project — Testing in States [Dataset]. https://data.world/associatedpress/covid-tracking-project-testing-in-states
    Explore at:
    zip, csvAvailable download formats
    Dataset updated
    Oct 14, 2024
    Authors
    The Associated Press
    Time period covered
    Jan 13, 2020 - Mar 8, 2021
    Description

    Updates

    April 29, 2020

    • The AP is now providing historical time series data for testing counts and death counts from The COVID Tracking Project. The counts are provided here unaltered, along with a population column with Census ACS-1 estimates and calculated testing rate and death rate columns.

    October 13, 2020

    The COVID Tracking Project is releasing more precise total testing counts, and has changed the way it is distributing the data that ends up on this site. Previously, total testing had been represented by positive tests plus negative tests. As states are beginning to report more specific testing counts, The COVID Tracking Project is moving toward reporting those numbers directly.

    This may make it more difficult to compare your state against others in terms of positivity rate, but the net effect is we now have more precise counts:

    • Total Test Encounters: Total tests increase by one for every individual that is tested that day. Additional tests for that individual on that day (i.e., multiple swabs taken at the same time) are not included

    • Total PCR Specimens: Total tests increase by one for every testing sample retrieved from an individual. Multiple samples from an individual on a single day can be included in the count

    • Unique People Tested: Total tests increase by one the first time an individual is tested. The count will not increase in later days if that individual is tested again – even months later

    These three totals are not all available for every state. The COVID Tracking Project prioritizes the different count types for each state in this order:

    1. Total Test Encounters

    2. Total PCR Specimens

    3. Unique People Tested

    If the state does not provide any of those totals directly, The COVID Tracking Project falls back to the initial calculation of total tests that it has provided up to this point: positive + negative tests.

    One of the above total counts will be the number present in the cumulative_total_test_results and total_test_results_increase columns.

    • The positivity rates provided on this site will divide confirmed cases by one of these total_test_results columns.

      • Due to these changes, we advise comparing positivity rates between states only if the states being compared have the same type of total test count.

    Overview

    The AP is using data collected by the COVID Tracking Project to measure COVID-19 testing across the United States.

    The COVID Tracking Project data is available at the state level in the United States. The AP has paired this data with population figures and has calculated testing rates and death rates per 1,000 people.

    This data is from The COVID Tracking Project API that is updated regularly throughout the day. Like all organizations dealing with data, The COVID Tracking Project is constantly refining and cleaning up their feed, so there may be brief moments where data does not appear correctly. At this link, you’ll find The COVID Tracking Project daily data reports, and a clean version of their feed.

    A Note on timing: - The COVID Tracking Project updates regularly throughout the day, but state numbers will come in at different times. The entire Tracking Project dataset will be updated between 4-5pm EDT daily. Keep this time in mind when reporting on stories comparing states. At certain times of day, one state may be more up to date than another. We have included the date_modified timestamp for state-level data, which represents the last time the state updated its data. The date_checked value in the state-level data reflects the last time The COVID Tracking Project checked the state source. We have also included the last_modified timestamp for the national-level data, which marks the last time the national data was updated.

    The AP is updating this dataset hourly at 45 minutes past the hour.

    To learn more about AP's data journalism capabilities for publishers, corporations and financial institutions, go here or email kromano@ap.org.

    About the data

    Caveats

    • The total_people_tested counts do not include pending tests. They are the total number of tests that have returned positive or negative.
    • The process for collecting testing data is different for each state. The COVID Tracking Project makes note of the difficulties specific to each state on their main data page.

    Attribution

    This data should be credited to The COVID Tracking Project

    Contact

    Nicky Forster — nforster@ap.org

  10. Major news brands and Gen Z awareness in the U.S. 2022

    • statista.com
    Updated Oct 25, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Major news brands and Gen Z awareness in the U.S. 2022 [Dataset]. https://www.statista.com/statistics/1374838/gen-z-news-brand-awareness-us/
    Explore at:
    Dataset updated
    Oct 25, 2023
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Jul 27, 2022 - Jul 28, 2022
    Area covered
    United States
    Description

    A U.S. survey found that Gen Z adults had, for the most part, lower awareness of established news brands than U.S. adults in general, with 16 percent of Gen Z saying they had never heard of NBC and around a third admitting they were not aware of The New Yorker. MSNBC, The Associated Press, Bloomberg, and Breitbart also fared poorly in this respect, though Gen Z's news consumption habits (predominantly online and via social media) mean that this lack of awareness of major brands is less surprising than it may seem.

  11. The Marshall Project: COVID Cases in Prisons

    • data.world
    csv, zip
    Updated Apr 6, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Associated Press (2023). The Marshall Project: COVID Cases in Prisons [Dataset]. https://data.world/associatedpress/marshall-project-covid-cases-in-prisons
    Explore at:
    csv, zipAvailable download formats
    Dataset updated
    Apr 6, 2023
    Dataset provided by
    data.world, Inc.
    Authors
    The Associated Press
    Time period covered
    Jul 31, 2019 - Aug 1, 2021
    Description

    Overview

    The Marshall Project, the nonprofit investigative newsroom dedicated to the U.S. criminal justice system, has partnered with The Associated Press to compile data on the prevalence of COVID-19 infection in prisons across the country. The Associated Press is sharing this data as the most comprehensive current national source of COVID-19 outbreaks in state and federal prisons.

    Lawyers, criminal justice reform advocates and families of the incarcerated have worried about what was happening in prisons across the nation as coronavirus began to take hold in the communities outside. Data collected by The Marshall Project and AP shows that hundreds of thousands of prisoners, workers, correctional officers and staff have caught the illness as prisons became the center of some of the country’s largest outbreaks. And thousands of people — most of them incarcerated — have died.

    In December, as COVID-19 cases spiked across the U.S., the news organizations also shared cumulative rates of infection among prison populations, to better gauge the total effects of the pandemic on prison populations. The analysis found that by mid-December, one in five state and federal prisoners in the United States had tested positive for the coronavirus -- a rate more than four times higher than the general population.

    This data, which is updated weekly, is an effort to track how those people have been affected and where the crisis has hit the hardest.

    Methodology and Caveats

    The data tracks the number of COVID-19 tests administered to people incarcerated in all state and federal prisons, as well as the staff in those facilities. It is collected on a weekly basis by Marshall Project and AP reporters who contact each prison agency directly and verify published figures with officials.

    Each week, the reporters ask every prison agency for the total number of coronavirus tests administered to its staff members and prisoners, the cumulative number who tested positive among staff and prisoners, and the numbers of deaths for each group.

    The time series data is aggregated to the system level; there is one record for each prison agency on each date of collection. Not all departments could provide data for the exact date requested, and the data indicates the date for the figures.

    To estimate the rate of infection among prisoners, we collected population data for each prison system before the pandemic, roughly in mid-March, in April, June, July, August, September and October. Beginning the week of July 28, we updated all prisoner population numbers, reflecting the number of incarcerated adults in state or federal prisons. Prior to that, population figures may have included additional populations, such as prisoners housed in other facilities, which were not captured in our COVID-19 data. In states with unified prison and jail systems, we include both detainees awaiting trial and sentenced prisoners.

    To estimate the rate of infection among prison employees, we collected staffing numbers for each system. Where current data was not publicly available, we acquired other numbers through our reporting, including calling agencies or from state budget documents. In six states, we were unable to find recent staffing figures: Alaska, Hawaii, Kentucky, Maryland, Montana, Utah.

    To calculate the cumulative COVID-19 impact on prisoner and prison worker populations, we aggregated prisoner and staff COVID case and death data up through Dec. 15. Because population snapshots do not account for movement in and out of prisons since March, and because many systems have significantly slowed the number of new people being sent to prison, it’s difficult to estimate the total number of people who have been held in a state system since March. To be conservative, we calculated our rates of infection using the largest prisoner population snapshots we had during this time period.

    As with all COVID-19 data, our understanding of the spread and impact of the virus is limited by the availability of testing. Epidemiology and public health experts say that aside from a few states that have recently begun aggressively testing in prisons, it is likely that there are more cases of COVID-19 circulating undetected in facilities. Sixteen prison systems, including the Federal Bureau of Prisons, would not release information about how many prisoners they are testing.

    Corrections departments in Indiana, Kansas, Montana, North Dakota and Wisconsin report coronavirus testing and case data for juvenile facilities; West Virginia reports figures for juvenile facilities and jails. For consistency of comparison with other state prison systems, we removed those facilities from our data that had been included prior to July 28. For these states we have also removed staff data. Similarly, Pennsylvania’s coronavirus data includes testing and cases for those who have been released on parole. We removed these tests and cases for prisoners from the data prior to July 28. The staff cases remain.

    About the Data

    There are four tables in this data:

    • covid_prison_cases.csv contains weekly time series data on tests, infections and deaths in prisons. The first dates in the table are on March 26. Any questions that a prison agency could not or would not answer are left blank.

    • prison_populations.csv contains snapshots of the population of people incarcerated in each of these prison systems for whom data on COVID testing and cases are available. This varies by state and may not always be the entire number of people incarcerated in each system. In some states, it may include other populations, such as those on parole or held in state-run jails. This data is primarily for use in calculating rates of testing and infection, and we would not recommend using these numbers to compare the change in how many people are being held in each prison system.

    • staff_populations.csv contains a one-time, recent snapshot of the headcount of workers for each prison agency, collected as close to April 15 as possible.

    • covid_prison_rates.csv contains the rates of cases and deaths for prisoners. There is one row for every state and federal prison system and an additional row with the National totals.

    Queries

    The Associated Press and The Marshall Project have created several queries to help you use this data:

    Get your state's prison COVID data: Provides each week's data from just your state and calculates a cases-per-100000-prisoners rate, a deaths-per-100000-prisoners rate, a cases-per-100000-workers rate and a deaths-per-100000-workers rate here

    Rank all systems' most recent data by cases per 100,000 prisoners here

    Find what percentage of your state's total cases and deaths -- as reported by Johns Hopkins University -- occurred within the prison system here

    Attribution

    In stories, attribute this data to: “According to an analysis of state prison cases by The Marshall Project, a nonprofit investigative newsroom dedicated to the U.S. criminal justice system, and The Associated Press.”

    Contributors

    Many reporters and editors at The Marshall Project and The Associated Press contributed to this data, including: Katie Park, Tom Meagher, Weihua Li, Gabe Isman, Cary Aspinwall, Keri Blakinger, Jake Bleiberg, Andrew R. Calderón, Maurice Chammah, Andrew DeMillo, Eli Hager, Jamiles Lartey, Claudia Lauer, Nicole Lewis, Humera Lodhi, Colleen Long, Joseph Neff, Michelle Pitcher, Alysia Santo, Beth Schwartzapfel, Damini Sharma, Colleen Slevin, Christie Thompson, Abbie VanSickle, Adria Watson, Andrew Welsh-Huggins.

    Questions

    If you have questions about the data, please email The Marshall Project at info+covidtracker@themarshallproject.org or file a Github issue.

    To learn more about AP's data journalism capabilities for publishers, corporations and financial institutions, go here or email kromano@ap.org.

  12. d

    Refugee Admission to the US Ending FY 2018

    • data.world
    csv, zip
    Updated Nov 20, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Refugee Admission to the US Ending FY 2018 [Dataset]. https://data.world/associatedpress/refugee-admissions-to-us-end-fy-2018
    Explore at:
    zip, csvAvailable download formats
    Dataset updated
    Nov 20, 2022
    Authors
    The Associated Press
    Time period covered
    2009 - 2018
    Description

    Overview

    At the end of the 2018 fiscal year, the U.S. had resettled 22,491 refugees -- a small fraction of the number of people who had entered in prior years. This is the smallest annual number of refugees since Congress passed a law in 1980 creating the modern resettlement system.

    It's also well below the cap of 45,000 set by the administration for 2018, and less than thirty percent of the number granted entry in the final year of Barack Obama’s presidency. It's also significantly below the cap for 2019 announced by President Trump's administration, which is 30,000.

    The Associated Press is updating its data on refugees through fiscal year 2018, which ended Sept. 30, to help reporters continue coverage of this story. Previous Associated Press data on refugees can be found here.

    Data obtained from the State Department's Bureau of Population, Refugees and Migration show the mix of refugees also has changed substantially:

    • The numbers of Iraqi, Somali and Syrian refugees -- who made up more than a third of all resettlements in the U.S. in the prior five years -- have almost entirely disappeared. Refugees from those three countries comprise about two percent of the 2018 resettlements.
    • In 2018, Christians have made up more than sixty percent of the refugee population, while the share of Muslims has dropped from roughly 45 percent of refugees in fiscal year 2016 to about 15 percent. (This data is not available at the city or state level.)
    • Of the states that usually average at least 100 resettlements, Maine, Louisiana, Michigan, Florida, California, Oklahoma and Texas have seen the largest percentage decreases in refugees. All have had their refugee caseloads drop more than 75% when comparing 2018 to the average over the previous five years (2013-2017).

    The past fiscal year marks a dramatic change in the refugee program, with only a fraction as many people entering. That affects refugees currently in the U.S., who may be waiting on relatives to arrive. It affects refugees in other countries, hoping to get to the United States for safety or other reasons. And it affects the organizations that work to house and resettle these refugees, who only a few years ago were dealing with record numbers of people. Several agencies have already closed their doors; others have laid off workers and cut back their programs.

    Because there is wide geographic variations on resettlement depending on refugees' country of origin, some U.S. cities have been more affected by this than others. For instance, in past years, Iraqis have resettled most often in San Diego, Calif., or Houston. Now, with only a handful of Iraqis being admitted in 2018, those cities have seen some of the biggest drop-offs in resettlement numbers.

    About This Data

    Datasheets include:

    • Annual_refugee_data: This provides the rawest form of the data from Oct. 1, 2008 – Sept. 30, 2018, where each record is a combination of fiscal year, city for refugee arrivals to a specific city and state and from a specific origin. Also provides annual totals for the state.
    • City_refugees: This provides data grouped by city for refugee arrivals to a specific city and state and from a specific origin, showing totals for each year next to each other in different columns, so you can quickly see trends over time. Data is from Oct. 1, 2008 – Sept. 30, 2018, grouped by fiscal year. It also compares 2018 numbers to a five-year average from 2013-2017.
    • City_refugees_and_foreign_born_proportions: This provides the data in City_refugees along with data that gives context to the origins of the foreign born populations living in each city. There are regional columns, sub-regional columns and a column specific to the origin listed in the refugee data. Data is from the American Community Survey 5-year 2013-2017 Table B05006: PLACE OF BIRTH FOR THE FOREIGN-BORN POPULATION. ### Caveats According to the State Department: "This data tracks the movement of refugees from various countries around the world to the U.S. for resettlement under the U.S. Refugee Admissions Program." The data does not include other types of immigration or visits to the U.S.

    The data tracks the refugees' stated destination in the United States. In many cases, this is where the refugees first lived, although many may have since moved.

    Be aware that some cities with particularly high totals may be the locations of refugee resettlement programs -- for instance, Glendale, Calif., is home to both Catholic Charities of Los Angeles and the International Rescue Committee of Los Angeles, which work at resettling refugees.

    About Refugee Resettlement

    The data for refugees from other countries - or for any particular timeframe since 2002 - can be accessed through the State Department's Refugee Processing Center's site by clicking on "Arrivals by Destination and Nationality."

    The Refugee Processing Center used to publish a state-by-state list of affiliate refugee organizations -- the groups that help refugees settle in the U.S. That list was last updated in January 2017, so it may now be out of date. It can be found here.

    For general information about the U.S. refugee resettlement program, see this State Department description. For more detailed information about the program and proposed 2018 caps and changes, see the FY 2018 Report to Congress.

    Queries

    The Associated Press has set up a number of pre-written queries to help you filter this data and find local stories. Queries can be accessed by clicking on their names in the upper right hand bar.

    • Find Cities Impacted - Most Change -- Use this query to see the cities that have seen the largest drop-offs in refugee resettlements. Creates a five-year average of how many refugees of a certain origin have come in the past, and then measures 2018 by that. Be wary of small raw numbers when considering the percentages!
    • Total Refugees for Each City in Your State -- Use this query to get the number of total refugees who've resettled in your state's cities by year.
    • Total Refugees in Your State -- Use this query to get the number of total refugees who've resettled in your state by year.
    • Changes in Origin over Time -- Use this query to track how many refugees are coming from each origin by year. The initial query provides national numbers, but can be filtered for state or even for city.
    • Extract Raw Data for Your State -- Use this query to type in your state name to extract and download just the data in your state. This is the raw data from the State Department, so it may be slightly more difficult to see changes over time. ###### Contact AP Data Journalist Michelle Minkoff with questions, mminkoff@ap.org
  13. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
The Associated Press (2024). AP VoteCast 2020 - General Election [Dataset]. https://data.world/associatedpress/ap-votecast

AP VoteCast 2020 - General Election

AP VoteCast provides all the data you need to tell the story of who voted and why in the 2020 U.S. general election.

Explore at:
csv, zipAvailable download formats
Dataset updated
Mar 29, 2024
Authors
The Associated Press
Description

AP VoteCast is a survey of the American electorate conducted by NORC at the University of Chicago for Fox News, NPR, PBS NewsHour, Univision News, USA Today Network, The Wall Street Journal and The Associated Press.

AP VoteCast combines interviews with a random sample of registered voters drawn from state voter files with self-identified registered voters selected using nonprobability approaches. In general elections, it also includes interviews with self-identified registered voters conducted using NORC’s probability-based AmeriSpeak® panel, which is designed to be representative of the U.S. population.

Interviews are conducted in English and Spanish. Respondents may receive a small monetary incentive for completing the survey. Participants selected as part of the random sample can be contacted by phone and mail and can take the survey by phone or online. Participants selected as part of the nonprobability sample complete the survey online.

In the 2020 general election, the survey of 133,103 interviews with registered voters was conducted between Oct. 26 and Nov. 3, concluding as polls closed on Election Day. AP VoteCast delivered data about the presidential election in all 50 states as well as all Senate and governors’ races in 2020.

Using this Data - IMPORTANT

This is survey data and must be properly weighted during analysis: DO NOT REPORT THIS DATA AS RAW OR AGGREGATE NUMBERS!!

Instead, use statistical software such as R or SPSS to weight the data.

National Survey

The national AP VoteCast survey of voters and nonvoters in 2020 is based on the results of the 50 state-based surveys and a nationally representative survey of 4,141 registered voters conducted between Nov. 1 and Nov. 3 on the probability-based AmeriSpeak panel. It included 41,776 probability interviews completed online and via telephone, and 87,186 nonprobability interviews completed online. The margin of sampling error is plus or minus 0.4 percentage points for voters and 0.9 percentage points for nonvoters.

State Surveys

In 20 states in 2020, AP VoteCast is based on roughly 1,000 probability-based interviews conducted online and by phone, and roughly 3,000 nonprobability interviews conducted online. In these states, the margin of sampling error is about plus or minus 2.3 percentage points for voters and 5.5 percentage points for nonvoters.

In an additional 20 states, AP VoteCast is based on roughly 500 probability-based interviews conducted online and by phone, and roughly 2,000 nonprobability interviews conducted online. In these states, the margin of sampling error is about plus or minus 2.9 percentage points for voters and 6.9 percentage points for nonvoters.

In the remaining 10 states, AP VoteCast is based on about 1,000 nonprobability interviews conducted online. In these states, the margin of sampling error is about plus or minus 4.5 percentage points for voters and 11.0 percentage points for nonvoters.

Although there is no statistically agreed upon approach for calculating margins of error for nonprobability samples, these margins of error were estimated using a measure of uncertainty that incorporates the variability associated with the poll estimates, as well as the variability associated with the survey weights as a result of calibration. After calibration, the nonprobability sample yields approximately unbiased estimates.

As with all surveys, AP VoteCast is subject to multiple sources of error, including from sampling, question wording and order, and nonresponse.

Sampling Details

Probability-based Registered Voter Sample

In each of the 40 states in which AP VoteCast included a probability-based sample, NORC obtained a sample of registered voters from Catalist LLC’s registered voter database. This database includes demographic information, as well as addresses and phone numbers for registered voters, allowing potential respondents to be contacted via mail and telephone. The sample is stratified by state, partisanship, and a modeled likelihood to respond to the postcard based on factors such as age, race, gender, voting history, and census block group education. In addition, NORC attempted to match sampled records to a registered voter database maintained by L2, which provided additional phone numbers and demographic information.

Prior to dialing, all probability sample records were mailed a postcard inviting them to complete the survey either online using a unique PIN or via telephone by calling a toll-free number. Postcards were addressed by name to the sampled registered voter if that individual was under age 35; postcards were addressed to “registered voter” in all other cases. Telephone interviews were conducted with the adult that answered the phone following confirmation of registered voter status in the state.

Nonprobability Sample

Nonprobability participants include panelists from Dynata or Lucid, including members of its third-party panels. In addition, some registered voters were selected from the voter file, matched to email addresses by V12, and recruited via an email invitation to the survey. Digital fingerprint software and panel-level ID validation is used to prevent respondents from completing the AP VoteCast survey multiple times.

AmeriSpeak Sample

During the initial recruitment phase of the AmeriSpeak panel, randomly selected U.S. households were sampled with a known, non-zero probability of selection from the NORC National Sample Frame and then contacted by mail, email, telephone and field interviewers (face-to-face). The panel provides sample coverage of approximately 97% of the U.S. household population. Those excluded from the sample include people with P.O. Box-only addresses, some addresses not listed in the U.S. Postal Service Delivery Sequence File and some newly constructed dwellings. Registered voter status was confirmed in field for all sampled panelists.

Weighting Details

AP VoteCast employs a four-step weighting approach that combines the probability sample with the nonprobability sample and refines estimates at a subregional level within each state. In a general election, the 50 state surveys and the AmeriSpeak survey are weighted separately and then combined into a survey representative of voters in all 50 states.

State Surveys

First, weights are constructed separately for the probability sample (when available) and the nonprobability sample for each state survey. These weights are adjusted to population totals to correct for demographic imbalances in age, gender, education and race/ethnicity of the responding sample compared to the population of registered voters in each state. In 2020, the adjustment targets are derived from a combination of data from the U.S. Census Bureau’s November 2018 Current Population Survey Voting and Registration Supplement, Catalist’s voter file and the Census Bureau’s 2018 American Community Survey. Prior to adjusting to population totals, the probability-based registered voter list sample weights are adjusted for differential non-response related to factors such as availability of phone numbers, age, race and partisanship.

Second, all respondents receive a calibration weight. The calibration weight is designed to ensure the nonprobability sample is similar to the probability sample in regard to variables that are predictive of vote choice, such as partisanship or direction of the country, which cannot be fully captured through the prior demographic adjustments. The calibration benchmarks are based on regional level estimates from regression models that incorporate all probability and nonprobability cases nationwide.

Third, all respondents in each state are weighted to improve estimates for substate geographic regions. This weight combines the weighted probability (if available) and nonprobability samples, and then uses a small area model to improve the estimate within subregions of a state.

Fourth, the survey results are weighted to the actual vote count following the completion of the election. This weighting is done in 10–30 subregions within each state.

National Survey

In a general election, the national survey is weighted to combine the 50 state surveys with the nationwide AmeriSpeak survey. Each of the state surveys is weighted as described. The AmeriSpeak survey receives a nonresponse-adjusted weight that is then adjusted to national totals for registered voters that in 2020 were derived from the U.S. Census Bureau’s November 2018 Current Population Survey Voting and Registration Supplement, the Catalist voter file and the Census Bureau’s 2018 American Community Survey. The state surveys are further adjusted to represent their appropriate proportion of the registered voter population for the country and combined with the AmeriSpeak survey. After all votes are counted, the national data file is adjusted to match the national popular vote for president.

Search
Clear search
Close search
Google apps
Main menu