5 datasets found
  1. W

    Webis-Web-Errors-19

    • webis.de
    • data.niaid.nih.gov
    • +1more
    2549837
    Updated 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Johannes Kiesel; Martin Potthast; Matthias Hagen; Benno Stein; Florian Kneist (2019). Webis-Web-Errors-19 [Dataset]. http://doi.org/10.5281/zenodo.2549837
    Explore at:
    2549837Available download formats
    Dataset updated
    2019
    Dataset provided by
    The Web Technology & Information Systems Network
    University of Kassel, hessian.AI, and ScaDS.AI
    Bauhaus-Universit�t Weimar
    Friedrich Schiller University Jena
    Authors
    Johannes Kiesel; Martin Potthast; Matthias Hagen; Benno Stein; Florian Kneist
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The Webis-Web-Errors-19 comprises various annotations for the 10,000 web page archives of the Webis-Web-Archive-17. The annotations are whether the page is (1) mostly advertisement, (2) cut off, (3) still loading, (4) pornographic; and whether it shows (not/a bit/ very) (5) pop-ups, (6) CAPTCHAs, or (7) error messages.

  2. High-Frequency Monitoring of COVID-19 Impacts on Households 2021-2022,...

    • microdata.worldbank.org
    • catalog.ihsn.org
    Updated Jul 11, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    World Bank (2023). High-Frequency Monitoring of COVID-19 Impacts on Households 2021-2022, Rounds 1-3 - Malaysia [Dataset]. https://microdata.worldbank.org/index.php/catalog/4449
    Explore at:
    Dataset updated
    Jul 11, 2023
    Dataset authored and provided by
    World Bankhttp://worldbank.org/
    Time period covered
    2021 - 2022
    Area covered
    Malaysia
    Description

    Abstract

    The World Bank has launched a fast-deploying high-frequency phone-based survey of households to generate near real time insights into the socio-economic impact of COVID-19 on households which hence to be used to support evidence-based policy responses to the crisis. At a time when conventional modes of data collection are not feasible, this phone-based rapid data collection method offers a way to gather granular information on the transmission mechanisms of the crisis on the populations, to identify gaps in policy responses, and to generate insights to inform scaling up or redirection of resources as the crisis unfolds.

    Geographic coverage

    National

    Analysis unit

    Individual, Household-level

    Sampling procedure

    A mobile frame was generated via random digit dialing (RDD), based on the National Numbering Plans from the Malaysian Communications and Multimedia Commission (MCMC). All possible subscriber combinations were generated in DRUID (D Force Sampling's Reactive User Interface Database), an SQL database interface which houses the complete sampling frame. From this database, complete random telephone numbers were sampled. For Round 1, a sample of 33,894 phone numbers were drawn (without replacement within the survey wave) from a total of 102,780,000 possible mobile numbers from more than 18 mobile providers in the sampling frame, which were not stratified. Once the sample was drawn in the form of replicates (subsamples) of n = 10.000, the numbers were filtered by D-Force Sampling using an auto-dialer to determine each numbers' working status. All numbers that yield a working call disposition for at least one of the two filtering attempts were then passed to the CATI center human interviewing team. Mobile devices were assumed to be personal, and therefore the person who answered the call was the selected respondent. Screening questions were used to ensure that the respondent was at least 18 years old and within the capacity of either contributing, making or with knowledge of household finances. Respondents who had participated in Round 1 were sampled for Round 2. Fresh respondents were introduced in Round 3 in addition to panel respondents from Round 2; fresh respondents in Round 3 were selected using the same procedure for sampling respondents in Round 1.

    Mode of data collection

    Computer Assisted Telephone Interview [cati]

    Research instrument

    The questionnaire is available in three languages, including English, Bahasa Melayu, and Mandarin Chinese. It can be downloaded from the Downloads section.

    Response rate

    In Round 1, the survey successfully interviewed 2,210 individuals out of 33,894 sampled phone numbers. In Round 2, the survey successfully re-interviewed 1,047 individuals, recording a 47% response rate. In Round 3, the survey successfully re-interviewed 667 respondents who had been previously interviewed in Round 2, recording a 64% response rate. The panel respondents in Round 3 were added with 446 fresh respondents.

    Sampling error estimates

    In Round 1, assuming a simple random sample, with p=0.5 and n=2,210 at the 95% CI level, yields a margin of sampling error (MOE) of 2.09 percentage points. Incorporating the design effect into this estimate yields a margin of sampling error of 2.65% percentage points.

    In Round 2, the complete weight was for the entire sample adjusted to the 2021 population estimates from DOSM’s annual intercensal population projections. Assuming a simple random sample with p=0.5 and n=1,047 at the 95% CI level, yields a margin of sampling error (MOE) of 3.803 percentage points. Incorporating the design effect into this estimate yields a margin of sampling error of 3.54 percentage points.

    Among both fresh and panel samples in Round 3, assuming a simple random sample, with p=0.5 and n=1,113 at the 95% CI level yields a margin of sampling error (MOE) of 2.94 percentage points. Incorporating the design effect into this estimate yields a margin of sampling error of 3.34 percentage points.

    Among panel samples in Round 3, with p=0.5 and n=667 at the 95% CI level yields a margin of sampling error (MOE) of 3.80 percentage points. Incorporating the design effect into this estimate yields a margin of sampling error of 4.16 percentage points.

  3. n

    Coronavirus (Covid-19) Data in the United States

    • nytimes.com
    • openicpsr.org
    • +4more
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    New York Times, Coronavirus (Covid-19) Data in the United States [Dataset]. https://www.nytimes.com/interactive/2020/us/coronavirus-us-cases.html
    Explore at:
    Dataset provided by
    New York Times
    Description

    The New York Times is releasing a series of data files with cumulative counts of coronavirus cases in the United States, at the state and county level, over time. We are compiling this time series data from state and local governments and health departments in an attempt to provide a complete record of the ongoing outbreak.

    Since late January, The Times has tracked cases of coronavirus in real time as they were identified after testing. Because of the widespread shortage of testing, however, the data is necessarily limited in the picture it presents of the outbreak.

    We have used this data to power our maps and reporting tracking the outbreak, and it is now being made available to the public in response to requests from researchers, scientists and government officials who would like access to the data to better understand the outbreak.

    The data begins with the first reported coronavirus case in Washington State on Jan. 21, 2020. We will publish regular updates to the data in this repository.

  4. C

    Allegheny County COVID-19 Tests, Cases and Deaths (Archive)

    • data.wprdc.org
    csv, html
    Updated Jun 13, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Allegheny County (2024). Allegheny County COVID-19 Tests, Cases and Deaths (Archive) [Dataset]. https://data.wprdc.org/dataset/allegheny-county-covid-19-tests-cases-and-deaths
    Explore at:
    html, csv(34046863), csv(339166949), csv, csv(277234), csv(16109), csv(14904), csv(840)Available download formats
    Dataset updated
    Jun 13, 2024
    Dataset provided by
    Allegheny County
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    Allegheny County
    Description

    COVID-19 Cases information is reported through the Pennsylvania State Department’s National Electronic Disease Surveillance System (PA-NEDSS). As new cases are passed to the Allegheny County Health Department they are investigated by case investigators. During investigation some cases which are initially determined by the State to be in the Allegheny County jurisdiction may change, which can account for differences between publication of the files on the number of cases, deaths and tests. Additionally, information is not always reported to the State in a timely manner, delays can range from days to weeks, which can also account for discrepancies between previous and current files. Test and Case information will be updated daily. This resource contains individuals who received a COVID-19 test and individuals whom are probable cases. Every day, these records are overwritten with updates. Each row in the data reflects a person that is tested, not tests that are conducted. People that are tested more than once will have their testing and case data updated using the following rules:

    1. Positive tests overwrite negative tests.
    2. Polymerase chain reaction (PCR) tests overwrite antibody or antigen (AG) tests.
    3. The first positive PCR test is never overwritten. Data collected from additional tests do not replace the first positive PCR test.

    Note: On April 4th 2022 the Pennsylvania Department of Health no longer required labs to report negative AG tests. Therefore aggregated counts that included AG tests have been removed from the Municipality/Neighborhood files going forward. Versions of this data up to this cut-off have been retained as archived files.

    Individual Test information is also updated daily. This resource contains the details and results of individual tests along with demographic information of the individual tested. Only PCR and AG tests are included. Every day, these records are overwritten with updates. This resource should be used to determine positivity rates.

    The remaining datasets provide statistics on death demographics. Demographic, municipality and neighborhood information for deaths are reported on a weekly schedule and are not included with individual cases or tests. This has been done to protect the privacy and security of individuals and their families in accordance with the Health Insurance Portability and Accountability Act (HIPAA). Municipality or City of Pittsburgh Neighborhood is based off the geocoded home address of the individual tested.

    Individuals whose home address is incomplete may not be in Allegheny County but whose temporary residency, work or other mitigating circumstance are determined to be in Allegheny County by the Pennsylvania Department of Health are counted as "Undefined".

    Since the start of the pandemic, the ACHD has mapped every day’s COVID tests, cases, and deaths to their Allegheny County municipality and neighborhood. Tests were mapped to patient address, and if this was not available, to the provider location. This has recently resulted in apparent testing rates that exceeded the populations of various municipalities -- mostly those with healthcare providers. As this was brought to our attention, the health department and our data partners began researching and comparing methods to most accurately display the data. This has led us to leave those with missing home addresses off the map. Although these data will still appear in test, case and death counts, there will be over 20,000 fewer tests and almost 1000 fewer cases on the map. In addition to these map changes, we have identified specific health systems and laboratories that had data uploading errors that resulted in missing locations, and are working with them to correct these errors.

    Due to minor discrepancies in the Municipal boundary and the City of Pittsburgh Neighborhood files individuals whose City Neighborhood cannot be identified are be counted as “Undefined (Pittsburgh)”.

    On May 19, 2023, with the rescinding of the COVID-19 public health emergency, changes in data and reporting mechanisms prompted a change to an annual data sharing schedule for tests, cases, hospitalizations, and deaths. Dates for annual release are TBD. The weekly municipal counts and individual data produced before this changed are maintained as archive files.

    Support for Health Equity datasets and tools provided by Amazon Web Services (AWS) through their Health Equity Initiative.

  5. CDC COVID-19 Community Levels by County

    • opendata.ramseycounty.us
    application/rdfxml +5
    Updated Nov 4, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Center for Disease Control and Prevention (2024). CDC COVID-19 Community Levels by County [Dataset]. https://opendata.ramseycounty.us/Public-Health/CDC-COVID-19-Community-Levels-by-County/uazb-iwdp
    Explore at:
    application/rdfxml, json, xml, csv, tsv, application/rssxmlAvailable download formats
    Dataset updated
    Nov 4, 2024
    Dataset provided by
    Centers for Disease Control and Preventionhttp://www.cdc.gov/
    Authors
    Center for Disease Control and Prevention
    License

    https://www.usa.gov/government-workshttps://www.usa.gov/government-works

    Description

    This public use dataset has 11 data elements reflecting United States COVID-19 community levels for all available counties. This dataset contains the same values used to display information available on the COVID Data Tracker at: https://covid.cdc.gov/covid-data-tracker/#county-view?list_select_state=all_states&list_select_county=all_counties&data-type=CommunityLevels The data are updated weekly.

    CDC looks at the combination of three metrics — new COVID-19 admissions per 100,000 population in the past 7 days, the percent of staffed inpatient beds occupied by COVID-19 patients, and total new COVID-19 cases per 100,000 population in the past 7 days — to determine the COVID-19 community level. The COVID-19 community level is determined by the higher of the new admissions and inpatient beds metrics, based on the current level of new cases per 100,000 population in the past 7 days. New COVID-19 admissions and the percent of staffed inpatient beds occupied represent the current potential for strain on the health system. Data on new cases acts as an early warning indicator of potential increases in health system strain in the event of a COVID-19 surge. Using these data, the COVID-19 community level is classified as low, medium, or high. COVID-19 Community Levels can help communities and individuals make decisions based on their local context and their unique needs. Community vaccination coverage and other local information, like early alerts from surveillance, such as through wastewater or the number of emergency department visits for COVID-19, when available, can also inform decision making for health officials and individuals.

    See https://www.cdc.gov/coronavirus/2019-ncov/science/community-levels.html for more information.

    For the most accurate and up-to-date data for any county or state, visit the relevant health department website. COVID Data Tracker may display data that differ from state and local websites. This can be due to differences in how data were collected, how metrics were calculated, or the timing of web updates.

    For more details on the Minnesota Department of Health COVID-19 thresholds, see COVID-19 Public Health Risk Measures: Data Notes (Updated 4/13/22). https://mn.gov/covid19/assets/phri_tcm1148-434773.pdf

    Note: This dataset was renamed from "United States COVID-19 Community Levels by County as Originally Posted" to "United States COVID-19 Community Levels by County" on March 31, 2022. March 31, 2022: Column name for county population was changed to “county_population”. No change was made to the data points previous released. March 31, 2022: New column, “health_service_area_population”, was added to the dataset to denote the total population in the designated Health Service Area based on 2019 Census estimate. March 31, 2022: FIPS codes for territories American Samoa, Guam, Commonwealth of the Northern Mariana Islands, and United States Virgin Islands were re-formatted to 5-digit numeric for records released on 3/3/2022 to be consistent with other records in the dataset. March 31, 2022: Changes were made to the text fields in variables “county”, “state”, and “health_service_area” so the formats are consistent across releases. March 31, 2022: The “%” sign was removed from the text field in column “covid_inpatient_bed_utilization”. No change was made to the data. As indicated in the column description, values in this column represent the percentage of staffed inpatient beds occupied by COVID-19 patients (7-day average). March 31, 2022: Data values for columns, “county_population”, “health_service_area_number”, and “health_service_area” were backfilled for records released on 2/24/2022. These columns were added since the week of 3/3/2022, thus the values were previously missing for records released the week prior. April 7, 2022: Updates made to data released on 3/24/2022 for Guam, Commonwealth of the Northern Mariana Islands, and United States Virgin Islands to correct a data mapping error.

  6. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Johannes Kiesel; Martin Potthast; Matthias Hagen; Benno Stein; Florian Kneist (2019). Webis-Web-Errors-19 [Dataset]. http://doi.org/10.5281/zenodo.2549837

Webis-Web-Errors-19

Explore at:
2549837Available download formats
Dataset updated
2019
Dataset provided by
The Web Technology & Information Systems Network
University of Kassel, hessian.AI, and ScaDS.AI
Bauhaus-Universit�t Weimar
Friedrich Schiller University Jena
Authors
Johannes Kiesel; Martin Potthast; Matthias Hagen; Benno Stein; Florian Kneist
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

The Webis-Web-Errors-19 comprises various annotations for the 10,000 web page archives of the Webis-Web-Archive-17. The annotations are whether the page is (1) mostly advertisement, (2) cut off, (3) still loading, (4) pornographic; and whether it shows (not/a bit/ very) (5) pop-ups, (6) CAPTCHAs, or (7) error messages.

Search
Clear search
Close search
Google apps
Main menu