Saved datasets
Last updated
Download format
Croissant
Croissant is a format for Machine Learning datasets
Learn more about this at mlcommons.org/croissant.
Usage rights
License from data provider
Please review the applicable license to make sure your contemplated use is permitted.
Topic
Provider
Free
Cost to access
Described as free to access or have a license that allows redistribution.
33 datasets found
  1. Webis-Web-Errors-19

    • zenodo.org
    • webis.de
    • +1more
    csv, png, txt
    Updated Sep 21, 2020
  2. C

    Allegheny County COVID-19 Tests, Cases and Deaths

    • data.wprdc.org
    csv, html
    Updated May 26, 2023
    + more versions
  3. n

    Coronavirus (Covid-19) Data in the United States

    • nytimes.com
    • openicpsr.org
    • +3more
  4. High-Frequency Monitoring of COVID-19 Impacts on Households 2021-2022,...

    • microdata.worldbank.org
    • catalog.ihsn.org
    Updated Jul 11, 2023
    + more versions
  5. E

    COVID-19 FDA dataset v1. Bilingual (EN, ES)

    • live.european-language-grid.eu
    tmx
    Updated Apr 30, 2020
  6. E

    COVID-19 FDA dataset v2. Multilingual (EN, ES, KO, VI, TL)

    • live.european-language-grid.eu
    tmx
    Updated Aug 24, 2020
  7. E

    COVID-19 CDC dataset v2. Multilingual (EN, ES, FR, PT, IT, DE, KO, RU, ZH,...

    • live.european-language-grid.eu
    tmx
    Updated Aug 15, 2020
    + more versions
  8. E

    COVID-19 HEALTH-AU dataset. Multilingual (EN, ES, IT, EL)

    • live.european-language-grid.eu
    tmx
    Updated Jan 5, 2020
    + more versions
  9. E

    COVID-19 BUSINESS-NL dataset. Bilingual (EN, NL)

    • live.european-language-grid.eu
    tmx
    Updated Nov 5, 2020
  10. E

    COVID-19 New Zealand dataset. Multilingual (EN, KO, IN, ES)

    • live.european-language-grid.eu
    tmx
    Updated Nov 5, 2020
  11. E

    COVID-19 GOV-FR dataset v1. Bilingual (EN, FR)

    • live.european-language-grid.eu
    tmx
    Updated Aug 25, 2020
  12. E

    COVID-19 Voltaire dataset v2. Multilingual (EN, AR, CS, DE, EL, ES, FA, FR,...

    • live.european-language-grid.eu
    tmx
    Updated Jun 8, 2020
    + more versions
  13. E

    COVID-19 CDC dataset v1. Bilingual (EN-FR)

    • live.european-language-grid.eu
    tmx
    Updated Apr 25, 2020
    + more versions
  14. E

    COVID-19 Government of Sweden dataset v1. Bilingual (EN, SV)

    • live.european-language-grid.eu
    tmx
    Updated Sep 18, 2020
  15. E

    COVID-19 New Zealand dataset. Bilingual (EN-ES)

    • live.european-language-grid.eu
    tmx
    Updated Nov 5, 2020
    + more versions
  16. E

    Manually Classified Errors in En->Sk Translation

    • live.european-language-grid.eu
    • lindat.cz
    binary format
    Updated May 14, 2012
    + more versions
  17. E

    Spanish-Italian website parallel corpus

    • live.european-language-grid.eu
    • data.europa.eu
    tmx
    Updated Jan 23, 2017
    + more versions
  18. E

    Data from: Manually Classified Errors in Cs->Sk Translation

    • live.european-language-grid.eu
    • lindat.cz
    binary format
    Updated May 14, 2012
    + more versions
  19. E

    COVID-19 Voltaire dataset v1. Bilingual (EN-PL)

    • live.european-language-grid.eu
    tmx
    + more versions
  20. E

    COVID-19 SAM-LT dataset. Bilingual (EN, LT)

    • live.european-language-grid.eu
    tmx
    Updated May 5, 2020
Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Johannes Kiesel; Johannes Kiesel; Fabienne Hubricht; Benno Stein; Martin Potthast; Martin Potthast; Fabienne Hubricht; Benno Stein (2020). Webis-Web-Errors-19 [Dataset]. http://doi.org/10.5281/zenodo.2640364
Organization logo

Webis-Web-Errors-19

Explore at:
csv, png, txtAvailable download formats
Dataset updated
Sep 21, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Johannes Kiesel; Johannes Kiesel; Fabienne Hubricht; Benno Stein; Martin Potthast; Martin Potthast; Fabienne Hubricht; Benno Stein
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

The Webis-Web-Errors-19 comprises various annotations for the 10,000 web page archives of the Webis-Web-Archive-17. The annotations are whether the page is (1) mostly advertisement, (2) cut off, (3) still loading, (4) pornographic; and whether it shows (not/a bit/ very) (5) pop-ups, (6) CAPTCHAs, or (7) error messages. If you use this dataset in your research, please cite it using this paper.

Search
Clear search
Close search
Google apps
Main menu