Search
Clear search
Close search
Main menu
Google apps
2 datasets found
  1. W

    PAN-WQF-12

    • anthology.aicmu.ac.cn
    • webis.de
    3250135
    Updated 2012
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Maik Anderka; Benno Stein (2012). PAN-WQF-12 [Dataset]. http://doi.org/10.5281/zenodo.3250135
    Explore at:
    3250135Available download formats
    Dataset updated
    2012
    Dataset provided by
    The Web Technology & Information Systems Network
    Bauhaus-Universität Weimar
    Diebold Nixdorf
    Authors
    Maik Anderka; Benno Stein
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The PAN Wikipedia Quality Flaw Corpus 2012, PAN-WQF-12, provides human-labeled English Wikipedia articles that contain specific quality flaws.

  2. E

    PAN Wikipedia Quality Flaw Corpus 2012 (PAN-WQF-12)

    • live.european-language-grid.eu
    • data.niaid.nih.gov
    • +1more
    Updated Apr 24, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). PAN Wikipedia Quality Flaw Corpus 2012 (PAN-WQF-12) [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/7531
    Explore at:
    Dataset updated
    Apr 24, 2024
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The PAN Wikipedia Quality Flaw Corpus 2012, PAN-WQF-12, provides human-labeled English Wikipedia articles that contain specific quality flaws.The corpus comprises 1,592,226 articles extracted from the English Wikipedia snapshot from January 4th, 2012. A subset of 208,228 articles is labled with ten specific quality flaws, which are listed in the following table. The labeling is based on human-defined cleanup tags. In addition, the corpus comprises 1,383,998 articles that have not been tagged with any cleanup tag.

  3. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Maik Anderka; Benno Stein (2012). PAN-WQF-12 [Dataset]. http://doi.org/10.5281/zenodo.3250135

PAN-WQF-12

Explore at:
5 scholarly articles cite this dataset (View in Google Scholar)
3250135Available download formats
Dataset updated
2012
Dataset provided by
The Web Technology & Information Systems Network
Bauhaus-Universität Weimar
Diebold Nixdorf
Authors
Maik Anderka; Benno Stein
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

The PAN Wikipedia Quality Flaw Corpus 2012, PAN-WQF-12, provides human-labeled English Wikipedia articles that contain specific quality flaws.