Search
Clear search
Close search
Main menu
Google apps
2 datasets found
  1. E

    Paderborn Genre Analysis Corpus 2012 (PaGA-12)

    • live.european-language-grid.eu
    • data.niaid.nih.gov
    • +1more
    mysql
    Updated May 10, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Paderborn Genre Analysis Corpus 2012 (PaGA-12) [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/7527
    Explore at:
    mysqlAvailable download formats
    Dataset updated
    May 10, 2024
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The Paderborn Genre Analysis 2012 corpus (PaGA-12) contains 1,639 HTML documents of 26 genres. All documents were collected from 2009-10-18 to 2009-11-20, and each document is manually assigned to exactly one genre. For each genre, the corpus provides at least 50 documents.All HTML documents contain German text only, and framesets are removed. The corpus is delivered in form of a MySQL database dump; the database structure is detailed in a README file delivered with the corpus.

  2. W

    Paderborn Genre Analysis Corpus 2012

    • webis.de
    Updated 2012
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Baumann, Michael; Lettmann, Theodor; Stein, Benno (2012). Paderborn Genre Analysis Corpus 2012 [Dataset]. http://doi.org/10.5281/zenodo.3250070
    Explore at:
    Dataset updated
    2012
    Dataset provided by
    The Web Technology & Information Systems Network
    Bauhaus-Universität Weimar
    Authors
    Baumann, Michael; Lettmann, Theodor; Stein, Benno
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Paderborn
    Description

    The Paderborn Genre Analysis 2012 corpus (PaGA-12) contains 1,639 HTML documents of 26 genres. All documents were collected from 2009-10-18 to 2009-11-20, and each document is manually assigned to exactly one genre. For each genre, the corpus provides at least 50 documents.

  3. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
(2024). Paderborn Genre Analysis Corpus 2012 (PaGA-12) [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/7527

Paderborn Genre Analysis Corpus 2012 (PaGA-12)

Explore at:
mysqlAvailable download formats
Dataset updated
May 10, 2024
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

The Paderborn Genre Analysis 2012 corpus (PaGA-12) contains 1,639 HTML documents of 26 genres. All documents were collected from 2009-10-18 to 2009-11-20, and each document is manually assigned to exactly one genre. For each genre, the corpus provides at least 50 documents.All HTML documents contain German text only, and framesets are removed. The corpus is delivered in form of a MySQL database dump; the database structure is detailed in a README file delivered with the corpus.