1 dataset found
  1. Z

    PAN12 Originality: Text-Alignment

    • data.niaid.nih.gov
    Updated Apr 21, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Potthast, Martin; Gollub, Tim; Hagen, Matthias; Graßegger, Jan; Kiesel, Johannes; Michel, Maximilian; Oberländer, Arnd; Tippmann, Martin; Barrón-Cedeño, Alberto; Gupta, Parth; Rosso, Paolo; Stein, Benno (2020). PAN12 Originality: Text-Alignment [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3715851
    Explore at:
    Dataset updated
    Apr 21, 2020
    Dataset provided by
    Universität Leipzig
    Martin-Luther-Universität Halle-Wittenberg
    Bauhaus-Universität Weimar
    Authors
    Potthast, Martin; Gollub, Tim; Hagen, Matthias; Graßegger, Jan; Kiesel, Johannes; Michel, Maximilian; Oberländer, Arnd; Tippmann, Martin; Barrón-Cedeño, Alberto; Gupta, Parth; Rosso, Paolo; Stein, Benno
    Description

    We provide you with a training corpus that consists of pairs of documents, one of which may contain passages of text resued from the other. The reused text is subject to various kinds of (automatic) obfuscation to hide the fact it has been reused.

  2. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Potthast, Martin; Gollub, Tim; Hagen, Matthias; Graßegger, Jan; Kiesel, Johannes; Michel, Maximilian; Oberländer, Arnd; Tippmann, Martin; Barrón-Cedeño, Alberto; Gupta, Parth; Rosso, Paolo; Stein, Benno (2020). PAN12 Originality: Text-Alignment [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3715851

PAN12 Originality: Text-Alignment

Explore at:
Dataset updated
Apr 21, 2020
Dataset provided by
Universität Leipzig
Martin-Luther-Universität Halle-Wittenberg
Bauhaus-Universität Weimar
Authors
Potthast, Martin; Gollub, Tim; Hagen, Matthias; Graßegger, Jan; Kiesel, Johannes; Michel, Maximilian; Oberländer, Arnd; Tippmann, Martin; Barrón-Cedeño, Alberto; Gupta, Parth; Rosso, Paolo; Stein, Benno
Description

We provide you with a training corpus that consists of pairs of documents, one of which may contain passages of text resued from the other. The reused text is subject to various kinds of (automatic) obfuscation to hide the fact it has been reused.

Search
Clear search
Close search
Google apps
Main menu