1 dataset found
  1. Z

    PAN12 Originality: Source Retrieval

    • data.niaid.nih.gov
    Updated Jun 11, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Potthast, Martin; Gollub, Tim; Hagen, Matthias; Graßegger, Jan; Kiesel, Johannes; Michel, Maximilian; Oberländer, Arnd; Tippmann, Martin; Barrón-Cedeño, Alberto; Gupta, Parth; Rosso, Paolo; Stein, Benno (2022). PAN12 Originality: Source Retrieval [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3713287
    Explore at:
    Dataset updated
    Jun 11, 2022
    Dataset provided by
    Bauhaus-Universität Weimar
    Martin-Luther-Universität Halle-Wittenberg
    Universität Leipzig
    Authors
    Potthast, Martin; Gollub, Tim; Hagen, Matthias; Graßegger, Jan; Kiesel, Johannes; Michel, Maximilian; Oberländer, Arnd; Tippmann, Martin; Barrón-Cedeño, Alberto; Gupta, Parth; Rosso, Paolo; Stein, Benno
    Description

    We provide you with a training corpus that consists of suspicious documents. Each suspicious document is about a specific topic and may consist of plagiarized passages obtained from web pages on that topic found in the ClueWeb09 corpus.

  2. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Potthast, Martin; Gollub, Tim; Hagen, Matthias; Graßegger, Jan; Kiesel, Johannes; Michel, Maximilian; Oberländer, Arnd; Tippmann, Martin; Barrón-Cedeño, Alberto; Gupta, Parth; Rosso, Paolo; Stein, Benno (2022). PAN12 Originality: Source Retrieval [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3713287

PAN12 Originality: Source Retrieval

Explore at:
Dataset updated
Jun 11, 2022
Dataset provided by
Bauhaus-Universität Weimar
Martin-Luther-Universität Halle-Wittenberg
Universität Leipzig
Authors
Potthast, Martin; Gollub, Tim; Hagen, Matthias; Graßegger, Jan; Kiesel, Johannes; Michel, Maximilian; Oberländer, Arnd; Tippmann, Martin; Barrón-Cedeño, Alberto; Gupta, Parth; Rosso, Paolo; Stein, Benno
Description

We provide you with a training corpus that consists of suspicious documents. Each suspicious document is about a specific topic and may consist of plagiarized passages obtained from web pages on that topic found in the ClueWeb09 corpus.

Search
Clear search
Close search
Google apps
Main menu