Saved datasets
Last updated
Download format
Usage rights
License from data provider
Please review the applicable license to make sure your contemplated use is permitted.
Topic
Provider
Free
Cost to access
Described as free to access or have a license that allows redistribution.
4 datasets found
  1. Z

    PAN22 Authorship Analysis: Authorship Verification

    • data.niaid.nih.gov
    Updated Nov 30, 2022
  2. PAN22 Authorship Analysis: Authorship Verification

    • zenodo.org
    Updated Nov 30, 2022
  3. PAN22 Authorship Analysis: Style Change Detection

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Dec 6, 2023
    + more versions
  4. PAN25 Multi-Author Writing Style Analysis

    • zenodo.org
    zip
    Updated Feb 19, 2025
  5. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Stamatatos, Efstathios (2022). PAN22 Authorship Analysis: Authorship Verification [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6337136

PAN22 Authorship Analysis: Authorship Verification

Explore at:
Dataset updated
Nov 30, 2022
Dataset provided by
Pezik, Piotr
Kestemont, Mike
Kredens, Krzysztof
Heini, Annina
Bevendorff, Janek
Potthast, Martin
Stamatatos, Efstathios
Stein, Benno
Description

Download

Access to our corpus can be requested via the Aston Institute for Forensic Linguistics Databank: https://fold.aston.ac.uk/handle/123456789/17

Task

Authorship verification is the task of deciding whether two texts have been written by the same author based on comparing the texts' writing styles. In previous editions of PAN, we explored the effectiveness of authorship verification technology in several languages and text genres. In the two most recent editions, cross-domain authorship verification using fanfiction texts was examined. Despite certain differences between fandoms, the task of cross-fandom authorship verification has proved to be relatively feasible. In the current edition, we focus on more challenging scenarios where each author verification case considers two texts that belong to different DTs (cross-DT authorship verification). This will allow us to study the ability of stylometric approaches to capture authorial characteristics that remain stable across DTs even when very different forms of expression are imposed by the DT norms.

Based on a new corpus in English, we provide cross-DT authorship verification cases using the following DTs:

Essays

Emails

Text messages

Business memos

The corpus comprises texts of around 100 individuals. All individuals have similar age (18-22) and are native English speakers. The topic of text samples is not restricted while the level of formality can vary within a certain DT (e.g., text messages may be addressed to family members or non-familial acquaintances).

More information at: Authorship Verification 2022

Search
Clear search
Close search
Google apps
Main menu