Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is the dataset for the shared task on Multi-Author Writing Style Analysis PAN@CLEF2025. Please consult the task's page for further details on the format, the dataset's creation, and links to baselines and utility code.
The goal of the style change detection task is to identify text positions within a given multi-author document at which the author switches. Hence, a fundamental question is the following: If multiple authors together have written a text, can we find evidence for this fact; do we have a means to detect variations in the writing style? Answering this question belongs to the most difficult and most interesting challenges in author identification: Style change detection is the only means to detect plagiarism in a document if no comparison texts are given; likewise, style change detection can help to uncover gift authorships, to verify a claimed authorship, or to develop new technology for writing support.
Previous editions of the multi-author writing style analysis task aim at e.g., detecting whether a document is single- or multi-authored (2018), the actual number of authors within a document (2019), whether there was a style change between two consecutive paragraphs (2020, 2021, 2022), and where the actual style changes were located (2021, 2022). In 2022, style changes also had to be detected on the sentence level. The previously used datasets exhibited high topic diversity, which allowed the participants to leverage topic information as a style change signal. In the 2023 and 2024 editions of the writing style analysis task, special attention is paid to this issue.
We ask participants to solve the following intrinsic style change detection task: for a given text, find all positions of writing style change on the sentence-level (i.e., for each pair of consecutive sentences, assess whether there was a style change). The simultaneous change of authorship and topic will be carefully controlled and we will provide participants with datasets of three difficulty levels:
All documents are provided in English and may contain an arbitrary number of style changes. However, style changes may only occur between sentences (i.e., a single sentence is always authored by a single author and contains no style changes).
To develop and then test your algorithms, three datasets including ground truth information are provided (easy for the easy task, medium for the medium task, and hard for the hard task).
Each dataset is split into three parts:
You are free to use additional external data for training your models. However, we ask you to make the additional data utilized freely available under a suitable license.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
These are the most recent Data Dictionary (pop-ups) and Panarctic Species List (PASL) zip files for all the vegetation plot data entered into Turboveg for the Alaska AVA. These files are necessary to correctly use the Turboveg data with regards to coded data. The Data Dictionary file will be updated when new datasets are entered into Turboveg which result in additions to coded data such as references, author code, habitat type, surficial geology, etc. Updates to the PASL will occur less frequently. Check the dates in the file names to be certain that you are using the most current files. Our data model is a set of tables that comprise our relational database. The Excel spreadsheet included in the resources below provides information about each field in our database, such as data type, description, if it is a required field, whether the information within the field is selected from a pop-up list, and whether the field is a standard within Turboveg or is specific to the AVA. Using Turboveg: 1) Download the installation file available through the link at Alaska Arctic Geoecological Atlas portal from the official Turboveg webpage (general installation file for worldwide users, however, some adjustments will be needed when using data from AAVA after installation of this program). 2) Open the Turboveg program and restore the most recent Data Dictionary and PASL zipped files into the Turboveg program by using the function 'Database-Backup/Restore-Restore.' All the previous versions of data dictionary files and PASL that are already in program will be overwritten. 3) Use the Alaska-AVA following the manual for Turboveg for Windows which is available at http://www.synbiosys.alterra.nl/turboveg/tvwin.pdf
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
(:unav)...........................................
Facebook
TwitterPeople sometimes give thanks as a true expression of their feeling but also sometimes because they know gratitude expression helps to make a certain social impression. That is, some gratitude is expressed because of intrinsic motivations or extrinsic motivations. Such motivations affect the outcomes of behavior. The present work assessed gratitude, trait tendency to manage socially desirable expressions, and well-being across two studies (combined n = 398). Motivations to express gratitude were also measured and impression management goals were manipulated in study 2. Results show that gratitude expression is highest when people want to make a good impression and extrinsic motives to express gratitude can moderate the relationship between gratitude and well-being. Implications for the measurement of gratitude and theoretical understanding of gratitude’s social function are discussed.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is the dataset for the shared task on Multi-Author Writing Style Analysis PAN@CLEF2025. Please consult the task's page for further details on the format, the dataset's creation, and links to baselines and utility code.
The goal of the style change detection task is to identify text positions within a given multi-author document at which the author switches. Hence, a fundamental question is the following: If multiple authors together have written a text, can we find evidence for this fact; do we have a means to detect variations in the writing style? Answering this question belongs to the most difficult and most interesting challenges in author identification: Style change detection is the only means to detect plagiarism in a document if no comparison texts are given; likewise, style change detection can help to uncover gift authorships, to verify a claimed authorship, or to develop new technology for writing support.
Previous editions of the multi-author writing style analysis task aim at e.g., detecting whether a document is single- or multi-authored (2018), the actual number of authors within a document (2019), whether there was a style change between two consecutive paragraphs (2020, 2021, 2022), and where the actual style changes were located (2021, 2022). In 2022, style changes also had to be detected on the sentence level. The previously used datasets exhibited high topic diversity, which allowed the participants to leverage topic information as a style change signal. In the 2023 and 2024 editions of the writing style analysis task, special attention is paid to this issue.
We ask participants to solve the following intrinsic style change detection task: for a given text, find all positions of writing style change on the sentence-level (i.e., for each pair of consecutive sentences, assess whether there was a style change). The simultaneous change of authorship and topic will be carefully controlled and we will provide participants with datasets of three difficulty levels:
All documents are provided in English and may contain an arbitrary number of style changes. However, style changes may only occur between sentences (i.e., a single sentence is always authored by a single author and contains no style changes).
To develop and then test your algorithms, three datasets including ground truth information are provided (easy for the easy task, medium for the medium task, and hard for the hard task).
Each dataset is split into three parts:
You are free to use additional external data for training your models. However, we ask you to make the additional data utilized freely available under a suitable license.