Saved datasets
Last updated
Download format
Usage rights
License from data provider
Please review the applicable license to make sure your contemplated use is permitted.
Topic
Free
Cost to access
Described as free to access or have a license that allows redistribution.
2 datasets found
  1. webis-clickbait-22

    • webis.de
    jsonl
    Updated 2022
  2. o

    Webis Clickbait Spoiling Corpus 2022

    • explore.openaire.eu
    • zenodo.org
    Updated Jan 16, 2022
  3. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Hagen, Matthias; Fröbe, Maik; Jurk, Artur; Potthast, Martin (2022). webis-clickbait-22 [Dataset]. https://webis.de/data/webis-clickbait-22
Organization logoOrganization logo

webis-clickbait-22

jsonlAvailable download formats
Dataset updated
2022
Dataset provided by
Leipzig Universityhttp://www.uni-leipzig.de/
Martin-Luther-University Halle-Wittenberghttp://www.uni-halle.de/
The Web Technology & Information Systems Network
Authors
Hagen, Matthias; Fröbe, Maik; Jurk, Artur; Potthast, Martin
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

The Webis Clickbait Spoiling Corpus 2022 dataset contains 5,000 spoiled clickbait posts crawled from Facebook, Reddit, and Twitter. This corpus supports the task of clickbait spoiling, which deals with generating a short text that satisfies the curiosity induced by a clickbait post. This dataset contains the clickbait posts and manually cleaned versions of the linked documents, and extracted spoilers for each clickbait post. Additionally, the spoilers are categorized into three types: short phrase spoilers, longer passage spoilers, and multiple non-consecutive pieces of text.

Search
Clear search
Close search
Google apps
Main menu