Saved datasets
Last updated
Download format
Usage rights
License from data provider
Please review the applicable license to make sure your contemplated use is permitted.
Topic
Free
Cost to access
Described as free to access or have a license that allows redistribution.
2 datasets found
  1. Webis-Clickbait-17

    • webis.de
    Updated 2017
  2. Webis Clickbait Corpus 2017 (Webis-Clickbait-17)

    • zenodo.org
    zip
    Updated Jun 11, 2018
  3. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Potthast, Martin; Gollub, Tim; Wiegmann, Matti; Stein, Benno; Hagen, Matthias (2017) Webis-Clickbait-17. [Dataset] http://doi.org/10.5281/zenodo.3346491
Organization logo

Webis-Clickbait-17

5 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated 2017
Dataset provided by
Bauhaus-Universität Weimarhttps://www.uni-weimar.de/
The Web Technology & Information Systems Network
Authors
Potthast, Martin; Gollub, Tim; Wiegmann, Matti; Stein, Benno; Hagen, Matthias
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

The Webis Clickbait Corpus 2017 (Webis-Clickbait-17) comprises a total of 38,517 Twitter posts from 27 major US news publishers. In addition to the posts, information about the articles linked in the posts are included. The posts had been published between November 2016 and June 2017. To avoid publisher and topical biases, a maximum of ten posts per day and publisher were sampled. All posts were annotated on a 4-point scale [not click baiting (0.0), slightly click baiting (0.33), considerably click baiting (0.66), heavily click baiting (1.0)] by five annotators from Amazon Mechanical Turk. A total of 9,276 posts are considered clickbait by the majority of annotators. In terms of its size, this corpus outranges the Webis Clickbait Corpus 2016 by one order of magnitude. The corpus is divided into two logical parts, a training and a test dataset. The training dataset has been released in the course of the Clickbait Challenge and a download link is provided below. To allow for an objective evaulatuion of clickbait detection systems, the test dataset is available only through the Evaluation-as-a-Service platform TIRA at the moment. On TIRA, developers can deploy clickbait detection systems and execute them against the test dataset. The performance of the submitted systems can be viewed on the TIRA page of the Clickbait Challenge.

Search
Clear search
Close search
Google apps
Main menu