3 datasets found

W
Webis-Clickbait-16
webis.de
3251557
Updated 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Martin Potthast; Benno Stein; Matthias Hagen; Sebastian Köpsel (2016). Webis-Clickbait-16 [Dataset]. http://doi.org/10.5281/zenodo.3251557
Explore at:
3251557Available download formats
Unique identifier
https://doi.org/10.5281/zenodo.3251557
Dataset updated
2016
Dataset provided by
University of Kassel, hessian.AI, and ScaDS.AI
Friedrich Schiller University Jena
Bauhaus-Universität Weimar
The Web Technology & Information Systems Network
Authors
Martin Potthast; Benno Stein; Matthias Hagen; Sebastian Köpsel
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The Webis Clickbait Corpus 2016 (Webis-Clickbait-16) comprises 2992 Twitter tweets sampled from top 20 news publishers as per retweets in 2014. The tweets have been manually annotated by three independent annotators with regard to whether they can be considered clickbait. A total of 767 tweets are considered clickbait by the majority of annotators. The majority vote of reviewers can be used as a ground truth to build clickbait detection technology. This corpus is the first of its kind and gives rise to the development of technology to tackle clickbait.
W
Webis-Clickbait-17
webis.de
5530410
Updated 2017
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Martin Potthast; Tim Gollub; Matti Wiegmann; Benno Stein; Matthias Hagen (2017). Webis-Clickbait-17 [Dataset]. http://doi.org/10.5281/zenodo.5530410
Explore at:
5530410Available download formats
Unique identifier
https://doi.org/10.5281/zenodo.5530410
Dataset updated
2017
Dataset provided by
University of Kassel, hessian.AI, and ScaDS.AI
Friedrich Schiller University Jena
Bauhaus-Universität Weimar
The Web Technology & Information Systems Network
Authors
Martin Potthast; Tim Gollub; Matti Wiegmann; Benno Stein; Matthias Hagen
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The Webis Clickbait Corpus 2017 (Webis-Clickbait-17) comprises a total of 38,517 Twitter posts from 27 major US news publishers. In addition to the posts, information about the articles linked in the posts are included. The posts had been published between November 2016 and June 2017. To avoid publisher and topical biases, a maximum of ten posts per day and publisher were sampled. All posts were annotated on a 4-point scale [not click baiting (0.0), slightly click baiting (0.33), considerably click baiting (0.66), heavily click baiting (1.0)] by five annotators from Amazon Mechanical Turk. A total of 9,276 posts are considered clickbait by the majority of annotators. In terms of its size, this corpus outranges the Webis Clickbait Corpus 2016 by one order of magnitude. The corpus is divided into two logical parts, a training and a test dataset. The training dataset has been released in the course of the Clickbait Challenge and a download link is provided below. To allow for an objective evaulatuion of clickbait detection systems, the test dataset is available only through the Evaluation-as-a-Service platform TIRA at the moment. On TIRA, developers can deploy clickbait detection systems and execute them against the test dataset. The performance of the submitted systems can be viewed on the TIRA page of the Clickbait Challenge.
E
Webis Clickbait Corpus 2016 (Webis-Clickbait-16)
live.european-language-grid.eu
data.niaid.nih.gov
json
Updated Apr 30, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Webis Clickbait Corpus 2016 (Webis-Clickbait-16) [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/7534
Explore at:
jsonAvailable download formats
Dataset updated
Apr 30, 2024
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The Webis Clickbait Corpus 2016 (Webis-Clickbait-16) comprises 2992 Twitter tweets sampled from top 20 news publishers as per retweets in 2014. The tweets have been manually annotated by three independent annotators with regard to whether they can be considered clickbait. A total of 767 tweets are considered clickbait by the majority of annotators. The majority vote of reviewers can be used as a ground truth to build clickbait detection technology. This corpus is the first of its kind and gives rise to the development of technology to tackle clickbait.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Martin Potthast; Benno Stein; Matthias Hagen; Sebastian Köpsel (2016). Webis-Clickbait-16 [Dataset]. http://doi.org/10.5281/zenodo.3251557

Webis-Clickbait-16

Explore at:

3251557Available download formats

Unique identifier

https://doi.org/10.5281/zenodo.3251557

Dataset updated

2016

Dataset provided by

University of Kassel, hessian.AI, and ScaDS.AI
Friedrich Schiller University Jena
Bauhaus-Universität Weimar
The Web Technology & Information Systems Network

Authors

Martin Potthast; Benno Stein; Matthias Hagen; Sebastian Köpsel

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

The Webis Clickbait Corpus 2016 (Webis-Clickbait-16) comprises 2992 Twitter tweets sampled from top 20 news publishers as per retweets in 2014. The tweets have been manually annotated by three independent annotators with regard to whether they can be considered clickbait. A total of 767 tweets are considered clickbait by the majority of annotators. The majority vote of reviewers can be used as a ground truth to build clickbait detection technology. This corpus is the first of its kind and gives rise to the development of technology to tackle clickbait.

Clear search

Close search

Google apps

Main menu

Webis-Clickbait-16

Webis-Clickbait-17

Webis Clickbait Corpus 2016 (Webis-Clickbait-16)

Webis-Clickbait-16