Saved datasets
1 dataset found
  1. Webis-WikiDebate-18

    • zenodo.org
    • webis.de
    • +1more
    application/gzip
    Updated Aug 29, 2022
  2. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Khalid Al-Khatib; Henning Wachsmuth; Henning Wachsmuth; Matthias Hagen; Matthias Hagen; Benno Stein; Benno Stein; Kevin Lang; Kevin Lang; Jakob Herpel; Khalid Al-Khatib; Jakob Herpel (2022). Webis-WikiDebate-18 [Dataset]. http://doi.org/10.5281/zenodo.3339136
Organization logo

Webis-WikiDebate-18

Explore at:
2 scholarly articles cite this dataset (View in Google Scholar)
application/gzipAvailable download formats
Dataset updated
Aug 29, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Khalid Al-Khatib; Henning Wachsmuth; Henning Wachsmuth; Matthias Hagen; Matthias Hagen; Benno Stein; Benno Stein; Kevin Lang; Kevin Lang; Jakob Herpel; Khalid Al-Khatib; Jakob Herpel
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Webis-WikiDebate-18 is a large-scale corpus for the argumentation model. The corpus is generated automatically based on the metadata in discussions and then verified partly by an expert.

The table has the following attributes:

  • comment-ID
  • discussion-ID
  • reference comment-ID
  • comment (wiki format with tag)
  • comment (plain text without tag)
  • username
  • timestamp
  • hierarchy level in discussion

The IDs consist of up to 3 numbers. For example, the comment-ID "3203277-22-11" consists of the page-ID "3203277" with the 23rd discussion and the 12th comment inside the discussion. Please note that the counting starts at 0. The page-ID is from the MediaWiki's internal article ID and can be called by the curid attribute (e.g. http://en.wikipedia.org/?curid=3203277). The article on Wikipedia and the corresponding talk page have two different IDs.
Sometimes the value is "\N" when the comment or the structure of the discussion was ill-formed and there was no previous comment, user or timestamp statement. This can happen quite often because no editor checks the meta data and everything has to be managed by the users themselves.
The reference comment-ID is the last comment on the higher hierarchy level which the current one refers to with its statement.
The hierarchy level of a comment in a discussion is identified by the number of the ":" in wiki format at the beginning of a comment and shows how deep the discussion already involves.

Search
Clear search
Close search
Google apps
Main menu