3 datasets found
  1. W

    Webis-Editorials-16

    • webis.de
    3254405
    Updated 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Steve Göring; Henning Wachsmuth; Johannes Kiesel; Matthias Hagen; Benno Stein (2016). Webis-Editorials-16 [Dataset]. http://doi.org/10.5281/zenodo.3254405
    Explore at:
    3254405Available download formats
    Dataset updated
    2016
    Dataset provided by
    Leibniz Universität Hannover
    The Web Technology & Information Systems Network
    University of Groningen
    Friedrich Schiller University Jena
    Bauhaus-Universität Weimar
    Authors
    Steve Göring; Henning Wachsmuth; Johannes Kiesel; Matthias Hagen; Benno Stein
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The Webis-Editorials-16 corpus is a novel corpus with 300 news editorials evenly selected from three diverse online news portals: Al Jazeera, Fox News, and The Guardian. The aim of the corpus is to study (1) the mining and classification of fine-grained types of argumentative discourse units and (2) the analysis of argumentation strategies pursued in editorials to achieve persuasion. To this end, each editorial contains manual type annotations of all units that capture the role that a unit plays in the argumentative discourse, such as assumption or statistics. The corpus consists of 14,313 units of six different types, each annotated by three professional annotators from the crowdsourcing platform upwork.com.

  2. E

    Webis-Editorials-16

    • live.european-language-grid.eu
    • data.niaid.nih.gov
    txt
    Updated Apr 30, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Webis-Editorials-16 [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/7538
    Explore at:
    txtAvailable download formats
    Dataset updated
    Apr 30, 2024
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Newer Version of this corpus in the 2018 version can be found here: https://doi.org/10.5281/zenodo.1340629

    The Webis-Editorials-16 corpus is a novel corpus with 300 news editorials evenly selected from three diverse online news portals: Al Jazeera, Fox News, and The Guardian. The aim of the corpus is to study (1) the mining and classification of fine-grained types of argumentative discourse units and (2) the analysis of argumentation strategies pursued in editorials to achieve persuasion. To this end, each editorial contains manual type annotations of all units that capture the role that a unit plays in the argumentative discourse, such as assumption or statistics. The corpus consists of 14,313 units of six different types, each annotated by three professional annotators from the crowdsourcing platform upwork.com.

  3. E

    Webis ChangeMyView Corpus 2020 (Webis-CMV-20)

    • live.european-language-grid.eu
    json
    Updated Mar 23, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Webis ChangeMyView Corpus 2020 (Webis-CMV-20) [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/7596
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Mar 23, 2022
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The Webis-CMV-20 dataset comprises all available posts and comments in the ChangeMyView subreddit from the foundation of the subreddit in 2005, until September 2017. From these, we have derived two sub-datasets for the tasks of persuasiveness prediction, and opinion malleability prediction. In addition, the corpus comprises historical posts by CMV authors, and derived personal characteristics.Dataset specificationAll files are in bzip2-compressed JSON Lines format.

    threads.jsonl: contains all the selected discussion threads from CMVpairs.jsonl: each record contains submission, delta-comment and nondelta-comment and the comments' similarity scoreposts-malleability.jsonl: contains posts for opinion mallebility prediction, in the format provided in the original Reddit Crawl datasetauthor_entity_category.jsonl: each record contains the author and list of Wikipedia entities mentioned by the author in the messages across all subreddits. For each mentioned entity we provide the following data:

    [title, wikidata_id, wikipedia_page_id, mentioned_entity_title, wikifier_score, subreddit_name, subreddit_id, subreddit_category_name, subreddit_topcategory_name]

    author_liwc.jsonl: personality traits features computed with LIWC for the authors from pairs.jsonl and post_malleability.jsonl datasetsauthor_subreddit.jsonl: for each author statistics of all number of all posts (submissions/comments) across all subreddits are providedauthor_subreddit_category.jsonl: similar to author_subreddit.jsonl, the statistics of all author posts is grouped by top-categories and categories according to snoopsnoo.com

  4. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Steve Göring; Henning Wachsmuth; Johannes Kiesel; Matthias Hagen; Benno Stein (2016). Webis-Editorials-16 [Dataset]. http://doi.org/10.5281/zenodo.3254405

Webis-Editorials-16

Explore at:
12 scholarly articles cite this dataset (View in Google Scholar)
3254405Available download formats
Dataset updated
2016
Dataset provided by
Leibniz Universität Hannover
The Web Technology & Information Systems Network
University of Groningen
Friedrich Schiller University Jena
Bauhaus-Universität Weimar
Authors
Steve Göring; Henning Wachsmuth; Johannes Kiesel; Matthias Hagen; Benno Stein
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

The Webis-Editorials-16 corpus is a novel corpus with 300 news editorials evenly selected from three diverse online news portals: Al Jazeera, Fox News, and The Guardian. The aim of the corpus is to study (1) the mining and classification of fine-grained types of argumentative discourse units and (2) the analysis of argumentation strategies pursued in editorials to achieve persuasion. To this end, each editorial contains manual type annotations of all units that capture the role that a unit plays in the argumentative discourse, such as assumption or statistics. The corpus consists of 14,313 units of six different types, each annotated by three professional annotators from the crowdsourcing platform upwork.com.

Search
Clear search
Close search
Google apps
Main menu