11 datasets found

W
Webis-Editorial-Quality-18
webis.de
anthology.aicmu.ac.cn
1340629
Updated 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Roxanne El Baff; Henning Wachsmuth; Khalid Al-Khatib; Benno Stein (2018). Webis-Editorial-Quality-18 [Dataset]. http://doi.org/10.5281/zenodo.1340629
Explore at:
1340629Available download formats
Unique identifier
https://doi.org/10.5281/zenodo.1340629
Dataset updated
2018
Dataset provided by
University of Groningen
Bauhaus-Universität Weimar
Leibniz Universität Hannover
The Web Technology & Information Systems Network
Deutsches Zentrum für Luft- und Raumfahrt
Authors
Roxanne El Baff; Henning Wachsmuth; Khalid Al-Khatib; Benno Stein
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The Webis-Editorials-18 corpus comprises 1000 news editorials that are annotated in accordance with a new notion for argumentation quality. The notion regards whether an editorial brings readers of opposing beliefs closer together or rather increases the gap between them. In particular, we label each editorial in the corpus as challenging, reinforcing, or no-effect. To account for the political ideology of the target readers, each editorial is labelled by three liberals and three conservatives.
E
Webis-Editorial-Quality-18 corpus
live.european-language-grid.eu
csv
Updated Apr 30, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Webis-Editorial-Quality-18 corpus [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/7424
Explore at:
csvAvailable download formats
Dataset updated
Apr 30, 2024
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
Challenge or Empower: Revisiting Argumentation Quality in a News Editorial Corpus.The Webis-Editorial-Quality-18 corpus is a novel corpus with 1000 news editorials. The aim of this Corpus is to study a new notion for news editorials quality. It contains the quality assessments of 1000 news editorials, each annotated by three liberals and three conservatives. The annotators also reported free-text reasons for the effects they observed.
E
Webis EditorialSum Corpus 2020
live.european-language-grid.eu
csv
Updated Oct 19, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2020). Webis EditorialSum Corpus 2020 [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/7658
Explore at:
csvAvailable download formats
Dataset updated
Oct 19, 2020
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The Webis EditorialSum Corpus consists of 1330 manually curated extractive summaries for 266 news editorials spanning three diverse portals: Al-Jazeera, Guardian and Fox News. Each editorial has 5 summaries, each labeled for overall quality and fine grained properties such as thesis-relevance, persuasiveness, reasonableness, self-containedness.The files are organized as follows:corpus.csv - Contains all the editorials and their acquired summariesNote: (X = [1,5] for five summaries)- article_id : Article ID in the corpus- title : Title of the editorial- article_text : Plain text of the editorial- summary_{X}_text : Plain text of the corresponding summary- thesis_{X}_text : Plain text of the thesis from the corresponding summary- lead : top 15% of the editorial's segments- body : segments between lead and conclusion sections- conclusion : bottom 15% of the editorial's segments- article_segments: Collection of paragraphs, each further divided into collection of segments containing: { "number": segment order in the editorial, "text" : segment text, "label": ADU type }- summary_{X}_segments: Collection of summary segments containing:{ "number": segment order in the editorial, "text" : segment text, "adu_label": ADU type from the editorial, "summary_label": can be 'thesis' or 'justification'}quality-groups.csv - Contains the IDs for high(and low)-quality summaries for each quality dimension per editorialFor example: article_id 2 has four high_quality summaries (summary_1, summary_2, summary_3, summary_4) and one low_quality summary (summary_5) in terms of overall quality.The summary texts can be obtained from corpus.csv respectively.
E
Webis Abstractive Snippet Corpus 2020
live.european-language-grid.eu
json
Updated Aug 19, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). Webis Abstractive Snippet Corpus 2020 [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/7817
Explore at:
jsonAvailable download formats
Dataset updated
Aug 19, 2023
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The Webis Abstractive Snippet 2020 (Webis-Snippete-20) comprises four abstractive snippet dataset from ClueWeb09, Clueweb12, and DMOZ descriptions. More than 10 million
E
COVID-19 CDC dataset v1. Bilingual (EN-FR)
live.european-language-grid.eu
tmx
Updated Apr 25, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2020). COVID-19 CDC dataset v1. Bilingual (EN-FR) [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/21051
Explore at:
tmxAvailable download formats
Dataset updated
Apr 25, 2020
License
https://elrc-share.eu/terms/publicDomain.htmlhttps://elrc-share.eu/terms/publicDomain.html
Description
EN-FR Bilingual COVID-19-related corpus acquired from the website (https://www.cdc.gov/) of the Centers for Disease Control and Prevention of US government (25th April 2020)
E
COVID-19 CDC dataset v2. Multilingual (EN, ES, FR, PT, IT, DE, KO, RU, ZH,...
live.european-language-grid.eu
tmx
Updated Aug 15, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2020). COVID-19 CDC dataset v2. Multilingual (EN, ES, FR, PT, IT, DE, KO, RU, ZH, UK, VI) [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/21340
Explore at:
tmxAvailable download formats
Dataset updated
Aug 15, 2020
License
https://elrc-share.eu/terms/publicDomain.htmlhttps://elrc-share.eu/terms/publicDomain.html
Description
Multilingual (EN, ES, FR, PT, IT, DE, KO, RU, ZH, UK, VI) COVID-19-related corpus acquired from the website (https://www.cdc.gov/) of the Centers for Disease Control and Prevention of US government (11th August 2020). It contains 51202 TUs in total.
E
BMVI Website (Processed)
live.european-language-grid.eu
data.europa.eu
tmx
Updated Mar 1, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2018). BMVI Website (Processed) [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/3028
Explore at:
tmxAvailable download formats
Dataset updated
Mar 1, 2018
License
https://elrc-share.eu/terms/openUnderPSI.htmlhttps://elrc-share.eu/terms/openUnderPSI.html
Description
tmx file, 2718 TUs, bilingual German/English, texts from the website of the Federal Ministry of Transport and Digital Infrastructure (BMVI) on transport issues. Original tmx file corrected and stripped
E
German-French website parallel corpus from the Federal Foreign Office Berlin...
live.european-language-grid.eu
data.europa.eu
tmx
Updated Jan 11, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). German-French website parallel corpus from the Federal Foreign Office Berlin [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/2874
Explore at:
tmxAvailable download formats
Dataset updated
Jan 11, 2022
License
https://elrc-share.eu/terms/openUnderPSI.htmlhttps://elrc-share.eu/terms/openUnderPSI.html
Area covered
French
Description
German-French texts extracted from the website of the Federal Foreign Office Berlin. This includes 11,852 pairs that were translated between October 2013 and the beginning of November 2015 and converted into a .TMX file format.
E
German-Portuguese website parallel corpus from the Federal Foreign Office...
live.european-language-grid.eu
data.europa.eu
tmx
Updated Jan 12, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). German-Portuguese website parallel corpus from the Federal Foreign Office Berlin [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/2875
Explore at:
tmxAvailable download formats
Dataset updated
Jan 12, 2022
License
https://elrc-share.eu/terms/openUnderPSI.htmlhttps://elrc-share.eu/terms/openUnderPSI.html
Description
German-Portuguese texts extracted from the website of the Federal Foreign Office Berlin. This includes 415 pairs that were translated between September 2013 and the beginning of December 2015 and converted into a .TMX file format.
E
Croatian-English parallel corpus from the website of the Croatian Journal of...
live.european-language-grid.eu
catalog.elra.info
+1more
tmx
Updated Nov 19, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2018). Croatian-English parallel corpus from the website of the Croatian Journal of Fisheries (Processed) [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/3178
Explore at:
tmxAvailable download formats
Dataset updated
Nov 19, 2018
License
Attribution-NoDerivs 3.0 (CC BY-ND 3.0)https://creativecommons.org/licenses/by-nd/3.0/
License information was derived automatically
Description
Croatian-English parallel corpus from the website of the Croatian Journal of Fisheries (https://ribarstvo.agr.hr/)
E
BMI Brochures and Website 2016
live.european-language-grid.eu
data.europa.eu
tmx
Updated Jan 16, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). BMI Brochures and Website 2016 [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/2886
Explore at:
tmxAvailable download formats
Dataset updated
Jan 16, 2022
License
https://elrc-share.eu/terms/openUnderPSI.htmlhttps://elrc-share.eu/terms/openUnderPSI.html
Description
Bilingual tmx file of German to English translations of the Federal Ministry of the Interior's website and brochures. Topics include terrorism, cyber security, asylum, cultural property, public administration and sport.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Roxanne El Baff; Henning Wachsmuth; Khalid Al-Khatib; Benno Stein (2018). Webis-Editorial-Quality-18 [Dataset]. http://doi.org/10.5281/zenodo.1340629

Webis-Editorial-Quality-18

Explore at:

3 scholarly articles cite this dataset (View in Google Scholar)

1340629Available download formats

Unique identifier

https://doi.org/10.5281/zenodo.1340629

Dataset updated

2018

Dataset provided by

University of Groningen
Bauhaus-Universität Weimar
Leibniz Universität Hannover
The Web Technology & Information Systems Network
Deutsches Zentrum für Luft- und Raumfahrt

Authors

Roxanne El Baff; Henning Wachsmuth; Khalid Al-Khatib; Benno Stein

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

The Webis-Editorials-18 corpus comprises 1000 news editorials that are annotated in accordance with a new notion for argumentation quality. The notion regards whether an editorial brings readers of opposing beliefs closer together or rather increases the gap between them. In particular, we label each editorial in the corpus as challenging, reinforcing, or no-effect. To account for the political ideology of the target readers, each editorial is labelled by three liberals and three conservatives.

Clear search

Close search

Google apps

Main menu

Webis-Editorial-Quality-18

Webis-Editorial-Quality-18 corpus

Webis EditorialSum Corpus 2020

Webis Abstractive Snippet Corpus 2020

COVID-19 CDC dataset v1. Bilingual (EN-FR)

COVID-19 CDC dataset v2. Multilingual (EN, ES, FR, PT, IT, DE, KO, RU, ZH,...

BMVI Website (Processed)

German-French website parallel corpus from the Federal Foreign Office Berlin...

German-Portuguese website parallel corpus from the Federal Foreign Office...

Croatian-English parallel corpus from the website of the Croatian Journal of...

BMI Brochures and Website 2016

Webis-Editorial-Quality-18