3 datasets found

e
Credibility Corpus with several datasets (Twitter, Web database) in French...
data.europa.eu
data.wu.ac.at
rar
Updated Dec 1, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
DELETED DELETED (2016). Credibility Corpus with several datasets (Twitter, Web database) in French and English [Dataset]. https://data.europa.eu/data/datasets/5840066288ee38426dc65bb3?locale=fr
Explore at:
rar(33261), rar(680351), rar(40693), rar(102374), rar(77120), rar(212274)Available download formats
Dataset updated
Dec 1, 2016
Dataset authored and provided by
DELETED DELETED
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
French
Description
Description of the corpora

The set of these datasets are made to analyze ifnormation credibility in general (rumor and disinformation for English and French documents), and occuring on the social web. Target databases about rumor, hoax and disinformation helped to collect obviously misinformation. Some topic (with keywords) helps us to made corpora from the micrroblogging platform Twitter, great provider of rumors and disinformation.

1 corpus describes Texts from the web database about rumors and disinformation. 4 corpora from Social Media Twitter about specific rumors (2 in English, 2 in French). 4 corpora from Social Media Twitter randomly built (2 in English, 2 in French). 4 corpora from Social Media Twitter about specific rumors (2 in English, 2 in French).

Size of different corpora :

Social Web Rumorous corpus: 1,612

French Hollande Rumorous corpus (Twitter): 371 French Lemon Rumorous corpus (Twitter): 270 English Pin Rumorous corpus (Twitter): 679 English Swine Rumorous corpus (Twitter): 1024

French 1st Random corpus (Twitter): 1000 French 2st Random corpus (Twitter): 1000 English 3st Random corpus (Twitter): 1000 English 4st Random corpus (Twitter): 1000

French Rihanna Event corpus (Twitter): 543 English Rihanna Event corpus (Twitter): 1000 French Euro2016 Event corpus (Twitter): 1000 English Euro2016 Event corpus (Twitter): 1000

A matrix links tweets with most 50 frequent words

Text data :

_id : message id body text : string text data

Matrix data :

52 columns (first column is id, second column is rumor indicator 1 or -1, other columns are words value is 1 contain or 0 does not contain) 11,102 lines (each line is a message)

Hidalgo corpus: lines range 1:75 Lemon corpus : lines range 76:467 Pin rumor : lines range 468:656 swine : lines range 657:1311

random messages : lines range 1312:11103

Sample contains : French Pin Rumorous corpus (Twitter): 679 Matrix data :

52 columns (first column is id, second column is rumor indicator 1 or -1, other columns are words value is 1 contain or 0 does not contain) 189 lines (each line is a message)
E
Credibility Corpus with several datasets (Twitter, Web database) in French...
live.european-language-grid.eu
txt
Updated Apr 10, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Credibility Corpus with several datasets (Twitter, Web database) in French and English [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/7468
Explore at:
txtAvailable download formats
Dataset updated
Apr 10, 2024
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
French
Description
The set of these datasets are made to analyze information credibility in general (rumor and disinformation for English and French documents), and occuring on the social web. Target databases about rumor, hoax and disinformation helped to collect obviously misinformation. Some topic (with keywords) helps us to made corpora from the micrroblogging platform Twitter, great provider of rumors and disinformation.1 corpus describes Texts from the web database about rumors and disinformation. 4 corpora from Social Media Twitter about specific rumors (2 in English, 2 in French). 4 corpora from Social Media Twitter randomly built (2 in English, 2 in French). 4 corpora from Social Media Twitter about specific rumors (2 in English, 2 in French).Size of different corpora :Social Web Rumorous corpus: 1,612French Hollande Rumorous corpus (Twitter): 371 French Lemon Rumorous corpus (Twitter): 270 English Pin Rumorous corpus (Twitter): 679 English Swine Rumorous corpus (Twitter): 1024French 1st Random corpus (Twitter): 1000 French 2st Random corpus (Twitter): 1000 English 3st Random corpus (Twitter): 1000 English 4st Random corpus (Twitter): 1000French Rihanna Event corpus (Twitter): 543 English Rihanna Event corpus (Twitter): 1000 French Euro2016 Event corpus (Twitter): 1000 English Euro2016 Event corpus (Twitter): 1000A matrix links tweets with most 50 frequent wordsText data :_id : message id body text : string text dataMatrix data :52 columns (first column is id, second column is rumor indicator 1 or -1, other columns are words value is 1 contain or 0 does not contain) 11,102 lines (each line is a message)Hidalgo corpus: lines range 1:75 Lemon corpus : lines range 76:467 Pin rumor : lines range 468:656 swine : lines range 657:1311random messages : lines range 1312:11103Sample contains : French Pin Rumorous corpus (Twitter): 679 Matrix data :52 columns (first column is id, second column is rumor indicator 1 or -1, other columns are words value is 1 contain or 0 does not contain) 189 lines (each line is a message)
e
Creedibility Corpus with several datasets (Twitter, Web database) in French...
data.europa.eu
rar
Updated Dec 1, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
nicolas turenne (2016). Creedibility Corpus with several datasets (Twitter, Web database) in French and English [Dataset]. https://data.europa.eu/data/datasets/5840066288ee38426dc65bb3/embed
Explore at:
rar(33261), rar(77120), rar(212274), rar(680351), rar(40693), rar(102374)Available download formats
Dataset updated
Dec 1, 2016
Dataset authored and provided by
nicolas turenne
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
French
Description
Description of the corpora

The set of these datasets are made to analyse ifnormation credibility in general (rumor and disinformation for English and French documents), and occuring on the social web. Target databases about rumor, hoax and disinformation helped to collection obviously misinformation. Some topic (with keywords) helps us to made corpora from the micrroblogging platform Twitter, great provider of rumors and disinformation.

1 corpus describes Texts from the web database about rumors and disinformation. 4 corpora from Social Media Twitter about specific rumors (2 in English, 2 in French). 4 corpora from Social Media Twitter randomly built (2 in English, 2 in French). 4 corpora from Social Media Twitter about specific rumors (2 in English, 2 in French).

Size of different corpora:

Social Web Rumorous corpus: 1,612

French Hollande Rumorous corpus (Twitter): 371 French Lemon Rumorous corpus (Twitter): 270 English Pin Rumorous corpus (Twitter): 679 English Swine Rumorous corpus (Twitter): 1024

French 1st Random corpus (Twitter): 1000 French 2nd Random corpus (Twitter): 1000 English 3rd Random corpus (Twitter): 1000 English 4th Random corpus (Twitter): 1000

French Rihanna Event corpus (Twitter): 543 English Rihanna Event corpus (Twitter): 1000 French Euro2016 Event corpus (Twitter): 1000 English Euro2016 Event corpus (Twitter): 1000

A matrix links tweets with most 50 frequent words

Text data:

_id: message id body text: string text data

Matrix data:

52 columns (first column is id, second column is rumor indicator 1 or -1, other columns are words value is 1 contain or 0 does not contain) 11,102 lines (each line is a message)

Hidalgo corpus: lines range 1:75 Lemon corpus: lines range 76:467 Pin rumor: lines range 468:656 Swine: lines range 657:1311

random Messages: lines range 1312:11103

Sample contains: French Pin Rumorous corpus (Twitter): 679 Matrix data:

52 columns (first column is id, second column is rumor indicator 1 or -1, other columns are words value is 1 contain or 0 does not contain) 189 lines (each line is a message)
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

DELETED DELETED (2016). Credibility Corpus with several datasets (Twitter, Web database) in French and English [Dataset]. https://data.europa.eu/data/datasets/5840066288ee38426dc65bb3?locale=fr

Credibility Corpus with several datasets (Twitter, Web database) in French and English

Explore at:

rar(33261), rar(680351), rar(40693), rar(102374), rar(77120), rar(212274)Available download formats

Dataset updated

Dec 1, 2016

Dataset authored and provided by

DELETED DELETED

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Area covered

French

Description

Description of the corpora

The set of these datasets are made to analyze ifnormation credibility in general (rumor and disinformation for English and French documents), and occuring on the social web. Target databases about rumor, hoax and disinformation helped to collect obviously misinformation. Some topic (with keywords) helps us to made corpora from the micrroblogging platform Twitter, great provider of rumors and disinformation.

1 corpus describes Texts from the web database about rumors and disinformation. 4 corpora from Social Media Twitter about specific rumors (2 in English, 2 in French). 4 corpora from Social Media Twitter randomly built (2 in English, 2 in French). 4 corpora from Social Media Twitter about specific rumors (2 in English, 2 in French).

Size of different corpora :

Social Web Rumorous corpus: 1,612

French Hollande Rumorous corpus (Twitter): 371 French Lemon Rumorous corpus (Twitter): 270 English Pin Rumorous corpus (Twitter): 679 English Swine Rumorous corpus (Twitter): 1024

French 1st Random corpus (Twitter): 1000 French 2st Random corpus (Twitter): 1000 English 3st Random corpus (Twitter): 1000 English 4st Random corpus (Twitter): 1000

French Rihanna Event corpus (Twitter): 543 English Rihanna Event corpus (Twitter): 1000 French Euro2016 Event corpus (Twitter): 1000 English Euro2016 Event corpus (Twitter): 1000

A matrix links tweets with most 50 frequent words

Text data :

_id : message id body text : string text data

Matrix data :

52 columns (first column is id, second column is rumor indicator 1 or -1, other columns are words value is 1 contain or 0 does not contain) 11,102 lines (each line is a message)

Hidalgo corpus: lines range 1:75 Lemon corpus : lines range 76:467 Pin rumor : lines range 468:656 swine : lines range 657:1311

random messages : lines range 1312:11103

Sample contains : French Pin Rumorous corpus (Twitter): 679 Matrix data :

52 columns (first column is id, second column is rumor indicator 1 or -1, other columns are words value is 1 contain or 0 does not contain) 189 lines (each line is a message)

Clear search

Close search

Google apps

Main menu

Credibility Corpus with several datasets (Twitter, Web database) in French...

Credibility Corpus with several datasets (Twitter, Web database) in French...

Creedibility Corpus with several datasets (Twitter, Web database) in French...

Credibility Corpus with several datasets (Twitter, Web database) in French and English