2 datasets found

Z
PAN19 Authorship Analysis: Celebrity Profiling
data.niaid.nih.gov
zenodo.org
Updated Oct 24, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Potthast, Martin (2023). PAN19 Authorship Analysis: Celebrity Profiling [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3530252
Explore at:
Dataset updated
Oct 24, 2023
Dataset provided by
Potthast, Martin
Wiegmann, Matti
Stein, Benno
Description
Paper: https://webis.de/publications.html?q=wiegmann_2019a Source Dataset: https://files.webis.de/data-in-progress/data-research/social-media-analysis/acl19-celebrity-profiling/

Celebrities are among the most prolific users of social media, promoting their personas and rallying followers. This activity is closely tied to genuine writing samples, rendering them worthy research subjects in many respects, not least author profiling. The Celebrity Profiling task this year is to predict four traits of a celebrity from their social media communication. The traits are the degree of fame, occupation, age, and gender. The social media communication is given as the teaser messages from past tweets. The goal is to develop a piece of software which predicts celebrity traits from the teaser history. The training dataset contains two files: a feeds.ndjson as input and a labels.ndjson as output. Each file lists all celebrities as JSON objects, one per line and identified by the id key. The input file contains the cid and a list of all teaser messages for each celebrity. {"id": 1234, "text": ["a tweet", "another tweet", ...]} The output file contains the cid and a value for each trait for each celebrity from the input file. {"id": 1234, "fame": "star", "occupation": "sports", "gender": "female", "birthyear": 2002} The following values are possible for each of the traits: fame := {rising, star, superstar} occupation := {sports, performer, creator, politics, manager, science, professional, religious} birthyear := {1940, ..., 2012} gender := {male, female, nonbinary}
PAN19 Authorship Analysis: Celebrity Profiling
zenodo.org
zip
Updated Nov 14, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Matti Wiegmann; Matti Wiegmann; Benno Stein; Benno Stein; Martin Potthast; Martin Potthast (2023). PAN19 Authorship Analysis: Celebrity Profiling [Dataset]. http://doi.org/10.5281/zenodo.3530253
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.3530253
Dataset updated
Nov 14, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Matti Wiegmann; Matti Wiegmann; Benno Stein; Benno Stein; Martin Potthast; Martin Potthast
Description
Celebrities are among the most prolific users of social media, promoting their personas and rallying followers. This activity is closely tied to genuine writing samples, rendering them worthy research subjects in many respects, not least author profiling.

The Celebrity Profiling task this year is to predict four traits of a celebrity from their social media communication. The traits are the degree of fame, occupation, age, and gender. The social media communication is given as the teaser messages from past tweets. The goal is to develop a piece of software which predicts celebrity traits from the teaser history.

The training dataset contains two files: a feeds.ndjson as input and a labels.ndjson as output. Each file lists all celebrities as JSON objects, one per line and identified by the id key.

The input file contains the cid and a list of all teaser messages for each celebrity.

{"id": 1234, "text": ["a tweet", "another tweet", ...]}

The output file contains the cid and a value for each trait for each celebrity from the input file.

{"id": 1234, "fame": "star", "occupation": "sports", "gender": "female", "birthyear": 2002}

The following values are possible for each of the traits:

fame := {rising, star, superstar} occupation := {sports, performer, creator, politics, manager, science, professional, religious} birthyear := {1940, ..., 2012} gender := {male, female, nonbinary}
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Potthast, Martin (2023). PAN19 Authorship Analysis: Celebrity Profiling [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3530252

PAN19 Authorship Analysis: Celebrity Profiling

Explore at:

Dataset updated

Oct 24, 2023

Dataset provided by

Potthast, Martin
Wiegmann, Matti
Stein, Benno

Description

Paper: https://webis.de/publications.html?q=wiegmann_2019a Source Dataset: https://files.webis.de/data-in-progress/data-research/social-media-analysis/acl19-celebrity-profiling/

Celebrities are among the most prolific users of social media, promoting their personas and rallying followers. This activity is closely tied to genuine writing samples, rendering them worthy research subjects in many respects, not least author profiling. The Celebrity Profiling task this year is to predict four traits of a celebrity from their social media communication. The traits are the degree of fame, occupation, age, and gender. The social media communication is given as the teaser messages from past tweets. The goal is to develop a piece of software which predicts celebrity traits from the teaser history. The training dataset contains two files: a feeds.ndjson as input and a labels.ndjson as output. Each file lists all celebrities as JSON objects, one per line and identified by the id key. The input file contains the cid and a list of all teaser messages for each celebrity. {"id": 1234, "text": ["a tweet", "another tweet", ...]} The output file contains the cid and a value for each trait for each celebrity from the input file. {"id": 1234, "fame": "star", "occupation": "sports", "gender": "female", "birthyear": 2002} The following values are possible for each of the traits: fame := {rising, star, superstar} occupation := {sports, performer, creator, politics, manager, science, professional, religious} birthyear := {1940, ..., 2012} gender := {male, female, nonbinary}

Clear search

Close search

Google apps

Main menu