1 dataset found

MASH: A Multiplatform Annotated Dataset for Societal Impact of Hurricane
zenodo.org
Updated May 24, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anonymous; Anonymous (2025). MASH: A Multiplatform Annotated Dataset for Societal Impact of Hurricane [Dataset]. http://doi.org/10.5281/zenodo.15401479
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.15401479
Dataset updated
May 24, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Anonymous; Anonymous
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
MASH: A Multiplatform Annotated Dataset for Societal Impact of Hurricane

We present a Multiplatform Annotated Dataset for Societal Impact of Hurricane (MASH) that includes 98,662 relevant social media data posts from Reddit, X, TikTok, and YouTube.

In addition, all relevant posts are annotated on three dimensions: Humanitarian Classes, Bias Classes, and Information Integrity Classes in a multi-modal approach that considers both textual and visual content (text, images, and videos), providing a rich labeled dataset for in-depth analysis.

The dataset is also complemented by an Online Analytics Platform (https://hurricane.web.illinois.edu/) that not only allows users to view hurricane-related posts and articles, but also explores high-frequency keywords, user sentiment, and the locations where posts were made.

To our best knowledge, MASH is the first large-scale, multi-platform, multimodal, and multi-dimensionally annotated hurricane dataset. We envision that MASH can contribute to the study of hurricanes' impact on society, such as disaster severity classification, event detections, public sentiment analysis, and bias identification.

Usage Notice

This dataset includes four annotation files:

• reddit_anno_publish.csv

• tiktok_anno_publish.csv

• twitter_anno_publish.csv

• youtube_anno_publish.csv

Each file contains post IDs and corresponding annotations on three dimensions: Humanitarian Classes, Bias Classes, and Information Integrity Classes.

To protect user privacy, only post IDs are released. We recommend retrieving the full post content via the official APIs of each platform, in accordance with their respective terms of service.

- Reddit API (https://www.reddit.com/dev/api)

- TikTok API (https://developers.tiktok.com/products/research-api)

- X/Twitter API (https://developer.x.com/en/docs/x-api)

- YouTube API (https://developers.google.com/youtube/v3)

Humanitarian Classes

Each post is annotated with seven binary humanitarian classes. For each class, the label is either:

• True – the post contains this humanitarian information

• False – the post does not contain this information

These seven humanitarian classes include:

• Casualty: The post reports people or animals who are killed, injured, or missing during the hurricane.

• Evacuation: The post describes the evacuation, relocation, rescue, or displacement of individuals or animals due to the hurricane.

• Damage: The post reports damage to infrastructure or public utilities caused by the hurricane.

• Advice: The post provides advice, guidance, or suggestions related to hurricanes, including how to stay safe, protect property, or prepare for the disaster.

• Request: Request for help, support, or resources due to the hurricane

• Assistance: This includes both physical aid and emotional or psychological support provided by individuals, communities, or organizations.

• Recovery: The post describes efforts or activities related to the recovery and rebuilding process after the hurricane.

Note: A single post may be labeled as True for multiple humanitarian categories.

Bias Classes

Each post is annotated with five binary bias classes. For each class, the label is either:

• True – the post contains this bias information

• False – the post does not contain this information

These five bias classes include:

• Linguistic Bias: The post contains biased, inappropriate, or offensive language, with a focus on word choice, tone, or expression.

• Political Bias: The post expresses political ideology, showing favor or disapproval toward specific political actors, parties, or policies.

• Gender Bias: The post contains biased, stereotypical, or discriminatory language or viewpoints related to gender.

• Hate Speech: The post contains language that expresses hatred, hostility, or dehumanization toward a specific group or individual, especially those belonging to minority or marginalized communities.

• Racial Bias: The post contains biased, discriminatory, or stereotypical statements directed toward one or more racial or ethnic groups.

Note: A single post may be labeled as True for multiple bias categories.

Information Integrity Classes

Each post is also annotated with a single information integrity class, represented by an integer:

• -1 → False information (i.e., misinformation or disinformation)

• 0 → Unverifiable information (unclear or lacking sufficient evidence)

• 1 → True information (verifiable and accurate)

Key Notes

This dataset is also available at https://huggingface.co/datasets/YRC10/MASH.

Version 1 is no longer available.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Anonymous; Anonymous (2025). MASH: A Multiplatform Annotated Dataset for Societal Impact of Hurricane [Dataset]. http://doi.org/10.5281/zenodo.15401479

MASH: A Multiplatform Annotated Dataset for Societal Impact of Hurricane

Explore at:

Unique identifier

https://doi.org/10.5281/zenodo.15401479

Dataset updated

May 24, 2025

Dataset provided by

Zenodohttp://zenodo.org/

Authors

Anonymous; Anonymous

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

MASH: A Multiplatform Annotated Dataset for Societal Impact of Hurricane

We present a Multiplatform Annotated Dataset for Societal Impact of Hurricane (MASH) that includes 98,662 relevant social media data posts from Reddit, X, TikTok, and YouTube.

In addition, all relevant posts are annotated on three dimensions: Humanitarian Classes, Bias Classes, and Information Integrity Classes in a multi-modal approach that considers both textual and visual content (text, images, and videos), providing a rich labeled dataset for in-depth analysis.

The dataset is also complemented by an Online Analytics Platform (https://hurricane.web.illinois.edu/) that not only allows users to view hurricane-related posts and articles, but also explores high-frequency keywords, user sentiment, and the locations where posts were made.

To our best knowledge, MASH is the first large-scale, multi-platform, multimodal, and multi-dimensionally annotated hurricane dataset. We envision that MASH can contribute to the study of hurricanes' impact on society, such as disaster severity classification, event detections, public sentiment analysis, and bias identification.

Usage Notice

This dataset includes four annotation files:

• reddit_anno_publish.csv

• tiktok_anno_publish.csv

• twitter_anno_publish.csv

• youtube_anno_publish.csv

Each file contains post IDs and corresponding annotations on three dimensions: Humanitarian Classes, Bias Classes, and Information Integrity Classes.

To protect user privacy, only post IDs are released. We recommend retrieving the full post content via the official APIs of each platform, in accordance with their respective terms of service.

- Reddit API (https://www.reddit.com/dev/api)

- TikTok API (https://developers.tiktok.com/products/research-api)

- X/Twitter API (https://developer.x.com/en/docs/x-api)

- YouTube API (https://developers.google.com/youtube/v3)

Humanitarian Classes

Each post is annotated with seven binary humanitarian classes. For each class, the label is either:

• True – the post contains this humanitarian information

• False – the post does not contain this information

These seven humanitarian classes include:

• Casualty: The post reports people or animals who are killed, injured, or missing during the hurricane.

• Evacuation: The post describes the evacuation, relocation, rescue, or displacement of individuals or animals due to the hurricane.

• Damage: The post reports damage to infrastructure or public utilities caused by the hurricane.

• Advice: The post provides advice, guidance, or suggestions related to hurricanes, including how to stay safe, protect property, or prepare for the disaster.

• Request: Request for help, support, or resources due to the hurricane

• Assistance: This includes both physical aid and emotional or psychological support provided by individuals, communities, or organizations.

• Recovery: The post describes efforts or activities related to the recovery and rebuilding process after the hurricane.

Note: A single post may be labeled as True for multiple humanitarian categories.

Bias Classes

Each post is annotated with five binary bias classes. For each class, the label is either:

• True – the post contains this bias information

• False – the post does not contain this information

These five bias classes include:

• Linguistic Bias: The post contains biased, inappropriate, or offensive language, with a focus on word choice, tone, or expression.

• Political Bias: The post expresses political ideology, showing favor or disapproval toward specific political actors, parties, or policies.

• Gender Bias: The post contains biased, stereotypical, or discriminatory language or viewpoints related to gender.

• Hate Speech: The post contains language that expresses hatred, hostility, or dehumanization toward a specific group or individual, especially those belonging to minority or marginalized communities.

• Racial Bias: The post contains biased, discriminatory, or stereotypical statements directed toward one or more racial or ethnic groups.

Note: A single post may be labeled as True for multiple bias categories.

Information Integrity Classes

Each post is also annotated with a single information integrity class, represented by an integer:

• -1 → False information (i.e., misinformation or disinformation)

• 0 → Unverifiable information (unclear or lacking sufficient evidence)

• 1 → True information (verifiable and accurate)

Key Notes

This dataset is also available at https://huggingface.co/datasets/YRC10/MASH.
Version 1 is no longer available.

Clear search

Close search

Google apps

Main menu

MASH: A Multiplatform Annotated Dataset for Societal Impact of Hurricane

MASH: A Multiplatform Annotated Dataset for Societal Impact of Hurricane

Usage Notice

Humanitarian Classes

Bias Classes

Information Integrity Classes

Key Notes

MASH: A Multiplatform Annotated Dataset for Societal Impact of HurricaneSee More Versions

MASH: A Multiplatform Annotated Dataset for Societal Impact of Hurricane

Usage Notice

Humanitarian Classes

Bias Classes

Information Integrity Classes

Key Notes

MASH: A Multiplatform Annotated Dataset for Societal Impact of Hurricane