1 dataset found
  1. MASH: A Multiplatform Annotated Dataset for Societal Impact of Hurricane

    • zenodo.org
    Updated May 24, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anonymous; Anonymous (2025). MASH: A Multiplatform Annotated Dataset for Societal Impact of Hurricane [Dataset]. http://doi.org/10.5281/zenodo.15401479
    Explore at:
    Dataset updated
    May 24, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Anonymous; Anonymous
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    MASH: A Multiplatform Annotated Dataset for Societal Impact of Hurricane

    We present a Multiplatform Annotated Dataset for Societal Impact of Hurricane (MASH) that includes 98,662 relevant social media data posts from Reddit, X, TikTok, and YouTube.
    In addition, all relevant posts are annotated on three dimensions: Humanitarian Classes, Bias Classes, and Information Integrity Classes in a multi-modal approach that considers both textual and visual content (text, images, and videos), providing a rich labeled dataset for in-depth analysis.
    The dataset is also complemented by an Online Analytics Platform (https://hurricane.web.illinois.edu/) that not only allows users to view hurricane-related posts and articles, but also explores high-frequency keywords, user sentiment, and the locations where posts were made.
    To our best knowledge, MASH is the first large-scale, multi-platform, multimodal, and multi-dimensionally annotated hurricane dataset. We envision that MASH can contribute to the study of hurricanes' impact on society, such as disaster severity classification, event detections, public sentiment analysis, and bias identification.

    Usage Notice

    This dataset includes four annotation files:
    • reddit_anno_publish.csv
    • tiktok_anno_publish.csv
    • twitter_anno_publish.csv
    • youtube_anno_publish.csv
    Each file contains post IDs and corresponding annotations on three dimensions: Humanitarian Classes, Bias Classes, and Information Integrity Classes.
    To protect user privacy, only post IDs are released. We recommend retrieving the full post content via the official APIs of each platform, in accordance with their respective terms of service.

    Humanitarian Classes

    Each post is annotated with seven binary humanitarian classes. For each class, the label is either:
    • True – the post contains this humanitarian information
    • False – the post does not contain this information
    These seven humanitarian classes include:
    • Casualty: The post reports people or animals who are killed, injured, or missing during the hurricane.
    • Evacuation: The post describes the evacuation, relocation, rescue, or displacement of individuals or animals due to the hurricane.
    • Damage: The post reports damage to infrastructure or public utilities caused by the hurricane.
    • Advice: The post provides advice, guidance, or suggestions related to hurricanes, including how to stay safe, protect property, or prepare for the disaster.
    • Request: Request for help, support, or resources due to the hurricane
    • Assistance: This includes both physical aid and emotional or psychological support provided by individuals, communities, or organizations.
    • Recovery: The post describes efforts or activities related to the recovery and rebuilding process after the hurricane.
    Note: A single post may be labeled as True for multiple humanitarian categories.

    Bias Classes

    Each post is annotated with five binary bias classes. For each class, the label is either:
    • True – the post contains this bias information
    • False – the post does not contain this information
    These five bias classes include:
    • Linguistic Bias: The post contains biased, inappropriate, or offensive language, with a focus on word choice, tone, or expression.
    • Political Bias: The post expresses political ideology, showing favor or disapproval toward specific political actors, parties, or policies.
    • Gender Bias: The post contains biased, stereotypical, or discriminatory language or viewpoints related to gender.
    • Hate Speech: The post contains language that expresses hatred, hostility, or dehumanization toward a specific group or individual, especially those belonging to minority or marginalized communities.
    • Racial Bias: The post contains biased, discriminatory, or stereotypical statements directed toward one or more racial or ethnic groups.
    Note: A single post may be labeled as True for multiple bias categories.

    Information Integrity Classes

    Each post is also annotated with a single information integrity class, represented by an integer:
    • -1 → False information (i.e., misinformation or disinformation)
    • 0 → Unverifiable information (unclear or lacking sufficient evidence)
    • 1 → True information (verifiable and accurate)

    Key Notes

    1. This dataset is also available at https://huggingface.co/datasets/YRC10/MASH.
    2. Version 1 is no longer available.
  2. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Anonymous; Anonymous (2025). MASH: A Multiplatform Annotated Dataset for Societal Impact of Hurricane [Dataset]. http://doi.org/10.5281/zenodo.15401479
Organization logo

MASH: A Multiplatform Annotated Dataset for Societal Impact of Hurricane

Explore at:
Dataset updated
May 24, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Anonymous; Anonymous
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

MASH: A Multiplatform Annotated Dataset for Societal Impact of Hurricane

We present a Multiplatform Annotated Dataset for Societal Impact of Hurricane (MASH) that includes 98,662 relevant social media data posts from Reddit, X, TikTok, and YouTube.
In addition, all relevant posts are annotated on three dimensions: Humanitarian Classes, Bias Classes, and Information Integrity Classes in a multi-modal approach that considers both textual and visual content (text, images, and videos), providing a rich labeled dataset for in-depth analysis.
The dataset is also complemented by an Online Analytics Platform (https://hurricane.web.illinois.edu/) that not only allows users to view hurricane-related posts and articles, but also explores high-frequency keywords, user sentiment, and the locations where posts were made.
To our best knowledge, MASH is the first large-scale, multi-platform, multimodal, and multi-dimensionally annotated hurricane dataset. We envision that MASH can contribute to the study of hurricanes' impact on society, such as disaster severity classification, event detections, public sentiment analysis, and bias identification.

Usage Notice

This dataset includes four annotation files:
• reddit_anno_publish.csv
• tiktok_anno_publish.csv
• twitter_anno_publish.csv
• youtube_anno_publish.csv
Each file contains post IDs and corresponding annotations on three dimensions: Humanitarian Classes, Bias Classes, and Information Integrity Classes.
To protect user privacy, only post IDs are released. We recommend retrieving the full post content via the official APIs of each platform, in accordance with their respective terms of service.

Humanitarian Classes

Each post is annotated with seven binary humanitarian classes. For each class, the label is either:
• True – the post contains this humanitarian information
• False – the post does not contain this information
These seven humanitarian classes include:
• Casualty: The post reports people or animals who are killed, injured, or missing during the hurricane.
• Evacuation: The post describes the evacuation, relocation, rescue, or displacement of individuals or animals due to the hurricane.
• Damage: The post reports damage to infrastructure or public utilities caused by the hurricane.
• Advice: The post provides advice, guidance, or suggestions related to hurricanes, including how to stay safe, protect property, or prepare for the disaster.
• Request: Request for help, support, or resources due to the hurricane
• Assistance: This includes both physical aid and emotional or psychological support provided by individuals, communities, or organizations.
• Recovery: The post describes efforts or activities related to the recovery and rebuilding process after the hurricane.
Note: A single post may be labeled as True for multiple humanitarian categories.

Bias Classes

Each post is annotated with five binary bias classes. For each class, the label is either:
• True – the post contains this bias information
• False – the post does not contain this information
These five bias classes include:
• Linguistic Bias: The post contains biased, inappropriate, or offensive language, with a focus on word choice, tone, or expression.
• Political Bias: The post expresses political ideology, showing favor or disapproval toward specific political actors, parties, or policies.
• Gender Bias: The post contains biased, stereotypical, or discriminatory language or viewpoints related to gender.
• Hate Speech: The post contains language that expresses hatred, hostility, or dehumanization toward a specific group or individual, especially those belonging to minority or marginalized communities.
• Racial Bias: The post contains biased, discriminatory, or stereotypical statements directed toward one or more racial or ethnic groups.
Note: A single post may be labeled as True for multiple bias categories.

Information Integrity Classes

Each post is also annotated with a single information integrity class, represented by an integer:
• -1 → False information (i.e., misinformation or disinformation)
• 0 → Unverifiable information (unclear or lacking sufficient evidence)
• 1 → True information (verifiable and accurate)

Key Notes

  1. This dataset is also available at https://huggingface.co/datasets/YRC10/MASH.
  2. Version 1 is no longer available.
Search
Clear search
Close search
Google apps
Main menu