34 datasets found
  1. PERSONA

    • huggingface.co
    Updated Apr 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SynthLabs (2025). PERSONA [Dataset]. https://huggingface.co/datasets/SynthLabsAI/PERSONA
    Explore at:
    Dataset updated
    Apr 16, 2025
    Dataset provided by
    Synth Labs
    Authors
    SynthLabs
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Dataset Card for PERSONAS (Prism Filter)

    PERSONAS (Prism filter) is one of the largest datasets of synthetic preferences, with over 200k preferences over thousands of questions and 1k personas. Details on the PERSONAS dataset can be found here paper link. Note that you MUST also fill out the form on our site to receive access to the full dataset. The form is available here.

      Dataset Details
    
    
    
    
    
    
    
      Dataset Description
    

    The personas dataset is a pluralistic… See the full description on the dataset page: https://huggingface.co/datasets/SynthLabsAI/PERSONA.

  2. persona-bias

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ai2, persona-bias [Dataset]. https://huggingface.co/datasets/allenai/persona-bias
    Explore at:
    Dataset provided by
    Allen Institute for AIhttp://allenai.org/
    Authors
    Ai2
    License

    https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/

    Description

    Persona-bias

    Data accompanying the paper Bias Runs Deep: Implicit Reasoning Biases in Persona-Assigned LLMs at ICLR 2024. Paper || Code || Project website || License

      Motivation
    

    This is a dataset of model outputs supporting our extensive study of biases in persona-assigned LLMs. These model outputs can be used for many purposes, for instance:

    developing a deeper understanding of persona-induced biases, e.g. by analyzing the inhibiting assumptions underlying model… See the full description on the dataset page: https://huggingface.co/datasets/allenai/persona-bias.

  3. h

    persona-driven-dataset

    • huggingface.co
    Updated Jun 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gauri K (2025). persona-driven-dataset [Dataset]. https://huggingface.co/datasets/gourik/persona-driven-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 1, 2025
    Authors
    Gauri K
    Description

    Datasets for paper "Socio-Culturally Aware Evaluation Framework for LLM-Based Content Moderation" https://arxiv.org/abs/2412.13578

  4. f

    Persona list by category.

    • plos.figshare.com
    xls
    Updated Jun 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pedro Henrique Luz de Araujo; Benjamin Roth (2025). Persona list by category. [Dataset]. http://doi.org/10.1371/journal.pone.0325664.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 30, 2025
    Dataset provided by
    PLOS ONE
    Authors
    Pedro Henrique Luz de Araujo; Benjamin Roth
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    One way to steer generations from large language models (LLM) is to assign a persona: a role that describes how the user expects the LLM to behave (e.g., a helpful assistant, a teacher, a woman). This paper investigates how personas affect diverse aspects of model behavior. We assign to seven LLMs 162 personas from 12 categories spanning variables like gender, sexual orientation, and occupation. We prompt them to answer questions from five datasets covering objective (e.g., questions about math and history) and subjective tasks (e.g., questions about beliefs and values). We also compare persona’s generations to two baseline settings: a control persona setting with 30 paraphrases of “a helpful assistant” to control for models’ prompt sensitivity, and an empty persona setting where no persona is assigned. We find that for all models and datasets, personas show greater variability than the control setting and that some measures of persona behavior generalize across models.

  5. f

    Persona group average ranks (out of 193—162 personas + 30 control personas +...

    • plos.figshare.com
    xls
    Updated Jun 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pedro Henrique Luz de Araujo; Benjamin Roth (2025). Persona group average ranks (out of 193—162 personas + 30 control personas + no persona baseline—lower is better) for each knowledge domain. The rank of the best persona in each group is shown in parenthesis. We show in bold the top persona group for each domain and we underline the best domain of each persona group. The top ranked persona for social sciences was the social scientist persona. [Dataset]. http://doi.org/10.1371/journal.pone.0325664.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 30, 2025
    Dataset provided by
    PLOS ONE
    Authors
    Pedro Henrique Luz de Araujo; Benjamin Roth
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Persona group average ranks (out of 193—162 personas + 30 control personas + no persona baseline—lower is better) for each knowledge domain. The rank of the best persona in each group is shown in parenthesis. We show in bold the top persona group for each domain and we underline the best domain of each persona group. The top ranked persona for social sciences was the social scientist persona.

  6. Synthetic-Persona-Chat

    • huggingface.co
    Updated Dec 20, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Google (2023). Synthetic-Persona-Chat [Dataset]. https://huggingface.co/datasets/google/Synthetic-Persona-Chat
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 20, 2023
    Dataset authored and provided by
    Googlehttp://google.com/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset Card for SPC: Synthetic-Persona-Chat Dataset

    Abstract from the paper introducing this dataset:

    High-quality conversational datasets are essential for developing AI models that can communicate with users. One way to foster deeper interactions between a chatbot and its user is through personas, aspects of the user's character that provide insights into their personality, motivations, and behaviors. Training Natural Language Processing (NLP) models on a diverse and… See the full description on the dataset page: https://huggingface.co/datasets/google/Synthetic-Persona-Chat.

  7. P

    PEC Dataset

    • paperswithcode.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Peixiang Zhong; Chen Zhang; Hao Wang; Yong liu; Chunyan Miao, PEC Dataset [Dataset]. https://paperswithcode.com/dataset/pec
    Explore at:
    Authors
    Peixiang Zhong; Chen Zhang; Hao Wang; Yong liu; Chunyan Miao
    Description

    A novel large-scale multi-domain dataset for persona-based empathetic conversations.

  8. f

    Persona ranks (out of 193, lower is better) for increasingly specialized...

    • plos.figshare.com
    xls
    Updated Jun 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pedro Henrique Luz de Araujo; Benjamin Roth (2025). Persona ranks (out of 193, lower is better) for increasingly specialized domains. For persona groups with multiple personas we show, in addition to the average rank, the rank of the best persona in the category between parentheses. [Dataset]. http://doi.org/10.1371/journal.pone.0325664.t004
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 30, 2025
    Dataset provided by
    PLOS ONE
    Authors
    Pedro Henrique Luz de Araujo; Benjamin Roth
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Persona ranks (out of 193, lower is better) for increasingly specialized domains. For persona groups with multiple personas we show, in addition to the average rank, the rank of the best persona in the category between parentheses.

  9. Data from: Mapping and Influencing the Political Ideology of Large Language...

    • zenodo.org
    bin, json
    Updated Feb 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pietro Bernardelle; Pietro Bernardelle; Leon Fröhling; Leon Fröhling; Stefano Civelli; Stefano Civelli; Riccardo Lunardi; Riccardo Lunardi; KEVIN ROITERO; KEVIN ROITERO; Gianluca Demartini; Gianluca Demartini (2025). Mapping and Influencing the Political Ideology of Large Language Models using Synthetic Personas [Dataset]. http://doi.org/10.5281/zenodo.14816665
    Explore at:
    bin, jsonAvailable download formats
    Dataset updated
    Feb 16, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Pietro Bernardelle; Pietro Bernardelle; Leon Fröhling; Leon Fröhling; Stefano Civelli; Stefano Civelli; Riccardo Lunardi; Riccardo Lunardi; KEVIN ROITERO; KEVIN ROITERO; Gianluca Demartini; Gianluca Demartini
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This repository contains the datasets and materials used to analyze and replicate the results presented in our paper investigating how persona-based prompting affects the political orientations of Large Language Models (LLMs).

    Contents

    The repository includes files organized by model (Mistral, Llama, Qwen, and Zephyr) and experimental condition (base, right-authoritarian [ra], and left-libertarian [ll]):

    Model Response Data

    • *_persona_compass_base.pqt: Political compass test responses for each model using baseline persona descriptions
    • *_persona_compass_ra.pqt: Responses after injecting right-authoritarian descriptors
    • *_persona_compass_ll.pqt: Responses after injecting left-libertarian descriptors

    Configuration and Input Files

    • personas.json: Collection of synthetic persona descriptions from PersonaHub used in the experiments
    • token_personas.json: Tokenized versions of the persona descriptions
    • political_compass_statements.json: The 62 statements from the Political Compass Test used for evaluation
    • prompts.json: Prompt templates used for model interactions
    • baseLLMsPoliticalView.json: Default political orientations of the models without persona prompting

    Related Code Repository

    The code used to analyze this data and reproduce the results presented in the paper can be found at: https://github.com/d-lab/llm-political-personas

    File Placement Instructions

    After downloading, organize the files as follows:

    Configuration and Input Files

    Place all the configuration files in the data/raw/ directory.

    Model Response Files

    Rename all model-specific .pqt files to persona_compass.pqt and place them in their respective directories:

    • Base condition files:
      • data/processed/Llama-3.1-8B-Instruct/base/persona_compass.pqt
      • data/processed/Mistral-7B-Instruct-v0.3/base/persona_compass.pqt
      • data/processed/Qwen2.5-7B-Instruct/base/persona_compass.pqt
      • data/processed/zephyr-7b-beta/base/persona_compass.pqt
    • Right-authoritarian condition files:
      • data/processed/Llama-3.1-8B-Instruct/right_authoritarian_personas/persona_compass.pqt
      • data/processed/Mistral-7B-Instruct-v0.3/right_authoritarian_personas/persona_compass.pqt
      • data/processed/Qwen2.5-7B-Instruct/right_authoritarian_personas/persona_compass.pqt
      • data/processed/zephyr-7b-beta/right_authoritarian_personas/persona_compass.pqt
    • Left-libertarian condition files:
      • data/processed/Llama-3.1-8B-Instruct/left_libertarian_personas/persona_compass.pqt
      • data/processed/Mistral-7B-Instruct-v0.3/left_libertarian_personas/persona_compass.pqt
      • data/processed/Qwen2.5-7B-Instruct/left_libertarian_personas/persona_compass.pqt
      • data/processed/zephyr-7b-beta/left_libertarian_personas/persona_compass.pqt
  10. P

    ConvAI2 Dataset

    • paperswithcode.com
    • library.toponeai.link
    Updated Feb 19, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Emily Dinan; Varvara Logacheva; Valentin Malykh; Alexander Miller; Kurt Shuster; Jack Urbanek; Douwe Kiela; Arthur Szlam; Iulian Serban; Ryan Lowe; Shrimai Prabhumoye; Alan W. black; Alexander Rudnicky; Jason Williams; Joelle Pineau; Mikhail Burtsev; Jason Weston (2021). ConvAI2 Dataset [Dataset]. https://paperswithcode.com/dataset/convai2
    Explore at:
    Dataset updated
    Feb 19, 2021
    Authors
    Emily Dinan; Varvara Logacheva; Valentin Malykh; Alexander Miller; Kurt Shuster; Jack Urbanek; Douwe Kiela; Arthur Szlam; Iulian Serban; Ryan Lowe; Shrimai Prabhumoye; Alan W. black; Alexander Rudnicky; Jason Williams; Joelle Pineau; Mikhail Burtsev; Jason Weston
    Description

    The ConvAI2 NeurIPS competition aimed at finding approaches to creating high-quality dialogue agents capable of meaningful open domain conversation. The ConvAI2 dataset for training models is based on the PERSONA-CHAT dataset. The speaker pairs each have assigned profiles coming from a set of 1155 possible personas (at training time), each consisting of at least 5 profile sentences, setting aside 100 never seen before personas for validation. As the original PERSONA-CHAT test set was released, a new hidden test set consisted of 100 new personas and over 1,015 dialogs was created by crowdsourced workers.

    To avoid modeling that takes advantage of trivial word overlap, additional rewritten sets of the same train and test personas were crowdsourced, with related sentences that are rephrases, generalizations or specializations, rendering the task much more challenging. For example “I just got my nails done” is revised as “I love to pamper myself on a regular basis” and “I am on a diet now” is revised as “I need to lose weight.”

    The training, validation and hidden test sets consists of 17,878, 1,000 and 1,015 dialogues, respectively.

  11. f

    Persona ranks for self-bias (out of 193), self-accuracy, overall bias, and...

    • plos.figshare.com
    • figshare.com
    xls
    Updated Jun 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pedro Henrique Luz de Araujo; Benjamin Roth (2025). Persona ranks for self-bias (out of 193), self-accuracy, overall bias, and overall accuracy. [Dataset]. http://doi.org/10.1371/journal.pone.0325664.t005
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 30, 2025
    Dataset provided by
    PLOS ONE
    Authors
    Pedro Henrique Luz de Araujo; Benjamin Roth
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Persona ranks for self-bias (out of 193), self-accuracy, overall bias, and overall accuracy.

  12. PERSONA_subset

    • huggingface.co
    Updated Apr 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SynthLabs (2025). PERSONA_subset [Dataset]. https://huggingface.co/datasets/SynthLabsAI/PERSONA_subset
    Explore at:
    Dataset updated
    Apr 16, 2025
    Dataset provided by
    Synth Labs
    Authors
    SynthLabs
    License

    https://choosealicense.com/licenses/cc/https://choosealicense.com/licenses/cc/

    Description

    Dataset Card for PERSONAS (Prism Filter)

    PERSONAS (Prism filter) is one of the largest datasets of synthetic preferences, with over 200k preferences over thousands of questions and 1k personas. Details on the PERSONAS dataset can be found here paper link Note that this subset is 5% of the training split of PERSONAS. The full dataset is here, strictly available for academic use. You MUST request access to the full persona dataset here.

      Dataset Details… See the full description on the dataset page: https://huggingface.co/datasets/SynthLabsAI/PERSONA_subset.
    
  13. P

    USR-PersonaChat Dataset

    • paperswithcode.com
    Updated Feb 18, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shikib Mehri; Maxine Eskenazi (2022). USR-PersonaChat Dataset [Dataset]. https://paperswithcode.com/dataset/usr-personachat
    Explore at:
    Dataset updated
    Feb 18, 2022
    Authors
    Shikib Mehri; Maxine Eskenazi
    Description

    This dataset was collected with the goal of assessing dialog evaluation metrics. In the paper, USR: An Unsupervised and Reference Free Evaluation Metric for Dialog (Mehri and Eskenazi, 2020), the authors collect this data to measure the quality of several existing word-overlap and embedding-based metrics, as well as their newly proposed USR metric.

  14. P

    SynthPAI Dataset

    • paperswithcode.com
    Updated Jun 10, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hanna Yukhymenko; Robin Staab; Mark Vero; Martin Vechev (2024). SynthPAI Dataset [Dataset]. https://paperswithcode.com/dataset/synthpai
    Explore at:
    Dataset updated
    Jun 10, 2024
    Authors
    Hanna Yukhymenko; Robin Staab; Mark Vero; Martin Vechev
    Description

    SynthPAI was created to provide a dataset that can be used to investigate the personal attribute inference (PAI) capabilities of LLM on online texts. Due to associated privacy concerns with real-world data, open datasets are rare (non-existent) in the research community. SynthPAI is a synthetic dataset that aims to fill this gap.

    Dataset Details Dataset Description SynthPAI was created using 300 GPT-4 agents seeded with individual personalities interacting with each other in a simulated online forum and consists of 103 threads and 7823 comments. For each profile, we further provide a set of personal attributes that a human could infer from the profile. We additionally conducted a user study to evaluate the quality of the synthetic comments, establishing that humans can barely distinguish between real and synthetic comments.

    Curated by: The dataset was created by SRILab at ETH Zurich. It was not created on behalf of any outside entity. Funded by: Two authors of this work are supported by the Swiss State Secretariat for Education, Research and Innovation (SERI) (SERI-funded ERC Consolidator Grant). This project did, however, not receive explicit funding by SERI and was devised independently. Views and opinions expressed are however those of the authors only and do not necessarily reflect those of the SERI-funded ERC Consolidator Grant. Shared by: SRILab at ETH Zurich Language(s) (NLP): English License: CC-BY-NC-SA-4.0

    Dataset Sources

    Repository: https://github.com/eth-sri/SynthPAI Paper: https://arxiv.org/abs/2406.07217

    Uses The dataset is intended to be used as a privacy-preserving method of (i) evaluating PAI capabilities of language models and (ii) aiding the development of potential defenses against such automated inferences.

    Direct Use As in the associated paper , where we include an analysis of the personal attribute inference (PAI) capabilities of 18 state-of-the-art LLMs across different attributes and on anonymized texts.

    Out-of-Scope Use The dataset shall not be used as part of any system that performs attribute inferences on real natural persons without their consent or otherwise maliciously.

    Dataset Structure We provide the instance descriptions below. Each data point consists of a single comment (that can be a top-level post):

    Comment

    author str: unique identifier of the person writing

    username str: corresponding username

    parent_id str: unique identifier of the parent comment

    thread_id str: unique identifier of the thread

    children list[str]: unique identifiers of children comments

    profile Profile: profile making the comment - described below

    text str: text of the comment

    guesses list[dict]: Dict containing model estimates of attributes based on the comment. Only contains attributes for which a prediction exists.

    reviews dict: Dict containing human estimates of attributes based on the comment. Each guess contains a corresponding hardness rating (and certainty rating). Contains all attributes

    The associated profiles are structured as follows

    Profile

    username str: identifier

    attributes: set of personal attributes that describe the user (directly listed below)

    The corresponding attributes and values are

    Attributes

    Age continuous [18-99] The age of a user in years.

    Place of Birth tuple [city, country] The place of birth of a user. We create tuples jointly for city and country in free-text format. (field name: birth_city_country)

    Location tuple [city, country] The current location of a user. We create tuples jointly for city and country in free-text format. (field name: city_country)

    Education free-text We use a free-text field to describe the user's education level. This includes additional details such as the degree and major. To ensure comparability with the evaluation of prior work, we later map these to a categorical scale: high school, college degree, master's degree, PhD.

    Income Level free-text [low, medium, high, very high] The income level of a user. We first generate a continuous income level in the profile's local currency. In our code, we map this to a categorical value considering the distribution of income levels in the respective profile location. For this, we roughly follow the local equivalents of the following reference levels for the US: Low (<30k USD), Middle (30-60k USD), High (60-150k USD), Very High (>150k USD).

    Occupation free-text The occupation of a user, described as a free-text field.

    Relationship Status categorical [single, In a Relationship, married, divorced, widowed] The relationship status of a user as one of 5 categories.

    Sex categorical [Male, Female] Biological Sex of a profile.

    Dataset Creation Curation Rationale SynthPAI was created to provide a dataset that can be used to investigate the personal attribute inference (PAI) capabilities of LLM on online texts. Due to associated privacy concerns with real-world data, open datasets are rare (non-existent) in the research community. SynthPAI is a synthetic dataset that aims to fill this gap. We additionally conducted a user study to evaluate the quality of the synthetic comments, establishing that humans can barely distinguish between real and synthetic comments.

    Source Data The dataset is fully synthetic and was created using GPT-4 agents (version gpt-4-1106-preview) seeded with individual personalities interacting with each other in a simulated online forum.

    Data Collection and Processing The dataset was created by sampling comments from the agents in threads. A human then inferred a set of personal attributes from sets of comments associated with each profile. Further, it was manually reviewed to remove any offensive or inappropriate content. We give a detailed overview of our dataset-creation procedure in the corresponding paper.

    Annotations

    Annotations are provided by authors of the paper.

    Personal and Sensitive Information

    All contained personal information is purely synthetic and does not relate to any real individual.

    Bias, Risks, and Limitations All profiles are synthetic and do not correspond to any real subpopulations. We provide a distribution of the personal attributes of the profiles in the accompanying paper. As the dataset has been created synthetically, data points can inherit limitations (e.g., biases) from the underlying model, GPT-4. While we manually reviewed comments individually, we cannot provide respective guarantees.

    Citation BibTeX:

    @misc{2406.07217, Author = {Hanna Yukhymenko and Robin Staab and Mark Vero and Martin Vechev}, Title = {A Synthetic Dataset for Personal Attribute Inference}, Year = {2024}, Eprint = {arXiv:2406.07217}, } APA:

    Hanna Yukhymenko, Robin Staab, Mark Vero, Martin Vechev: “A Synthetic Dataset for Personal Attribute Inference”, 2024; arXiv:2406.07217.

    Dataset Card Authors

    Hanna Yukhymenko Robin Staab Mark Vero

  15. h

    persona-chat

    • huggingface.co
    Updated Jul 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Awsaf (2025). persona-chat [Dataset]. https://huggingface.co/datasets/awsaf49/persona-chat
    Explore at:
    Dataset updated
    Jul 3, 2025
    Authors
    Awsaf
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset Card for PersonaChat

      Dataset Description
    

    PersonaChat is a multi-turn dialogue dataset introduced by Zhang et al. (2018) for training and evaluating persona-grounded conversational agents. Each conversation is between two crowdworkers, each assigned a randomly selected persona consisting of several simple facts. The dataset aims to assess whether models can maintain consistent character traits throughout a conversation.

    Original Paper: Personalizing Dialogue… See the full description on the dataset page: https://huggingface.co/datasets/awsaf49/persona-chat.

  16. f

    Example prompts (with an example persona) for all datasets.

    • plos.figshare.com
    xls
    Updated Jun 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pedro Henrique Luz de Araujo; Benjamin Roth (2025). Example prompts (with an example persona) for all datasets. [Dataset]. http://doi.org/10.1371/journal.pone.0325664.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 30, 2025
    Dataset provided by
    PLOS ONE
    Authors
    Pedro Henrique Luz de Araujo; Benjamin Roth
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Example prompts (with an example persona) for all datasets.

  17. P

    FoCus Dataset

    • paperswithcode.com
    Updated Feb 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yoonna Jang; Jungwoo Lim; Yuna Hur; Dongsuk Oh; Suhyune Son; Yeonsoo Lee; Donghoon Shin; Seungryong Kim; Heuiseok Lim (2024). FoCus Dataset [Dataset]. https://paperswithcode.com/dataset/focus
    Explore at:
    Dataset updated
    Feb 12, 2024
    Authors
    Yoonna Jang; Jungwoo Lim; Yuna Hur; Dongsuk Oh; Suhyune Son; Yeonsoo Lee; Donghoon Shin; Seungryong Kim; Heuiseok Lim
    Description

    We introduce a new dataset, called FoCus, that supports knowledge-grounded answers that reflect user’s persona. One of the situations in which people need different types of knowledge, based on their preferences, occurs when they travel around the world.

  18. f

    Differences between the average accuracy (across all personas) and the...

    • figshare.com
    • plos.figshare.com
    xls
    Updated Jun 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pedro Henrique Luz de Araujo; Benjamin Roth (2025). Differences between the average accuracy (across all personas) and the accuracy of personas when answering questions involving their own demographic. [Dataset]. http://doi.org/10.1371/journal.pone.0325664.t006
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 30, 2025
    Dataset provided by
    PLOS ONE
    Authors
    Pedro Henrique Luz de Araujo; Benjamin Roth
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Differences between the average accuracy (across all personas) and the accuracy of personas when answering questions involving their own demographic.

  19. f

    Differences between the frequency that each demographic is selected as the...

    • plos.figshare.com
    xls
    Updated Jun 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pedro Henrique Luz de Araujo; Benjamin Roth (2025). Differences between the frequency that each demographic is selected as the answer by the persona of the same demographic and on average (across all personas). [Dataset]. http://doi.org/10.1371/journal.pone.0325664.t007
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 30, 2025
    Dataset provided by
    PLOS ONE
    Authors
    Pedro Henrique Luz de Araujo; Benjamin Roth
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Differences between the frequency that each demographic is selected as the answer by the persona of the same demographic and on average (across all personas).

  20. m

    Zhi Yubo_URL of Master thesis_2025

    • data.mendeley.com
    Updated Apr 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yubo Zhi (2025). Zhi Yubo_URL of Master thesis_2025 [Dataset]. http://doi.org/10.17632/8z8thsd2d7.1
    Explore at:
    Dataset updated
    Apr 25, 2025
    Authors
    Yubo Zhi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Current researches focus on understanding influencer marketing, the theories behind it and factors that contributes to the success of such campaigns. Even though many articles and research papers do acknowledge that relationship of the 2 parties is important and essential for influencer marketing, however, very few or no researches directly conduct empirical analysis on whether relationship between KOLs and their followers indeed influence and to what magnitude influence the success of influencer marketing campaigns and eventually impacting brand’s choice of marketing tactic or KOLs to choose. This study in the form of case study with KOLs on Instagram and Red platform will help to fill this void by addressing this issue which is underexplored currently and provide a deep-dive into the relationship between influencers and its followers and the impact on their followers. Together with the deep-dive, the paper will also include researches on other factors that will affect the effectiveness of influencer marketing. Empirical evidence from this research confirms that KOLs’ ability to influence their followers will impact the outcome of influencer marketing, but only effective through certain methods. Specifically, focusing on two largest social platform, Instagram and Red, the paper found that post content alignment with KOL’s persona and write-up or message interactivity are two influencing factors in determining the success of influencer marketing. While other factors such as relationship built between the KOL and followers does not seem to influence the outcome of future campaigns, potentially suggesting that past relationship built between the KOL and her followers has a short-horizon of influences, as the benefits of strong relationship with the followers seem not carry forward. The findings in this paper offer marketers and KOLs theoretical guidance for conducting influencer marketing campaigns on Instagram and Red as well as in the global and China market. Keywords: Brand marketing strategy, Influencer Marketing, key opinion leader, Social Media Platforms, Consumer BehaviorM

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
SynthLabs (2025). PERSONA [Dataset]. https://huggingface.co/datasets/SynthLabsAI/PERSONA
Organization logo

PERSONA

SynthLabsAI/PERSONA

PERSONAS (Prism Filter)

Explore at:
Dataset updated
Apr 16, 2025
Dataset provided by
Synth Labs
Authors
SynthLabs
License

Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically

Description

Dataset Card for PERSONAS (Prism Filter)

PERSONAS (Prism filter) is one of the largest datasets of synthetic preferences, with over 200k preferences over thousands of questions and 1k personas. Details on the PERSONAS dataset can be found here paper link. Note that you MUST also fill out the form on our site to receive access to the full dataset. The form is available here.

  Dataset Details







  Dataset Description

The personas dataset is a pluralistic… See the full description on the dataset page: https://huggingface.co/datasets/SynthLabsAI/PERSONA.

Search
Clear search
Close search
Google apps
Main menu