9 datasets found

P
ConvAI2 Dataset
paperswithcode.com
library.toponeai.link
Updated Jun 14, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Emily Dinan; Varvara Logacheva; Valentin Malykh; Alexander Miller; Kurt Shuster; Jack Urbanek; Douwe Kiela; Arthur Szlam; Iulian Serban; Ryan Lowe; Shrimai Prabhumoye; Alan W. black; Alexander Rudnicky; Jason Williams; Joelle Pineau; Mikhail Burtsev; Jason Weston (2022). ConvAI2 Dataset [Dataset]. https://paperswithcode.com/dataset/convai2
Explore at:
Dataset updated
Jun 14, 2022
Authors
Emily Dinan; Varvara Logacheva; Valentin Malykh; Alexander Miller; Kurt Shuster; Jack Urbanek; Douwe Kiela; Arthur Szlam; Iulian Serban; Ryan Lowe; Shrimai Prabhumoye; Alan W. black; Alexander Rudnicky; Jason Williams; Joelle Pineau; Mikhail Burtsev; Jason Weston
Description
The ConvAI2 NeurIPS competition aimed at finding approaches to creating high-quality dialogue agents capable of meaningful open domain conversation. The ConvAI2 dataset for training models is based on the PERSONA-CHAT dataset. The speaker pairs each have assigned profiles coming from a set of 1155 possible personas (at training time), each consisting of at least 5 profile sentences, setting aside 100 never seen before personas for validation. As the original PERSONA-CHAT test set was released, a new hidden test set consisted of 100 new personas and over 1,015 dialogs was created by crowdsourced workers.

To avoid modeling that takes advantage of trivial word overlap, additional rewritten sets of the same train and test personas were crowdsourced, with related sentences that are rephrases, generalizations or specializations, rendering the task much more challenging. For example “I just got my nails done” is revised as “I love to pamper myself on a regular basis” and “I am on a diet now” is revised as “I need to lose weight.”

The training, validation and hidden test sets consists of 17,878, 1,000 and 1,015 dialogues, respectively.
P
BPersona-chat Dataset
paperswithcode.com
Updated Aug 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yunmeng Li; Jun Suzuki; Makoto Morishita; Kaori Abe; Ryoko Tokuhisa; Ana Brassard; Kentaro Inui (2023). BPersona-chat Dataset [Dataset]. https://paperswithcode.com/dataset/bpersona-chat
Explore at:
Dataset updated
Aug 1, 2023
Authors
Yunmeng Li; Jun Suzuki; Makoto Morishita; Kaori Abe; Ryoko Tokuhisa; Ana Brassard; Kentaro Inui
Description
BPersona-chat is an evaluation dataset based on the English multiturn chat corpus Persona-chat and the Japanese multiturn chat corpus JPersona-chat.

Each chat was performed between two crowd workers assuming artificial personas. The speakers discuss a given personality trait, including but not limited to self-introduction, hobby, and others. (Notice that they are not translations of each other.)

Chats are translated into Japanese/English by professional translators, a low-quality machine translation model A and a high-quality machine translation model B.

Translations are evaluated by crowdworkers as either good or bad, depending on the correctness and coherence.

Each chat is included in one .xlsx file with the following structure:

person - the speaker on the current utterance, source - the utterance in the source language, translation - the translation in the target language, evaluation: is this a good translation? - the evaluation of the translation's quality, y - the current translation is a correct translation of the source utterance, n - the current translation is an erroneous translation of the source utterance.
PMPC (Persona Match on Persona-Chat)
opendatalab.com
zip
Updated Sep 22, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
iFlytek Research (2022). PMPC (Persona Match on Persona-Chat) [Dataset]. https://opendatalab.com/OpenDataLab/PMPC
Explore at:
zip(141185672 bytes)Available download formats
Dataset updated
Sep 22, 2022
Dataset provided by
科大讯飞http://www.iflytek.com/
Microsoft Research Asia
Queen’s University
University of Science and Technology of China
Description
PMPC (Persona Match on Persona-Chat) is a dataset for Speaker Persona Detection (SPD) which aims to detect speaker personas based on the plain conversational text.
h
persona-chat-en2bn-azure
huggingface.co
Updated Mar 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Intelsense AI (2025). persona-chat-en2bn-azure [Dataset]. https://huggingface.co/datasets/intelsense/persona-chat-en2bn-azure
Explore at:
Dataset updated
Mar 26, 2025
Dataset authored and provided by
Intelsense AI
Description
intelsense/persona-chat-en2bn-azure dataset hosted on Hugging Face and contributed by the HF Datasets community
t
USR-PersonaChat - Dataset - LDM
service.tib.eu
Updated Jan 2, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). USR-PersonaChat - Dataset - LDM [Dataset]. https://service.tib.eu/ldmservice/dataset/usr-personachat
Explore at:
Dataset updated
Jan 2, 2025
Description
This dataset is used for dialogue response evaluation.
w
persona.chat - Historical whois Lookup
whoisdatacenter.com
csv
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
AllHeart Web Inc, persona.chat - Historical whois Lookup [Dataset]. https://whoisdatacenter.com/domain/persona.chat/
Explore at:
csvAvailable download formats
Dataset authored and provided by
AllHeart Web Inc
License
https://whoisdatacenter.com/terms-of-use/https://whoisdatacenter.com/terms-of-use/
Time period covered
Mar 15, 1985 - Mar 27, 2025
Description
Explore the historical Whois records related to persona.chat (Domain). Get insights into ownership history and changes over time.
h
persona-chat-en2bn
huggingface.co
Updated Mar 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Intelsense AI (2025). persona-chat-en2bn [Dataset]. https://huggingface.co/datasets/intelsense/persona-chat-en2bn
Explore at:
Dataset updated
Mar 26, 2025
Dataset authored and provided by
Intelsense AI
Description
intelsense/persona-chat-en2bn dataset hosted on Hugging Face and contributed by the HF Datasets community
P
PersonalDialog Dataset
paperswithcode.com
opendatalab.com
Updated Dec 16, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yinhe Zheng; Guanyi Chen; Minlie Huang; Song Liu; Xuan Zhu (2021). PersonalDialog Dataset [Dataset]. https://paperswithcode.com/dataset/personaldialog
Explore at:
Dataset updated
Dec 16, 2021
Authors
Yinhe Zheng; Guanyi Chen; Minlie Huang; Song Liu; Xuan Zhu
Description
PersonalDialog is a large-scale multi-turn dialogue dataset containing various traits from a large number of speakers. The dataset consists of 20.83M sessions and 56.25M utterances from 8.47M speakers. Each utterance is associated with a speaker who is marked with traits like Age, Gender, Location, Interest Tags, etc. Several anonymization schemes are designed to protect the privacy of each speaker.
Ranking de aplicaciones de mensajería según usuarios activos mensuales...
es.statista.com
Updated Jul 31, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2024). Ranking de aplicaciones de mensajería según usuarios activos mensuales mundiales 2024 [Dataset]. https://es.statista.com/estadisticas/599043/aplicaciones-de-mensajeria-mas-populares-a-nivel-mundial-de/
Explore at:
Dataset updated
Jul 31, 2024
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
2024
Area covered
Mundial
Description
En enero de 2024, 2.000 millones de usuarios accedían al chat de WhatsApp mensualmente. El uso de la aplicación es particularmente fuerte en mercados en Estados Unidos, aunque cabe destacar que es una de las aplicaciones sociales móviles más populares en todo el mundo. En febrero de 2014, la red social Facebook adquirió la aplicación móvil por 19.000 millones de dólares estadounidenses.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Emily Dinan; Varvara Logacheva; Valentin Malykh; Alexander Miller; Kurt Shuster; Jack Urbanek; Douwe Kiela; Arthur Szlam; Iulian Serban; Ryan Lowe; Shrimai Prabhumoye; Alan W. black; Alexander Rudnicky; Jason Williams; Joelle Pineau; Mikhail Burtsev; Jason Weston (2022). ConvAI2 Dataset [Dataset]. https://paperswithcode.com/dataset/convai2

ConvAI2 Dataset

Conversational Intelligence Challenge 2

Explore at:

Dataset updated

Jun 14, 2022

Authors

Description

The ConvAI2 NeurIPS competition aimed at finding approaches to creating high-quality dialogue agents capable of meaningful open domain conversation. The ConvAI2 dataset for training models is based on the PERSONA-CHAT dataset. The speaker pairs each have assigned profiles coming from a set of 1155 possible personas (at training time), each consisting of at least 5 profile sentences, setting aside 100 never seen before personas for validation. As the original PERSONA-CHAT test set was released, a new hidden test set consisted of 100 new personas and over 1,015 dialogs was created by crowdsourced workers.

To avoid modeling that takes advantage of trivial word overlap, additional rewritten sets of the same train and test personas were crowdsourced, with related sentences that are rephrases, generalizations or specializations, rendering the task much more challenging. For example “I just got my nails done” is revised as “I love to pamper myself on a regular basis” and “I am on a diet now” is revised as “I need to lose weight.”

The training, validation and hidden test sets consists of 17,878, 1,000 and 1,015 dialogues, respectively.

Clear search

Close search

Google apps

Main menu

ConvAI2 Dataset

BPersona-chat Dataset

PMPC (Persona Match on Persona-Chat)

persona-chat-en2bn-azure

USR-PersonaChat - Dataset - LDM

persona.chat - Historical whois Lookup

persona-chat-en2bn

PersonalDialog Dataset

Ranking de aplicaciones de mensajería según usuarios activos mensuales...

ConvAI2 DatasetSee More Versions

Conversational Intelligence Challenge 2

ConvAI2 Dataset