5 datasets found

h
convAI
huggingface.co
Updated Feb 4, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
saikrishna (2024). convAI [Dataset]. https://huggingface.co/datasets/saikrishna759/convAI
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 4, 2024
Authors
saikrishna
Description
saikrishna759/convAI dataset hosted on Hugging Face and contributed by the HF Datasets community
h
ConvAI
huggingface.co
Updated Jul 12, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
causal-lm-dataset (2023). ConvAI [Dataset]. https://huggingface.co/datasets/causal-lm/ConvAI
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 12, 2023
Dataset authored and provided by
causal-lm-dataset
Description
causal-lm/ConvAI dataset hosted on Hugging Face and contributed by the HF Datasets community
h
EPO-RL-data
huggingface.co
Updated Apr 28, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tongyi-ConvAI (2025). EPO-RL-data [Dataset]. https://huggingface.co/datasets/Tongyi-ConvAI/EPO-RL-data
Explore at:
Dataset updated
Apr 28, 2025
Dataset authored and provided by
Tongyi-ConvAI
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset Card for EPO-RL-data

This is the official data collection for paper "EPO: Explicit Policy Optimization for Strategic Reasoning in LLMs via Reinforcement Learning". Please see paper & code for more information:

paper: https://arxiv.org/abs/2502.12486 code: https://github.com/AlibabaResearch/DAMO-ConvAI/tree/main/EPO

Uses

This dataset was used to train the strategic reasoning model (Llama-3-8B-Instruct) via RL reported in our paper. Note that dump.rdb was… See the full description on the dataset page: https://huggingface.co/datasets/Tongyi-ConvAI/EPO-RL-data.
h
oasst1-guanaco-damo-convai-pro
huggingface.co
Updated Oct 25, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Chandeepa Dissanayake (2023). oasst1-guanaco-damo-convai-pro [Dataset]. https://huggingface.co/datasets/chansurgeplus/oasst1-guanaco-damo-convai-pro
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 25, 2023
Authors
Chandeepa Dissanayake
Description
Dataset Card for "oasst1-guanaco-damo-convai-pro"

More Information needed
h
Nadi_Indic466k_Instruct
huggingface.co
Updated Mar 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Convai Innovations (2024). Nadi_Indic466k_Instruct [Dataset]. https://huggingface.co/datasets/convaiinnovations/Nadi_Indic466k_Instruct
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 12, 2024
Authors
Convai Innovations
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Nadi_Indic466K_Instruct Dataset

The Nadi_Indic466K_Instruct dataset is the world's first coding dataset with 18 Indian language support, 466k rows and 142 Million total tokens. This dataset can be used by developers to build Indian coding language models (LLMs) for various programming languages. Q-LoRA based SFT/PPO/DPO fine-tuning can be done on the dataset in LLAMA-2 or Mistral or any opens-soure LLM for text generation. The dataset was carefully curated such that the coding part… See the full description on the dataset page: https://huggingface.co/datasets/convaiinnovations/Nadi_Indic466k_Instruct.
Not seeing a result you expected?
Learn how you can add new datasets to our index.