Datasheet for the dataset: multilingual-NLI-26lang-2mil7
Dataset Summary
This dataset contains 2 730 000 NLI text pairs in 26 languages spoken by more than 4 billion people. The dataset can be used to train models for multilingual NLI (Natural Language Inference) or zero-shot classification. The dataset is based on the English datasets MultiNLI, Fever-NLI, ANLI, LingNLI and WANLI and was created using the latest open-source machine translation models. The dataset is… See the full description on the dataset page: https://huggingface.co/datasets/MoritzLaurer/multilingual-NLI-26lang-2mil7.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Datasheet for the dataset: multilingual-NLI-26lang-2mil7
Dataset Summary
This dataset contains 2 730 000 NLI text pairs in 26 languages spoken by more than 4 billion people. The dataset can be used to train models for multilingual NLI (Natural Language Inference) or zero-shot classification. The dataset is based on the English datasets MultiNLI, Fever-NLI, ANLI, LingNLI and WANLI and was created using the latest open-source machine translation models. The dataset is… See the full description on the dataset page: https://huggingface.co/datasets/MoritzLaurer/multilingual-NLI-26lang-2mil7.