9 datasets found

h
Nekochu_Llama-3.1-8B-German-ORPO-details
huggingface.co
Updated Jul 30, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Open LLM Leaderboard (2025). Nekochu_Llama-3.1-8B-German-ORPO-details [Dataset]. https://huggingface.co/datasets/open-llm-leaderboard/Nekochu_Llama-3.1-8B-German-ORPO-details
Explore at:
Dataset updated
Jul 30, 2025
Dataset authored and provided by
Open LLM Leaderboard
Description
Dataset Card for Evaluation run of Nekochu/Llama-3.1-8B-German-ORPO

Dataset automatically created during the evaluation run of model Nekochu/Llama-3.1-8B-German-ORPO The dataset is composed of 38 configuration(s), each one corresponding to one of the evaluated task. The dataset has been created from 1 run(s). Each run can be found as a specific split in each configuration, the split being named using the timestamp of the run.The "train" split is always pointing to the latest… See the full description on the dataset page: https://huggingface.co/datasets/open-llm-leaderboard/Nekochu_Llama-3.1-8B-German-ORPO-details.

wsdm - open models - nbroad

kaggle.com

Updated Jan 21, 2025

Facebook

Twitter

Click to copy link

Link copied

Cite

Nicholas Broad (2025). wsdm - open models - nbroad [Dataset]. https://www.kaggle.com/datasets/nbroad/wsdm-open-models-nbroad

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Jan 21, 2025

Dataset provided by

Kagglehttp://kaggle.com/

Authors

Nicholas Broad

License

Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically

Description

8.5k+13k+5k rows (v1, v2, v3) of multilingual prompts and responses. Prompts taken from lmsys 1m dataset. Same format as host-provided dataset.

No winner column

v1 Model response counts:

Model	Count
mistralai/Mistral-Nemo-Instruct-2407	1867
meta-llama/Meta-Llama-3-8B-Instruct	1702
mistralai/Mixtral-8x7B-Instruct-v0.1	1506
mistralai/Mistral-7B-Instruct-v0.3	1424
NousResearch/Hermes-3-Llama-3.1-8B	1408
meta-llama/Llama-3.3-70B-Instruct	1344
Qwen/Qwen2.5-72B-Instruct	1322
01-ai/Yi-1.5-34B-Chat	1322
HuggingFaceH4/starchat2-15b-v0.1	1302
microsoft/Phi-3.5-mini-instruct	1294
google/gemma-2-27b-it	1230
Qwen/QwQ-32B-Preview	1117

v1 Language Counts

Language	Count
Portuguese	1079
Russian	966
Chinese	909
English	883
Spanish	779
German	615
French	585
Italian	493
unknown	383
Japanese	319
Korean	201
Polish	132
Indonesian	104
Arabic	75
Vietnamese	57
Turkish	57
Dutch	50
Latin	40
Hungarian	37
Ukrainian	36
Persian	34
Danish	33
Greek	33
Czech	29
Swedish	25
Romanian	24
Galician	22
Hebrew	19
Serbian	18
Scots	17
Norwegian	17
Bulgarian	15
Finnish	14
Catalan	14
Hawaiian	13
Corsican	13
Malay	12
Slovak	11
Thai	10
Occitan	9
Norwegian Nynorsk	8
Afrikaans	8
Haitian Creole	8
Quechua	8
Samoan	7
Breton	7
Uzbek	7
Bangla	7
Hausa	6
Luxembourgish	6
Tsonga	6
Esperanto	6
Interlingua	5
Somali	5
Basque	5
Aymara	5
Tatar	5
Nauru	4
Tagalog	4
Tswana	4
Wolof	4
Guarani	4
Faroese	4
Croatian	4
Malagasy	4
Estonian	4
Lithuanian	3
Khasi	3
Tongan	3
Akan	3
Manx	3
Javanese	3
Swahili	3
Seselwa Creole French	3
Oromo	3
Latvian	3
Lingala	2
Interlingue	2
Bosnian	2
Yoruba	2
Kazakh	2
zzp	2
Macedonian	2
Tajik	2
Southern Sotho	2
Welsh	2
Scottish Gaelic	2
Northern Sotho	2
Kinyarwanda	2
Irish	2
Fijian	2
Amharic	2
Bislama	2
Hmong	2
Hindi	2
Waray	2
Volapük	2
Marathi	1
Sundanese	1
Kalaallisut	1
Ganda	1
Afar	1
Rundi	1
Sanskrit	1
Bashkir	1
Cebuano	1
Zulu	1
Sinhala	1
Romansh	1
Nepali	1
Xhosa	1
Tamil	1
Māori	1
Albanian	1
Icelandic	1
Slovenian	1
xx	1

v2 Model Counts

Model Name	Count
google/gemma-2-9b-it	1242
01-ai/Yi-1.5-34B-Chat	1229
microsoft/phi-4	1195
microsoft/Phi-3.5-mini-instruct	1187
NousResearch/Hermes-3-Llama-3.1-8B	1179
meta-llama/Llama-2-7b-chat-hf	1179
mistralai/Mixtral-8x7B-Instruct-v0.1	1177
mistralai/Mistral-Nemo-Instruct-2407	1163
meta-llama/Meta-Llama-3-8B-Instruct	1158
meta-llama/Llama-3.1-70B-Instruct	1146
meta-llama/Llama-3.3-70B-Instruct	1142
microsoft/Phi-3-mini-4k-instruct	1141
Qwen/Qwen2.5-0.5B-Instruct	1138
google/gemma-2-2b-it	1133
google/gemma-1.1-7b-it	1130
meta-llama/Llama-3.2-1B-Instruct	1115
mistralai/Mistral-7B-Instruct-v0.3	1115
HuggingFaceH4/starchat2-15b-v0.1	1112
meta-llama/Llama-3.2-3B-Instruct	1097
HuggingFaceTB/SmolLM2-1.7B-Instruct	1092
Qwen/Qwen2.5-72B-Instruct	1088
tiiuae/falcon-7b-instruct	1064
Qwen/QwQ-32B-Preview	964

v2 Language Counts

Language	Count
English	2724
Portuguese	1482
Russian	1410
Chinese	1121
Spanish	1088
French	859
German	814
Italian	725
unknown	502
Japanese	378
Korean	270
Polish	151
Indonesian	132
Arabic	114
Vietnamese	98
Latin ...

h
dolly-15k_de
huggingface.co
Updated Aug 31, 2000
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mayflower GmbH (2000). dolly-15k_de [Dataset]. https://huggingface.co/datasets/mayflowergmbh/dolly-15k_de
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 31, 2000
Dataset authored and provided by
Mayflower GmbH
Description
A reformatted version of the DRXD1000/Dolly-15k-German dataset. Available for finetuning in hiyouga/LLaMA-Factory.
Aleph-Alpha-GermanWeb
huggingface.co
Updated Apr 25, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Aleph Alpha (2025). Aleph-Alpha-GermanWeb [Dataset]. https://huggingface.co/datasets/Aleph-Alpha/Aleph-Alpha-GermanWeb
Explore at:
Dataset updated
Apr 25, 2025
Dataset authored and provided by
Aleph Alphahttps://aleph-alpha.com/
License
https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
Description
AlephAlphaGermanWeb

Aleph-Alpha-GermanWeb is a new German-language dataset that combines heuristic and model-based filtering techniques with synthetic data generation to achieve SOTA performance in German-language benchmarks. The dataset draws from three sources: (1) Common Crawl web data, (2) FineWeb2, and (3) synthetically-generated data conditioned on actual, organic web data. In our accompanying paper, we evaluated our dataset by training both a 1B Llama-style model and an 8B… See the full description on the dataset page: https://huggingface.co/datasets/Aleph-Alpha/Aleph-Alpha-GermanWeb.
h
wiki_qa_de
huggingface.co
Updated Jan 23, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mayflower GmbH (2025). wiki_qa_de [Dataset]. https://huggingface.co/datasets/mayflowergmbh/wiki_qa_de
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 23, 2025
Dataset authored and provided by
Mayflower GmbH
Description
A german translation for the wiki_qa dataset. Extracted from seedboxventures/multitask_german_examples_32k. Translation created by seedbox ai for KafkaLM ❤️. Available for finetuning in hiyouga/LLaMA-Factory.
f
350M Model
figshare.com
json
Updated May 23, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pavel Chizhov (2025). 350M Model [Dataset]. http://doi.org/10.6084/m9.figshare.29135096.v1
Explore at:
jsonAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.29135096.v1
Dataset updated
May 23, 2025
Dataset provided by
figshare
Authors
Pavel Chizhov
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
350M Model**RAG-350M** is a 350 million parameters Small Reasoning Model, trained for retrieval-augmented general (RAG), search and source summarization. Along with RAG-1B it belongs to our family of specialized reasoning models.RAG-350M outperforms most SLMs (4 billion parameters and below) on standardized benchmarks for retrieval-augmented general (HotPotQA, 2wiki) and is a highly cost-effective alternative with popular larger models, including Qwen-2.5-7B, Llama-3.1-8B and Gemma-3-4B. It is the only SLM to date to maintain consistent RAG performance across leading European languages and to ensure systematic reference grounding for statements. Due to its size, ease of deployment on constrained infrastructure (including mobile phone) and built-in support for factual and accurate information, RAG-350m unlocks a range of new use cases for generative AI.## FeaturesRAG-350M is a specialized language model using a series of special tokens to process a structured input (query and sources) and generate a structured output (reasoning sequence and answer with sources). For easier implementation, we encourage to use the associated API library.### Citation supportRAG-350M natively generated grounded answers on the basis of excerpts and citations extracted from the provided sources, using a custom syntax inspired by Wikipedia. It is one a handful open weights model to date to have been developed with this feature and the first one designed for actual deployment. In contrast with Anthropic approach (Citation mode), citation are integrally generated by the model and are not the product of external chunking. As a result we can provide another desirable feature to simplify source checking: citation shortening for longer excerpts (using "(…)").### RAG reasoningRAG-350M generates a specific reasoning sequences incorporating several proto-agentic abilities for RAG applications. The model is able to make a series of decisions directly:* Assessing whether the query is understandable.* Assessing whether the query is trivial enough to not require a lengthy pre-analysis (adjustable reasoning)* Assessing whether the sources do contain enough input to generate a grounded answer.The structured reasoning trace include the following steps:* Language detection of the query. The model will always strive to answer in the language of the original query.* Query analysis and associated query report. The analysis can either lead to a standard answer, a shortening reasoning trace/answer for trivial question, a reformulated query or a refusal (that could in the context of the application be transformed into user input querying).* Source analysis and associated source report. This step evaluates the coverage and depth of the provided sources in regards to the query.* Draft of the final answer.### MultilingualityRAG-350M is able to read and write in the main European languages: French, German, Italian, Spanish and, to a lesser extent, Polish, Latin and Portuguese.To date, it is the only small language model with negligible loss of performance in leading European languages for RAG-related tasks. On a translated set of HotPotQA we observed a significant drop of performance in most SLMs from 10\% to 30-35\% for sub-1B models. We do expect the results of any standard English evaluation on our RAG models should be largely transferable to the main European languages limiting the costs of evaluation and deployment in multilingual settings.## TrainingRAG-350M is trained on large synthetic dataset emulating retrieval of wide variety of multilingual open sources from Common Corpus. They provide native support for citation and grounding with literal quotes. Following on the latest trends of agentification, the models reintegrate multiple features associated with RAG workflows such as query routing, query reformulation, source reranking.## EvaluationRAG-350M was evaluated on three standard RAG benchmarks, 2wiki, HotpotQA and MuSique.All the benchmarks only assess the "trivial" mode on questions requiring some form of multi-hop reasoning over sources (answer disseminated into different sources) as well as discrimination of distractor sources.RAG-350M is not simply a cost-effective version of larger models. We found it has been able to answer correctly to several hundred questions from HotPotQA that neither Llama-3-8b nor Qwen-2.5-7b could solve. Consequently we encourage its use as part of multi-model RAG systems.
h
deutsche_bahn_faq_128
huggingface.co
Updated Aug 8, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
adesso SE (2024). deutsche_bahn_faq_128 [Dataset]. https://huggingface.co/datasets/islam-hajosman/deutsche_bahn_faq_128
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 8, 2024
Authors
adesso SE
Description
Dataset Name: Deutsche Bahn FAQ in Llama 3 Format Dataset Description: This dataset contains 1000 question-answer pairs extracted from the official Deutsche Bahn (German Railways) FAQ section. The data has been specifically formatted to be compatible with the Llama 3 instruct models for supervised fine-tuning (SFT). Dataset Purpose: The primary purpose of this dataset is to facilitate the fine-tuning of Llama 3 instruct models for tasks related to customer service and information retrieval in… See the full description on the dataset page: https://huggingface.co/datasets/islam-hajosman/deutsche_bahn_faq_128.
u
Data from: LLM-Supported Workflow for Processing Faulty OCR
pub.uni-bielefeld.de
Updated Jul 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Christian Wachter; Patrick Jentsch (2025). LLM-Supported Workflow for Processing Faulty OCR [Dataset]. https://pub.uni-bielefeld.de/record/3003406
Explore at:
Dataset updated
Jul 11, 2025
Authors
Christian Wachter; Patrick Jentsch
Description
Notebook based on Sarah Oberbichler's (oberbichler@ieg-mainz.de) Notebook 'Researching German Historical Newspapers with Llama AI Model' (https://github.com/soberbichler/Notebooks4Historical_Newspapers/blob/main/Llama3_OCR.ipynb) Edited by Christian Wachter (christian.wachter@uni-bielefeld.de) and Patrick Jentsch (p.jentsch@uni-bielefeld.de) This notebook shows how LLMs can be used to support research with historical newspapers. In this example, the Llama 3.1 model is used to correct OCR of previously OCR'd historical newspaper pages. OCR quality has been a long-standing issue in digitization efforts. Historical newspapers are particularly affected due their complexity, historical fonts, or degradation. Additionally, OCR technology faced limitations when dealing with historical scripts.

License: GNU GPLv3
h
HARD-REASONING-DE
huggingface.co
Updated Jul 16, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Embraceable Technology GmbH (2025). HARD-REASONING-DE [Dataset]. https://huggingface.co/datasets/embraceableAI/HARD-REASONING-DE
Explore at:
Dataset updated
Jul 16, 2025
Dataset authored and provided by
Embraceable Technology GmbH
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
HARD-REASONING-DE

The original dataset was obtained from German-RAG LLM-HARD BENCHMARK and was further cleaned, filtered and re-evaluated.

Methodology: Reasoning-DE

Providing Persona Descriptions and rewriting in a similar style with a different focus area and name in german/english language Generating Simple Logical Problems out of Persona-specific Views & Language. Generating Approaches, Thinking-Steps & Solutions separately verified by Llama-3.1-405B-Instruct Quality… See the full description on the dataset page: https://huggingface.co/datasets/embraceableAI/HARD-REASONING-DE.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Open LLM Leaderboard (2025). Nekochu_Llama-3.1-8B-German-ORPO-details [Dataset]. https://huggingface.co/datasets/open-llm-leaderboard/Nekochu_Llama-3.1-8B-German-ORPO-details

Nekochu_Llama-3.1-8B-German-ORPO-details

Evaluation run of Nekochu/Llama-3.1-8B-German-ORPO

open-llm-leaderboard/Nekochu_Llama-3.1-8B-German-ORPO-details

Explore at:

Dataset updated

Jul 30, 2025

Dataset authored and provided by

Open LLM Leaderboard

Description

Dataset Card for Evaluation run of Nekochu/Llama-3.1-8B-German-ORPO

Dataset automatically created during the evaluation run of model Nekochu/Llama-3.1-8B-German-ORPO The dataset is composed of 38 configuration(s), each one corresponding to one of the evaluated task. The dataset has been created from 1 run(s). Each run can be found as a specific split in each configuration, the split being named using the timestamp of the run.The "train" split is always pointing to the latest… See the full description on the dataset page: https://huggingface.co/datasets/open-llm-leaderboard/Nekochu_Llama-3.1-8B-German-ORPO-details.

Clear search

Close search

Google apps

Main menu

Nekochu_Llama-3.1-8B-German-ORPO-details

wsdm - open models - nbroad

****No winner column****

v1 Model response counts:

v1 Language Counts

v2 Model Counts

v2 Language Counts

dolly-15k_de

Aleph-Alpha-GermanWeb

wiki_qa_de

350M Model

deutsche_bahn_faq_128

Data from: LLM-Supported Workflow for Processing Faulty OCR

HARD-REASONING-DE

Nekochu_Llama-3.1-8B-German-ORPO-details

Evaluation run of Nekochu/Llama-3.1-8B-German-ORPO

open-llm-leaderboard/Nekochu_Llama-3.1-8B-German-ORPO-details

No winner column