Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Dataset Card for Processed FinanceBench
Dataset Description
This dataset is derived from the PatronusAI/financebench-test dataset, containing only the PASS examples processed into a clean format for question-answering tasks in the financial domain.
Dataset Summary
The dataset contains financial questions, their corresponding document contexts, and human-written answers that have been verified as faithful to the source documents.
Columns:
question:… See the full description on the dataset page: https://huggingface.co/datasets/virattt/financebench.
FinanceMTEB/FinanceBench dataset hosted on Hugging Face and contributed by the HF Datasets community
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
DataRobot-Research/financebench dataset hosted on Hugging Face and contributed by the HF Datasets community
sunitha-ravi/mistral-original-financebench dataset hosted on Hugging Face and contributed by the HF Datasets community
financebench/submissions_public dataset hosted on Hugging Face and contributed by the HF Datasets community
jbyhring/Financebench dataset hosted on Hugging Face and contributed by the HF Datasets community
financebench/finance-events-latest dataset hosted on Hugging Face and contributed by the HF Datasets community
gx-ai-architect/financebench-numerical-fixed dataset hosted on Hugging Face and contributed by the HF Datasets community
StarNeit/finance-bench-stock-predictions dataset hosted on Hugging Face and contributed by the HF Datasets community
mlfoundations-dev/r1_annotated_finqa_OT7B_eval_7e69
Precomputed model outputs for evaluation.
Evaluation Results
FinanceBench
Average Accuracy: 50.97% ± 0.00% Number of Runs: 1
Run Accuracy Questions Solved Total Questions
1 50.97% 210 412
mlfoundations-dev/OpenThinker2-7B_eval_7e69
Precomputed model outputs for evaluation.
Evaluation Results
FinanceBench
Average Accuracy: 43.69% ± 0.00% Number of Runs: 1
Run Accuracy Questions Solved Total Questions
1 43.69% 180 412
mlfoundations-dev/Qwen2.5-7B-Instruct_eval_7e69
Precomputed model outputs for evaluation.
Evaluation Results
FinanceBench
Average Accuracy: 30.58% ± 0.00% Number of Runs: 1
Run Accuracy Questions Solved Total Questions
1 30.58% 126 412
mlfoundations-dev/DeepSeek-R1-Distill-Qwen-7B_eval_7e69
Precomputed model outputs for evaluation.
Evaluation Results
FinanceBench
Average Accuracy: 34.22% ± 0.00% Number of Runs: 1
Run Accuracy Questions Solved Total Questions
1 34.22% 141 412
mlfoundations-dev/OpenR1-Math-Raw-all-correct-sharegpt_1744357704_eval_7e69
Precomputed model outputs for evaluation.
Evaluation Results
FinanceBench
Average Accuracy: 0.00% ± 0.00% Number of Runs: 1
Run Accuracy Questions Solved Total Questions
1 0.00% 0 10
mlfoundations-dev/OpenR1-Math-Raw-all-correct-5k_OT7B_eval_7e69
Precomputed model outputs for evaluation.
Evaluation Results
FinanceBench
Average Accuracy: 45.15% ± 0.00% Number of Runs: 1
Run Accuracy Questions Solved Total Questions
1 45.15% 186 412
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Dataset Card for Processed FinanceBench
Dataset Description
This dataset is derived from the PatronusAI/financebench-test dataset, containing only the PASS examples processed into a clean format for question-answering tasks in the financial domain.
Dataset Summary
The dataset contains financial questions, their corresponding document contexts, and human-written answers that have been verified as faithful to the source documents.
Columns:
question:… See the full description on the dataset page: https://huggingface.co/datasets/virattt/financebench.