MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
๐ฏ DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving
๐ Paper@arXiv | ๐ค Datasets&Models@HF | ๐ฑ Code@GitHub ๐ฆ Thread@X(Twitter) | ๐ถ ไธญๆๅๅฎข@็ฅไน | ๐ Leaderboard@PapersWithCode | ๐ BibTeX
[!IMPORTANT] ๐ฅ Excited to find our DART-Math-DSMath-7B (Prop2Diff) trained on DART-Math-Hard comparable to the AIMO winner NuminaMath-7B on CoT, but based solely on MATH & GSM8K prompt set, leaving much room to improve! Besides, our DART method is also fully compatibleโฆ See the full description on the dataset page: https://huggingface.co/datasets/hkust-nlp/dart-math-hard.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
๐ฏ DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving
๐ Paper@arXiv | ๐ค Datasets&Models@HF | ๐ฑ Code@GitHub ๐ฆ Thread@X(Twitter) | ๐ถ ไธญๆๅๅฎข@็ฅไน | ๐ Leaderboard@PapersWithCode | ๐ BibTeX
Datasets: DART-Math
DART-Math datasets are the state-of-the-art and data-efficientopen-source instruction tuning datasets for mathematical reasoning.
Figure 1: Left: Average accuracy on 6 mathematical benchmarks. We compare with modelsโฆ See the full description on the dataset page: https://huggingface.co/datasets/hkust-nlp/dart-math-uniform.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
[!NOTE] This dataset is the data pool synthesized from the query set of the MATH training set, containing all answer-correct samples and other metadata produced during the work. DART-Math-* datasets are extracted from dart-math-pool-* data pools.
๐ฏ DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving
๐ Paper@arXiv | ๐ค Datasets&Models@HF | ๐ฑ Code@GitHub ๐ฆ Thread@X(Twitter) | ๐ถ ไธญๆๅๅฎข@็ฅไน | ๐ Leaderboard@PapersWithCode | ๐ BibTeX
Datasets:โฆ See the full description on the dataset page: https://huggingface.co/datasets/hkust-nlp/dart-math-pool-math.
Litzy619/dart-math-diff dataset hosted on Hugging Face and contributed by the HF Datasets community
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
[!NOTE] This dataset is the synthesis information of queries from the GSM8K training set, such as the numbers of raw/correct samples of each synthesis job. Usually used with dart-math-pool-gsm8k.
๐ฏ DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving
๐ Paper@arXiv | ๐ค Datasets&Models@HF | ๐ฑ Code@GitHub ๐ฆ Thread@X(Twitter) | ๐ถ ไธญๆๅๅฎข@็ฅไน | ๐ Leaderboard@PapersWithCode | ๐ BibTeX
Datasets: DART-Math
DART-Math datasets are theโฆ See the full description on the dataset page: https://huggingface.co/datasets/hkust-nlp/dart-math-pool-gsm8k-query-info.
Dataset Card for "dart-math-uniform"
More Information needed
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
[!NOTE] This dataset is the synthesis information of queries from the MATH training set, such as the numbers of raw/correct samples of each synthesis job. Usually used with dart-math-pool-math.
๐ฏ DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving
๐ Paper@arXiv | ๐ค Datasets&Models@HF | ๐ฑ Code@GitHub ๐ฆ Thread@X(Twitter) | ๐ถ ไธญๆๅๅฎข@็ฅไน | ๐ Leaderboard@PapersWithCode | ๐ BibTeX
Datasets: DART-Math
DART-Math datasets are the state-of-the-artโฆ See the full description on the dataset page: https://huggingface.co/datasets/hkust-nlp/dart-math-pool-math-query-info.
Dart-math-hard_scored - with OpenDataArena Scores
This dataset is a scored version of the original hkust-nlp/dart-math-hard dataset. The scoring was performed using the OpenDataArena-Tool, a comprehensive suite of automated evaluation methods for assessing instruction-following datasets. This version of the dataset includes rich, multi-dimensional scores for both the instructions (questions) and the instruction-response pairs, allowing for highly granular data analysis andโฆ See the full description on the dataset page: https://huggingface.co/datasets/OpenDataArena/hkust-nlp_dart-math-hard_scored.
Dataset Card for "rlhflow_mixture_mod_scalebiosampled-20k"
weight = { 'MathInstruct': 0.17918314039707184, 'SlimOrca': 0.1572466790676117, 'Magicoder-Evol-Instruct-110K': 0.1262860894203186, 'dart-math-uniform': 0.10912656784057617, 'GPTeacher-General-Instruct': 0.10593341290950775, 'GPT4-LLM-Cleaned': 0.09206369519233704, 'WizardLM_evol_instruct_V2_196k': 0.07409033179283142, 'UltraInteract_sft': 0.056401610374450684, 'orca-math-word-problems-200k': 0.054032713174819946โฆ See the full description on the dataset page: https://huggingface.co/datasets/pxyyy/rlhflow_mixture_mod_scalebiosampled-20k.
Dataset Card for "rlhflow_mixture_intuitive_sampled-20k"
weight = { 'MathInstruct': 0.1, 'SlimOrca': 0.2, 'Magicoder-Evol-Instruct-110K': 0.1, 'dart-math-uniform': 0.3, 'GPTeacher-General-Instruct': 0.05, 'GPT4-LLM-Cleaned': 0.03, 'WizardLM_evol_instruct_V2_196k': 0.05, 'UltraInteract_sft': 0.1, 'orca-math-word-problems-200k': 0.05, 'ShareGPT_V3_unfiltered_cleaned_split_no_imsorry': 0.02, }
More Information needed
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
GSM8K (Fixed)
Some erroneous labels exist in the GSM8K dataset. This dataset is fixed from https://github.com/openai/grade-school-math/blob/master/grade_school_math/data/train.jsonl with the code appended at the end. The errors are located by delving into unreasonably low pass rates by the strong DeepSeekMath-7B-RL and hopefully should be exhaustive. This dataset is used by the ๐ฏDART-Math project to synthesize data.
[!WARNING] โ ๏ธ Only the training set has been fixed so far.
forโฆ See the full description on the dataset page: https://huggingface.co/datasets/hkust-nlp/gsm8k-fix.
Dataset Card for "rlhflow_mixture_scalebio_sampled-nolisa-600k"
weight = { 'SlimOrca': 0.34525978565216064, 'dart-math-uniform': 0.23386941850185394, 'GPT4-LLM-Cleaned': 0.19111572206020355, 'MathInstruct': 0.16642746329307556, 'GPTeacher-General-Instruct': 0.042891550809144974, 'ShareGPT_V3_unfiltered_cleaned_split_no_imsorry': 0.006720397621393204, 'UltraInteract_sft': 0.0042861211113631725, 'WizardLM_evol_instruct_V2_196k': 0.004021201748400927, 'Magicoder-Evol-Instruct-110K':โฆ See the full description on the dataset page: https://huggingface.co/datasets/pxyyy/rlhflow_mixture_scalebio_sampled-nolisa-250k.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
[!NOTE] This dataset is the VRT baseline dataset used to train baseline models *-VRT in Table 2 of the paper.
Another ablation baseline to DART is vanilla rejection tuning (VRT), where we synthesize a dataset of the same size of 0.59M examples with DeepSeekMath-7B-RL, using vanilla rejection sampling as described in ยง2.1.
๐ฏ DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving
๐ Paper@arXiv | ๐ค Datasets&Models@HF | ๐ฑ Code@GitHub ๐ฆโฆ See the full description on the dataset page: https://huggingface.co/datasets/hkust-nlp/vrt-baseline.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
๐ฏ DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving
๐ Paper@arXiv | ๐ค Datasets&Models@HF | ๐ฑ Code@GitHub ๐ฆ Thread@X(Twitter) | ๐ถ ไธญๆๅๅฎข@็ฅไน | ๐ Leaderboard@PapersWithCode | ๐ BibTeX
[!IMPORTANT] ๐ฅ Excited to find our DART-Math-DSMath-7B (Prop2Diff) trained on DART-Math-Hard comparable to the AIMO winner NuminaMath-7B on CoT, but based solely on MATH & GSM8K prompt set, leaving much room to improve! Besides, our DART method is also fully compatibleโฆ See the full description on the dataset page: https://huggingface.co/datasets/hkust-nlp/dart-math-hard.