5 datasets found
  1. h

    gsm8k

    • huggingface.co
    Updated Aug 11, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    OpenAI (2022). gsm8k [Dataset]. https://huggingface.co/datasets/openai/gsm8k
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 11, 2022
    Dataset authored and provided by
    OpenAIhttp://openai.com/
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset Card for GSM8K

      Dataset Summary
    

    GSM8K (Grade School Math 8K) is a dataset of 8.5K high quality linguistically diverse grade school math word problems. The dataset was created to support the task of question answering on basic mathematical problems that require multi-step reasoning.

    These problems take between 2 and 8 steps to solve. Solutions primarily involve performing a sequence of elementary calculations using basic arithmetic operations (+ − ×÷) to reach the… See the full description on the dataset page: https://huggingface.co/datasets/openai/gsm8k.

  2. gsm8k-synthetic-diverse-8b

    • huggingface.co
    Updated Oct 29, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    gsm8k-synthetic-diverse-8b [Dataset]. https://huggingface.co/datasets/gretelai/gsm8k-synthetic-diverse-8b
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 29, 2024
    Dataset provided by
    Gretel Labs, Inc.
    License

    https://choosealicense.com/licenses/llama3.1/https://choosealicense.com/licenses/llama3.1/

    Description

    gretelai/gsm8k-synthetic-diverse-8b

    This dataset is a synthetically generated version inspired by the GSM8K https://huggingface.co/datasets/openai/gsm8k dataset, created entirely using Gretel Navigator with meta-llama/Meta-Llama-3.1-8B as the agent LLM. It contains ~1500 Grade School-level math word problems with step-by-step solutions, focusing on age group, difficulty, and domain diversity.

      Key Features:
    

    Synthetically Generated: Math problems created using… See the full description on the dataset page: https://huggingface.co/datasets/gretelai/gsm8k-synthetic-diverse-8b.

  3. gsm8k-thai

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    gsm8k-thai [Dataset]. https://huggingface.co/datasets/VISAI-AI/gsm8k-thai
    Explore at:
    Dataset provided by
    Visai AI Co., Ltd.
    Authors
    VISAI AI
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    gsm8k-thai

    This dataset is a Thai translation of the GSM8k benchmark (https://huggingface.co/datasets/openai/gsm8k), a dataset of grade school math word problems. The translation was performed using Claude 3.5 Sonnet. It is intended for evaluating the performance of language models on mathematical reasoning in the Thai language. The split of training and test data follows the original GSM8k dataset.

      Annotations
    

    source: claude-3.5-sonnet language: en -> th… See the full description on the dataset page: https://huggingface.co/datasets/VISAI-AI/gsm8k-thai.

  4. h

    gms8k_fr_500

    • huggingface.co
    Updated Mar 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    gms8k_fr_500 [Dataset]. https://huggingface.co/datasets/cmh/gms8k_fr_500
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 13, 2025
    Authors
    cmh
    Description

    openai/gsm8k dataset translated to french using quickmt/quickmt-en-fr

    Question tokens = 256 maximum Answer tokens = 1024 maximum Lines = 500

  5. h

    gsm8k_cz

    • huggingface.co
    Updated Mar 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    honza hroch (2025). gsm8k_cz [Dataset]. https://huggingface.co/datasets/hroch/gsm8k_cz
    Explore at:
    Dataset updated
    Mar 25, 2025
    Authors
    honza hroch
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Tato datová sada obsahuje automatický český překlad datové sady GSM8K od OpenAI.

    question: Originální pole obsahující otázku v angličtině. answer: Originální pole obsahující odpověď v angličtině. translated_question: Přeložené pole otázky do češtiny pomocí modelu Helsinki-NLP/opus-mt-en-cs. translated_answer: Přeložené pole odpovědi do češtiny pomocí modelu Helsinki-NLP/opus-mt-en-cs. Překlad byl proveden na základě hlavního rozdělení datové sady GSM8K. Použitý model pro překlad:… See the full description on the dataset page: https://huggingface.co/datasets/hroch/gsm8k_cz.

  6. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
OpenAI (2022). gsm8k [Dataset]. https://huggingface.co/datasets/openai/gsm8k

gsm8k

openai/gsm8k

Grade School Math 8K

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 11, 2022
Dataset authored and provided by
OpenAIhttp://openai.com/
License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

Dataset Card for GSM8K

  Dataset Summary

GSM8K (Grade School Math 8K) is a dataset of 8.5K high quality linguistically diverse grade school math word problems. The dataset was created to support the task of question answering on basic mathematical problems that require multi-step reasoning.

These problems take between 2 and 8 steps to solve. Solutions primarily involve performing a sequence of elementary calculations using basic arithmetic operations (+ − ×÷) to reach the… See the full description on the dataset page: https://huggingface.co/datasets/openai/gsm8k.

Search
Clear search
Close search
Google apps
Main menu