100+ datasets found
  1. Ranking of LLM tools in solving math problems 2024

    • statista.com
    Updated Jun 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Ranking of LLM tools in solving math problems 2024 [Dataset]. https://www.statista.com/statistics/1458141/leading-math-llm-tools/
    Explore at:
    Dataset updated
    Jun 25, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Mar 2024
    Area covered
    Worldwide
    Description

    As of March 2024, OpenAI o1 was the large language model (LLM) tool that had the best benchmark score in solving math problems, with a score of **** percent. Close behind, in second place, was OpenAI o1-mini, followed by GPT-4o.

  2. MathInstruct Dataset: Hybrid Math Instruction

    • kaggle.com
    zip
    Updated Nov 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). MathInstruct Dataset: Hybrid Math Instruction [Dataset]. https://www.kaggle.com/datasets/thedevastator/mathinstruct-dataset-hybrid-math-instruction-tun
    Explore at:
    zip(60239940 bytes)Available download formats
    Dataset updated
    Nov 30, 2023
    Authors
    The Devastator
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    MathInstruct Dataset: Hybrid Math Instruction Tuning

    A curated dataset for math instruction tuning models

    By TIGER-Lab (From Huggingface) [source]

    About this dataset

    MathInstruct is a comprehensive and meticulously curated dataset specifically designed to facilitate the development and evaluation of models for math instruction tuning. This dataset consists of a total of 13 different math rationale datasets, out of which six have been exclusively curated for this project, ensuring a diverse range of instructional materials. The main objective behind creating this dataset is to provide researchers with an easily accessible and manageable resource that aids in enhancing the effectiveness and precision of math instruction.

    One noteworthy feature of MathInstruct is its lightweight nature, making it highly convenient for researchers to utilize without any hassle. With carefully selected columns such as source, source, output, output, users can readily identify the origin or reference material from where the math instruction was obtained. Additionally, they can also refer to the expected output or solution corresponding to each specific math problem or exercise.

    Overall, MathInstruct offers immense potential in refining hybrid math instruction by facilitating meticulous model development and rigorous evaluation processes. Researchers can leverage this diverse dataset to gain deeper insights into effective teaching methodologies while exploring innovative approaches towards enhancing mathematical learning experiences

    How to use the dataset

    Title: How to Use the MathInstruct Dataset for Hybrid Math Instruction Tuning

    Introduction: The MathInstruct dataset is a comprehensive collection of math instruction examples, designed to assist in developing and evaluating models for math instruction tuning. This guide will provide an overview of the dataset and explain how to make effective use of it.

    • Understanding the Dataset Structure: The dataset consists of a file named train.csv. This CSV file contains the training data, which includes various columns such as source and output. The source column represents the source of math instruction (textbook, online resource, or teacher), while the output column represents expected output or solution to a particular math problem or exercise.

    • Accessing the Dataset: To access the MathInstruct dataset, you can download it from Kaggle's website. Once downloaded, you can read and manipulate the data using programming languages like Python with libraries such as pandas.

    • Exploring the Columns: a) Source Column: The source column provides information about where each math instruction comes from. It may include references to specific textbooks, online resources, or even teachers who provided instructional material. b) Output Column: The output column specifies what students are expected to achieve as a result of each math instruction. It contains solutions or expected outputs for different math problems or exercises.

    • Utilizing Source Information: By analyzing the different sources mentioned in this dataset, researchers can understand which instructional materials are more effective in teaching specific topics within mathematics. They can also identify common strategies used by teachers across multiple sources.

    • Analyzing Expected Outputs: Researchers can study variations in expected outputs for similar types of problems across different sources. This analysis may help identify differences in approaches across textbooks/resources and enrich our understanding of various teaching methods.

    • Model Development and Evaluation: Researchers can utilize this dataset to develop machine learning models that automatically assess whether a given math instruction leads to the expected output. By training models on this data, one can create automated systems that provide feedback on math problems or suggest alternative instruction sources.

    • Scaling the Dataset: Due to its lightweight nature, the MathInstruct dataset is easily accessible and manageable. Researchers can scale up their training data by combining it with other instructional datasets or expand it further by labeling more examples based on similar guidelines.

    Conclusion: The MathInstruct dataset serves as a valuable resource for developing and evaluating models related to math instruction tuning. By analyzing the source information and expected outputs, researchers can gain insights into effective teaching methods and build automated assessment

    Research Ideas

    • Model development: This dataset can be used for developing and training models for math instruction...
  3. r

    Australian and New Zealand journal of statistics Impact Factor 2024-2025 -...

    • researchhelpdesk.org
    Updated Feb 23, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Research Help Desk (2022). Australian and New Zealand journal of statistics Impact Factor 2024-2025 - ResearchHelpDesk [Dataset]. https://www.researchhelpdesk.org/journal/impact-factor-if/211/australian-and-new-zealand-journal-of-statistics
    Explore at:
    Dataset updated
    Feb 23, 2022
    Dataset authored and provided by
    Research Help Desk
    Description

    Australian and New Zealand journal of statistics Impact Factor 2024-2025 - ResearchHelpDesk - The Australian & New Zealand Journal of Statistics is an international journal managed jointly by the Statistical Society of Australia and the New Zealand Statistical Association. Its purpose is to report significant and novel contributions in statistics, ranging across articles on statistical theory, methodology, applications and computing. The journal has a particular focus on statistical techniques that can be readily applied to real-world problems, and on application papers with an Australasian emphasis. Outstanding articles submitted to the journal may be selected as Discussion Papers, to be read at a meeting of either the Statistical Society of Australia or the New Zealand Statistical Association. The main body of the journal is divided into three sections. The Theory and Methods Section publishes papers containing original contributions to the theory and methodology of statistics, econometrics and probability, and seeks papers motivated by a real problem and which demonstrate the proposed theory or methodology in that situation. There is a strong preference for papers motivated by, and illustrated with, real data. The Applications Section publishes papers demonstrating applications of statistical techniques to problems faced by users of statistics in the sciences, government and industry. A particular focus is the application of newly developed statistical methodology to real data and the demonstration of better use of established statistical methodology in an area of application. It seeks to aid teachers of statistics by placing statistical methods in context. The Statistical Computing Section publishes papers containing new algorithms, code snippets, or software descriptions (for open source software only) which enhance the field through the application of computing. Preference is given to papers featuring publically available code and/or data, and to those motivated by statistical methods for practical problems. In addition, suitable review papers and articles of historical and general interest will be considered. The journal also publishes book reviews on a regular basis. Abstracting and Indexing Information Academic Search (EBSCO Publishing) Academic Search Alumni Edition (EBSCO Publishing) Academic Search Elite (EBSCO Publishing) Academic Search Premier (EBSCO Publishing) CompuMath Citation Index (Clarivate Analytics) Current Index to Statistics (ASA/IMS) Journal Citation Reports/Science Edition (Clarivate Analytics) Mathematical Reviews/MathSciNet/Current Mathematical Publications (AMS) RePEc: Research Papers in Economics Science Citation Index Expanded (Clarivate Analytics) SCOPUS (Elsevier) Statistical Theory & Method Abstracts (Zentralblatt MATH) ZBMATH (Zentralblatt MATH)

  4. Data from: Data Fission: Splitting a Single Data Point

    • tandf.figshare.com
    txt
    Updated Dec 14, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    James Leiner; Boyan Duan; Larry Wasserman; Aaditya Ramdas (2023). Data Fission: Splitting a Single Data Point [Dataset]. http://doi.org/10.6084/m9.figshare.24328745.v2
    Explore at:
    txtAvailable download formats
    Dataset updated
    Dec 14, 2023
    Dataset provided by
    Taylor & Francishttps://taylorandfrancis.com/
    Authors
    James Leiner; Boyan Duan; Larry Wasserman; Aaditya Ramdas
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Suppose we observe a random vector X from some distribution in a known family with unknown parameters. We ask the following question: when is it possible to split X into two pieces f(X) and g(X) such that neither part is sufficient to reconstruct X by itself, but both together can recover X fully, and their joint distribution is tractable? One common solution to this problem when multiple samples of X are observed is data splitting, but Rasines and Young offers an alternative approach that uses additive Gaussian noise—this enables post-selection inference in finite samples for Gaussian distributed data and asymptotically when errors are non-Gaussian. In this article, we offer a more general methodology for achieving such a split in finite samples by borrowing ideas from Bayesian inference to yield a (frequentist) solution that can be viewed as a continuous analog of data splitting. We call our method data fission, as an alternative to data splitting, data carving and p-value masking. We exemplify the method on several prototypical applications, such as post-selection inference for trend filtering and other regression problems, and effect size estimation after interactive multiple testing. Supplementary materials for this article are available online.

  5. A Review of Published Analyses of Case-Cohort Studies and Recommendations...

    • plos.figshare.com
    docx
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stephen J. Sharp; Manon Poulaliou; Simon G. Thompson; Ian R. White; Angela M. Wood (2023). A Review of Published Analyses of Case-Cohort Studies and Recommendations for Future Reporting [Dataset]. http://doi.org/10.1371/journal.pone.0101176
    Explore at:
    docxAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Stephen J. Sharp; Manon Poulaliou; Simon G. Thompson; Ian R. White; Angela M. Wood
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The case-cohort study design combines the advantages of a cohort study with the efficiency of a nested case-control study. However, unlike more standard observational study designs, there are currently no guidelines for reporting results from case-cohort studies. Our aim was to review recent practice in reporting these studies, and develop recommendations for the future. By searching papers published in 24 major medical and epidemiological journals between January 2010 and March 2013 using PubMed, Scopus and Web of Knowledge, we identified 32 papers reporting case-cohort studies. The median subcohort sampling fraction was 4.1% (interquartile range 3.7% to 9.1%). The papers varied in their approaches to describing the numbers of individuals in the original cohort and the subcohort, presenting descriptive data, and in the level of detail provided about the statistical methods used, so it was not always possible to be sure that appropriate analyses had been conducted. Based on the findings of our review, we make recommendations about reporting of the study design, subcohort definition, numbers of participants, descriptive information and statistical methods, which could be used alongside existing STROBE guidelines for reporting observational studies.

  6. m

    Calculus Video Worked Example Data

    • data.mendeley.com
    Updated Apr 12, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jamison Judd (2019). Calculus Video Worked Example Data [Dataset]. http://doi.org/10.17632/t3xr5j67fd.1
    Explore at:
    Dataset updated
    Apr 12, 2019
    Authors
    Jamison Judd
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Summary data from a Calculus II class where students were required to watch an instructional video before or after lecture. Dataset includes gender (1=female; 2=male), vgroup (-1=before lecture; 1=after lecture), binary flag for 26 individual videos (1=watched 80% or more of length of video; 0=not watched), videosum (sum of number of videos watched), final_raw (raw grade student received on cumulative final course exam), sat_math (scaled SAT-Math score out of 800), math_place (institutional calculus readiness score out of 100), watched20 (grouping flag for students who watched 20 or more videos).

  7. gsm8k

    • huggingface.co
    Updated Aug 11, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    OpenAI (2022). gsm8k [Dataset]. https://huggingface.co/datasets/openai/gsm8k
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 11, 2022
    Dataset authored and provided by
    OpenAIhttp://openai.com/
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset Card for GSM8K

      Dataset Summary
    

    GSM8K (Grade School Math 8K) is a dataset of 8.5K high quality linguistically diverse grade school math word problems. The dataset was created to support the task of question answering on basic mathematical problems that require multi-step reasoning.

    These problems take between 2 and 8 steps to solve. Solutions primarily involve performing a sequence of elementary calculations using basic arithmetic operations (+ − ×÷) to reach the… See the full description on the dataset page: https://huggingface.co/datasets/openai/gsm8k.

  8. Mathematics Dataset

    • github.com
    • opendatalab.com
    • +1more
    Updated Apr 3, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DeepMind (2019). Mathematics Dataset [Dataset]. https://github.com/Wikidepia/mathematics_dataset_id
    Explore at:
    Dataset updated
    Apr 3, 2019
    Dataset provided by
    DeepMindhttp://deepmind.com/
    Description

    This dataset consists of mathematical question and answer pairs, from a range of question types at roughly school-level difficulty. This is designed to test the mathematical learning and algebraic reasoning skills of learning models.

    ## Example questions

     Question: Solve -42*r + 27*c = -1167 and 130*r + 4*c = 372 for r.
     Answer: 4
     
     Question: Calculate -841880142.544 + 411127.
     Answer: -841469015.544
     
     Question: Let x(g) = 9*g + 1. Let q(c) = 2*c + 1. Let f(i) = 3*i - 39. Let w(j) = q(x(j)). Calculate f(w(a)).
     Answer: 54*a - 30
    

    It contains 2 million (question, answer) pairs per module, with questions limited to 160 characters in length, and answers to 30 characters in length. Note the training data for each question type is split into "train-easy", "train-medium", and "train-hard". This allows training models via a curriculum. The data can also be mixed together uniformly from these training datasets to obtain the results reported in the paper. Categories:

    • algebra (linear equations, polynomial roots, sequences)
    • arithmetic (pairwise operations and mixed expressions, surds)
    • calculus (differentiation)
    • comparison (closest numbers, pairwise comparisons, sorting)
    • measurement (conversion, working with time)
    • numbers (base conversion, remainders, common divisors and multiples, primality, place value, rounding numbers)
    • polynomials (addition, simplification, composition, evaluating, expansion)
    • probability (sampling without replacement)
  9. Math problems IMO

    • kaggle.com
    zip
    Updated Jan 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Artem Goncharov (2025). Math problems IMO [Dataset]. https://www.kaggle.com/datasets/artemgoncarov/math-problems-imo
    Explore at:
    zip(66054740 bytes)Available download formats
    Dataset updated
    Jan 15, 2025
    Authors
    Artem Goncharov
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Data with 100.000 diverse problems from International Math Olympiads (AIME, IMO etc).

    You can use it for example for RAG systems or just to fine-tune model. If you like it, please upvote. Have a good work with this data!

  10. r

    Australian and New Zealand journal of statistics Acceptance Rate -...

    • researchhelpdesk.org
    Updated Mar 23, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Research Help Desk (2022). Australian and New Zealand journal of statistics Acceptance Rate - ResearchHelpDesk [Dataset]. https://www.researchhelpdesk.org/journal/acceptance-rate/211/australian-and-new-zealand-journal-of-statistics
    Explore at:
    Dataset updated
    Mar 23, 2022
    Dataset authored and provided by
    Research Help Desk
    Description

    Australian and New Zealand journal of statistics Acceptance Rate - ResearchHelpDesk - The Australian & New Zealand Journal of Statistics is an international journal managed jointly by the Statistical Society of Australia and the New Zealand Statistical Association. Its purpose is to report significant and novel contributions in statistics, ranging across articles on statistical theory, methodology, applications and computing. The journal has a particular focus on statistical techniques that can be readily applied to real-world problems, and on application papers with an Australasian emphasis. Outstanding articles submitted to the journal may be selected as Discussion Papers, to be read at a meeting of either the Statistical Society of Australia or the New Zealand Statistical Association. The main body of the journal is divided into three sections. The Theory and Methods Section publishes papers containing original contributions to the theory and methodology of statistics, econometrics and probability, and seeks papers motivated by a real problem and which demonstrate the proposed theory or methodology in that situation. There is a strong preference for papers motivated by, and illustrated with, real data. The Applications Section publishes papers demonstrating applications of statistical techniques to problems faced by users of statistics in the sciences, government and industry. A particular focus is the application of newly developed statistical methodology to real data and the demonstration of better use of established statistical methodology in an area of application. It seeks to aid teachers of statistics by placing statistical methods in context. The Statistical Computing Section publishes papers containing new algorithms, code snippets, or software descriptions (for open source software only) which enhance the field through the application of computing. Preference is given to papers featuring publically available code and/or data, and to those motivated by statistical methods for practical problems. In addition, suitable review papers and articles of historical and general interest will be considered. The journal also publishes book reviews on a regular basis. Abstracting and Indexing Information Academic Search (EBSCO Publishing) Academic Search Alumni Edition (EBSCO Publishing) Academic Search Elite (EBSCO Publishing) Academic Search Premier (EBSCO Publishing) CompuMath Citation Index (Clarivate Analytics) Current Index to Statistics (ASA/IMS) Journal Citation Reports/Science Edition (Clarivate Analytics) Mathematical Reviews/MathSciNet/Current Mathematical Publications (AMS) RePEc: Research Papers in Economics Science Citation Index Expanded (Clarivate Analytics) SCOPUS (Elsevier) Statistical Theory & Method Abstracts (Zentralblatt MATH) ZBMATH (Zentralblatt MATH)

  11. q

    Instructor Guide: Integrating Leadership Roles, Artificial Intelligence,...

    • qubeshub.org
    Updated Jan 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dr Pankaj Mehrotra (2025). Instructor Guide: Integrating Leadership Roles, Artificial Intelligence, PhET Simulation, HHMI-Biointeractive Data Explorer and Google Tools to understand Mathematics and Statistics. [Dataset]. http://doi.org/10.25334/KMDZ-N209
    Explore at:
    Dataset updated
    Jan 4, 2025
    Dataset provided by
    QUBES
    Authors
    Dr Pankaj Mehrotra
    Description

    Mathematical and Statistical analysis skills are important skills to be included in the course curriculum. Together or individually, these skills can advance knowledge, critical thinking, and creativity. In this guide, I provide an overview of how leadership roles, AI skills, simulation based learning and google tools can be integrated into class activities to help students understand examples of application of mathematical and statistical concepts such as sum, mean, data and data analysis. Through these activities, students develop an understanding that mathematics and statistics are interdependent and cross disciplines. Using simulations, students use the simulation tools to learn about application of mathematics and statistics in real-life and research practices as they learn the concepts of mathematics through PhET Simulation and collect data to apply data organization, analysis and statistics through HHMI-Biointeractive Data Explorer thus introducing key concepts in mathematics and statistics.

  12. n

    Data from: Exploring Human-Like Mathematical Reasoning: Perspectives on...

    • curate.nd.edu
    pdf
    Updated Dec 3, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zhenwen Liang (2024). Exploring Human-Like Mathematical Reasoning: Perspectives on Generalizability and Efficiency [Dataset]. http://doi.org/10.7274/27895872.v1
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Dec 3, 2024
    Dataset provided by
    University of Notre Dame
    Authors
    Zhenwen Liang
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Mathematical reasoning, a fundamental aspect of human cognition, poses significant challenges for artificial intelligence (AI) systems. Despite recent advancements in natural language processing (NLP) and large language models (LLMs), AI's ability to replicate human-like reasoning, generalization, and efficiency remains an ongoing research challenge. In this dissertation, we address key limitations in MWP solving, focusing on the accuracy, generalization ability and efficiency of AI-based mathematical reasoners by applying human-like reasoning methods and principles.

    This dissertation introduces several innovative approaches in mathematical reasoning. First, a numeracy-driven framework is proposed to enhance math word problem (MWP) solvers by integrating numerical reasoning into model training, surpassing human-level performance on benchmark datasets. Second, a novel multi-solution framework captures the diversity of valid solutions to math problems, improving the generalization capabilities of AI models. Third, a customized knowledge distillation technique, termed Customized Exercise for Math Learning (CEMAL), is developed to create tailored exercises for smaller models, significantly improving their efficiency and accuracy in solving MWPs. Additionally, a multi-view fine-tuning paradigm (MinT) is introduced to enable smaller models to handle diverse annotation styles from different datasets, improving their adaptability and generalization. To further advance mathematical reasoning, a benchmark, MathChat, is introduced to evaluate large language models (LLMs) in multi-turn reasoning and instruction-following tasks, demonstrating significant performance improvements. Finally, new inference-time verifiers, Math-Rev and Code-Rev, are developed to enhance reasoning verification, combining language-based and code-based solutions for improved accuracy in both math and code reasoning tasks.

    In summary, this dissertation provides a comprehensive exploration of these challenges and contributes novel solutions that push the boundaries of AI-driven mathematical reasoning. Potential future research directions are also discussed to further extend the impact of this dissertation.

  13. U

    Data from: Dataset of the study: "Chatbots put to the test in math and logic...

    • researchdata.bath.ac.uk
    Updated May 20, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vagelis Plevris; George Papazafeiropoulos; Alejandro Jimenez Rios (2023). Dataset of the study: "Chatbots put to the test in math and logic problems: A preliminary comparison and assessment of ChatGPT-3.5, ChatGPT-4, and Google Bard" [Dataset]. http://doi.org/10.5281/zenodo.7940781
    Explore at:
    Dataset updated
    May 20, 2023
    Dataset provided by
    Zenodo
    Authors
    Vagelis Plevris; George Papazafeiropoulos; Alejandro Jimenez Rios
    Dataset funded by
    Oslo Metropolitan University
    Description

    This dataset contains the 30 questions that were posed to the chatbots (i) ChatGPT-3.5; (ii) ChatGPT-4; and (iii) Google Bard, in May 2023 for the study “Chatbots put to the test in math and logic problems: A preliminary comparison and assessment of ChatGPT-3.5, ChatGPT-4, and Google Bard”. These 30 questions describe mathematics and logic problems that have a unique correct answer. The questions are fully described with plain text only, without the need for any images or special formatting. The questions are divided into two sets of 15 questions each (Set A and Set B). The questions of Set A are 15 “Original” problems that cannot be found online, at least in their exact wording, while Set B contains 15 “Published” problems that one can find online by searching on the internet, usually with their solution. Each question is posed three times to each chatbot.

    This dataset contains the following: (i) The full set of the 30 questions, A01-A15 and B01-B15; (ii) the correct answer for each one of them; (iii) an explanation of the solution, for the problems where such an explanation is needed, (iv) the 30 (questions) × 3 (chatbots) × 3 (answers) = 270 detailed answers of the chatbots. For the published problems of Set B, we also provide a reference to the source where each problem was taken from.

  14. Math CoT Arabic English Reasoning

    • kaggle.com
    zip
    Updated May 16, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Miscovery (2025). Math CoT Arabic English Reasoning [Dataset]. https://www.kaggle.com/datasets/miscovery/math-cot-arabic-english-reasoning
    Explore at:
    zip(920398 bytes)Available download formats
    Dataset updated
    May 16, 2025
    Authors
    Miscovery
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Math CoT Arabic English Dataset

    A high-quality, bilingual (English & Arabic) dataset for Chain-of-Thought (COT) reasoning in mathematics and related disciplines, developed by Miscovery AI.

    Overview

    Math-COT is a unique dataset designed to facilitate and benchmark the development of chain-of-thought reasoning capabilities in language models across mathematical domains. With meticulously crafted examples, explicit reasoning steps, and bilingual support, this dataset offers a robust foundation for training and evaluating mathematical reasoning abilities.

    Key Features

    • 99% Clean & High-Quality Data: Human-reviewed, accurately annotated examples with verified solutions
    • Bilingual Support: Complete English and Arabic parallel content for cross-lingual research and applications
    • Structured Reasoning Steps: Each problem solution is broken down into explicit step-by-step reasoning
    • Diverse Subject Coverage: Spans 21 different categories within mathematics and adjacent fields
    • Comprehensive Format: Includes questions, answers, reasoning chains, and relevant metadata

    Dataset Structure

    Each entry in the dataset contains the following fields:

    {
     "en_question": "Question text in English",
     "ar_question": "Question text in Arabic",
     "en_answer": "Detailed step-by-step solution in English",
     "ar_answer": "Detailed step-by-step solution in Arabic",
     "category": "Mathematical category",
     "en_q_word": "Word count of English question",
     "ar_q_word": "Word count of Arabic question",
     "en_a_word": "Word count of English answer",
     "ar_a_word": "Word count of Arabic answer"
    }
    

    Categories

    The dataset covers 21 distinct categories:

    1. Mathematics - Arithmetic
    2. Mathematics - Algebra
    3. Mathematics - Geometry
    4. Mathematics - Trigonometry
    5. Mathematics - Calculus
    6. Mathematics - Linear Algebra
    7. Mathematics - Probability
    8. Mathematics - Statistics
    9. Mathematics - Set Theory
    10. Mathematics - Number Theory
    11. Mathematics - Discrete Math
    12. Mathematics - Topology
    13. Mathematics - Differential Equations
    14. Mathematics - Real Analysis
    15. Math Puzzles
    16. Linguistics
    17. Logic and Reasoning
    18. Philosophy
    19. Sports and Games
    20. Psychology
    21. Cultural Traditions

    Example

    Here's a sample entry from the dataset:

    {
     "en_question": "A bag contains only red and blue balls. If one ball is drawn at random, the probability that it is red is 2/5. If 8 more red balls are added, the probability of drawing a red ball becomes 4/5. How many blue balls are there in the bag?",
     "ar_question": "تحتوي الحقيبة على كرات حمراء وزرقاء فقط. إذا تم سحب كرة واحدة عشوائيًا ، فإن احتمال أن تكون حمراء هو 2/5. إذا تمت إضافة 8 كرات حمراء أخرى ، يصبح احتمال سحب كرة حمراء 4/5. كم عدد الكرات الزرقاء الموجودة في الحقيبة؟",
    

    Usage

    This dataset is especially valuable for:

    • Training and evaluating mathematical reasoning in language models
    • Research on step-by-step problem solving approaches
    • Developing educational AI assistants for mathematics
    • Cross-lingual research on mathematical reasoning
    • Benchmarking Chain-of-Thought (COT) capabilities

    Citation

    If you use this dataset in your research, please cite:

    @dataset{miscoveryai2025mathcot,
     title={Math CoT Arabic English Reasoning: A Bilingual Dataset for Chain-of-Thought Mathematical Reasoning},
     author={Miscovery AI},
     year={2025},
     publisher={Kaggle},
     url={https://www.kaggle.com/datasets/miscovery/math-cot-arabic-english-reasoning}
    }
    

    License

    This project is licensed under the MIT License - see the LICENSE file for details.

    Contact

    For questions, feedback, or issues related to this dataset, please contact Miscovery AI at info@miscovery.com.

  15. Math Dataset

    • kaggle.com
    • opendatalab.com
    zip
    Updated Mar 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Awsaf (2024). Math Dataset [Dataset]. https://www.kaggle.com/datasets/awsaf49/math-dataset
    Explore at:
    zip(7412179 bytes)Available download formats
    Dataset updated
    Mar 12, 2024
    Authors
    Awsaf
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Awsaf

    Released under MIT

    Contents

    Reference: https://github.com/hendrycks/math/

  16. ASSISTments Replication Study - 2019-2020 cohort

    • openicpsr.org
    delimited
    Updated Dec 22, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mingyu Feng; Neil Heffernan; Robert Murphy; Jeremy Roschelle (2022). ASSISTments Replication Study - 2019-2020 cohort [Dataset]. http://doi.org/10.3886/E183645V1
    Explore at:
    delimitedAvailable download formats
    Dataset updated
    Dec 22, 2022
    Dataset provided by
    Digital Promise
    SRI
    WestEd
    Worcester Polytechnic Institute
    Authors
    Mingyu Feng; Neil Heffernan; Robert Murphy; Jeremy Roschelle
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Area covered
    United States, North Carolina
    Description

    The purpose of the ASSISTments Replication Study is to conduct a replication study of the impact of a fully developed, widely adopted intervention called ASSISTments on middle school student mathematics outcomes. ASSISTments is an online formative assessment platform that provides immediate feedback to students and supports teachers in their use of homework to improve math instruction and learning. Findings from a previous IES-funded efficacy study, conducted in Maine, indicated this intervention led to beneficial impacts on student learning outcomes in 7th grade. The current study examined the impacts of this intervention with a more diverse sample and relied on trained local math coaches (instead of the intervention developers) to provide professional development and support to teachers. Participating schools (and all 7th grade math teachers in the school) in this study were randomly assigned to either a treatment or control group. Teachers participated in the project over a two year period, the 2018-19 school year and the 2019-20 school year. The 2018-19 school year was to serve as a ramp-up year. Data used in the final analysis was collected during the second year of the study, the 2019-20 school year. The data contained in this project is primarily from the 2019-20 school year and includes student ASSISTments usage data, teacher ASSISTments usage data, student outcome data, and teacher instructional log data. Student outcome data is from the online Mathematics Readiness Test for Grade 8 developed by Math Diagnostic Test Project (MDTP). The teacher instructional log had teachers to answer questions about their daily instructional practices over the span of 5 consecutive days of instruction. They were asked to participate in 3 rounds of logs over the course of the 2019-2020 school year. Student and teacher usage data of ASSISTments were collected automatically as they used the system. The usage data was limited to treatment group only. Other data (outcome data, teacher instructional log data) were collected from both treatment and control groups.

  17. q

    Linear Regression (Excel) and Cellular Respiration for Biology, Chemistry...

    • qubeshub.org
    Updated Jan 11, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Irene Corriette; Beatriz Gonzalez; Daniela Kitanska; Henriette Mozsolits; Sheela Vemu (2022). Linear Regression (Excel) and Cellular Respiration for Biology, Chemistry and Mathematics [Dataset]. http://doi.org/10.25334/5PX5-H796
    Explore at:
    Dataset updated
    Jan 11, 2022
    Dataset provided by
    QUBES
    Authors
    Irene Corriette; Beatriz Gonzalez; Daniela Kitanska; Henriette Mozsolits; Sheela Vemu
    Description

    Students typically find linear regression analysis of data sets in a biology classroom challenging. These activities could be used in a Biology, Chemistry, Mathematics, or Statistics course. The collection provides student activity files with Excel instructions and Instructor Activity files with Excel instructions and solutions to problems.

    Students will be able to perform linear regression analysis, find correlation coefficient, create a scatter plot and find the r-square using MS Excel 365. Students will be able to interpret data sets, describe the relationship between biological variables, and predict the value of an output variable based on the input of an predictor variable.

  18. Data from: Learning Mathematics for Life A Perspective from PISA

    • catalog.data.gov
    • datasets.ai
    • +1more
    Updated Mar 30, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Department of State (2021). Learning Mathematics for Life A Perspective from PISA [Dataset]. https://catalog.data.gov/dataset/learning-mathematics-for-life-a-perspective-from-pisa
    Explore at:
    Dataset updated
    Mar 30, 2021
    Dataset provided by
    United States Department of Statehttp://state.gov/
    Area covered
    Pisa
    Description

    People from many countries have expressed interest in the tests students take for the Programme for International Student Assessment (PISA). Learning Mathematics for Life examines the link between the PISA test requirements and student performance. It focuses specifically on the proportions of students who answer questions correctly across a range of difficulty. The questions are classified by content, competencies, context and format, and the connections between these and student performance are then analysed. This analysis has been carried out in an effort to link PISA results to curricular programmes and structures in participating countries and economies. Results from the student assessment reflect differences in country performance in terms of the test questions. These findings are important for curriculum planners, policy makers and in particular teachers – especially mathematics teachers of intermediate and lower secondary school classes.

  19. Z

    Data from: MLFMF: Data Sets for Machine Learning for Mathematical...

    • data.niaid.nih.gov
    Updated Oct 26, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bauer, Andrej; Petković, Matej; Todorovski, Ljupčo (2023). MLFMF: Data Sets for Machine Learning for Mathematical Formalization [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10041074
    Explore at:
    Dataset updated
    Oct 26, 2023
    Dataset provided by
    University of Ljubljana
    Institute of Mathematics, Physics, and Mechanics
    Authors
    Bauer, Andrej; Petković, Matej; Todorovski, Ljupčo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    MLFMF MLFMF (Machine Learning for Mathematical Formalization) is a collection of data sets for benchmarking recommendation systems used to support formalization of mathematics with proof assistants. These systems help humans identify which previous entries (theorems, constructions, datatypes, and postulates) are relevant in proving a new theorem or carrying out a new construction. The MLFMF data sets provide solid benchmarking support for further investigation of the numerous machine learning approaches to formalized mathematics. With more than 250,000 entries in total, this is currently the largest collection of formalized mathematical knowledge in machine learnable format. In addition to benchmarking the recommendation systems, the data sets can also be used for benchmarking node classification and link prediction algorithms. The four data sets Each data set is derived from a library of formalized mathematics written in proof assistants Agda or Lean. The collection includes

    the largest Lean 4 library Mathlib, the three largest Agda libraries:

    the standard library the library of univalent mathematics Agda-unimath, and the TypeTopology library. Each data set represents the corresponding library in two ways: as a heterogeneous network, and as a list of syntax trees of all the entries in the library. The network contains the (modular) structure of the library and the references between entries, while the syntax trees give complete and easily parsed information about each entry. The Lean library data set was obtained by converting .olean files into s-expressions (see the lean2sexp tool). The Agda data sets were obtained with an s-expression extension of the official Agda repository (use either master-sexp or release-2.6.3-sexp branch). For more details, see our arXiv copy of the paper. Directory structure First, the mlfmf.zip archive needs to be unzipped. It contains a separate directory for every library (for example, the standard library of Agda can be found in the stdlib directory) and some auxiliary files. Every library directory contains

    the network file from which the heterogeneous network can be loaded, a zip of the entries directory that contains (many) files with abstract syntax trees. Each of those files describes a single entry of the library. In addition to the auxiliary files which are used for loading the data (and described below), the zipped sources of lean2sexp and Agda s-expression extension are present. Loading the data In addition to the data files, there is also a simple python script main.py for loading the data. To run it, you will have to install the packages listed in the file requirements.txt: tqdm and networkx. The easiest way to do so is calling pip install -r requirements.txt. When running main.py for the first time, the script will unzip the entry files into the directory named entries. After that, the script loads the syntax trees of the entries (see the Entry class) and the network (as networkx.MultiDiGraph object). Note. The entry files have extension .dag (directed acyclic graph), since Lean uses node sharing, which breaks the tree structure (a shared node has more than one parent node). More information For more information about the data collection process, detailed data (and data format) description, and baseline experiments that were already performed with these data, see our arXiv copy of the paper. For the code that was used to perform the experiments and data format description, visit our github repository https://github.com/ul-fmf/mlfmf-data. Funding Since not all the funders are available in the Zenodo's database, we list them here:

    This material is based upon work supported by the Air Force Office of Scientific Research under award number FA9550-21-1-0024. The authors also acknowledge the financial support of the Slovenian Research Agency via the research core funding No. P2-0103 and No. P1-0294.

  20. Data from: Impacts of lessons management based on Mathematics words problems...

    • scielo.figshare.com
    jpeg
    Updated Jun 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Maria Alice Veiga Ferreira de Souza (2023). Impacts of lessons management based on Mathematics words problems on learning [Dataset]. http://doi.org/10.6084/m9.figshare.5720452.v1
    Explore at:
    jpegAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    SciELOhttp://www.scielo.org/
    Authors
    Maria Alice Veiga Ferreira de Souza
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    ABSTRACT This article presents potential successes and constraints presented in lessons based on written words problems of mathematics impacting on the learning process of students in the eighth year of Portuguese classes in an elementary school. Those problems have been proposed by future teachers during a supervised internship at the University of Lisbon. The data emerged from strata of interaction/intervention of a teacher-coach with three interns regarding the actions of their lessons based on written words problems of Mathematics. Successes have been identified such as the association of geometric figures to their algebraic expressions and the conduction of explanations by direct questions on the subject, as well as constraints as confusing mathematical concepts, written commands with no meaning for students, terms without proper contextualization to the mathematical context. The research has been supported by authors and researchers in the field of problem solving, the understanding of statements of math problems and the training in/of teaching practice.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Statista (2025). Ranking of LLM tools in solving math problems 2024 [Dataset]. https://www.statista.com/statistics/1458141/leading-math-llm-tools/
Organization logo

Ranking of LLM tools in solving math problems 2024

Explore at:
Dataset updated
Jun 25, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Mar 2024
Area covered
Worldwide
Description

As of March 2024, OpenAI o1 was the large language model (LLM) tool that had the best benchmark score in solving math problems, with a score of **** percent. Close behind, in second place, was OpenAI o1-mini, followed by GPT-4o.

Search
Clear search
Close search
Google apps
Main menu