35 datasets found
  1. Handwritten Math Expressions Dataset

    • kaggle.com
    Updated Dec 31, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    GOVINDARAM SRIRAM (2024). Handwritten Math Expressions Dataset [Dataset]. https://www.kaggle.com/datasets/govindaramsriram/handwritten-math-expressions-dataset/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 31, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    GOVINDARAM SRIRAM
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Description:
    This dataset contains images of handwritten mathematical expressions paired with their corresponding textual representations and answers. The expressions include various arithmetic operations such as addition (+), subtraction (-), multiplication (*), division (÷), and parentheses for grouping operations. The dataset is designed to support tasks such as Optical Character Recognition (OCR), handwritten text recognition, and sequence modeling for solving mathematical expressions.

    Key Features:

    • Images: Contains high-quality images of handwritten mathematical equations.
    • Annotations: A CSV file with two columns:
      • Expression: The mathematical expression in text form.
      • Answer: The evaluated result of the expression.
    • Complexity: Includes basic operations, grouped expressions with parentheses, and diverse handwriting styles to simulate real-world challenges.
    • Applications: Ideal for developing and benchmarking OCR systems, training deep learning models, and fine-tuning pretrained models for handwritten text recognition.

    This dataset serves as a valuable resource for researchers and practitioners working on handwriting recognition and mathematical problem-solving automation.

  2. R

    Handwritten Math Equation Solver Dataset

    • universe.roboflow.com
    zip
    Updated Apr 24, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Main Workspace (2024). Handwritten Math Equation Solver Dataset [Dataset]. https://universe.roboflow.com/main-workspace-ilksa/handwritten-math-equation-solver
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 24, 2024
    Dataset authored and provided by
    Main Workspace
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Handwritten Math Numbers Bounding Boxes
    Description

    Handwritten Math Equation Solver

    ## Overview
    
    Handwritten Math Equation Solver is a dataset for object detection tasks - it contains Handwritten Math Numbers annotations for 420 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  3. handwritten math symbol

    • kaggle.com
    Updated Dec 5, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Băngf Hải (2024). handwritten math symbol [Dataset]. https://www.kaggle.com/datasets/banghai/handwritten-math-symbol/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 5, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Băngf Hải
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Băngf Hải

    Released under Apache 2.0

    Contents

  4. h

    Aida-Calculus-Math-Handwriting

    • huggingface.co
    Updated Jul 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    deep copy (2025). Aida-Calculus-Math-Handwriting [Dataset]. https://huggingface.co/datasets/deepcopy/Aida-Calculus-Math-Handwriting
    Explore at:
    Dataset updated
    Jul 12, 2025
    Authors
    deep copy
    License

    https://choosealicense.com/licenses/cdla-sharing-1.0/https://choosealicense.com/licenses/cdla-sharing-1.0/

    Description

    Aida Calculus Math Handwriting Recognition Dataset (Downsampled Image-to-LaTeX Version)

    Synthetic handwritten calculus math expressions with downsampled images for LaTeX OCR and handwriting recognition tasks.

      Dataset Summary
    

    This is a processed version of the original Aida Calculus Math Handwriting Recognition Dataset, tailored specifically for image-to-LaTeX modeling. The dataset comprises synthetic handwritten calculus expressions, with each image annotated by a ground… See the full description on the dataset page: https://huggingface.co/datasets/deepcopy/Aida-Calculus-Math-Handwriting.

  5. Math Equations Dataset (AidaV7 Modified)

    • kaggle.com
    Updated Sep 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Atharva Mishra (2024). Math Equations Dataset (AidaV7 Modified) [Dataset]. https://www.kaggle.com/datasets/theseus200719/math-equations-dataset-aidav7-modified
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 1, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Atharva Mishra
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    I’ve modified the 100K image dataset of handwritten math equations from the AidaV7 Dataset to improve its usability for training models. The original dataset was divided into 10 folders, each containing 10K images, which made it challenging to train models that require a large volume of data simultaneously. I combined all the images into a single folder to address this. Additionally, I restructured the annotations, which were originally spread across multiple JSON files and stored as an array of dictionaries. The annotations weren’t in order, requiring repeated iterations through the array to locate the correct annotation for each image. To streamline this process, I merged all the JSON files into one and converted the data into a CSV file. In this CSV, each row represents an image filename, and the columns contain the corresponding annotations, making annotation retrieval faster during training.

  6. R

    Handwritten Maths Operators 2 Dataset

    • universe.roboflow.com
    zip
    Updated Jul 24, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    asd (2024). Handwritten Maths Operators 2 Dataset [Dataset]. https://universe.roboflow.com/asd-0nnic/handwritten-maths-operators-2
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 24, 2024
    Dataset authored and provided by
    asd
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Equal 6zFb Bounding Boxes
    Description

    Handwritten Maths Operators 2

    ## Overview
    
    Handwritten Maths Operators 2 is a dataset for object detection tasks - it contains Equal 6zFb annotations for 853 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  7. ICDAR 2023 CROHME: Competition on Recognition of Handwritten Mathematical...

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Oct 10, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    XIE Yejing; XIE Yejing; Mouchère Harold; Mouchère Harold; Simistira Liwicki Foteini; Simistira Liwicki Foteini; Rakesh Sumit; Saini Rajkumar; Nakagawa Masaki; Nakagawa Masaki; Nguyen Cuong Tuan; Nguyen Cuong Tuan; Truong Thanh-Nghia; Rakesh Sumit; Saini Rajkumar; Truong Thanh-Nghia (2023). ICDAR 2023 CROHME: Competition on Recognition of Handwritten Mathematical Expressions [Dataset]. http://doi.org/10.5281/zenodo.8428035
    Explore at:
    zipAvailable download formats
    Dataset updated
    Oct 10, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    XIE Yejing; XIE Yejing; Mouchère Harold; Mouchère Harold; Simistira Liwicki Foteini; Simistira Liwicki Foteini; Rakesh Sumit; Saini Rajkumar; Nakagawa Masaki; Nakagawa Masaki; Nguyen Cuong Tuan; Nguyen Cuong Tuan; Truong Thanh-Nghia; Rakesh Sumit; Saini Rajkumar; Truong Thanh-Nghia
    License

    Attribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
    License information was derived automatically

    Description

    Here is the datasets collected for the Competitionon Recognition of Online Handwritten Mathematical Expressions in competition session of ICDAR 2023.
    3 tasks are proposed with different modalities, there are on-line, off-line and bi-modal.
    For on-line task, we provide .inkml file (contain trace information, mathML and LaTeX string), and also symbol level label graph (SymLG) as ground truth. Except the new data and previous CROHME data, we also provide huge amount of artificial on-line data in the train set.
    For off-line task, the .png images (scanned from paper or rendering from inkml) and symbol level label graph (SymLG) are provided. Except the new data and previous CROHME data, we use off-line images from OffHME to increase the size of train set.
    For bi-modal task, both .inkml file and ,png images are provided as 2 channels input, and SymLG as ground truth.

    All the 3 tasks inherited the data collected from the previous 6 CROHME, and also the new collection 2023 in 3 sites, Nantes (France), Luleå (Sweden) and Tokyo (Japan).

  8. h

    fastmath-handwritten-math-to-latex

    • huggingface.co
    Updated Apr 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    deep copy (2025). fastmath-handwritten-math-to-latex [Dataset]. https://huggingface.co/datasets/deepcopy/fastmath-handwritten-math-to-latex
    Explore at:
    Dataset updated
    Apr 2, 2025
    Authors
    deep copy
    Description

    deepcopy/fastmath-handwritten-math-to-latex dataset hosted on Hugging Face and contributed by the HF Datasets community

  9. g

    Persian Handwritten Math Solutions

    • gts.ai
    json
    Updated Feb 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    GTS (2025). Persian Handwritten Math Solutions [Dataset]. https://gts.ai/dataset-download/page/9/
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Feb 6, 2025
    Dataset provided by
    GLOBOSE TECHNOLOGY SOLUTIONS PRIVATE LIMITED
    Authors
    GTS
    Description

    Explore the Persian Handwritten Math Solutions Dataset with images and JSON annotations for formula recognition.

  10. Aida Calculus Math Handwriting Recognition Dataset

    • kaggle.com
    Updated Oct 5, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aida by Pearson (2020). Aida Calculus Math Handwriting Recognition Dataset [Dataset]. https://www.kaggle.com/aidapearson/ocr-data/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 5, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Aida by Pearson
    License

    https://cdla.io/sharing-1-0/https://cdla.io/sharing-1-0/

    Description

    Context

    The Aida Calculus Math Handwriting Recognition Dataset consists of 100,000 images in 10 batches. Each image contains a photo of a handwritten calculus math expression (specifically within the topic of limits) written with a dark utensil on plain paper. Each image is accompanied by ground truth math expression in LaTeX as well as bounding boxes and pixel-level masks per character. All images are synthetically generated.

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F5602706%2F67bf0c680286baf2c979c8207a991bb2%2FScreen%20Shot%202020-08-19%20at%201.02.50%20PM.png?generation=1597868629120369&alt=media%20=500x100" alt="">

    Motivation

    The complexity of handwriting recognition for math expressions can be decomposed into the following sources of variability:

    Image of Math = Math Expression x Math Characters x Location of Math Characters x Visual Qualities of the Math Characters (fonts, color) x Noise of Image (backgrounds, stray marks)

    It is the job of the recognition model to take the Image of Math as input and predict the Math Expression.
    Typical approaches to handwritten recognition tasks involve collecting and tagging of large amounts of data, on which many iterations of models are trained. The "one dataset, many models" paradigm has specific drawbacks within the context of product development. As product requirements evolve, such as the addition of a new mathematical character into the prediction space, a new data collection and tagging effort must be undertaken. The cycle of adapting the handwriting recognition capability to new requirements is long and does not support agile product development.

    Here, we take a different approach by iteratively building a complex, synthetically generated dataset towards specific requirements. The generation process delivers exact control over the distribution of math expressions, characters, location of characters, specific visual qualities of the math, image noise, and image augmentations to the developer. The developer controls every aspect of the data, down to each pixel. In many ways, the data synthesis runs backwards to the handwriting recognition model, creating visual complexity that the model must then untangle to uncover the ground truth math expression. Thus, we can arrive at a "many datasets, one model" paradigm that as product requirements change, the data can quickly iterate and adapt on agile cycles.

    In addition to affording more control over the product development process, synthetic data allows for 100% correct pixel by pixel tagging that opens the door for new modeling possibilities. Every image is tagged with the ground truth LaTeX for the expressions, bounding boxes per math character, and exact pixel masks for each character.

    Our goal in releasing this dataset is to provide the data science and machine learning community with resources for undertaking the challenging computer vision task of extracting math expressions from images. The data offers something to all levels, from beginners building simple character recognition models to experts who wish to predict pixel-by-pixel masks and decode the complex structure of math expressions.

    Content

    The images contain math expressions of limits, a topic typically encountered by students learning Calculus I in the United States. Features of the writing such as font, writing utensils (type, color, pressure, consistency), angle and distance of photo, and size of writing are all simulated. Backgrounds features include shadows, various plain paper types, bleed throughs, other distortions, and noise typical of student taking photos of their math.

    The strategy in defining the populations from which images are synthesized is to be a superset of what we expect students to submit. Therefore, the math expressions are not in themselves pedagogical, but aim to encompass the potential variety of student submissions, both mathematically correct and incorrect. The image features and augmentations are similarly designed to cover the range of possible student handwriting qualities.

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F5602706%2F78c49b9673f8d07c91cd5c929e50ed13%2FPicture2.png?generation=1597361067979205&alt=media" alt=""> https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F5602706%2F38f70b6a773709eb02578f20634e8433%2FPicture1.png?generation=1597361068613807&alt=media" alt=""> https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F5602706%2F17a3a78ac635cd728f9d6ef32609aee8%2FPicture3.png?generation=1597361068784034&alt=media" alt=""> https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F5602706%2Fc052749a8085d66aa7bf97c78a4b6c6a%2FPicture4.png?generation=1597361068949074&alt=media%20=250x100" alt="">

    Data consis...

  11. h

    MathWriting-human

    • huggingface.co
    Updated Jun 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    deep copy (2025). MathWriting-human [Dataset]. https://huggingface.co/datasets/deepcopy/MathWriting-human
    Explore at:
    Dataset updated
    Jun 26, 2025
    Authors
    deep copy
    Description

    Dataset Card for MathWriting

      Dataset Summary
    

    The MathWriting dataset contains online handwritten mathematical expressions collected through a prompted interface and rendered to RGB images. It consists of 230,000 human-written expressions, each paired with its corresponding LaTeX string. The dataset is intended to support research in online and offline handwritten mathematical expression (HME) recognition. Key features:

    Online handwriting converted to rendered RGB images.… See the full description on the dataset page: https://huggingface.co/datasets/deepcopy/MathWriting-human.

  12. R

    Handwritten Maths Operators Dataset

    • universe.roboflow.com
    zip
    Updated Jul 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    asd (2024). Handwritten Maths Operators Dataset [Dataset]. https://universe.roboflow.com/asd-0nnic/handwritten-maths-operators/dataset/1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 23, 2024
    Dataset authored and provided by
    asd
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Equal Bounding Boxes
    Description

    Handwritten Maths Operators

    ## Overview
    
    Handwritten Maths Operators is a dataset for object detection tasks - it contains Equal annotations for 853 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  13. Offline Handwritten Mathematical Symbols

    • kaggle.com
    Updated Jul 31, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Carlos Espa (2021). Offline Handwritten Mathematical Symbols [Dataset]. https://www.kaggle.com/datasets/carlosespa/offline-handwritten-mathematical-symbols
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 31, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Carlos Espa
    Description

    Dataset

    This dataset was created by Carlos Espa

    Contents

  14. o

    Hwrt Database Of Handwritten Symbols

    • explore.openaire.eu
    • data.niaid.nih.gov
    • +1more
    Updated Jan 28, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Martin Thoma (2015). Hwrt Database Of Handwritten Symbols [Dataset]. http://doi.org/10.5281/zenodo.50022
    Explore at:
    Dataset updated
    Jan 28, 2015
    Authors
    Martin Thoma
    Description

    The HWRT database of handwritten symbols contains on-line data of handwritten symbols such as all alphanumeric characters, arrows, greek characters and mathematical symbols like the integral symbol. The database can be downloaded in form of bzip2-compressed tar files. Each tar file contains: symbols.csv: A CSV file with the rows symbol_id, latex, training_samples, test_samples. The symbol id is an integer, the row latex contains the latex code of the symbol, the rows training_samples and test_samples contain integers with the number of labeled data. train-data.csv: A CSV file with the rows symbol_id, user_id, user_agent and data. test-data.csv: A CSV file with the rows symbol_id, user_id, user_agent and data. All CSV files use ";" as delimiter and "'" as quotechar. The data is given in YAML format as a list of lists of dictinaries. Each dictionary has the keys "x", "y" and "time". (x,y) are coordinates and time is the UNIX time. About 90% of the data was made available by Daniel Kirsch via github.com/kirel/detexify-data. Thank you very much, Daniel!

  15. R

    Handwritten Maths Operators 3 Dataset

    • universe.roboflow.com
    zip
    Updated Jul 24, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    asd (2024). Handwritten Maths Operators 3 Dataset [Dataset]. https://universe.roboflow.com/asd-0nnic/handwritten-maths-operators-3
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 24, 2024
    Dataset authored and provided by
    asd
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Equal 6zFb Tq9W Bounding Boxes
    Description

    Handwritten Maths Operators 3

    ## Overview
    
    Handwritten Maths Operators 3 is a dataset for object detection tasks - it contains Equal 6zFb Tq9W annotations for 754 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  16. Handwritten math evaluation

    • kaggle.com
    Updated Oct 21, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CODER B (2020). Handwritten math evaluation [Dataset]. https://www.kaggle.com/datasets/coderb/handwritten-math-evaluation
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 21, 2020
    Dataset provided by
    Kaggle
    Authors
    CODER B
    Description

    Dataset

    This dataset was created by CODER B

    Released under Data files © Original Authors

    Contents

  17. m

    HwMath: a dataset of handwritten digits and mathematical symbols

    • mostwiedzy.pl
    zip
    Updated Jun 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Artur Poczobut; Hanna Wieloszewska (2025). HwMath: a dataset of handwritten digits and mathematical symbols [Dataset]. http://doi.org/10.34808/9b43-yh51
    Explore at:
    zip(1283816)Available download formats
    Dataset updated
    Jun 30, 2025
    Authors
    Artur Poczobut; Hanna Wieloszewska
    License

    https://creativecommons.org/licenses/zero/1.0https://creativecommons.org/licenses/zero/1.0

    Description

    The dataset contains samples of handwritten digits (0–9) and basic mathematical symbols: +, -, ÷, ×, (, ).Total number of samples: 2,183.

  18. HME100K

    • kaggle.com
    Updated May 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CuteDeadu (2025). HME100K [Dataset]. https://www.kaggle.com/datasets/cutedeadu/hme100k/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 4, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    CuteDeadu
    Description

    Dataset

    This dataset was created by CuteDeadu

    Contents

  19. R

    Handwritten Digit Dataset

    • universe.roboflow.com
    zip
    Updated Apr 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    RUET (2023). Handwritten Digit Dataset [Dataset]. https://universe.roboflow.com/ruet-tdttc/handwritten-digit/model/2
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 3, 2023
    Dataset authored and provided by
    RUET
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Digits And Signs Bounding Boxes
    Description

    Here are a few use cases for this project:

    1. Math Education Tools: The model could be integrated into an educational software to help students learn and visualize math problems. It can recognize and interpret handwritten equations, turning them into digital format, allowing students to solve them more easily.

    2. Handwritten Data Digitization: This can be used in institutions like banks, where many data entries are still done by hand. This tool could transcribe these handwritten entries into digital numbers, helping to automate the digitization process.

    3. Automated Marking System: The model can be used to auto-grade written numerical assignments or exam answers, reducing redundancy for teachers and providing objective scoring.

    4. Invoice Processing: Companies dealing with large numbers of handwritten invoices could use the model to accurately transcribe these documents into a digital system for easy tracking and management.

    5. Handwriting Recognition in Health Sector: In healthcare, doctors' handwritten notes or prescriptions often cause issues. This model could digitize those notes, ensuring that errors due to illegibility are minimized.

  20. CROHME2019

    • kaggle.com
    Updated Jun 3, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cuong Nguyen (2024). CROHME2019 [Dataset]. https://www.kaggle.com/datasets/ntcuong2103/crohme2019/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 3, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Cuong Nguyen
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    ICDAR 2019 Competition on Recognition of Handwritten Mathematical Expressions and Typeset Formula Detection (ICDAR2019-CROHME-TDF) - With temporal classification labeled data (generated from Label Graph)

    \cite{Mouchère, ICDAR 2019 Competition on Recognition of Handwritten Mathematical Expressions and Typeset Formula Detection (ICDAR2019-CROHME-TDF) ,1,ID:ICDAR2019-CROHME-TDF_1,URL:https://tc11.cvc.uab.es/datasets/ICDAR2019-CROHME-TDF_1}

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
GOVINDARAM SRIRAM (2024). Handwritten Math Expressions Dataset [Dataset]. https://www.kaggle.com/datasets/govindaramsriram/handwritten-math-expressions-dataset/code
Organization logo

Handwritten Math Expressions Dataset

Handwritten Mathematical Expressions for OCR and Deep Learning Tasks

Explore at:
33 scholarly articles cite this dataset (View in Google Scholar)
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 31, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
GOVINDARAM SRIRAM
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Description:
This dataset contains images of handwritten mathematical expressions paired with their corresponding textual representations and answers. The expressions include various arithmetic operations such as addition (+), subtraction (-), multiplication (*), division (÷), and parentheses for grouping operations. The dataset is designed to support tasks such as Optical Character Recognition (OCR), handwritten text recognition, and sequence modeling for solving mathematical expressions.

Key Features:

  • Images: Contains high-quality images of handwritten mathematical equations.
  • Annotations: A CSV file with two columns:
    • Expression: The mathematical expression in text form.
    • Answer: The evaluated result of the expression.
  • Complexity: Includes basic operations, grouped expressions with parentheses, and diverse handwriting styles to simulate real-world challenges.
  • Applications: Ideal for developing and benchmarking OCR systems, training deep learning models, and fine-tuning pretrained models for handwritten text recognition.

This dataset serves as a valuable resource for researchers and practitioners working on handwriting recognition and mathematical problem-solving automation.

Search
Clear search
Close search
Google apps
Main menu