100+ datasets found
  1. Handwritten Digits 0 - 9

    • kaggle.com
    Updated Dec 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    André Meier (2022). Handwritten Digits 0 - 9 [Dataset]. http://doi.org/10.34740/kaggle/dsv/4632848
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 1, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    André Meier
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Since the MNIST dataset contains only American style numbers, it is difficult to classify isolated numbers (especially 1 and 7). This dataset contains about 21,600 numbers from 0 - 9 in European (Swiss) notation. The single images are in full color .jpg with a size of 90x140px. It is possible that from time to time a small black border exists in the numbers. Please take this into account in your evaluations. have fun :-)

  2. T

    mnist

    • tensorflow.org
    • universe.roboflow.com
    • +3more
    Updated Jun 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). mnist [Dataset]. https://www.tensorflow.org/datasets/catalog/mnist
    Explore at:
    Dataset updated
    Jun 1, 2024
    Description

    The MNIST database of handwritten digits.

    To use this dataset:

    import tensorflow_datasets as tfds
    
    ds = tfds.load('mnist', split='train')
    for ex in ds.take(4):
     print(ex)
    

    See the guide for more informations on tensorflow_datasets.

    https://storage.googleapis.com/tfds-data/visualization/fig/mnist-3.0.1.png" alt="Visualization" width="500px">

  3. R

    Handwritten Digits Dataset

    • universe.roboflow.com
    zip
    Updated Apr 21, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mathocr (2025). Handwritten Digits Dataset [Dataset]. https://universe.roboflow.com/mathocr-jzmyo/handwritten-digits-h27w9/model/1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 21, 2025
    Dataset authored and provided by
    Mathocr
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Digits Bounding Boxes
    Description

    Handwritten Digits

    ## Overview
    
    Handwritten Digits is a dataset for object detection tasks - it contains Digits annotations for 1,560 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  4. g

    Persian Handwritten Digits Dataset

    • gts.ai
    json
    Updated Aug 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    GTS (2024). Persian Handwritten Digits Dataset [Dataset]. https://gts.ai/dataset-download/persian-handwritten-digits-dataset/
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Aug 12, 2024
    Dataset provided by
    GLOBOSE TECHNOLOGY SOLUTIONS PRIVATE LIMITED
    Authors
    GTS
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    The Persian Handwritten Digits Dataset contains 150,000 high-quality images of digits 0–9 generated with GANs. It is balanced across classes, culturally authentic, and ideal for OCR, digit recognition, handwriting analysis, and generative modeling.

  5. t

    MNIST database of handwritten digits - Dataset - LDM

    • service.tib.eu
    Updated Dec 16, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). MNIST database of handwritten digits - Dataset - LDM [Dataset]. https://service.tib.eu/ldmservice/dataset/mnist-database-of-handwritten-digits
    Explore at:
    Dataset updated
    Dec 16, 2024
    Description

    The MNIST handwritten digit database is a dataset of 60,000 training and 10,000 test examples of handwritten digit images.

  6. Arabic Handwritten Digits Dataset

    • figshare.com
    bin
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mohamed Loey (2023). Arabic Handwritten Digits Dataset [Dataset]. http://doi.org/10.6084/m9.figshare.12236948.v1
    Explore at:
    binAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    figshare
    Authors
    Mohamed Loey
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Arabic Handwritten Digits DatasetAbstractIn recent years, handwritten digits recognition has been an important areadue to its applications in several fields. This work is focusing on the recognitionpart of handwritten Arabic digits recognition that face several challenges, includingthe unlimited variation in human handwriting and the large public databases. Thepaper provided a deep learning technique that can be effectively apply to recognizing Arabic handwritten digits. LeNet-5, a Convolutional Neural Network (CNN)trained and tested MADBase database (Arabic handwritten digits images) that contain 60000 training and 10000 testing images. A comparison is held amongst theresults, and it is shown by the end that the use of CNN was leaded to significantimprovements across different machine-learning classification algorithms.The Convolutional Neural Network was trained and tested MADBase database (Arabic handwritten digits images) that contain 60000 training and 10000 testing images. Moreover, the CNN is giving an average recognition accuracy of 99.15%.ContextThe motivation of this study is to use cross knowledge learned from multiple works to enhancement the performance of Arabic handwritten digits recognition. In recent years, Arabic handwritten digits recognition with different handwriting styles as well, making it important to find and work on a new and advanced solution for handwriting recognition. A deep learning systems needs a huge number of data (images) to be able to make a good decisions.ContentThe MADBase is modified Arabic handwritten digits database contains 60,000 training images, and 10,000 test images. MADBase were written by 700 writers. Each writer wrote each digit (from 0 -9) ten times. To ensure including different writing styles, the database was gathered from different institutions: Colleges of Engineering and Law, School of Medicine, the Open University (whose students span a wide range of ages), a high school, and a governmental institution.MADBase is available for free and can be downloaded from (http://datacenter.aucegypt.edu/shazeem/) .AcknowledgementsCNN for Handwritten Arabic Digits Recognition Based on LeNet-5http://link.springer.com/chapter/10.1007/978-3-319-48308-5_54Ahmed El-Sawy, Hazem El-Bakry, Mohamed LoeyProceedings of the International Conference on Advanced Intelligent Systems and Informatics 2016Volume 533 of the series Advances in Intelligent Systems and Computing pp 566-575InspirationCreating the proposed database presents more challenges because it deals with many issues such as style of writing, thickness, dots number and position. Some characters have different shapes while written in the same position. For example the teh character has different shapes in isolated position.Arabic Handwritten Characters Datasethttps://www.kaggle.com/mloey1/ahcd1Benha Universityhttp://bu.edu.eg/staff/mloeyhttps://mloey.github.io/

  7. S

    MNIST Dataset

    • scidb.cn
    Updated Feb 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Xuyu Zhang; Jingjing Gao; Yu Gan; Chunyuan Song; Dawei Zhang; Songlin Zhuang; Shensheng Han; Puxiang Lai; Honglin Liu (2023). MNIST Dataset [Dataset]. http://doi.org/10.57760/sciencedb.07421
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 16, 2023
    Dataset provided by
    Science Data Bank
    Authors
    Xuyu Zhang; Jingjing Gao; Yu Gan; Chunyuan Song; Dawei Zhang; Songlin Zhuang; Shensheng Han; Puxiang Lai; Honglin Liu
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    MNIST is a picture data set of handwritten numbers, which was organized by the National Institute of Standards and Technology (NIST) of the United States. A total of 250 handwritten digital pictures were collected, 50% of which were high school students and 50% were from the staff of the Census Bureau. The collection purpose of this data set is to realize the recognition of handwritten digits through algorithms. The data set contains 60000 images and labels, while the test set contains 10000 images and labels. The first 5000 training sets from the initial NIST program, The last 5000 test sets from the original NIST program. The first 5000 are more regular than the last 5000, because the first 5000 data come from the employees of the US Census Bureau, and the last 5000 data come from college students.

  8. MNIST Dataset

    • kaggle.com
    Updated Feb 4, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Arnav Sharma (2024). MNIST Dataset [Dataset]. https://www.kaggle.com/datasets/arnavsharma45/mnist-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 4, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Arnav Sharma
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    The MNIST database (Modified National Institute of Standards and Technology database) is a large database of handwritten digits that is commonly used for training various image processing systems.

  9. c

    MNIST Dataset

    • cubig.ai
    Updated Oct 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CUBIG (2024). MNIST Dataset [Dataset]. https://cubig.ai/store/products/478/mnist-dataset
    Explore at:
    Dataset updated
    Oct 12, 2024
    Dataset authored and provided by
    CUBIG
    License

    https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service

    Measurement technique
    Synthetic data generation using AI techniques for model training, Privacy-preserving data transformation via differential privacy
    Description

    1) Data Introduction • The MNIST Dataset is a widely used benchmark for handwritten digit recognition, containing images of handwritten digits from 0 to 9.

    2) Data Utilization (1) Characteristics of the MNIST Dataset: • The dataset consists of grayscale images representing digits, collected from a diverse population, making it ideal for evaluating machine learning algorithms on image classification tasks. • It provides a standardized and easily accessible resource for comparing the performance of various classification models.

    (2) Applications of the MNIST Dataset: • Handwritten digit recognition model development: The MNIST dataset is commonly used for training and testing deep learning and machine learning models in tasks such as digit recognition, algorithm benchmarking, and educational demonstrations.

  10. D

    Data from: Handwritten digits in fMRI ('69' data set)

    • data.ru.nl
    00112_485_v1
    Updated Apr 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marcel van Gerven; Floris de Lange; Tom Heskes (2025). Handwritten digits in fMRI ('69' data set) [Dataset]. http://doi.org/10.34973/tvp5-r364
    Explore at:
    00112_485_v1(3225887 bytes)Available download formats
    Dataset updated
    Apr 23, 2025
    Dataset provided by
    Radboud University
    Authors
    Marcel van Gerven; Floris de Lange; Tom Heskes
    Description

    Functional MRI data of a single participant presented with 100 examples of MNIST handwritten digits 6 and 9 (with fixation). An anatomical scan, functional localizers for dorsal and ventral V1-V3 and the image prior used for reconstruction are included.

  11. Handwritten Digits and basic Operators

    • kaggle.com
    Updated Oct 5, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tanmay (2024). Handwritten Digits and basic Operators [Dataset]. https://www.kaggle.com/datasets/tanmayp2311/handwritten-digits-and-basic-operators
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 5, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Tanmay
    License

    https://cdla.io/permissive-1-0/https://cdla.io/permissive-1-0/

    Description

    Created the dataset containing 500k images across 13 labels. This dataset contains images for 0-9 and addition, subtraction and multiplication. The images are 28x28 with white background and black ink.

  12. H

    nmist simple dataset

    • dataverse.harvard.edu
    application/gzip
    Updated Dec 23, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Harvard Dataverse (2019). nmist simple dataset [Dataset]. http://doi.org/10.7910/DVN/XW8JYK
    Explore at:
    application/gzip(28881), application/gzip(1648877), application/gzip(4542), application/gzip(9912422)Available download formats
    Dataset updated
    Dec 23, 2019
    Dataset provided by
    Harvard Dataverse
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    MNIST database of handwritten digits, available from this page, has a training set of 60,000 examples, and a test set of 10,000 examples. It is a subset of a larger set available from NIST. The digits have been size-normalized and centered in a fixed-size image. It is a good database for people who want to try learning techniques and pattern recognition methods on real-world data while spending minimal efforts on preprocessing and formatting.

  13. Data from: Written and spoken digits database for multimodal learning

    • zenodo.org
    • data.niaid.nih.gov
    bin
    Updated Jan 21, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lyes Khacef; Lyes Khacef; Laurent Rodriguez; Benoit Miramond; Laurent Rodriguez; Benoit Miramond (2021). Written and spoken digits database for multimodal learning [Dataset]. http://doi.org/10.5281/zenodo.4452953
    Explore at:
    binAvailable download formats
    Dataset updated
    Jan 21, 2021
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Lyes Khacef; Lyes Khacef; Laurent Rodriguez; Benoit Miramond; Laurent Rodriguez; Benoit Miramond
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Database description:

    The written and spoken digits database is not a new database but a constructed database from existing ones, in order to provide a ready-to-use database for multimodal fusion [1].

    The written digits database is the original MNIST handwritten digits database [2] with no additional processing. It consists of 70000 images (60000 for training and 10000 for test) of 28 x 28 = 784 dimensions.

    The spoken digits database was extracted from Google Speech Commands [3], an audio dataset of spoken words that was proposed to train and evaluate keyword spotting systems. It consists of 105829 utterances of 35 words, amongst which 38908 utterances of the ten digits (34801 for training and 4107 for test). A pre-processing was done via the extraction of the Mel Frequency Cepstral Coefficients (MFCC) with a framing window size of 50 ms and frame shift size of 25 ms. Since the speech samples are approximately 1 s long, we end up with 39 time slots. For each one, we extract 12 MFCC coefficients with an additional energy coefficient. Thus, we have a final vector of 39 x 13 = 507 dimensions. Standardization and normalization were applied on the MFCC features.

    To construct the multimodal digits dataset, we associated written and spoken digits of the same class respecting the initial partitioning in [2] and [3] for the training and test subsets. Since we have less samples for the spoken digits, we duplicated some random samples to match the number of written digits and have a multimodal digits database of 70000 samples (60000 for training and 10000 for test).

    The dataset is provided in six files as described below. Therefore, if a shuffle is performed on the training or test subsets, it must be performed in unison with the same order for the written digits, spoken digits and labels.

    Files:

    • data_wr_train.npy: 60000 samples of 784-dimentional written digits for training;
    • data_sp_train.npy: 60000 samples of 507-dimentional spoken digits for training;
    • labels_train.npy: 60000 labels for the training subset;
    • data_wr_test.npy: 10000 samples of 784-dimentional written digits for test;
    • data_sp_test.npy: 10000 samples of 507-dimentional spoken digits for test;
    • labels_test.npy: 10000 labels for the test subset.

    References:

    1. Khacef, L. et al. (2020), "Brain-Inspired Self-Organization with Cellular Neuromorphic Computing for Multimodal Unsupervised Learning".
    2. LeCun, Y. & Cortes, C. (1998), “MNIST handwritten digit database”.
    3. Warden, P. (2018), “Speech Commands: A Dataset for Limited-Vocabulary Speech Recognition”.
  14. h

    seraiki-handwritten-numerals

    • huggingface.co
    Updated Jul 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Taha Arif (2025). seraiki-handwritten-numerals [Dataset]. https://huggingface.co/datasets/tahaListens/seraiki-handwritten-numerals
    Explore at:
    Dataset updated
    Jul 15, 2025
    Authors
    Taha Arif
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    📊 Seraiki Handwritten Numbers (1–99)

      Dataset Summary
    

    This dataset contains handwritten Seraiki numbers from 1 to 99, written in words using the Perso-Arabic Seraiki script.It was created to support Optical Character Recognition (OCR) research and to promote AI development for the underrepresented Seraiki language.
    Unlike digit-only datasets (e.g., MNIST), this dataset includes full word forms of numbers, making it suitable for sequence-based recognition tasks (TrOCR… See the full description on the dataset page: https://huggingface.co/datasets/tahaListens/seraiki-handwritten-numerals.

  15. Handwritten Digits images Dataset

    • kaggle.com
    Updated Jan 28, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    UMEGS Hamza (2020). Handwritten Digits images Dataset [Dataset]. https://www.kaggle.com/datasets/umegshamza/handwritten-digits-images-dataset/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 28, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    UMEGS Hamza
    Description

    Dataset

    This dataset was created by UMEGS Hamza

    Released under Data files © Original Authors

    Contents

  16. o

    mnist_784

    • openml.org
    Updated Sep 29, 2014
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yann LeCun; Corinna Cortes; Christopher J.C. Burges (2014). mnist_784 [Dataset]. https://www.openml.org/search?type=data&sort=nr_of_likes&status=active&id=554
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 29, 2014
    Authors
    Yann LeCun; Corinna Cortes; Christopher J.C. Burges
    Description

    Author: Yann LeCun, Corinna Cortes, Christopher J.C. Burges
    Source: MNIST Website - Date unknown
    Please cite:

    The MNIST database of handwritten digits with 784 features, raw data available at: http://yann.lecun.com/exdb/mnist/. It can be split in a training set of the first 60,000 examples, and a test set of 10,000 examples

    It is a subset of a larger set available from NIST. The digits have been size-normalized and centered in a fixed-size image. It is a good database for people who want to try learning techniques and pattern recognition methods on real-world data while spending minimal efforts on preprocessing and formatting. The original black and white (bilevel) images from NIST were size normalized to fit in a 20x20 pixel box while preserving their aspect ratio. The resulting images contain grey levels as a result of the anti-aliasing technique used by the normalization algorithm. the images were centered in a 28x28 image by computing the center of mass of the pixels, and translating the image so as to position this point at the center of the 28x28 field.

    With some classification methods (particularly template-based methods, such as SVM and K-nearest neighbors), the error rate improves when the digits are centered by bounding box rather than center of mass. If you do this kind of pre-processing, you should report it in your publications. The MNIST database was constructed from NIST's NIST originally designated SD-3 as their training set and SD-1 as their test set. However, SD-3 is much cleaner and easier to recognize than SD-1. The reason for this can be found on the fact that SD-3 was collected among Census Bureau employees, while SD-1 was collected among high-school students. Drawing sensible conclusions from learning experiments requires that the result be independent of the choice of training set and test among the complete set of samples. Therefore it was necessary to build a new database by mixing NIST's datasets.

    The MNIST training set is composed of 30,000 patterns from SD-3 and 30,000 patterns from SD-1. Our test set was composed of 5,000 patterns from SD-3 and 5,000 patterns from SD-1. The 60,000 pattern training set contained examples from approximately 250 writers. We made sure that the sets of writers of the training set and test set were disjoint. SD-1 contains 58,527 digit images written by 500 different writers. In contrast to SD-3, where blocks of data from each writer appeared in sequence, the data in SD-1 is scrambled. Writer identities for SD-1 is available and we used this information to unscramble the writers. We then split SD-1 in two: characters written by the first 250 writers went into our new training set. The remaining 250 writers were placed in our test set. Thus we had two sets with nearly 30,000 examples each. The new training set was completed with enough examples from SD-3, starting at pattern # 0, to make a full set of 60,000 training patterns. Similarly, the new test set was completed with SD-3 examples starting at pattern # 35,000 to make a full set with 60,000 test patterns. Only a subset of 10,000 test images (5,000 from SD-1 and 5,000 from SD-3) is available on this site. The full 60,000 sample training set is available.

  17. f

    Devanagari Handwritten Digit and Character Dataset

    • figshare.com
    application/x-rar
    Updated May 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rajiv Kumar (2024). Devanagari Handwritten Digit and Character Dataset [Dataset]. http://doi.org/10.6084/m9.figshare.25835629.v1
    Explore at:
    application/x-rarAvailable download formats
    Dataset updated
    May 16, 2024
    Dataset provided by
    figshare
    Authors
    Rajiv Kumar
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    A benchmark dataset is required for the development of an efficient and a reliable recognitionsystem. Unfortunately, no comprehensive benchmark dataset exists for handwritten Devnagari opticaldocument recognition research, at least in the public domain. This paper is an effort in this direction. In here,we introduce a comprehensive dataset that we referred to as CPAR-2012 dataset, for such benchmark studies,also present some preliminary recognition results. The dataset includes 35,000 isolated handwritten numerals,83,300 characters, 2,000 constrained and 2,000 unconstrained handwritten pangrams. It is organized in arelational data model that contains text images along with their writer's information and related handwritingattributes. We collected the handwriting samples from 2,000 subjects who were chosen from different age,ethnicity, and educational background, regional and linguistic groups. The samples reflect expected variationsin Devnagari handwriting. The digit recognition results using recognition schemes that uses simple mostfeatures & four neural network classifiers & KNN, and classifier ensemble have also been reported forbenchmarking.

  18. m

    Handwritten Hindko Digits Dataset (HHDD)

    • data.mendeley.com
    Updated Oct 29, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    tanveer Ahmed (2024). Handwritten Hindko Digits Dataset (HHDD) [Dataset]. http://doi.org/10.17632/gz8r3spkns.3
    Explore at:
    Dataset updated
    Oct 29, 2024
    Authors
    tanveer Ahmed
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset consists of 100 Hindko numbers written in words from 1 to 100 so these words were written on pages and every candidate was asked to write these 100 words twice. So 200 samples are taken from every candidate. Every candidate signed an undertaking that he/she have no objection on usage of this writing for academic and research purposes. Then by using advanced scanning machines these pages were scanned by setting dpi on 1200. Then words are cropped by using cropping tool from these scan images and saved into the folders. For every class separate folder is created and labelled from 1 to 100. Every sample is saved into their relevant folder so that 100 folder is used for 100 different words. As size of every image was different so for better results every image is resized into same size that is 50x50 pixels. The dataset consists of 224782 samples. The storage size of image dataset is 394MB and storage size of CSV version of dataset is 1098MB.

  19. s

    In-Air Hand-Drawn Number and Shape Dataset

    • orda.shef.ac.uk
    zip
    Updated Jul 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Basheer Alwaely; Charith Abhayaratne (2025). In-Air Hand-Drawn Number and Shape Dataset [Dataset]. http://doi.org/10.15131/shef.data.7381472.v2
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 14, 2025
    Dataset provided by
    The University of Sheffield
    Authors
    Basheer Alwaely; Charith Abhayaratne
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains in-air hand-written numbers and shapes data used in the paper:B. Alwaely and C. Abhayaratne, "Graph Spectral Domain Feature Learning With Application to in-Air Hand-Drawn Number and Shape Recognition," in IEEE Access, vol. 7, pp. 159661-159673, 2019, doi: 10.1109/ACCESS.2019.2950643.The dataset contains the following:-Readme.txt- InAirNumberShapeDataset.zip containing-Number Folder (With 2 sub folders for Matlab and Excel)-Shapes Folder (With 2 sub folders for Matlab and Excel)The datasets include the in-air drawn number and shape hand movement path captured by a Kinect sensor. The number sub dataset includes 500 instances per each number 0 to 9, resulting in a total of 5000 number data instances. Similarly, the shape sub dataset also includes 500 instances per each shape for 10 different arbitrary 2D shapes, resulting in a total of 5000 shape instances. The dataset provides X, Y, Z coordinates of the hand movement path data in Matlab (M-file) and Excel formats and their corresponding labels.This dataset creation has received The University of Sheffield ethics approval under application #023005 granted on 19/10/2018.

  20. r

    Data from: EMNIST: an extension of MNIST to handwritten letters

    • researchdata.edu.au
    Updated May 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    van Schaik Andre; Tapson Jonathan; Afshar Saeed; Cohen Gregory (2023). EMNIST: an extension of MNIST to handwritten letters [Dataset]. http://doi.org/10.26183/M9K1-ZR06
    Explore at:
    Dataset updated
    May 16, 2023
    Dataset provided by
    Western Sydney University
    Authors
    van Schaik Andre; Tapson Jonathan; Afshar Saeed; Cohen Gregory
    Description

    The MNIST dataset has become a standard benchmark for learning, classification and computer vision systems. Contributing to its widespread adoption are the understandable and intuitive nature of the task, its relatively small size and storage requirements and the accessibility and ease-of-use of the database itself. The MNIST database was derived from a larger dataset known as the NIST Special Database 19 which contains digits, uppercase and lowercase handwritten letters. This paper introduces a variant of the full NIST dataset, which we have called Extended MNIST (EMNIST), which follows the same conversion paradigm used to create the MNIST dataset. The result is a set of datasets that constitute a more challenging classification tasks involving letters and digits, and that shares the same image structure and parameters as the original MNIST task, allowing for direct compatibility with all existing classifiers and systems. Benchmark results are presented along with a validation of the conversion process through the comparison of the classification results on converted NIST digits and the MNIST digits.

    A Read Me file describing the database is included in the available attachments.
    Note: The available zip files are each > 500MB in size. Should these files become unavailable from the website provided, please contact Western Sydney University Library about this record.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
André Meier (2022). Handwritten Digits 0 - 9 [Dataset]. http://doi.org/10.34740/kaggle/dsv/4632848
Organization logo

Handwritten Digits 0 - 9

European (Swiss) handwritten digits 90x140px

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 1, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
André Meier
License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

Since the MNIST dataset contains only American style numbers, it is difficult to classify isolated numbers (especially 1 and 7). This dataset contains about 21,600 numbers from 0 - 9 in European (Swiss) notation. The single images are in full color .jpg with a size of 90x140px. It is possible that from time to time a small black border exists in the numbers. Please take this into account in your evaluations. have fun :-)

Search
Clear search
Close search
Google apps
Main menu