100+ datasets found

Handwritten Digits 0 - 9
kaggle.com
Updated Dec 1, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
André Meier (2022). Handwritten Digits 0 - 9 [Dataset]. http://doi.org/10.34740/kaggle/dsv/4632848
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.34740/kaggle/dsv/4632848
Dataset updated
Dec 1, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
André Meier
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Since the MNIST dataset contains only American style numbers, it is difficult to classify isolated numbers (especially 1 and 7). This dataset contains about 21,600 numbers from 0 - 9 in European (Swiss) notation. The single images are in full color .jpg with a size of 90x140px. It is possible that from time to time a small black border exists in the numbers. Please take this into account in your evaluations. have fun :-)
T
mnist
tensorflow.org
universe.roboflow.com
+3more
Updated Jun 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). mnist [Dataset]. https://www.tensorflow.org/datasets/catalog/mnist
Explore at:
Dataset updated
Jun 1, 2024
Description
The MNIST database of handwritten digits.

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('mnist', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.

https://storage.googleapis.com/tfds-data/visualization/fig/mnist-3.0.1.png" alt="Visualization" width="500px">
R
Handwritten Digits Dataset
universe.roboflow.com
zip
Updated Apr 21, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mathocr (2025). Handwritten Digits Dataset [Dataset]. https://universe.roboflow.com/mathocr-jzmyo/handwritten-digits-h27w9/model/1
Explore at:
zipAvailable download formats
Dataset updated
Apr 21, 2025
Dataset authored and provided by
Mathocr
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Digits Bounding Boxes
Description
Handwritten Digits

## Overview Handwritten Digits is a dataset for object detection tasks - it contains Digits annotations for 1,560 images. ## Getting Started You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model. ## License This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
g
Persian Handwritten Digits Dataset
gts.ai
json
Updated Aug 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
GTS (2024). Persian Handwritten Digits Dataset [Dataset]. https://gts.ai/dataset-download/persian-handwritten-digits-dataset/
Explore at:
jsonAvailable download formats
Dataset updated
Aug 12, 2024
Dataset provided by
GLOBOSE TECHNOLOGY SOLUTIONS PRIVATE LIMITED
Authors
GTS
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
The Persian Handwritten Digits Dataset contains 150,000 high-quality images of digits 0–9 generated with GANs. It is balanced across classes, culturally authentic, and ideal for OCR, digit recognition, handwriting analysis, and generative modeling.
t
MNIST database of handwritten digits - Dataset - LDM
service.tib.eu
Updated Dec 16, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). MNIST database of handwritten digits - Dataset - LDM [Dataset]. https://service.tib.eu/ldmservice/dataset/mnist-database-of-handwritten-digits
Explore at:
Dataset updated
Dec 16, 2024
Description
The MNIST handwritten digit database is a dataset of 60,000 training and 10,000 test examples of handwritten digit images.
Arabic Handwritten Digits Dataset
figshare.com
bin
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mohamed Loey (2023). Arabic Handwritten Digits Dataset [Dataset]. http://doi.org/10.6084/m9.figshare.12236948.v1
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.12236948.v1
Dataset updated
May 31, 2023
Dataset provided by
figshare
Authors
Mohamed Loey
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Arabic Handwritten Digits DatasetAbstractIn recent years, handwritten digits recognition has been an important areadue to its applications in several fields. This work is focusing on the recognitionpart of handwritten Arabic digits recognition that face several challenges, includingthe unlimited variation in human handwriting and the large public databases. Thepaper provided a deep learning technique that can be effectively apply to recognizing Arabic handwritten digits. LeNet-5, a Convolutional Neural Network (CNN)trained and tested MADBase database (Arabic handwritten digits images) that contain 60000 training and 10000 testing images. A comparison is held amongst theresults, and it is shown by the end that the use of CNN was leaded to significantimprovements across different machine-learning classification algorithms.The Convolutional Neural Network was trained and tested MADBase database (Arabic handwritten digits images) that contain 60000 training and 10000 testing images. Moreover, the CNN is giving an average recognition accuracy of 99.15%.ContextThe motivation of this study is to use cross knowledge learned from multiple works to enhancement the performance of Arabic handwritten digits recognition. In recent years, Arabic handwritten digits recognition with different handwriting styles as well, making it important to find and work on a new and advanced solution for handwriting recognition. A deep learning systems needs a huge number of data (images) to be able to make a good decisions.ContentThe MADBase is modified Arabic handwritten digits database contains 60,000 training images, and 10,000 test images. MADBase were written by 700 writers. Each writer wrote each digit (from 0 -9) ten times. To ensure including different writing styles, the database was gathered from different institutions: Colleges of Engineering and Law, School of Medicine, the Open University (whose students span a wide range of ages), a high school, and a governmental institution.MADBase is available for free and can be downloaded from (http://datacenter.aucegypt.edu/shazeem/) .AcknowledgementsCNN for Handwritten Arabic Digits Recognition Based on LeNet-5http://link.springer.com/chapter/10.1007/978-3-319-48308-5_54Ahmed El-Sawy, Hazem El-Bakry, Mohamed LoeyProceedings of the International Conference on Advanced Intelligent Systems and Informatics 2016Volume 533 of the series Advances in Intelligent Systems and Computing pp 566-575InspirationCreating the proposed database presents more challenges because it deals with many issues such as style of writing, thickness, dots number and position. Some characters have different shapes while written in the same position. For example the teh character has different shapes in isolated position.Arabic Handwritten Characters Datasethttps://www.kaggle.com/mloey1/ahcd1Benha Universityhttp://bu.edu.eg/staff/mloeyhttps://mloey.github.io/
S
MNIST Dataset
scidb.cn
Updated Feb 16, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Xuyu Zhang; Jingjing Gao; Yu Gan; Chunyuan Song; Dawei Zhang; Songlin Zhuang; Shensheng Han; Puxiang Lai; Honglin Liu (2023). MNIST Dataset [Dataset]. http://doi.org/10.57760/sciencedb.07421
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.57760/sciencedb.07421
Dataset updated
Feb 16, 2023
Dataset provided by
Science Data Bank
Authors
Xuyu Zhang; Jingjing Gao; Yu Gan; Chunyuan Song; Dawei Zhang; Songlin Zhuang; Shensheng Han; Puxiang Lai; Honglin Liu
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
MNIST is a picture data set of handwritten numbers, which was organized by the National Institute of Standards and Technology (NIST) of the United States. A total of 250 handwritten digital pictures were collected, 50% of which were high school students and 50% were from the staff of the Census Bureau. The collection purpose of this data set is to realize the recognition of handwritten digits through algorithms. The data set contains 60000 images and labels, while the test set contains 10000 images and labels. The first 5000 training sets from the initial NIST program, The last 5000 test sets from the original NIST program. The first 5000 are more regular than the last 5000, because the first 5000 data come from the employees of the US Census Bureau, and the last 5000 data come from college students.
MNIST Dataset
kaggle.com
Updated Feb 4, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Arnav Sharma (2024). MNIST Dataset [Dataset]. https://www.kaggle.com/datasets/arnavsharma45/mnist-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 4, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Arnav Sharma
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
The MNIST database (Modified National Institute of Standards and Technology database) is a large database of handwritten digits that is commonly used for training various image processing systems.
c
MNIST Dataset
cubig.ai
Updated Oct 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CUBIG (2024). MNIST Dataset [Dataset]. https://cubig.ai/store/products/478/mnist-dataset
Explore at:
Dataset updated
Oct 12, 2024
Dataset authored and provided by
CUBIG
License
https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
Measurement technique
Synthetic data generation using AI techniques for model training, Privacy-preserving data transformation via differential privacy
Description
1) Data Introduction • The MNIST Dataset is a widely used benchmark for handwritten digit recognition, containing images of handwritten digits from 0 to 9.

2) Data Utilization (1) Characteristics of the MNIST Dataset: • The dataset consists of grayscale images representing digits, collected from a diverse population, making it ideal for evaluating machine learning algorithms on image classification tasks. • It provides a standardized and easily accessible resource for comparing the performance of various classification models.

(2) Applications of the MNIST Dataset: • Handwritten digit recognition model development: The MNIST dataset is commonly used for training and testing deep learning and machine learning models in tasks such as digit recognition, algorithm benchmarking, and educational demonstrations.
D
Data from: Handwritten digits in fMRI ('69' data set)
data.ru.nl
00112_485_v1
Updated Apr 23, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Marcel van Gerven; Floris de Lange; Tom Heskes (2025). Handwritten digits in fMRI ('69' data set) [Dataset]. http://doi.org/10.34973/tvp5-r364
Explore at:
00112_485_v1(3225887 bytes)Available download formats
Unique identifier
https://doi.org/10.34973/tvp5-r364
Dataset updated
Apr 23, 2025
Dataset provided by
Radboud University
Authors
Marcel van Gerven; Floris de Lange; Tom Heskes
Description
Functional MRI data of a single participant presented with 100 examples of MNIST handwritten digits 6 and 9 (with fixation). An anatomical scan, functional localizers for dorsal and ventral V1-V3 and the image prior used for reconstruction are included.
Handwritten Digits and basic Operators
kaggle.com
Updated Oct 5, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tanmay (2024). Handwritten Digits and basic Operators [Dataset]. https://www.kaggle.com/datasets/tanmayp2311/handwritten-digits-and-basic-operators
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 5, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Tanmay
License
https://cdla.io/permissive-1-0/https://cdla.io/permissive-1-0/
Description
Created the dataset containing 500k images across 13 labels. This dataset contains images for 0-9 and addition, subtraction and multiplication. The images are 28x28 with white background and black ink.
H
nmist simple dataset
dataverse.harvard.edu
application/gzip
Updated Dec 23, 2019
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Harvard Dataverse (2019). nmist simple dataset [Dataset]. http://doi.org/10.7910/DVN/XW8JYK
Explore at:
application/gzip(28881), application/gzip(1648877), application/gzip(4542), application/gzip(9912422)Available download formats
Unique identifier
https://doi.org/10.7910/DVN/XW8JYK
Dataset updated
Dec 23, 2019
Dataset provided by
Harvard Dataverse
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
MNIST database of handwritten digits, available from this page, has a training set of 60,000 examples, and a test set of 10,000 examples. It is a subset of a larger set available from NIST. The digits have been size-normalized and centered in a fixed-size image. It is a good database for people who want to try learning techniques and pattern recognition methods on real-world data while spending minimal efforts on preprocessing and formatting.
Data from: Written and spoken digits database for multimodal learning
zenodo.org
data.niaid.nih.gov
bin
Updated Jan 21, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lyes Khacef; Lyes Khacef; Laurent Rodriguez; Benoit Miramond; Laurent Rodriguez; Benoit Miramond (2021). Written and spoken digits database for multimodal learning [Dataset]. http://doi.org/10.5281/zenodo.4452953
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.4452953
Dataset updated
Jan 21, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Lyes Khacef; Lyes Khacef; Laurent Rodriguez; Benoit Miramond; Laurent Rodriguez; Benoit Miramond
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Database description:

The written and spoken digits database is not a new database but a constructed database from existing ones, in order to provide a ready-to-use database for multimodal fusion [1].

The written digits database is the original MNIST handwritten digits database [2] with no additional processing. It consists of 70000 images (60000 for training and 10000 for test) of 28 x 28 = 784 dimensions.

The spoken digits database was extracted from Google Speech Commands [3], an audio dataset of spoken words that was proposed to train and evaluate keyword spotting systems. It consists of 105829 utterances of 35 words, amongst which 38908 utterances of the ten digits (34801 for training and 4107 for test). A pre-processing was done via the extraction of the Mel Frequency Cepstral Coefficients (MFCC) with a framing window size of 50 ms and frame shift size of 25 ms. Since the speech samples are approximately 1 s long, we end up with 39 time slots. For each one, we extract 12 MFCC coefficients with an additional energy coefficient. Thus, we have a final vector of 39 x 13 = 507 dimensions. Standardization and normalization were applied on the MFCC features.

To construct the multimodal digits dataset, we associated written and spoken digits of the same class respecting the initial partitioning in [2] and [3] for the training and test subsets. Since we have less samples for the spoken digits, we duplicated some random samples to match the number of written digits and have a multimodal digits database of 70000 samples (60000 for training and 10000 for test).

The dataset is provided in six files as described below. Therefore, if a shuffle is performed on the training or test subsets, it must be performed in unison with the same order for the written digits, spoken digits and labels.

Files:

data_wr_train.npy: 60000 samples of 784-dimentional written digits for training;

data_sp_train.npy: 60000 samples of 507-dimentional spoken digits for training;

labels_train.npy: 60000 labels for the training subset;

data_wr_test.npy: 10000 samples of 784-dimentional written digits for test;

data_sp_test.npy: 10000 samples of 507-dimentional spoken digits for test;

labels_test.npy: 10000 labels for the test subset.

References:

Khacef, L. et al. (2020), "Brain-Inspired Self-Organization with Cellular Neuromorphic Computing for Multimodal Unsupervised Learning".

LeCun, Y. & Cortes, C. (1998), “MNIST handwritten digit database”.

Warden, P. (2018), “Speech Commands: A Dataset for Limited-Vocabulary Speech Recognition”.
h
seraiki-handwritten-numerals
huggingface.co
Updated Jul 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Taha Arif (2025). seraiki-handwritten-numerals [Dataset]. https://huggingface.co/datasets/tahaListens/seraiki-handwritten-numerals
Explore at:
Dataset updated
Jul 15, 2025
Authors
Taha Arif
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
📊 Seraiki Handwritten Numbers (1–99)

Dataset Summary

This dataset contains handwritten Seraiki numbers from 1 to 99, written in words using the Perso-Arabic Seraiki script.It was created to support Optical Character Recognition (OCR) research and to promote AI development for the underrepresented Seraiki language.
Unlike digit-only datasets (e.g., MNIST), this dataset includes full word forms of numbers, making it suitable for sequence-based recognition tasks (TrOCR… See the full description on the dataset page: https://huggingface.co/datasets/tahaListens/seraiki-handwritten-numerals.
Handwritten Digits images Dataset
kaggle.com
Updated Jan 28, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
UMEGS Hamza (2020). Handwritten Digits images Dataset [Dataset]. https://www.kaggle.com/datasets/umegshamza/handwritten-digits-images-dataset/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 28, 2020
Dataset provided by
Kagglehttp://kaggle.com/
Authors
UMEGS Hamza
Description
Dataset

This dataset was created by UMEGS Hamza

Released under Data files © Original Authors

Contents
o
mnist_784
openml.org
Updated Sep 29, 2014
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yann LeCun; Corinna Cortes; Christopher J.C. Burges (2014). mnist_784 [Dataset]. https://www.openml.org/search?type=data&sort=nr_of_likes&status=active&id=554
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 29, 2014
Authors
Yann LeCun; Corinna Cortes; Christopher J.C. Burges
Description
Author: Yann LeCun, Corinna Cortes, Christopher J.C. Burges
Source: MNIST Website - Date unknown
Please cite:

The MNIST database of handwritten digits with 784 features, raw data available at: http://yann.lecun.com/exdb/mnist/. It can be split in a training set of the first 60,000 examples, and a test set of 10,000 examples

It is a subset of a larger set available from NIST. The digits have been size-normalized and centered in a fixed-size image. It is a good database for people who want to try learning techniques and pattern recognition methods on real-world data while spending minimal efforts on preprocessing and formatting. The original black and white (bilevel) images from NIST were size normalized to fit in a 20x20 pixel box while preserving their aspect ratio. The resulting images contain grey levels as a result of the anti-aliasing technique used by the normalization algorithm. the images were centered in a 28x28 image by computing the center of mass of the pixels, and translating the image so as to position this point at the center of the 28x28 field.

With some classification methods (particularly template-based methods, such as SVM and K-nearest neighbors), the error rate improves when the digits are centered by bounding box rather than center of mass. If you do this kind of pre-processing, you should report it in your publications. The MNIST database was constructed from NIST's NIST originally designated SD-3 as their training set and SD-1 as their test set. However, SD-3 is much cleaner and easier to recognize than SD-1. The reason for this can be found on the fact that SD-3 was collected among Census Bureau employees, while SD-1 was collected among high-school students. Drawing sensible conclusions from learning experiments requires that the result be independent of the choice of training set and test among the complete set of samples. Therefore it was necessary to build a new database by mixing NIST's datasets.

The MNIST training set is composed of 30,000 patterns from SD-3 and 30,000 patterns from SD-1. Our test set was composed of 5,000 patterns from SD-3 and 5,000 patterns from SD-1. The 60,000 pattern training set contained examples from approximately 250 writers. We made sure that the sets of writers of the training set and test set were disjoint. SD-1 contains 58,527 digit images written by 500 different writers. In contrast to SD-3, where blocks of data from each writer appeared in sequence, the data in SD-1 is scrambled. Writer identities for SD-1 is available and we used this information to unscramble the writers. We then split SD-1 in two: characters written by the first 250 writers went into our new training set. The remaining 250 writers were placed in our test set. Thus we had two sets with nearly 30,000 examples each. The new training set was completed with enough examples from SD-3, starting at pattern # 0, to make a full set of 60,000 training patterns. Similarly, the new test set was completed with SD-3 examples starting at pattern # 35,000 to make a full set with 60,000 test patterns. Only a subset of 10,000 test images (5,000 from SD-1 and 5,000 from SD-3) is available on this site. The full 60,000 sample training set is available.
f
Devanagari Handwritten Digit and Character Dataset
figshare.com
application/x-rar
Updated May 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rajiv Kumar (2024). Devanagari Handwritten Digit and Character Dataset [Dataset]. http://doi.org/10.6084/m9.figshare.25835629.v1
Explore at:
application/x-rarAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.25835629.v1
Dataset updated
May 16, 2024
Dataset provided by
figshare
Authors
Rajiv Kumar
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
A benchmark dataset is required for the development of an efficient and a reliable recognitionsystem. Unfortunately, no comprehensive benchmark dataset exists for handwritten Devnagari opticaldocument recognition research, at least in the public domain. This paper is an effort in this direction. In here,we introduce a comprehensive dataset that we referred to as CPAR-2012 dataset, for such benchmark studies,also present some preliminary recognition results. The dataset includes 35,000 isolated handwritten numerals,83,300 characters, 2,000 constrained and 2,000 unconstrained handwritten pangrams. It is organized in arelational data model that contains text images along with their writer's information and related handwritingattributes. We collected the handwriting samples from 2,000 subjects who were chosen from different age,ethnicity, and educational background, regional and linguistic groups. The samples reflect expected variationsin Devnagari handwriting. The digit recognition results using recognition schemes that uses simple mostfeatures & four neural network classifiers & KNN, and classifier ensemble have also been reported forbenchmarking.
m
Handwritten Hindko Digits Dataset (HHDD)
data.mendeley.com
Updated Oct 29, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
tanveer Ahmed (2024). Handwritten Hindko Digits Dataset (HHDD) [Dataset]. http://doi.org/10.17632/gz8r3spkns.3
Explore at:
Unique identifier
https://doi.org/10.17632/gz8r3spkns.3
Dataset updated
Oct 29, 2024
Authors
tanveer Ahmed
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset consists of 100 Hindko numbers written in words from 1 to 100 so these words were written on pages and every candidate was asked to write these 100 words twice. So 200 samples are taken from every candidate. Every candidate signed an undertaking that he/she have no objection on usage of this writing for academic and research purposes. Then by using advanced scanning machines these pages were scanned by setting dpi on 1200. Then words are cropped by using cropping tool from these scan images and saved into the folders. For every class separate folder is created and labelled from 1 to 100. Every sample is saved into their relevant folder so that 100 folder is used for 100 different words. As size of every image was different so for better results every image is resized into same size that is 50x50 pixels. The dataset consists of 224782 samples. The storage size of image dataset is 394MB and storage size of CSV version of dataset is 1098MB.
s
In-Air Hand-Drawn Number and Shape Dataset
orda.shef.ac.uk
zip
Updated Jul 14, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Basheer Alwaely; Charith Abhayaratne (2025). In-Air Hand-Drawn Number and Shape Dataset [Dataset]. http://doi.org/10.15131/shef.data.7381472.v2
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.15131/shef.data.7381472.v2
Dataset updated
Jul 14, 2025
Dataset provided by
The University of Sheffield
Authors
Basheer Alwaely; Charith Abhayaratne
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains in-air hand-written numbers and shapes data used in the paper:B. Alwaely and C. Abhayaratne, "Graph Spectral Domain Feature Learning With Application to in-Air Hand-Drawn Number and Shape Recognition," in IEEE Access, vol. 7, pp. 159661-159673, 2019, doi: 10.1109/ACCESS.2019.2950643.The dataset contains the following:-Readme.txt- InAirNumberShapeDataset.zip containing-Number Folder (With 2 sub folders for Matlab and Excel)-Shapes Folder (With 2 sub folders for Matlab and Excel)The datasets include the in-air drawn number and shape hand movement path captured by a Kinect sensor. The number sub dataset includes 500 instances per each number 0 to 9, resulting in a total of 5000 number data instances. Similarly, the shape sub dataset also includes 500 instances per each shape for 10 different arbitrary 2D shapes, resulting in a total of 5000 shape instances. The dataset provides X, Y, Z coordinates of the hand movement path data in Matlab (M-file) and Excel formats and their corresponding labels.This dataset creation has received The University of Sheffield ethics approval under application #023005 granted on 19/10/2018.
r
Data from: EMNIST: an extension of MNIST to handwritten letters
researchdata.edu.au
Updated May 16, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
van Schaik Andre; Tapson Jonathan; Afshar Saeed; Cohen Gregory (2023). EMNIST: an extension of MNIST to handwritten letters [Dataset]. http://doi.org/10.26183/M9K1-ZR06
Explore at:
Unique identifier
https://doi.org/10.26183/M9K1-ZR06
Dataset updated
May 16, 2023
Dataset provided by
Western Sydney University
Authors
van Schaik Andre; Tapson Jonathan; Afshar Saeed; Cohen Gregory
Description
The MNIST dataset has become a standard benchmark for learning, classification and computer vision systems. Contributing to its widespread adoption are the understandable and intuitive nature of the task, its relatively small size and storage requirements and the accessibility and ease-of-use of the database itself. The MNIST database was derived from a larger dataset known as the NIST Special Database 19 which contains digits, uppercase and lowercase handwritten letters. This paper introduces a variant of the full NIST dataset, which we have called Extended MNIST (EMNIST), which follows the same conversion paradigm used to create the MNIST dataset. The result is a set of datasets that constitute a more challenging classification tasks involving letters and digits, and that shares the same image structure and parameters as the original MNIST task, allowing for direct compatibility with all existing classifiers and systems. Benchmark results are presented along with a validation of the conversion process through the comparison of the classification results on converted NIST digits and the MNIST digits.
A Read Me file describing the database is included in the available attachments.
Note: The available zip files are each > 500MB in size. Should these files become unavailable from the website provided, please contact Western Sydney University Library about this record.

Facebook

Twitter

Click to copy link

Link copied

Cite

André Meier (2022). Handwritten Digits 0 - 9 [Dataset]. http://doi.org/10.34740/kaggle/dsv/4632848

Handwritten Digits 0 - 9

European (Swiss) handwritten digits 90x140px

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Unique identifier

https://doi.org/10.34740/kaggle/dsv/4632848

Dataset updated

Dec 1, 2022

Dataset provided by

Kagglehttp://kaggle.com/

Authors

André Meier

License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

Since the MNIST dataset contains only American style numbers, it is difficult to classify isolated numbers (especially 1 and 7). This dataset contains about 21,600 numbers from 0 - 9 in European (Swiss) notation. The single images are in full color .jpg with a size of 90x140px. It is possible that from time to time a small black border exists in the numbers. Please take this into account in your evaluations. have fun :-)

Clear search

Close search

Google apps

Main menu

Handwritten Digits 0 - 9

mnist

Handwritten Digits Dataset

Handwritten Digits

Persian Handwritten Digits Dataset

MNIST database of handwritten digits - Dataset - LDM

Arabic Handwritten Digits Dataset

MNIST Dataset

MNIST Dataset

MNIST Dataset

Data from: Handwritten digits in fMRI ('69' data set)

Handwritten Digits and basic Operators

nmist simple dataset

Data from: Written and spoken digits database for multimodal learning

seraiki-handwritten-numerals

Handwritten Digits images Dataset

Dataset

Contents

mnist_784

Devanagari Handwritten Digit and Character Dataset

Handwritten Hindko Digits Dataset (HHDD)

In-Air Hand-Drawn Number and Shape Dataset

Data from: EMNIST: an extension of MNIST to handwritten letters

Handwritten Digits 0 - 9

European (Swiss) handwritten digits 90x140px