Facebook
TwitterMNIST is a subset of a larger set available from NIST (it's copied from http://yann.lecun.com/exdb/mnist/)
The MNIST database of handwritten digits has a training set of 60,000 examples, and a test set of 10,000 examples. . Four files are available:
Many methods have been tested with this training set and test set (see http://yann.lecun.com/exdb/mnist/ for more details)
Facebook
TwitterAttribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
License information was derived automatically
The MNIST dataset is a dataset of handwritten digits. It is a popular dataset for machine learning and artificial intelligence research. The dataset consists of 60,000 training images and 10,000 test images. Each image is a 28x28 pixel grayscale image of a handwritten digit. The digits are labeled from 0 to 9.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
The MNIST database (Modified National Institute of Standards and Technology database) is a large collection of handwritten digits. It is a subset of a larger NIST Special Database 3 (digits written by employees of the United States Census Bureau) and Special Database 1 (digits written by high school students) which contain monochrome images of handwritten digits. The digits have been size-normalized and centered in a fixed-size image. The original black and white (bilevel) images from NIST were size normalized to fit in a 20x20 pixel box while preserving their aspect ratio. The resulting images contain grey levels as a result of the anti-aliasing technique used by the normalization algorithm. the images were centered in a 28x28 image by computing the center of mass of the pixels and translating the image so as to position this point at the center of the 28x28 field.
License: Yann LeCun and Corinna Cortes hold the copyright of MNIST dataset, which is a derivative work from original NIST datasets. MNIST dataset is made available under the terms of the Creative Commons Attribution-Share Alike 3.0 license.
Facebook
Twitterhttps://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified
The MNIST database of handwritten digits, available from this page, has a training set of 60,000 examples, and a test set of 10,000 examples. It is a subset of a larger set available from NIST. The digits have been size-normalized and centered in a fixed-size image. It is a good database for people who want to try learning techniques and pattern recognition methods on real-world data while spending minimal efforts on preprocessing and formatting. The original black and white (bilevel) images from NIST were size normalized to fit in a 20x20 pixel box while preserving their aspect ratio. The resulting images contain grey levels as a result of the anti-aliasing technique used by the normalization algorithm. the images were centered in a 28x28 image by computing the center of mass of the pixels, and translating the image so as to position this point at the center of the 28x28 field. With some classification methods (particuarly template-based methods, such as SVM and K-nearest neighbors),
Facebook
Twitterhttp://www.gnu.org/licenses/lgpl-3.0.htmlhttp://www.gnu.org/licenses/lgpl-3.0.html
The MNIST-100 dataset is a variation of the original MNIST dataset, consisting of 100 handwritten numbers extracted from the MNIST dataset. Unlike the traditional MNIST dataset, which contains 60,000 training images of digits from 0 to 9, the Modified MNIST-10 dataset focuses on 100 numbers.
Dataset Overview: - Dataset Name: MNIST-100 - Total Number of Images: train: 60000 test: 1000 - Classes: 100 (Numbers from 00 to 99) - Image Size: 28x56 pixels (grayscale)
Data Collection: The MNIST-100 dataset was created by randomly selecting 10 unique digits from the original MNIST dataset. For each selected digit, 10 representative images were extracted, resulting in a total of 100 images. These images were carefully chosen to represent a diverse range of handwriting styles for each digit.
Each image in the dataset is labeled with its corresponding numbers, ranging from 00 to 99, making it suitable for classification tasks. Researchers and practitioners can use this dataset to train and evaluate machine learning algorithms and neural networks for digit recognition and classification.
Please note that the Modified MNIST-100 dataset is not intended to replace the original MNIST dataset but serves as a complementary resource for specific applications requiring a smaller and more focused subset of the MNIST data.
Overall, the MNIST-100 dataset offers a compact and representative collection of 100 handwritten numbers, providing a convenient tool for experimentation and learning in computer vision and pattern recognition.
Label Distribution for training set:
| Label | Occurrences | Label | Occurrences | Label | Occurrences |
|---|---|---|---|---|---|
| 0 | 561 | 34 | 629 | 68 | 606 |
| 1 | 687 | 35 | 540 | 69 | 582 |
| 2 | 582 | 36 | 588 | 70 | 566 |
| 3 | 633 | 37 | 619 | 71 | 659 |
| 4 | 588 | 38 | 584 | 72 | 572 |
| 5 | 544 | 39 | 609 | 73 | 682 |
| 6 | 582 | 40 | 570 | 74 | 627 |
| 7 | 615 | 41 | 679 | 75 | 598 |
| 8 | 584 | 42 | 544 | 76 | 605 |
| 9 | 567 | 43 | 567 | 77 | 602 |
| 10 | 641 | 44 | 574 | 78 | 595 |
| 11 | 780 | 45 | 555 | 79 | 586 |
| 12 | 720 | 46 | 550 | 80 | 569 |
| 13 | 699 | 47 | 614 | 81 | 628 |
| 14 | 630 | 48 | 614 | 82 | 578 |
| 15 | 627 | 49 | 595 | 83 | 622 |
| 16 | 684 | 50 | 505 | 84 | 569 |
| 17 | 713 | 51 | 583 | 85 | 540 |
| 18 | 743 | 52 | 512 | 86 | 557 |
| 19 | 706 | 53 | 555 | 87 | 628 |
| 20 | 527 | 54 | 504 | 88 | 562 |
| 21 | 710 | 55 | 488 | 89 | 625 |
| 22 | 586 | 56 | 531 | 90 | 600 |
| 23 | 584 | 57 | 556 | 91 | 700 |
| 24 | 568 | 58 | 497 | 92 | 622 |
| 25 | 530 | 59 | 520 | 93 | 622 |
| 26 | 612 | 60 | 556 | 94 | 591 |
| 27 | 627 | 61 | 682 | 95 | 557 |
| 28 | 618 | 62 | 594 | 96 | 580 |
| 29 | 619 | 63 | 539 | 97 | 640 |
| 30 | 622 | 64 | 610 | 98 | 577 |
| 31 | 684 | 65 | 514 | 99 | 563 |
| 32 | 606 | 66 | 587 | ||
| 33 | 592 | 67 | 655 |
Test data:
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F7193292%2Fac688f2526851734cb50be10f0a7bd7d%2Fpobrane%20(16).png?generation=1690276359580027&alt=media" alt="">
| Label | Occurrences | Label | Occurrences | Label | Occurrences |
|---|---|---|---|---|---|
| 00 | 96 | 34 | 100 | 68 | 90 |
| 01 | 108 | 35 | 91 | 69 | 92 |
| 02 | 91 | 36 | 107 | 70 | 102 |
| 03 | 96 | 37 | 112 | 71 | 116 |
| 04 | 75 | 38 | 97 | 72 | 101 |
| 05 | 85 | 39 | 96 | 73 | 106 |
| 06 | 88 | 40 | 103 | 74 | 98 |
| 07 | 96 | 41 | 123 | 75 ... |
Facebook
TwitterThe goal of introducing the Rescaled Fashion-MNIST dataset is to provide a dataset that contains scale variations (up to a factor of 4), to evaluate the ability of networks to generalise to scales not present in the training data.
The Rescaled Fashion-MNIST dataset was introduced in the paper:
[1] A. Perzanowski and T. Lindeberg (2025) "Scale generalisation properties of extended scale-covariant and scale-invariant Gaussian derivative networks on image datasets with spatial scaling variations”, Journal of Mathematical Imaging and Vision, 67(29), https://doi.org/10.1007/s10851-025-01245-x.
with a pre-print available at arXiv:
[2] Perzanowski and Lindeberg (2024) "Scale generalisation properties of extended scale-covariant and scale-invariant Gaussian derivative networks on image datasets with spatial scaling variations”, arXiv preprint arXiv:2409.11140.
Importantly, the Rescaled Fashion-MNIST dataset is more challenging than the MNIST Large Scale dataset, introduced in:
[3] Y. Jansson and T. Lindeberg (2022) "Scale-invariant scale-channel networks: Deep networks that generalise to previously unseen scales", Journal of Mathematical Imaging and Vision, 64(5): 506-536, https://doi.org/10.1007/s10851-022-01082-2.
The Rescaled Fashion-MNIST dataset is provided on the condition that you provide proper citation for the original Fashion-MNIST dataset:
[4] Xiao, H., Rasul, K., and Vollgraf, R. (2017) “Fashion-MNIST: A novel image dataset for benchmarking machine learning algorithms”, arXiv preprint arXiv:1708.07747
and also for this new rescaled version, using the reference [1] above.
The data set is made available on request. If you would be interested in trying out this data set, please make a request in the system below, and we will grant you access as soon as possible.
The Rescaled FashionMNIST dataset is generated by rescaling 28×28 gray-scale images of clothes from the original FashionMNIST dataset [4]. The scale variations are up to a factor of 4, and the images are embedded within black images of size 72x72, with the object in the frame always centred. The imresize() function in Matlab was used for the rescaling, with default anti-aliasing turned on, and bicubic interpolation overshoot removed by clipping to the [0, 255] range. The details of how the dataset was created can be found in [1].
There are 10 different classes in the dataset: “T-shirt/top”, “trouser”, “pullover”, “dress”, “coat”, “sandal”, “shirt”, “sneaker”, “bag” and “ankle boot”. In the dataset, these are represented by integer labels in the range [0, 9].
The dataset is split into 50 000 training samples, 10 000 validation samples and 10 000 testing samples. The training dataset is generated using the initial 50 000 samples from the original Fashion-MNIST training set. The validation dataset, on the other hand, is formed from the final 10 000 images of that same training set. For testing, all test datasets are built from the 10 000 images contained in the original Fashion-MNIST test set.
The training dataset file (~2.9 GB) for scale 1, which also contains the corresponding validation and test data for the same scale, is:
fashionmnist_with_scale_variations_tr50000_vl10000_te10000_outsize72-72_scte1p000_scte1p000.h5
Additionally, for the Rescaled FashionMNIST dataset, there are 9 datasets (~415 MB each) for testing scale generalisation at scales not present in the training set. Each of these datasets is rescaled using a different image scaling factor, 2k/4, with k being integers in the range [-4, 4]:
fashionmnist_with_scale_variations_te10000_outsize72-72_scte0p500.h5
fashionmnist_with_scale_variations_te10000_outsize72-72_scte0p595.h5
fashionmnist_with_scale_variations_te10000_outsize72-72_scte0p707.h5
fashionmnist_with_scale_variations_te10000_outsize72-72_scte0p841.h5
fashionmnist_with_scale_variations_te10000_outsize72-72_scte1p000.h5
fashionmnist_with_scale_variations_te10000_outsize72-72_scte1p189.h5
fashionmnist_with_scale_variations_te10000_outsize72-72_scte1p414.h5
fashionmnist_with_scale_variations_te10000_outsize72-72_scte1p682.h5
fashionmnist_with_scale_variations_te10000_outsize72-72_scte2p000.h5
These dataset files were used for the experiments presented in Figures 6, 7, 14, 16, 19 and 23 in [1].
The datasets are saved in HDF5 format, with the partitions in the respective h5 files named as
('/x_train', '/x_val', '/x_test', '/y_train', '/y_test', '/y_val'); which ones exist depends on which data split is used.
The training dataset can be loaded in Python as:
with h5py.File(`
x_train = np.array( f["/x_train"], dtype=np.float32)
x_val = np.array( f["/x_val"], dtype=np.float32)
x_test = np.array( f["/x_test"], dtype=np.float32)
y_train = np.array( f["/y_train"], dtype=np.int32)
y_val = np.array( f["/y_val"], dtype=np.int32)
y_test = np.array( f["/y_test"], dtype=np.int32)
We also need to permute the data, since Pytorch uses the format [num_samples, channels, width, height], while the data is saved as [num_samples, width, height, channels]:
x_train = np.transpose(x_train, (0, 3, 1, 2))
x_val = np.transpose(x_val, (0, 3, 1, 2))
x_test = np.transpose(x_test, (0, 3, 1, 2))
The test datasets can be loaded in Python as:
with h5py.File(`
x_test = np.array( f["/x_test"], dtype=np.float32)
y_test = np.array( f["/y_test"], dtype=np.int32)
The test datasets can be loaded in Matlab as:
x_test = h5read(`
The images are stored as [num_samples, x_dim, y_dim, channels] in HDF5 files. The pixel intensity values are not normalised, and are in a [0, 255] range.
There is also a closely related Fashion-MNIST with translations dataset, which in addition to scaling variations also comprises spatial translations of the objects.
Facebook
TwitterFashion-MNIST is a dataset of Zalando's article images consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grayscale image, associated with a label from 10 classes.
To use this dataset:
import tensorflow_datasets as tfds
ds = tfds.load('fashion_mnist', split='train')
for ex in ds.take(4):
print(ex)
See the guide for more informations on tensorflow_datasets.
https://storage.googleapis.com/tfds-data/visualization/fig/fashion_mnist-3.0.1.png" alt="Visualization" width="500px">
Facebook
TwitterAttribution-NoDerivs 4.0 (CC BY-ND 4.0)https://creativecommons.org/licenses/by-nd/4.0/
License information was derived automatically
The EMNIST dataset is a set of handwritten character digits derived from the NIST Special Database 19 (https://www.nist.gov/srd/nist-special-database-19) and converted to a 28x28 pixel image format and dataset structure that directly matches the MNIST dataset (http://yann.lecun.com/exdb/mnist/). Further information on the dataset contents and conversion process can be found in the paper available at https://arxiv.org/abs/1702.05373v2
The MNIST dataset has become a standard benchmark for learning, classification and computer vision systems. Contributing to its widespread adoption are the understandable and intuitive nature of the task, its relatively small size and storage requirements and the accessibility and ease-of-use of the database itself. The MNIST database was derived from a larger dataset known as the NIST Special Database 19 which contains digits, uppercase and lowercase handwritten letters. This paper introduces a variant of the full NIST dataset, which we have called Extended MNIST (EMNIST), which follows the same conversion paradigm used to create the MNIST dataset. The result is a set of datasets that constitute a more challenging classification tasks involving letters and digits, and that shares the same image structure and parameters as the original MNIST task, allowing for direct compatibility with all existing classifiers and systems. Benchmark results are presented along with a validation of the conversion process through the comparison of the classification results on converted NIST digits and the MNIST digits.
The database is made available in original MNIST format and Matlab format.
Facebook
TwitterThe MNIST dataset is a collection of images of handwritten digits, with size n = 70,000 and D = 784.
Facebook
TwitterThis dataset was created by ᗪIYᗩ ᗩGᗩTᕼ
Facebook
TwitterThe MNIST-scale dataset is a dataset of images of handwritten digits, where each image is scaled to a different size.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is derived from the Leaf repository (https://github.com/TalwalkarLab/leaf) pre-processing of the Extended MNIST dataset, grouping examples by writer. Details about Leaf were published in "LEAF: A Benchmark for Federated Settings" https://arxiv.org/abs/1812.01097Note: This dataset does not include some additional preprocessing that MNIST includes, such as size-normalization and centering. In the Federated EMNIST data, the value of 1.0 corresponds to the background, and 0.0 corresponds to the color of the digits themselves; this is the inverse of some MNIST representations, e.g. in tensorflow_datasets, where 0 corresponds to the background color, and 255 represents the color of the digit.Data set sizes:only_digits=True: 3,383 users, 10 label classestrain: 341,873 examplestest: 40,832 examplesonly_digits=False: 3,400 users, 62 label classestrain: 671,585 examplestest: 77,483 examplesRather than holding out specific users, each user's examples are split across train and test so that all users have at least one example in train and one example in test. Writers that had less than 2 examples are excluded from the data set.The tf.data.Datasets returned by tff.simulation.datasets.ClientData.create_tf_dataset_for_client will yield collections.OrderedDict objects at each iteration, with the following keys and values, in lexicographic order by key:'label': a tf.Tensor with dtype=tf.int32 and shape [1], the class label of the corresponding pixels. Labels [0-9] correspond to the digits classes, labels [10-35] correspond to the uppercase classes (e.g., label 11 is 'B'), and labels [36-61] correspond to the lowercase classes (e.g., label 37 is 'b').'pixels': a tf.Tensor with dtype=tf.float32 and shape [28, 28], containing the pixels of the handwritten digit, with values in the range [0.0, 1.0].Argsonly_digits(Optional) whether to only include examples that are from the digits [0-9] classes. If False, includes lower and upper case characters, for a total of 62 class labels.cache_dir(Optional) directory to cache the downloaded file. If None, caches in Keras' default cache directory.ReturnsTuple of (train, test) where the tuple elements are tff.simulation.datasets.ClientData objects.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Database description:
The written and spoken digits database is not a new database but a constructed database from existing ones, in order to provide a ready-to-use database for multimodal fusion.
The written digits database is the original MNIST handwritten digits database [1] with no additional processing. It consists of 70000 images (60000 for training and 10000 for test) of 28 x 28 = 784 dimensions.
The spoken digits database was extracted from Google Speech Commands [2], an audio dataset of spoken words that was proposed to train and evaluate keyword spotting systems. It consists of 105829 utterances of 35 words, amongst which 38908 utterances of the ten digits (34801 for training and 4107 for test). A pre-processing was done via the extraction of the Mel Frequency Cepstral Coefficients (MFCC) with a framing window size of 50 ms and frame shift size of 25 ms. Since the speech samples are approximately 1 s long, we end up with 39 time slots. For each one, we extract 12 MFCC coefficients with an additional energy coefficient. Thus, we have a final vector of 39 x 13 = 507 dimensions. Standardization and normalization were applied on the MFCC features.
To construct the multimodal digits dataset, we associated written and spoken digits of the same class respecting the initial partitioning in [1] and [2] for the training and test subsets. Since we have less samples for the spoken digits, we duplicated some random samples to match the number of written digits and have a multimodal digits database of 70000 samples (60000 for training and 10000 for test).
The dataset is provided in six files as described below. Therefore, if a shuffle is performed on the training or test subsets, it must be performed in unison with the same order for the written digits, spoken digits and labels.
Files:
References:
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Code [GitHub] | Publication [Nature Scientific Data'23 / ISBI'21] | Preprint [arXiv]
Abstract
We introduce MedMNIST, a large-scale MNIST-like collection of standardized biomedical images, including 12 datasets for 2D and 6 datasets for 3D. All images are pre-processed into 28x28 (2D) or 28x28x28 (3D) with the corresponding classification labels, so that no background knowledge is required for users. Covering primary data modalities in biomedical images, MedMNIST is designed to perform classification on lightweight 2D and 3D images with various data scales (from 100 to 100,000) and diverse tasks (binary/multi-class, ordinal regression and multi-label). The resulting dataset, consisting of approximately 708K 2D images and 10K 3D images in total, could support numerous research and educational purposes in biomedical image analysis, computer vision and machine learning. We benchmark several baseline methods on MedMNIST, including 2D / 3D neural networks and open-source / commercial AutoML tools. The data and code are publicly available at https://medmnist.com/.
Disclaimer: The only official distribution link for the MedMNIST dataset is Zenodo. We kindly request users to refer to this original dataset link for accurate and up-to-date data.
Update: We are thrilled to release MedMNIST+ with larger sizes: 64x64, 128x128, and 224x224 for 2D, and 64x64x64 for 3D. As a complement to the previous 28-size MedMNIST, the large-size version could serve as a standardized benchmark for medical foundation models. Install the latest API to try it out!
Python Usage
We recommend our official code to download, parse and use the MedMNIST dataset:
% pip install medmnist% python
To use the standard 28-size (MNIST-like) version utilizing the downloaded files:
from medmnist import PathMNIST
train_dataset = PathMNIST(split="train")
To enable automatic downloading by setting download=True:
from medmnist import NoduleMNIST3D
val_dataset = NoduleMNIST3D(split="val", download=True)
Alternatively, you can access MedMNIST+ with larger image sizes by specifying the size parameter:
from medmnist import ChestMNIST
test_dataset = ChestMNIST(split="test", download=True, size=224)
Citation
If you find this project useful, please cite both v1 and v2 paper as:
Jiancheng Yang, Rui Shi, Donglai Wei, Zequan Liu, Lin Zhao, Bilian Ke, Hanspeter Pfister, Bingbing Ni. Yang, Jiancheng, et al. "MedMNIST v2-A large-scale lightweight benchmark for 2D and 3D biomedical image classification." Scientific Data, 2023.
Jiancheng Yang, Rui Shi, Bingbing Ni. "MedMNIST Classification Decathlon: A Lightweight AutoML Benchmark for Medical Image Analysis". IEEE 18th International Symposium on Biomedical Imaging (ISBI), 2021.
or using bibtex:
@article{medmnistv2, title={MedMNIST v2-A large-scale lightweight benchmark for 2D and 3D biomedical image classification}, author={Yang, Jiancheng and Shi, Rui and Wei, Donglai and Liu, Zequan and Zhao, Lin and Ke, Bilian and Pfister, Hanspeter and Ni, Bingbing}, journal={Scientific Data}, volume={10}, number={1}, pages={41}, year={2023}, publisher={Nature Publishing Group UK London} }
@inproceedings{medmnistv1, title={MedMNIST Classification Decathlon: A Lightweight AutoML Benchmark for Medical Image Analysis}, author={Yang, Jiancheng and Shi, Rui and Ni, Bingbing}, booktitle={IEEE 18th International Symposium on Biomedical Imaging (ISBI)}, pages={191--195}, year={2021} }
Please also cite the corresponding paper(s) of source data if you use any subset of MedMNIST as per the description on the project website.
License
The MedMNIST dataset is licensed under Creative Commons Attribution 4.0 International (CC BY 4.0), except DermaMNIST under Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0).
The code is under Apache-2.0 License.
Changelog
v3.0 (this repository): Released MedMNIST+ featuring larger sizes: 64x64, 128x128, and 224x224 for 2D, and 64x64x64 for 3D.
v2.2: Removed a small number of mistakenly included blank samples in OrganAMNIST, OrganCMNIST, OrganSMNIST, OrganMNIST3D, and VesselMNIST3D.
v2.1: Addressed an issue in the NoduleMNIST3D file (i.e., nodulemnist3d.npz). Further details can be found in this issue.
v2.0: Launched the initial repository of MedMNIST v2, adding 6 datasets for 3D and 2 for 2D.
v1.0: Established the initial repository (in a separate repository) of MedMNIST v1, featuring 10 datasets for 2D.
Note: This dataset is NOT intended for clinical use.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Here we present a dataset, MNIST4OD, of large size (number of dimensions and number of instances) suitable for Outliers Detection task.The dataset is based on the famous MNIST dataset (http://yann.lecun.com/exdb/mnist/).We build MNIST4OD in the following way:To distinguish between outliers and inliers, we choose the images belonging to a digit as inliers (e.g. digit 1) and we sample with uniform probability on the remaining images as outliers such as their number is equal to 10% of that of inliers. We repeat this dataset generation process for all digits. For implementation simplicity we then flatten the images (28 X 28) into vectors.Each file MNIST_x.csv.gz contains the corresponding dataset where the inlier class is equal to x.The data contains one instance (vector) in each line where the last column represents the outlier label (yes/no) of the data point. The data contains also a column which indicates the original image class (0-9).See the following numbers for a complete list of the statistics of each datasets ( Name | Instances | Dimensions | Number of Outliers in % ):MNIST_0 | 7594 | 784 | 10MNIST_1 | 8665 | 784 | 10MNIST_2 | 7689 | 784 | 10MNIST_3 | 7856 | 784 | 10MNIST_4 | 7507 | 784 | 10MNIST_5 | 6945 | 784 | 10MNIST_6 | 7564 | 784 | 10MNIST_7 | 8023 | 784 | 10MNIST_8 | 7508 | 784 | 10MNIST_9 | 7654 | 784 | 10
Facebook
TwitterDataset Card for "notMNIST"
Overview
The notMNIST dataset is a collection of images of letters from A to J in various fonts. It is designed as a more challenging alternative to the traditional MNIST dataset, which consists of handwritten digits. The notMNIST dataset is commonly used in machine learning and computer vision tasks for character recognition.
Dataset Information
Number of Classes: 10 (A to J) Number of Samples: 187,24 Image Size: 28 x 28 pixels… See the full description on the dataset page: https://huggingface.co/datasets/anubhavmaity/notMNIST.
Facebook
TwitterThe goal of this dataset is to create a custom dataset for multi-digit recognition tasks by concatenating pairs of digits from the MNIST dataset into single 128x128 pixel images and assigning labels that represent two-digit numbers from '00' to '99'.
Image (128 x 128 pixel Numpy array): The dataset contains images of size 128 x 128 pixels. Each image is a composition of two pairs of MNIST digits. Each digit occupies a 28 x 28 pixel space within the larger 128 x 128 pixel canvas. The digits are randomly placed within the canvas to simulate real-world scenarios.
Label (Int): The labels represent two-digit numbers ranging from '00' to '99'. These labels are assigned based on the digits present in the image and their order. For example, an image with '7' and '2' as the first and second digits would be labeled as '72' ('7' * 10 + '2'). Leading zeros are added to ensure that all labels are two characters in length.
Training Data: 60,000 data points Test Data: 10,000 data points
Data Generation: To create this dataset, you would start with the MNIST dataset, which contains single-digit images of handwritten digits from '0' to '9'. For each data point in the new dataset, you would randomly select two pairs of digits from MNIST and place them on a 128 x 128 canvas. The digits are placed at random positions, and their order can also be random. After creating the multi-digit image, you assign a label by concatenating the labels of the individual digits while ensuring they are two characters in length.
Key Features of the 2-Digit Classification Dataset:
Multi-Digit Images: This dataset consists of multi-digit images, each containing two handwritten digits. The inclusion of multiple digits in a single image presents a unique and challenging classification task.
Labeling Complexity: Labels are represented as two-digit numbers, adding complexity to the classification problem. The labels range from '00' to '99,' encompassing a wide variety of possible combinations.
Diverse Handwriting Styles: The dataset captures diverse handwriting styles, making it suitable for testing the robustness and generalization capabilities of machine learning models.
128x128 Pixel Images: Images are provided in a high-resolution format of 128x128 pixels, allowing for fine-grained analysis and leveraging the increased image information.
Large-Scale Training and Test Sets: With 60,000 training data points and 10,000 test data points, this dataset provides ample data for training and evaluating classification models.
Potential Use Cases:
Multi-Digit Recognition: The dataset is ideal for developing and evaluating machine learning models that can accurately classify multi-digit sequences, which find applications in reading house numbers, license plates, and more.
OCR (Optical Character Recognition) Systems: Researchers and developers can use this dataset to train and benchmark OCR systems for recognizing handwritten multi-digit numbers.
Real-World Document Processing: In scenarios where documents contain multiple handwritten numbers, such as invoices, receipts, and forms, this dataset can be valuable for automating data extraction.
Address Parsing: It can be used to build systems capable of parsing handwritten addresses and extracting postal codes or other important information.
Authentication and Security: Multi-digit classification models can contribute to security applications by recognizing handwritten PINs, passwords, or access codes.
Education and Handwriting Analysis: Educational institutions can use this dataset to create handwriting analysis tools and assess the difficulty of recognizing different handwritten number combinations.
Benchmarking Deep Learning Models: Data scientists and machine learning practitioners can use this dataset as a benchmark for testing and improving deep learning models' performance in multi-digit classification tasks.
Data Augmentation: Researchers can employ data augmentation techniques to generate even more training data by introducing variations in digit placement and size.
Model Explainability: Developing models for interpreting and explaining the reasoning behind classifying specific multi-digit combinations can have applications in AI ethics and accountability.
Visualizations and Data Exploration: Researchers can use this dataset to explore visualizations and data analysis techniques to gain insights into the characteristics of handwritten multi-digit numbers.
In summary, the 2-Digit Classification Dataset offers a unique opportunity to work on a challenging multi-digit recognition problem with real-world applications, making it a valuable resource for researchers, developers, and data scientists.
Note: Creating this dataset would require a considerable amount of preprocessing and image manipulation. ...
Facebook
TwitterThis dataset was created by Tan Li Tung
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
[!NOTE] This dataset card is based on the README file of the authors' GitHub repository: https://github.com/greydanus/mnist1d
The MNIST-1D Dataset
Most machine learning models get around the same ~99% test accuracy on MNIST. The MNIST-1D dataset is 100x smaller (default sample size: 4000+1000; dimensionality: 40) and does a better job of separating between models with/without nonlinearity and models with/without spatial inductive biases. MNIST-1D is a core teaching dataset in… See the full description on the dataset page: https://huggingface.co/datasets/christopher/mnist1d.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
This dataset contains a simplified version of the famous MNIST handwritten digits dataset. This version involves distinguishing between digits 3 and 5 rather than the full range 0-9.
Facebook
TwitterMNIST is a subset of a larger set available from NIST (it's copied from http://yann.lecun.com/exdb/mnist/)
The MNIST database of handwritten digits has a training set of 60,000 examples, and a test set of 10,000 examples. . Four files are available:
Many methods have been tested with this training set and test set (see http://yann.lecun.com/exdb/mnist/ for more details)