18 datasets found

T
mnist
tensorflow.org
universe.roboflow.com
+5more
Updated Jun 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). mnist [Dataset]. https://www.tensorflow.org/datasets/catalog/mnist
Explore at:
Dataset updated
Jun 1, 2024
Description
The MNIST database of handwritten digits.

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('mnist', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.

https://storage.googleapis.com/tfds-data/visualization/fig/mnist-3.0.1.png" alt="Visualization" width="500px">
P
MNIST Dataset
paperswithcode.com
Updated Nov 16, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Y. LeCun; L. Bottou; Y. Bengio; P. Haffner (2021). MNIST Dataset [Dataset]. https://paperswithcode.com/dataset/mnist
Explore at:
Dataset updated
Nov 16, 2021
Authors
Y. LeCun; L. Bottou; Y. Bengio; P. Haffner
Description
The MNIST database (Modified National Institute of Standards and Technology database) is a large collection of handwritten digits. It has a training set of 60,000 examples, and a test set of 10,000 examples. It is a subset of a larger NIST Special Database 3 (digits written by employees of the United States Census Bureau) and Special Database 1 (digits written by high school students) which contain monochrome images of handwritten digits. The digits have been size-normalized and centered in a fixed-size image. The original black and white (bilevel) images from NIST were size normalized to fit in a 20x20 pixel box while preserving their aspect ratio. The resulting images contain grey levels as a result of the anti-aliasing technique used by the normalization algorithm. the images were centered in a 28x28 image by computing the center of mass of the pixels, and translating the image so as to position this point at the center of the 28x28 field.
Mnist 42000 Images Dataset
universe.roboflow.com
zip
Updated Apr 25, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Roboflow (2023). Mnist 42000 Images Dataset [Dataset]. https://universe.roboflow.com/roboflow-jvuqo/mnist-42000-images-u0qdg
Explore at:
zipAvailable download formats
Dataset updated
Apr 25, 2023
Dataset authored and provided by
Roboflow
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Numbers
Description
The MNIST database (Modified National Institute of Standards and Technology database) is a large database of handwritten digits that is commonly used for training various image processing systems. The database is also widely used for training and testing in the field of machine learning. It was created by "re-mixing" the samples from NIST's original datasets. The creators felt that since NIST's training dataset was taken from American Census Bureau employees, while the testing dataset was taken from American high school students, it was not well-suited for machine learning experiments. Furthermore, the black and white images from NIST were normalized to fit into a 28x28 pixel bounding box and anti-aliased, which introduced grayscale levels.

Yann LeCun, Courant Institute, NYU Corinna Cortes, Google Labs, New York Christopher J.C. Burges, Microsoft Research, Redmond
T
fashion_mnist
tensorflow.org
opendatalab.com
+4more
Updated Jun 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). fashion_mnist [Dataset]. https://www.tensorflow.org/datasets/catalog/fashion_mnist
Explore at:
Dataset updated
Jun 1, 2024
Description
Fashion-MNIST is a dataset of Zalando's article images consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grayscale image, associated with a label from 10 classes.

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('fashion_mnist', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.

https://storage.googleapis.com/tfds-data/visualization/fig/fashion_mnist-3.0.1.png" alt="Visualization" width="500px">
o
mnist_784
openml.org
Updated Sep 29, 2014
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yann LeCun; Corinna Cortes; Christopher J.C. Burges (2014). mnist_784 [Dataset]. https://www.openml.org/d/554
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 29, 2014
Authors
Yann LeCun; Corinna Cortes; Christopher J.C. Burges
Description
Author: Yann LeCun, Corinna Cortes, Christopher J.C. Burges
Source: MNIST Website - Date unknown
Please cite:

The MNIST database of handwritten digits with 784 features, raw data available at: http://yann.lecun.com/exdb/mnist/. It can be split in a training set of the first 60,000 examples, and a test set of 10,000 examples

It is a subset of a larger set available from NIST. The digits have been size-normalized and centered in a fixed-size image. It is a good database for people who want to try learning techniques and pattern recognition methods on real-world data while spending minimal efforts on preprocessing and formatting. The original black and white (bilevel) images from NIST were size normalized to fit in a 20x20 pixel box while preserving their aspect ratio. The resulting images contain grey levels as a result of the anti-aliasing technique used by the normalization algorithm. the images were centered in a 28x28 image by computing the center of mass of the pixels, and translating the image so as to position this point at the center of the 28x28 field.

With some classification methods (particularly template-based methods, such as SVM and K-nearest neighbors), the error rate improves when the digits are centered by bounding box rather than center of mass. If you do this kind of pre-processing, you should report it in your publications. The MNIST database was constructed from NIST's NIST originally designated SD-3 as their training set and SD-1 as their test set. However, SD-3 is much cleaner and easier to recognize than SD-1. The reason for this can be found on the fact that SD-3 was collected among Census Bureau employees, while SD-1 was collected among high-school students. Drawing sensible conclusions from learning experiments requires that the result be independent of the choice of training set and test among the complete set of samples. Therefore it was necessary to build a new database by mixing NIST's datasets.

The MNIST training set is composed of 30,000 patterns from SD-3 and 30,000 patterns from SD-1. Our test set was composed of 5,000 patterns from SD-3 and 5,000 patterns from SD-1. The 60,000 pattern training set contained examples from approximately 250 writers. We made sure that the sets of writers of the training set and test set were disjoint. SD-1 contains 58,527 digit images written by 500 different writers. In contrast to SD-3, where blocks of data from each writer appeared in sequence, the data in SD-1 is scrambled. Writer identities for SD-1 is available and we used this information to unscramble the writers. We then split SD-1 in two: characters written by the first 250 writers went into our new training set. The remaining 250 writers were placed in our test set. Thus we had two sets with nearly 30,000 examples each. The new training set was completed with enough examples from SD-3, starting at pattern # 0, to make a full set of 60,000 training patterns. Similarly, the new test set was completed with SD-3 examples starting at pattern # 35,000 to make a full set with 60,000 test patterns. Only a subset of 10,000 test images (5,000 from SD-1 and 5,000 from SD-3) is available on this site. The full 60,000 sample training set is available.
a
not-MNIST
datasets.activeloop.ai
opendatalab.com
+3more
deeplake
Updated Mar 11, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yaroslav Bulatov (2022). not-MNIST [Dataset]. https://datasets.activeloop.ai/docs/ml/datasets/not-mnist-dataset/
Explore at:
deeplakeAvailable download formats
Dataset updated
Mar 11, 2022
Authors
Yaroslav Bulatov
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
The not-MNIST dataset is a dataset of handwritten digits. It is a challenging dataset that can be used for machine learning and artificial intelligence research. The dataset consists of 100,000 images of handwritten digits. The images are divided into a training set of 60,000 images and a test set of 40,000 images. The images are drawn from a variety of fonts and styles, making them more challenging than the MNIST dataset. The images are 28x28 pixels in size and are grayscale. The dataset is available under the Creative Commons Zero Public Domain Dedication license.
O
CI-MNIST
opendatalab.com
zip
Updated Mar 31, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
McGill University (2023). CI-MNIST [Dataset]. https://opendatalab.com/OpenDataLab/CI-MNIST
Explore at:
zipAvailable download formats
Dataset updated
Mar 31, 2023
Dataset provided by
Microsoft Research
McGill University
University of Montreal
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
CI-MNIST (Correlated and Imbalanced MNIST) is a variant of MNIST dataset with introduced different types of correlations between attributes, dataset features, and an artificial eligibility criterion. For an input image x, the label y∈{1,0} indicates eligibility or ineligibility, respectively, given that x is even or odd. The dataset defines the background colors as the protected or sensitive attribute s∈{0,1}, where blue denotes the unprivileged group and red denotes the privileged group. The dataset was designed in order to evaluate bias-mitigation approaches in challenging setups and be capable of controlling different dataset configurations.
T
kmnist
tensorflow.org
datasets.activeloop.ai
Updated Jun 1, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). kmnist [Dataset]. https://www.tensorflow.org/datasets/catalog/kmnist
Explore at:
Dataset updated
Jun 1, 2024
Description
Kuzushiji-MNIST is a drop-in replacement for the MNIST dataset (28x28 grayscale, 70,000 images), provided in the original MNIST format as well as a NumPy format. Since MNIST restricts us to 10 classes, we chose one character to represent each of the 10 rows of Hiragana when creating Kuzushiji-MNIST.

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('kmnist', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.

https://storage.googleapis.com/tfds-data/visualization/fig/kmnist-3.0.1.png" alt="Visualization" width="500px">
MNIST 70000 Original
kaggle.com
zip
Updated Apr 22, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cairo (2020). MNIST 70000 Original [Dataset]. https://www.kaggle.com/datasets/cairo43227/mnist-70000-original
Explore at:
zip(18218962 bytes)Available download formats
Dataset updated
Apr 22, 2020
Authors
Cairo
Description
Dataset

This dataset was created by Cairo

Contents
MNIST Data for Digit Recognition
kaggle.com
Updated Dec 22, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sylvia Mittal (2017). MNIST Data for Digit Recognition [Dataset]. https://www.kaggle.com/sylvia23/mnist-data-for-digit-recognation/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 22, 2017
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Sylvia Mittal
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
This dataset contains training and testing data for digit recognition which includes hand written images of digits.

It contains four zip files which you can easily include in your neural network. So, download all four of them by clicking "Download all" button.

This is the MNIST dataset used world-wide to check the performance of neural networks based upon digit recognition.

It also contains training and testing labels.
P
MNIST-MIX Dataset
paperswithcode.com
opendatalab.com
Updated Apr 7, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
MNIST-MIX Dataset [Dataset]. https://paperswithcode.com/dataset/mnist-mix
Explore at:
Dataset updated
Apr 7, 2020
Authors
Weiwei Jiang
Description
MNIST-MIX is a multi-language handwritten digit recognition dataset. It contains digits from 10 different languages.
T
binarized_mnist
tensorflow.org
Updated Jun 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). binarized_mnist [Dataset]. https://www.tensorflow.org/datasets/catalog/binarized_mnist
Explore at:
Dataset updated
Jun 1, 2024
Description
A specific binarization of the MNIST images originally used in (Salakhutdinov & Murray, 2008). This dataset is frequently used to evaluate generative models of images, so labels are not provided.

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('binarized_mnist', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.

https://storage.googleapis.com/tfds-data/visualization/fig/binarized_mnist-1.0.0.png" alt="Visualization" width="500px">
MNIST32
kaggle.com
Updated Feb 27, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
sbordt (2024). MNIST32 [Dataset]. https://www.kaggle.com/datasets/sbordt/mnist32
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 27, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
sbordt
License
Attribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
License information was derived automatically
Description
This dataset is derived from the MNIST dataset (http://yann.lecun.com/exdb/mnist/).

The dataset is described in the research paper https://openaccess.thecvf.com/content/CVPR2023W/XAI4CV/html/Bordt_The_Manifold_Hypothesis_for_Gradient-Based_Explanations_CVPRW_2023_paper.html

The usage of the dataset is described in this example notebook https://github.com/tml-tuebingen/explanations-manifold/blob/main/examples/mnist32.ipynb
R
Mnist Project Dataset
universe.roboflow.com
zip
Updated May 2, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nabilah K S (2024). Mnist Project Dataset [Dataset]. https://universe.roboflow.com/nabilah-k-s/mnist-project/dataset/2
Explore at:
zipAvailable download formats
Dataset updated
May 2, 2024
Dataset authored and provided by
Nabilah K S
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Variables measured
OBJECT DETECTION Bounding Boxes
Description
MNIST PROJECT

## Overview MNIST PROJECT is a dataset for object detection tasks - it contains OBJECT DETECTION annotations for 2,550 images. ## Getting Started You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model. ## License This dataset is available under the [Public Domain license](https://creativecommons.org/licenses/Public Domain).
CIFAR & rotated-MNIST dataset
zenodo.org
zip
Updated Feb 18, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zhengyang; Zhengyang (2020). CIFAR & rotated-MNIST dataset [Dataset]. http://doi.org/10.5281/zenodo.3670627
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.3670627
Dataset updated
Feb 18, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Zhengyang; Zhengyang
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The dataset used in PDO-eConvs: Partial Differential Operator Based Equivariant Convolutions
P
EMNIST Dataset
paperswithcode.com
library.toponeai.link
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gregory Cohen; Saeed Afshar; Jonathan Tapson; André van Schaik, EMNIST Dataset [Dataset]. https://paperswithcode.com/dataset/emnist
Explore at:
Authors
Gregory Cohen; Saeed Afshar; Jonathan Tapson; André van Schaik
Description
EMNIST (extended MNIST) has 4 times more data than MNIST. It is a set of handwritten digits with a 28 x 28 format.
P
MedMNIST v2 Dataset
paperswithcode.com
huggingface.co
Updated Feb 18, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jiancheng Yang; Rui Shi; Donglai Wei; Zequan Liu; Lin Zhao; Bilian Ke; Hanspeter Pfister; Bingbing Ni (2025). MedMNIST v2 Dataset [Dataset]. https://paperswithcode.com/dataset/medmnist-v2
Explore at:
Dataset updated
Feb 18, 2025
Authors
Jiancheng Yang; Rui Shi; Donglai Wei; Zequan Liu; Lin Zhao; Bilian Ke; Hanspeter Pfister; Bingbing Ni
Description
MedMNIST v2 is a large-scale MNIST-like collection of standardized biomedical images, including 12 datasets for 2D and 6 datasets for 3D. All images are pre-processed into 28 x 28 (2D) or 28 x 28 x 28 (3D) with the corresponding classification labels, so that no background knowledge is required for users. Covering primary data modalities in biomedical images, MedMNIST v2 is designed to perform classification on lightweight 2D and 3D images with various data scales (from 100 to 100,000) and diverse tasks (binary/multi-class, ordinal regression and multi-label). The resulting dataset, consisting of 708,069 2D images and 10,214 3D images in total, could support numerous research / educational purposes in biomedical image analysis, computer vision and machine learning.

Description and image from: MedMNIST v2: A Large-Scale Lightweight Benchmark for 2D and 3D Biomedical Image Classification

Each subset keeps the same license as that of the source dataset. Please also cite the corresponding paper of source data if you use any subset of MedMNIST.
P
HASY Dataset
paperswithcode.com
opendatalab.com
Updated Jan 31, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
HASY Dataset [Dataset]. https://paperswithcode.com/dataset/hasy
Explore at:
Dataset updated
Jan 31, 2017
Authors
Martin Thoma
Description
HASY is a dataset of single symbols similar to MNIST. It contains 168,233 instances of 369 classes. HASY contains two challenges: A classification challenge with 10 pre-defined folds for 10-fold cross-validation and a verification challenge.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

(2024). mnist [Dataset]. https://www.tensorflow.org/datasets/catalog/mnist

mnist

Explore at:

70 scholarly articles cite this dataset (View in Google Scholar)

Dataset updated

Jun 1, 2024

Description

The MNIST database of handwritten digits.

To use this dataset:

import tensorflow_datasets as tfds

ds = tfds.load('mnist', split='train')
for ex in ds.take(4):
 print(ex)

See the guide for more informations on tensorflow_datasets.

https://storage.googleapis.com/tfds-data/visualization/fig/mnist-3.0.1.png" alt="Visualization" width="500px">

Clear search

Close search

Google apps

Main menu

mnist

MNIST Dataset

Mnist 42000 Images Dataset

fashion_mnist

mnist_784

not-MNIST

CI-MNIST

kmnist

MNIST 70000 Original

Dataset

Contents

MNIST Data for Digit Recognition

MNIST-MIX Dataset

binarized_mnist

MNIST32

Mnist Project Dataset

MNIST PROJECT

CIFAR & rotated-MNIST dataset

EMNIST Dataset

MedMNIST v2 Dataset

HASY Dataset

mnist