51 datasets found

MNIST Dataset
kaggle.com
opendatalab.com
+4more
zip
Updated Jan 8, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hojjat Khodabakhsh (2019). MNIST Dataset [Dataset]. https://www.kaggle.com/datasets/hojjatk/mnist-dataset
Explore at:
zip(23112702 bytes)Available download formats
Dataset updated
Jan 8, 2019
Authors
Hojjat Khodabakhsh
Description
Context

MNIST is a subset of a larger set available from NIST (it's copied from http://yann.lecun.com/exdb/mnist/)

Content

The MNIST database of handwritten digits has a training set of 60,000 examples, and a test set of 10,000 examples. . Four files are available:

train-images-idx3-ubyte.gz: training set images (9912422 bytes)

train-labels-idx1-ubyte.gz: training set labels (28881 bytes)

t10k-images-idx3-ubyte.gz: test set images (1648877 bytes)

t10k-labels-idx1-ubyte.gz: test set labels (4542 bytes)

How to read

See sample MNIST reader

Acknowledgements

Yann LeCun, Courant Institute, NYU

Corinna Cortes, Google Labs, New York

Christopher J.C. Burges, Microsoft Research, Redmond

Inspiration

Many methods have been tested with this training set and test set (see http://yann.lecun.com/exdb/mnist/ for more details)

MNIST-100

kaggle.com

zip

Updated Jul 25, 2023

+ more versions

Facebook

Twitter

Click to copy link

Link copied

Cite

Marcin Wierzbiński (2023). MNIST-100 [Dataset]. https://www.kaggle.com/datasets/martininf1n1ty/mnist100

Explore at:

zip(23452456 bytes)Available download formats

Dataset updated

Jul 25, 2023

Authors

Marcin Wierzbiński

License

http://www.gnu.org/licenses/lgpl-3.0.htmlhttp://www.gnu.org/licenses/lgpl-3.0.html

Description

The MNIST-100 dataset is a variation of the original MNIST dataset, consisting of 100 handwritten numbers extracted from the MNIST dataset. Unlike the traditional MNIST dataset, which contains 60,000 training images of digits from 0 to 9, the Modified MNIST-10 dataset focuses on 100 numbers.

Dataset Overview: - Dataset Name: MNIST-100 - Total Number of Images: train: 60000 test: 1000 - Classes: 100 (Numbers from 00 to 99) - Image Size: 28x56 pixels (grayscale)

Data Collection: The MNIST-100 dataset was created by randomly selecting 10 unique digits from the original MNIST dataset. For each selected digit, 10 representative images were extracted, resulting in a total of 100 images. These images were carefully chosen to represent a diverse range of handwriting styles for each digit.

Each image in the dataset is labeled with its corresponding numbers, ranging from 00 to 99, making it suitable for classification tasks. Researchers and practitioners can use this dataset to train and evaluate machine learning algorithms and neural networks for digit recognition and classification.

Please note that the Modified MNIST-100 dataset is not intended to replace the original MNIST dataset but serves as a complementary resource for specific applications requiring a smaller and more focused subset of the MNIST data.

Overall, the MNIST-100 dataset offers a compact and representative collection of 100 handwritten numbers, providing a convenient tool for experimentation and learning in computer vision and pattern recognition.

Label Distribution for training set:

Label	Occurrences	Label	Occurrences	Label	Occurrences
0	561	34	629	68	606
1	687	35	540	69	582
2	582	36	588	70	566
3	633	37	619	71	659
4	588	38	584	72	572
5	544	39	609	73	682
6	582	40	570	74	627
7	615	41	679	75	598
8	584	42	544	76	605
9	567	43	567	77	602
10	641	44	574	78	595
11	780	45	555	79	586
12	720	46	550	80	569
13	699	47	614	81	628
14	630	48	614	82	578
15	627	49	595	83	622
16	684	50	505	84	569
17	713	51	583	85	540
18	743	52	512	86	557
19	706	53	555	87	628
20	527	54	504	88	562
21	710	55	488	89	625
22	586	56	531	90	600
23	584	57	556	91	700
24	568	58	497	92	622
25	530	59	520	93	622
26	612	60	556	94	591
27	627	61	682	95	557
28	618	62	594	96	580
29	619	63	539	97	640
30	622	64	610	98	577
31	684	65	514	99	563
32	606	66	587
33	592	67	655

Test data:

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F7193292%2Fac688f2526851734cb50be10f0a7bd7d%2Fpobrane%20(16).png?generation=1690276359580027&alt=media" alt="">

Label	Occurrences	Label	Occurrences	Label	Occurrences
00	96	34	100	68	90
01	108	35	91	69	92
02	91	36	107	70	102
03	96	37	112	71	116
04	75	38	97	72	101
05	85	39	96	73	106
06	88	40	103	74	98
07	96	41	123	75 ...

a
MNIST
datasets.activeloop.ai
deeplake
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yann LeCun, MNIST [Dataset]. https://datasets.activeloop.ai/docs/ml/datasets/mnist/
Explore at:
deeplakeAvailable download formats
Authors
Yann LeCun
License
Attribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
License information was derived automatically
Time period covered
Jan 1, 1998 - Dec 31, 2000
Area covered
Earth
Dataset funded by
AT&T Bell Labs
Description
The MNIST dataset is a dataset of handwritten digits. It is a popular dataset for machine learning and artificial intelligence research. The dataset consists of 60,000 training images and 10,000 test images. Each image is a 28x28 pixel grayscale image of a handwritten digit. The digits are labeled from 0 to 9.
MNIST Dataset
kaggle.com
zip
Updated Mar 26, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Saba Hesaraki (2023). MNIST Dataset [Dataset]. https://www.kaggle.com/datasets/sabahesaraki/mnist-dataset
Explore at:
zip(11556456 bytes)Available download formats
Dataset updated
Mar 26, 2023
Authors
Saba Hesaraki
Description
The MNIST database of handwritten digits, available from this page, has a training set of 60,000 examples, and a test set of 10,000 examples. It is a subset of a larger set available from NIST. The digits have been size-normalized and centered in a fixed-size image. It is a good database for people who want to try learning techniques and pattern recognition methods on real-world data while spending minimal effort on preprocessing and formatting.

Four files are available on this site: train-images-idx3-ubyte.gz: training set images (9912422 bytes) train-labels-idx1-ubyte.gz: training set labels (28881 bytes) t10k-images-idx3-ubyte.gz: test set images (1648877 bytes) t10k-labels-idx1-ubyte.gz: test set labels (4542 bytes)
S
MNIST Dataset
scidb.cn
Updated Feb 16, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Xuyu Zhang; Jingjing Gao; Yu Gan; Chunyuan Song; Dawei Zhang; Songlin Zhuang; Shensheng Han; Puxiang Lai; Honglin Liu (2023). MNIST Dataset [Dataset]. http://doi.org/10.57760/sciencedb.07421
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.57760/sciencedb.07421
Dataset updated
Feb 16, 2023
Dataset provided by
Science Data Bank
Authors
Xuyu Zhang; Jingjing Gao; Yu Gan; Chunyuan Song; Dawei Zhang; Songlin Zhuang; Shensheng Han; Puxiang Lai; Honglin Liu
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
MNIST is a picture data set of handwritten numbers, which was organized by the National Institute of Standards and Technology (NIST) of the United States. A total of 250 handwritten digital pictures were collected, 50% of which were high school students and 50% were from the staff of the Census Bureau. The collection purpose of this data set is to realize the recognition of handwritten digits through algorithms. The data set contains 60000 images and labels, while the test set contains 10000 images and labels. The first 5000 training sets from the initial NIST program, The last 5000 test sets from the original NIST program. The first 5000 are more regular than the last 5000, because the first 5000 data come from the employees of the US Census Bureau, and the last 5000 data come from college students.
Rescaled Fashion-MNIST dataset
zenodo.org
Updated Jun 27, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Andrzej Perzanowski; Andrzej Perzanowski; Tony Lindeberg; Tony Lindeberg (2025). Rescaled Fashion-MNIST dataset [Dataset]. http://doi.org/10.5281/zenodo.15187793
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.15187793
Dataset updated
Jun 27, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Andrzej Perzanowski; Andrzej Perzanowski; Tony Lindeberg; Tony Lindeberg
Time period covered
Apr 10, 2025
Description
Motivation

The goal of introducing the Rescaled Fashion-MNIST dataset is to provide a dataset that contains scale variations (up to a factor of 4), to evaluate the ability of networks to generalise to scales not present in the training data.

The Rescaled Fashion-MNIST dataset was introduced in the paper:

[1] A. Perzanowski and T. Lindeberg (2025) "Scale generalisation properties of extended scale-covariant and scale-invariant Gaussian derivative networks on image datasets with spatial scaling variations”, Journal of Mathematical Imaging and Vision, 67(29), https://doi.org/10.1007/s10851-025-01245-x.

with a pre-print available at arXiv:

[2] Perzanowski and Lindeberg (2024) "Scale generalisation properties of extended scale-covariant and scale-invariant Gaussian derivative networks on image datasets with spatial scaling variations”, arXiv preprint arXiv:2409.11140.

Importantly, the Rescaled Fashion-MNIST dataset is more challenging than the MNIST Large Scale dataset, introduced in:

[3] Y. Jansson and T. Lindeberg (2022) "Scale-invariant scale-channel networks: Deep networks that generalise to previously unseen scales", Journal of Mathematical Imaging and Vision, 64(5): 506-536, https://doi.org/10.1007/s10851-022-01082-2.

Access and rights

The Rescaled Fashion-MNIST dataset is provided on the condition that you provide proper citation for the original Fashion-MNIST dataset:

[4] Xiao, H., Rasul, K., and Vollgraf, R. (2017) “Fashion-MNIST: A novel image dataset for benchmarking machine learning algorithms”, arXiv preprint arXiv:1708.07747

and also for this new rescaled version, using the reference [1] above.

The data set is made available on request. If you would be interested in trying out this data set, please make a request in the system below, and we will grant you access as soon as possible.

The dataset

The Rescaled FashionMNIST dataset is generated by rescaling 28×28 gray-scale images of clothes from the original FashionMNIST dataset [4]. The scale variations are up to a factor of 4, and the images are embedded within black images of size 72x72, with the object in the frame always centred. The imresize() function in Matlab was used for the rescaling, with default anti-aliasing turned on, and bicubic interpolation overshoot removed by clipping to the [0, 255] range. The details of how the dataset was created can be found in [1].

There are 10 different classes in the dataset: “T-shirt/top”, “trouser”, “pullover”, “dress”, “coat”, “sandal”, “shirt”, “sneaker”, “bag” and “ankle boot”. In the dataset, these are represented by integer labels in the range [0, 9].

The dataset is split into 50 000 training samples, 10 000 validation samples and 10 000 testing samples. The training dataset is generated using the initial 50 000 samples from the original Fashion-MNIST training set. The validation dataset, on the other hand, is formed from the final 10 000 images of that same training set. For testing, all test datasets are built from the 10 000 images contained in the original Fashion-MNIST test set.

The h5 files containing the dataset

The training dataset file (~2.9 GB) for scale 1, which also contains the corresponding validation and test data for the same scale, is:

fashionmnist_with_scale_variations_tr50000_vl10000_te10000_outsize72-72_scte1p000_scte1p000.h5

Additionally, for the Rescaled FashionMNIST dataset, there are 9 datasets (~415 MB each) for testing scale generalisation at scales not present in the training set. Each of these datasets is rescaled using a different image scaling factor, 2^k/4, with k being integers in the range [-4, 4]:

fashionmnist_with_scale_variations_te10000_outsize72-72_scte0p500.h5
fashionmnist_with_scale_variations_te10000_outsize72-72_scte0p595.h5
fashionmnist_with_scale_variations_te10000_outsize72-72_scte0p707.h5
fashionmnist_with_scale_variations_te10000_outsize72-72_scte0p841.h5
fashionmnist_with_scale_variations_te10000_outsize72-72_scte1p000.h5
fashionmnist_with_scale_variations_te10000_outsize72-72_scte1p189.h5
fashionmnist_with_scale_variations_te10000_outsize72-72_scte1p414.h5
fashionmnist_with_scale_variations_te10000_outsize72-72_scte1p682.h5
fashionmnist_with_scale_variations_te10000_outsize72-72_scte2p000.h5

These dataset files were used for the experiments presented in Figures 6, 7, 14, 16, 19 and 23 in [1].

Instructions for loading the data set

The datasets are saved in HDF5 format, with the partitions in the respective h5 files named as
('/x_train', '/x_val', '/x_test', '/y_train', '/y_test', '/y_val'); which ones exist depends on which data split is used.

The training dataset can be loaded in Python as:

with h5py.File(`

x_train = np.array( f["/x_train"], dtype=np.float32)
x_val = np.array( f["/x_val"], dtype=np.float32)
x_test = np.array( f["/x_test"], dtype=np.float32)
y_train = np.array( f["/y_train"], dtype=np.int32)
y_val = np.array( f["/y_val"], dtype=np.int32)
y_test = np.array( f["/y_test"], dtype=np.int32)

We also need to permute the data, since Pytorch uses the format [num_samples, channels, width, height], while the data is saved as [num_samples, width, height, channels]:

x_train = np.transpose(x_train, (0, 3, 1, 2))
x_val = np.transpose(x_val, (0, 3, 1, 2))
x_test = np.transpose(x_test, (0, 3, 1, 2))

The test datasets can be loaded in Python as:

with h5py.File(`

x_test = np.array( f["/x_test"], dtype=np.float32)
y_test = np.array( f["/y_test"], dtype=np.int32)

The test datasets can be loaded in Matlab as:

x_test = h5read(`

The images are stored as [num_samples, x_dim, y_dim, channels] in HDF5 files. The pixel intensity values are not normalised, and are in a [0, 255] range.

There is also a closely related Fashion-MNIST with translations dataset, which in addition to scaling variations also comprises spatial translations of the objects.
a
Data from: Fashion-MNIST
datasets.activeloop.ai
tensorflow.org
+3more
deeplake
Updated Feb 8, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Han Xiao, Kashif Rasul, Roland Vollgraf (2022). Fashion-MNIST [Dataset]. https://datasets.activeloop.ai/docs/ml/datasets/fashion-mnist-dataset/
Explore at:
deeplakeAvailable download formats
Dataset updated
Feb 8, 2022
Authors
Han Xiao, Kashif Rasul, Roland Vollgraf
License
https://github.com/zalandoresearch/fashion-mnist/blob/master/LICENSEhttps://github.com/zalandoresearch/fashion-mnist/blob/master/LICENSE
Description
A dataset of 70,000 fashion images with labels for 10 classes. The dataset was created by researchers at Zalando Research and is used for research in machine learning and computer vision tasks such as image classification.
Fashion MNIST Image Dataset
kaggle.com
Updated May 15, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ghanshyam Saini (2025). Fashion MNIST Image Dataset [Dataset]. https://www.kaggle.com/datasets/ghnshymsaini/fashion-mnist-image-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 15, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Ghanshyam Saini
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Fashion-MNIST Dataset (Image Files and CSV Labels)

This dataset contains images of Zalando's article categories, intended for fashion image classification. It serves as a direct drop-in replacement for the original MNIST dataset, often used as a benchmark for machine learning algorithms. Fashion-MNIST is slightly more challenging than regular MNIST.

Dataset Structure:

The dataset is organized into the following files and folders:

train/: This folder contains the training set images. It holds 60,000 grayscale images, each with dimensions 28x28 pixels. The images are in PNG format. The filenames within this folder are not explicitly labeled with the class, so you will need to refer to the train.csv file for the corresponding labels.

test/: This folder contains the testing set images. It holds 10,000 grayscale images, each with dimensions 28x28 pixels and in PNG format. Similar to the training set, the filenames here are not directly labeled, and the test.csv file provides the corresponding labels.

train.csv: This CSV (Comma Separated Values) file contains the labels for the images in the train/ folder. Each row in this file corresponds to an image. It typically contains two columns:

pixel1, pixel2, ..., pixel784: These columns represent the flattened pixel values of the 28x28 grayscale images. The pixel values are integers ranging from 0 to 255.

label: This column contains the corresponding class label (an integer from 0 to 9) for the image. You will need to refer to the class mapping (provided below) to understand the meaning of these numerical labels.

test.csv: This CSV file contains the labels for the images in the test/ folder, following the same format as train.csv with pixel1 to pixel784 columns and a label column.

Content of the Data:

Each image in the Fashion-MNIST dataset belongs to one of the following 10 classes:

Label Description
0 T-shirt/top
1 Trouser
2 Pullover
3 Dress
4 Coat
5 Sandal
6 Shirt
7 Sneaker
8 Bag
9 Ankle boot

The images are grayscale, meaning each pixel has a single intensity value.

How to Use This Dataset:

Download the entire dataset, including the train/ and test/ folders and the train.csv and test.csv files.

The image files in the train/ and test/ folders contain the visual data. You can load these images using libraries that handle image formats (like PIL, OpenCV).

The train.csv and test.csv files provide the ground truth labels for the corresponding images. You can read these CSV files using libraries like Pandas. The pixel values in the CSV can be reshaped into a 28x28 matrix to represent the image. The label column provides the class of the fashion item.

You can train your image classification models using the train/ images and train.csv labels.

Evaluate the performance of your trained models using the test/ images and test.csv labels.

Citation:

When using the Fashion-MNIST dataset, please cite the original paper:

Xiao, Han, Kashif Rasul, and Roland Vollgraf. "Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms." arXiv preprint arXiv:1708.07747 (2017).

Data Contribution:

Thank you for providing this well-structured Fashion-MNIST dataset with separate image folders and CSV label files. This organization makes it convenient for users to work with both the raw image data and the corresponding labels for training and evaluation of their fashion classification models.

If you find this dataset structure clear, well-organized, and useful for your projects, please consider giving it an upvote after downloading. Your feedback and appreciation are valuable!
p
Binarized MNIST data for quantum computing
pennylane.ai
Updated Apr 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Joseph Bowles (2025). Binarized MNIST data for quantum computing [Dataset]. https://pennylane.ai/datasets/binarized-mnist
Explore at:
Dataset updated
Apr 15, 2025
Authors
Joseph Bowles
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Measurement technique
Simulation
Dataset funded by
Xanadu Quantum Technologies
Description
Binarized version of the MNIST handwritten digits dataset
Data from: MNIST Handwritten Digits Dataset
kaggle.com
zip
Updated May 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ghanshyam Saini (2025). MNIST Handwritten Digits Dataset [Dataset]. https://www.kaggle.com/datasets/ghnshymsaini/mnist-handwritten-digits-dataset/versions/1
Explore at:
zip(29605861 bytes)Available download formats
Dataset updated
May 15, 2025
Authors
Ghanshyam Saini
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
MNIST Handwritten Digits Dataset (Organized by Folder)

This dataset provides the classic MNIST handwritten digits dataset, a foundational resource for image classification in machine learning. It contains a training set of 60,000 examples and a test set of 10,000 examples of grayscale images of handwritten digits (0 through 9).

Dataset Structure:

The uploaded data is organized within a main folder named mnist_png, which contains the following subfolders:

train: This folder contains the training set images. Upon navigating into the train folder, you will find 10 subfolders, named 0 through 9. Each of these subfolders corresponds to a digit class (e.g., the folder named 0 contains images of the digit zero, the folder named 1 contains images of the digit one, and so on). The images within these subfolders are grayscale handwritten digit images in a common image format (e.g., PNG).

test: This folder contains the test set images. Similar to the train folder, upon navigating into the test folder, you will find 10 subfolders, named 0 through 9. Each of these subfolders contains the corresponding test images for that digit class.

Content of the Data:

Each image in the MNIST dataset is a 28x28 pixel grayscale image of a handwritten digit (0-9). The pixel values typically range from 0 (black) to 255 (white).

How to Use This Dataset:

Download the main MNIST folder (or the archive containing it) and extract its contents.

Navigate into the mnist_png folder.

The train and test subfolders contain the image data, organized by digit class. You can directly use this folder structure with image data loaders that support directory-based organization. The name of the subfolder will correspond to the digit label.

The train folder provides the images you can use to train your machine learning models.

The test folder provides a separate set of images that you can use to evaluate the performance of your trained models on unseen data.

Citation:

The MNIST dataset is a well-established resource. While there isn't a single definitive paper for the original creation of the dataset in this image format, it's often attributed to the work done at the University of Toronto and is a standard in the field. You can often cite it in the context of the specific papers or implementations you are referencing that utilize it.

Data Contribution:

Thank you for downloading this image-based organization of the MNIST dataset. By structuring the images into class-specific folders within the train and test directories, I aim to provide a user-friendly format for those working on handwritten digit recognition tasks. This structure aligns well with many image data loading utilities and workflows.

If you find this folder structure clear, well-organized, and useful for your projects, please consider giving it an upvote after downloading. Your feedback and appreciation are valuable and encourage further contributions to the Kaggle community. Thank you!
mnist+context
figshare.com
application/x-gzip
Updated May 30, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Doris Voina; Eric Shea-Brown; Stefan Mihalas (2021). mnist+context [Dataset]. http://doi.org/10.6084/m9.figshare.14703639.v1
Explore at:
application/x-gzipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.14703639.v1
Dataset updated
May 30, 2021
Dataset provided by
Figsharehttp://figshare.com/
Authors
Doris Voina; Eric Shea-Brown; Stefan Mihalas
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
The dataset consists of MNIST digits set on either noisy backgrounds, with noise a Gaussian random variable, or MNIST digits set on a more naturalistic background from the CIFAR-10 dataset. A parameter that makes the task more difficult is digit transparency; as we increase transparency, the background interferes with the digit and the identity of the digit becomes more ambiguous. The goal is to perform image classification so that neural networks (NN) correctly identify the MNIST digit despite the different backgrounds.
Hindi/Devanagari MNIST Data
kaggle.com
zip
Updated Mar 18, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anurag Shenoy (2025). Hindi/Devanagari MNIST Data [Dataset]. https://www.kaggle.com/datasets/anurags397/hindi-mnist-data
Explore at:
zip(18064821 bytes)Available download formats
Dataset updated
Mar 18, 2025
Authors
Anurag Shenoy
Description
Context

Handwritten image data is easy to find in languages such as English and Japanese, but not for many Indian languages including Hindi. While trying to create an MNIST like personal project, I stumbled upon a Hindi Handwritten characters dataset by Shailesh Acharya and Prashnna Kumar Gyawali, which is uploaded to the UCI Machine Learning Repository.

This dataset however, only has the digits from 0 to 9, and all other characters have been removed.

Content

Data Type: GrayScale Image Image Format: PNG Resolution: 32 by 32 pixels Actual character is centered within 28 by 28 pixel, padding of 2 pixel is added on all four sides of actual character.

There are ~1700 images per class in the Train set, and around ~300 images per class in the Test set.

Acknowledgements

The Dataset is ©️ Original Authors.

Original Authors: - Shailesh Acharya - Prashnna Kumar Gyawali

Citation: S. Acharya, A.K. Pant and P.K. Gyawali “**Deep Learning Based Large Scale Handwritten Devanagari Character Recognition**”, In Proceedings of the 9th International Conference on Software, Knowledge, Information Management and Applications (SKIMA), pp. 121-126, 2015.

The full Dataset is available here: https://archive.ics.uci.edu/ml/datasets/Devanagari+Handwritten+Character+Dataset
T
moving_mnist
tensorflow.org
opendatalab.com
Updated Nov 23, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). moving_mnist [Dataset]. https://www.tensorflow.org/datasets/catalog/moving_mnist
Explore at:
Dataset updated
Nov 23, 2022
Description
Moving variant of MNIST database of handwritten digits. This is the data used by the authors for reporting model performance. See tfds.video.moving_mnist.image_as_moving_sequence for generating training/validation data from the MNIST dataset.

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('moving_mnist', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.
mnist.pkl.gz
figshare.com
application/gzip
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yann LeCun (2023). mnist.pkl.gz [Dataset]. http://doi.org/10.6084/m9.figshare.13303457.v1
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.13303457.v1
Dataset updated
May 31, 2023
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Yann LeCun
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
MNIST dataset originally hosted on https://deeplearning.net, re-hosted here because deeplearning.net is currently inaccessible.

Label	Description
0	T-shirt/top
1	Trouser
2	Pullover
3	Dress
4	Coat
5	Sandal
6	Shirt
7	Sneaker
8	Bag
9	Ankle boot

MNIST as PNG

kaggle.com

zip

Updated Jul 17, 2024

Facebook

Twitter

Click to copy link

Link copied

Cite

Ben Gorman (2024). MNIST as PNG [Dataset]. https://www.kaggle.com/datasets/ben519/mnist-as-png

Explore at:

zip(32971223 bytes)Available download formats

Dataset updated

Jul 17, 2024

Authors

Ben Gorman

License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

[MNIST](https://en.wikipedia.org/wiki/MNIST_database#:~:text=The%20MNIST%20database%20(Modified%20National,training%20various%20image%20processing%20systems.) data in PNG format, derived directly from MNIST in CSV.

The data contains 60,000 labelled train samples and 10,000 labelled test samples. Each sample is a 28x28 grayscale PNG image.

Unzipped directory structure 👇

test/
 0/
  test_image_3.png
  test_image_10.png
  test_image_13.png
  ...
 1/
  test_image_2.png
  test_image_5.png
  test_image_14.png
  ...
 ...
 9/

train/
 0/
  train_image_1.png
  train_image_21.png
  train_image_34.png
  ...
 1/
 ...
 9/

Data collection script

import pandas as pd
from PIL import Image

mnist_train = pd.read_csv("mnist-csv/mnist_train.csv")
mnist_test = pd.read_csv("mnist-csv/mnist_test.csv")

for i in range(10):

  # Convert the training data to png
  train_i = mnist_train.loc[mnist_train.label == i]
  for index, row in train_i.iterrows():
    X = row[1:].to_numpy().reshape(28, 28)
    filepath = (
      f"mnist-png/train/{i}/train_image_{index}.png"
    )
    img = Image.fromarray(X.astype("uint8"), mode="L")
    img.save(filepath)

  # Convert the test data to png
  test_i = mnist_test.loc[mnist_test.label == i]
  for index, row in test_i.iterrows():
    X = row[1:].to_numpy().reshape(28, 28)
    filepath = f"mnist-png/test/{i}/test_image_{index}.png"
    img = Image.fromarray(X.astype("uint8"), mode="L")
    img.save(filepath)

MNIST Original
kaggle.com
zip
Updated Aug 12, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Avnish (2018). MNIST Original [Dataset]. https://www.kaggle.com/datasets/avnishnish/mnist-original/data
Explore at:
zip(11408921 bytes)Available download formats
Dataset updated
Aug 12, 2018
Authors
Avnish
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Content

MNIST dataset, which is a set of 70,000 small images of digits handwritten by high school students and employees of the US Census Bureau. Each image is labeled with the digit it represents. This set has been studied so much that it is often called the “Hello World” of Machine Learning

Inspiration

Test classification algorithms on this dataset and find out which one predicts the best
T
kmnist
tensorflow.org
datasets.activeloop.ai
Updated Jun 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). kmnist [Dataset]. https://www.tensorflow.org/datasets/catalog/kmnist
Explore at:
Dataset updated
Jun 1, 2024
Description
Kuzushiji-MNIST is a drop-in replacement for the MNIST dataset (28x28 grayscale, 70,000 images), provided in the original MNIST format as well as a NumPy format. Since MNIST restricts us to 10 classes, we chose one character to represent each of the 10 rows of Hiragana when creating Kuzushiji-MNIST.

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('kmnist', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.

https://storage.googleapis.com/tfds-data/visualization/fig/kmnist-3.0.1.png" alt="Visualization" width="500px">
f
Data_Sheet_1_Is Neuromorphic MNIST Neuromorphic? Analyzing the...
frontiersin.figshare.com
figshare.com
pdf
Updated Jun 6, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Laxmi R. Iyer; Yansong Chua; Haizhou Li (2023). Data_Sheet_1_Is Neuromorphic MNIST Neuromorphic? Analyzing the Discriminative Power of Neuromorphic Datasets in the Time Domain.PDF [Dataset]. http://doi.org/10.3389/fnins.2021.608567.s001
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.3389/fnins.2021.608567.s001
Dataset updated
Jun 6, 2023
Dataset provided by
Frontiers
Authors
Laxmi R. Iyer; Yansong Chua; Haizhou Li
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
A major characteristic of spiking neural networks (SNNs) over conventional artificial neural networks (ANNs) is their ability to spike, enabling them to use spike timing for coding and efficient computing. In this paper, we assess if neuromorphic datasets recorded from static images are able to evaluate the ability of SNNs to use spike timings in their calculations. We have analyzed N-MNIST, N-Caltech101 and DvsGesture along these lines, but focus our study on N-MNIST. First we evaluate if additional information is encoded in the time domain in a neuromorphic dataset. We show that an ANN trained with backpropagation on frame-based versions of N-MNIST and N-Caltech101 images achieve 99.23 and 78.01% accuracy. These are comparable to the state of the art—showing that an algorithm that purely works on spatial data can classify these datasets. Second we compare N-MNIST and DvsGesture on two STDP algorithms, RD-STDP, that can classify only spatial data, and STDP-tempotron that classifies spatiotemporal data. We demonstrate that RD-STDP performs very well on N-MNIST, while STDP-tempotron performs better on DvsGesture. Since DvsGesture has a temporal dimension, it requires STDP-tempotron, while N-MNIST can be adequately classified by an algorithm that works on spatial data alone. This shows that precise spike timings are not important in N-MNIST. N-MNIST does not, therefore, highlight the ability of SNNs to classify temporal data. The conclusions of this paper open the question—what dataset can evaluate SNN ability to classify temporal data?
p
MNISQ data for quantum computing
pennylane.ai
Updated Aug 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Leonardo Placidi; Ryuichiro Hataya; Toshio Mori; Koki Aoyama; Hayata Morisaki; Kosuke Mitarai; Keisuke Fujii (2023). MNISQ data for quantum computing [Dataset]. https://pennylane.ai/datasets/mnisq
Explore at:
Dataset updated
Aug 30, 2023
Authors
Leonardo Placidi; Ryuichiro Hataya; Toshio Mori; Koki Aoyama; Hayata Morisaki; Kosuke Mitarai; Keisuke Fujii
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Measurement technique
Simulation
Dataset funded by
Xanadu Quantum Technologies
Description
This dataset contains a portion of MNISQ: a dataset of quantum circuits that encode data from MNIST, Fashion-MNIST, and Kuzushiji-MNIST.
Model Zoo: A Dataset of Diverse Populations of Neural Network Models -...
zenodo.org
data.niaid.nih.gov
bin, json, zip
Updated Jun 13, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Konstantin Schürholt; Diyar Taskiran; Boris Knyazev; Xavier Giró-i-Nieto; Damian Borth; Konstantin Schürholt; Diyar Taskiran; Boris Knyazev; Xavier Giró-i-Nieto; Damian Borth (2022). Model Zoo: A Dataset of Diverse Populations of Neural Network Models - Fashion-MNIST [Dataset]. http://doi.org/10.5281/zenodo.6632105
Explore at:
bin, zip, jsonAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.6632105
Dataset updated
Jun 13, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Konstantin Schürholt; Diyar Taskiran; Boris Knyazev; Xavier Giró-i-Nieto; Damian Borth; Konstantin Schürholt; Diyar Taskiran; Boris Knyazev; Xavier Giró-i-Nieto; Damian Borth
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Abstract

In the last years, neural networks have evolved from laboratory environments to the state-of-the-art for many real-world problems. Our hypothesis is that neural network models (i.e., their weights and biases) evolve on unique, smooth trajectories in weight space during training. Following, a population of such neural network models (refereed to as “model zoo”) would form topological structures in weight space. We think that the geometry, curvature and smoothness of these structures contain information about the state of training and can be reveal latent properties of individual models. With such zoos, one could investigate novel approaches for (i) model analysis, (ii) discover unknown learning dynamics, (iii) learn rich representations of such populations, or (iv) exploit the model zoos for generative modelling of neural network weights and biases. Unfortunately, the lack of standardized model zoos and available benchmarks significantly increases the friction for further research about populations of neural networks. With this work, we publish a novel dataset of model zoos containing systematically generated and diverse populations of neural network models for further research. In total the proposed model zoo dataset is based on six image datasets, consist of 24 model zoos with varying hyperparameter combinations are generated and includes 47’360 unique neural network models resulting in over 2’415’360 collected model states. Additionally, to the model zoo data we provide an in-depth analysis of the zoos and provide benchmarks for multiple downstream tasks as mentioned before.

Dataset

This dataset is part of a larger collection of model zoos and contains the zoos trained on the labelled samples from Fashion-MNIST. All zoos with extensive information and code can be found at www.modelzoos.cc.

This repository contains two types of files: the raw model zoos as collections of models (file names beginning with "fmnist_"), as well as preprocessed model zoos wrapped in a custom pytorch dataset class (filenames beginning with "dataset"). Zoos are trained in three configurations varying the seed only (seed), varying hyperparameters with fixed seeds (hyp_fix) or varying hyperparameters with random seeds (hyp_rand). The index_dict.json files contain information on how to read the vectorized models.

For more information on the zoos and code to access and use the zoos, please see www.modelzoos.cc.

Facebook

Twitter

Click to copy link

Link copied

Cite

Hojjat Khodabakhsh (2019). MNIST Dataset [Dataset]. https://www.kaggle.com/datasets/hojjatk/mnist-dataset

MNIST Dataset

The MNIST database of handwritten digits (http://yann.lecun.com)

Explore at:

124 scholarly articles cite this dataset (View in Google Scholar)

zip(23112702 bytes)Available download formats

Dataset updated

Jan 8, 2019

Authors

Hojjat Khodabakhsh

Description

Context

MNIST is a subset of a larger set available from NIST (it's copied from http://yann.lecun.com/exdb/mnist/)

Content

The MNIST database of handwritten digits has a training set of 60,000 examples, and a test set of 10,000 examples. . Four files are available:

train-images-idx3-ubyte.gz: training set images (9912422 bytes)
train-labels-idx1-ubyte.gz: training set labels (28881 bytes)
t10k-images-idx3-ubyte.gz: test set images (1648877 bytes)
t10k-labels-idx1-ubyte.gz: test set labels (4542 bytes)

How to read

See sample MNIST reader

Acknowledgements

Yann LeCun, Courant Institute, NYU
Corinna Cortes, Google Labs, New York
Christopher J.C. Burges, Microsoft Research, Redmond

Inspiration

Many methods have been tested with this training set and test set (see http://yann.lecun.com/exdb/mnist/ for more details)

Clear search

Close search

Google apps

Main menu

Label	Occurrences	Label	Occurrences	Label	Occurrences
0	561	34	629	68	606
1	687	35	540	69	582
2	582	36	588	70	566
3	633	37	619	71	659
4	588	38	584	72	572
5	544	39	609	73	682
6	582	40	570	74	627
7	615	41	679	75	598
8	584	42	544	76	605
9	567	43	567	77	602
10	641	44	574	78	595
11	780	45	555	79	586
12	720	46	550	80	569
13	699	47	614	81	628
14	630	48	614	82	578
15	627	49	595	83	622
16	684	50	505	84	569
17	713	51	583	85	540
18	743	52	512	86	557
19	706	53	555	87	628
20	527	54	504	88	562
21	710	55	488	89	625
22	586	56	531	90	600
23	584	57	556	91	700
24	568	58	497	92	622
25	530	59	520	93	622
26	612	60	556	94	591
27	627	61	682	95	557
28	618	62	594	96	580
29	619	63	539	97	640
30	622	64	610	98	577
31	684	65	514	99	563
32	606	66	587
33	592	67	655

Label	Occurrences	Label	Occurrences	Label	Occurrences
0	561	34	629	68	606
1	687	35	540	69	582
2	582	36	588	70	566
3	633	37	619	71	659
4	588	38	584	72	572
5	544	39	609	73	682
6	582	40	570	74	627
7	615	41	679	75	598
8	584	42	544	76	605
9	567	43	567	77	602
10	641	44	574	78	595
11	780	45	555	79	586
12	720	46	550	80	569
13	699	47	614	81	628
14	630	48	614	82	578
15	627	49	595	83	622
16	684	50	505	84	569
17	713	51	583	85	540
18	743	52	512	86	557
19	706	53	555	87	628
20	527	54	504	88	562
21	710	55	488	89	625
22	586	56	531	90	600
23	584	57	556	91	700
24	568	58	497	92	622
25	530	59	520	93	622
26	612	60	556	94	591
27	627	61	682	95	557
28	618	62	594	96	580
29	619	63	539	97	640
30	622	64	610	98	577
31	684	65	514	99	563
32	606	66	587
33	592	67	655

MNIST Dataset

Context

Content

How to read

Acknowledgements

Inspiration

MNIST-100

MNIST

MNIST Dataset

MNIST Dataset

Rescaled Fashion-MNIST dataset

Motivation

Access and rights

The dataset

The h5 files containing the dataset

Instructions for loading the data set

Data from: Fashion-MNIST

Fashion MNIST Image Dataset

Fashion-MNIST Dataset (Image Files and CSV Labels)

Binarized MNIST data for quantum computing

Data from: MNIST Handwritten Digits Dataset

MNIST Handwritten Digits Dataset (Organized by Folder)

mnist+context

Hindi/Devanagari MNIST Data

Context

Content

Acknowledgements

moving_mnist

mnist.pkl.gz

MNIST as PNG

Unzipped directory structure 👇

Data collection script

MNIST Original

Content

Inspiration

kmnist

Data_Sheet_1_Is Neuromorphic MNIST Neuromorphic? Analyzing the...

MNISQ data for quantum computing

Model Zoo: A Dataset of Diverse Populations of Neural Network Models -...

MNIST Dataset

The MNIST database of handwritten digits (http://yann.lecun.com)

Context

Content

How to read

Acknowledgements

Inspiration

Label	Occurrences	Label	Occurrences	Label	Occurrences
0	561	34	629	68	606
1	687	35	540	69	582
2	582	36	588	70	566
3	633	37	619	71	659
4	588	38	584	72	572
5	544	39	609	73	682
6	582	40	570	74	627
7	615	41	679	75	598
8	584	42	544	76	605
9	567	43	567	77	602
10	641	44	574	78	595
11	780	45	555	79	586
12	720	46	550	80	569
13	699	47	614	81	628
14	630	48	614	82	578
15	627	49	595	83	622
16	684	50	505	84	569
17	713	51	583	85	540
18	743	52	512	86	557
19	706	53	555	87	628
20	527	54	504	88	562
21	710	55	488	89	625
22	586	56	531	90	600
23	584	57	556	91	700
24	568	58	497	92	622
25	530	59	520	93	622
26	612	60	556	94	591
27	627	61	682	95	557
28	618	62	594	96	580
29	619	63	539	97	640
30	622	64	610	98	577
31	684	65	514	99	563
32	606	66	587
33	592	67	655