69 datasets found

MNIST Dataset
kaggle.com
opendatalab.com
+4more
zip
Updated Jan 8, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hojjat Khodabakhsh (2019). MNIST Dataset [Dataset]. https://www.kaggle.com/datasets/hojjatk/mnist-dataset
Explore at:
zip(23112702 bytes)Available download formats
Dataset updated
Jan 8, 2019
Authors
Hojjat Khodabakhsh
Description
Context

MNIST is a subset of a larger set available from NIST (it's copied from http://yann.lecun.com/exdb/mnist/)

Content

The MNIST database of handwritten digits has a training set of 60,000 examples, and a test set of 10,000 examples. . Four files are available:

train-images-idx3-ubyte.gz: training set images (9912422 bytes)

train-labels-idx1-ubyte.gz: training set labels (28881 bytes)

t10k-images-idx3-ubyte.gz: test set images (1648877 bytes)

t10k-labels-idx1-ubyte.gz: test set labels (4542 bytes)

How to read

See sample MNIST reader

Acknowledgements

Yann LeCun, Courant Institute, NYU

Corinna Cortes, Google Labs, New York

Christopher J.C. Burges, Microsoft Research, Redmond

Inspiration

Many methods have been tested with this training set and test set (see http://yann.lecun.com/exdb/mnist/ for more details)

MNIST-100

kaggle.com

zip

Updated Jul 25, 2023

+ more versions

Facebook

Twitter

Click to copy link

Link copied

Cite

Marcin Wierzbiński (2023). MNIST-100 [Dataset]. https://www.kaggle.com/datasets/martininf1n1ty/mnist100

Explore at:

zip(23452456 bytes)Available download formats

Dataset updated

Jul 25, 2023

Authors

Marcin Wierzbiński

License

http://www.gnu.org/licenses/lgpl-3.0.htmlhttp://www.gnu.org/licenses/lgpl-3.0.html

Description

The MNIST-100 dataset is a variation of the original MNIST dataset, consisting of 100 handwritten numbers extracted from the MNIST dataset. Unlike the traditional MNIST dataset, which contains 60,000 training images of digits from 0 to 9, the Modified MNIST-10 dataset focuses on 100 numbers.

Dataset Overview: - Dataset Name: MNIST-100 - Total Number of Images: train: 60000 test: 1000 - Classes: 100 (Numbers from 00 to 99) - Image Size: 28x56 pixels (grayscale)

Data Collection: The MNIST-100 dataset was created by randomly selecting 10 unique digits from the original MNIST dataset. For each selected digit, 10 representative images were extracted, resulting in a total of 100 images. These images were carefully chosen to represent a diverse range of handwriting styles for each digit.

Each image in the dataset is labeled with its corresponding numbers, ranging from 00 to 99, making it suitable for classification tasks. Researchers and practitioners can use this dataset to train and evaluate machine learning algorithms and neural networks for digit recognition and classification.

Please note that the Modified MNIST-100 dataset is not intended to replace the original MNIST dataset but serves as a complementary resource for specific applications requiring a smaller and more focused subset of the MNIST data.

Overall, the MNIST-100 dataset offers a compact and representative collection of 100 handwritten numbers, providing a convenient tool for experimentation and learning in computer vision and pattern recognition.

Label Distribution for training set:

Label	Occurrences	Label	Occurrences	Label	Occurrences
0	561	34	629	68	606
1	687	35	540	69	582
2	582	36	588	70	566
3	633	37	619	71	659
4	588	38	584	72	572
5	544	39	609	73	682
6	582	40	570	74	627
7	615	41	679	75	598
8	584	42	544	76	605
9	567	43	567	77	602
10	641	44	574	78	595
11	780	45	555	79	586
12	720	46	550	80	569
13	699	47	614	81	628
14	630	48	614	82	578
15	627	49	595	83	622
16	684	50	505	84	569
17	713	51	583	85	540
18	743	52	512	86	557
19	706	53	555	87	628
20	527	54	504	88	562
21	710	55	488	89	625
22	586	56	531	90	600
23	584	57	556	91	700
24	568	58	497	92	622
25	530	59	520	93	622
26	612	60	556	94	591
27	627	61	682	95	557
28	618	62	594	96	580
29	619	63	539	97	640
30	622	64	610	98	577
31	684	65	514	99	563
32	606	66	587
33	592	67	655

Test data:

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F7193292%2Fac688f2526851734cb50be10f0a7bd7d%2Fpobrane%20(16).png?generation=1690276359580027&alt=media" alt="">

Label	Occurrences	Label	Occurrences	Label	Occurrences
00	96	34	100	68	90
01	108	35	91	69	92
02	91	36	107	70	102
03	96	37	112	71	116
04	75	38	97	72	101
05	85	39	96	73	106
06	88	40	103	74	98
07	96	41	123	75 ...

a
Data from: Fashion-MNIST
datasets.activeloop.ai
tensorflow.org
+3more
deeplake
Updated Feb 8, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Han Xiao, Kashif Rasul, Roland Vollgraf (2022). Fashion-MNIST [Dataset]. https://datasets.activeloop.ai/docs/ml/datasets/fashion-mnist-dataset/
Explore at:
deeplakeAvailable download formats
Dataset updated
Feb 8, 2022
Authors
Han Xiao, Kashif Rasul, Roland Vollgraf
License
https://github.com/zalandoresearch/fashion-mnist/blob/master/LICENSEhttps://github.com/zalandoresearch/fashion-mnist/blob/master/LICENSE
Description
A dataset of 70,000 fashion images with labels for 10 classes. The dataset was created by researchers at Zalando Research and is used for research in machine learning and computer vision tasks such as image classification.
Fashion MNIST Image Dataset
kaggle.com
Updated May 15, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ghanshyam Saini (2025). Fashion MNIST Image Dataset [Dataset]. https://www.kaggle.com/datasets/ghnshymsaini/fashion-mnist-image-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 15, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Ghanshyam Saini
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Fashion-MNIST Dataset (Image Files and CSV Labels)

This dataset contains images of Zalando's article categories, intended for fashion image classification. It serves as a direct drop-in replacement for the original MNIST dataset, often used as a benchmark for machine learning algorithms. Fashion-MNIST is slightly more challenging than regular MNIST.

Dataset Structure:

The dataset is organized into the following files and folders:

train/: This folder contains the training set images. It holds 60,000 grayscale images, each with dimensions 28x28 pixels. The images are in PNG format. The filenames within this folder are not explicitly labeled with the class, so you will need to refer to the train.csv file for the corresponding labels.

test/: This folder contains the testing set images. It holds 10,000 grayscale images, each with dimensions 28x28 pixels and in PNG format. Similar to the training set, the filenames here are not directly labeled, and the test.csv file provides the corresponding labels.

train.csv: This CSV (Comma Separated Values) file contains the labels for the images in the train/ folder. Each row in this file corresponds to an image. It typically contains two columns:

pixel1, pixel2, ..., pixel784: These columns represent the flattened pixel values of the 28x28 grayscale images. The pixel values are integers ranging from 0 to 255.

label: This column contains the corresponding class label (an integer from 0 to 9) for the image. You will need to refer to the class mapping (provided below) to understand the meaning of these numerical labels.

test.csv: This CSV file contains the labels for the images in the test/ folder, following the same format as train.csv with pixel1 to pixel784 columns and a label column.

Content of the Data:

Each image in the Fashion-MNIST dataset belongs to one of the following 10 classes:

Label Description
0 T-shirt/top
1 Trouser
2 Pullover
3 Dress
4 Coat
5 Sandal
6 Shirt
7 Sneaker
8 Bag
9 Ankle boot

The images are grayscale, meaning each pixel has a single intensity value.

How to Use This Dataset:

Download the entire dataset, including the train/ and test/ folders and the train.csv and test.csv files.

The image files in the train/ and test/ folders contain the visual data. You can load these images using libraries that handle image formats (like PIL, OpenCV).

The train.csv and test.csv files provide the ground truth labels for the corresponding images. You can read these CSV files using libraries like Pandas. The pixel values in the CSV can be reshaped into a 28x28 matrix to represent the image. The label column provides the class of the fashion item.

You can train your image classification models using the train/ images and train.csv labels.

Evaluate the performance of your trained models using the test/ images and test.csv labels.

Citation:

When using the Fashion-MNIST dataset, please cite the original paper:

Xiao, Han, Kashif Rasul, and Roland Vollgraf. "Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms." arXiv preprint arXiv:1708.07747 (2017).

Data Contribution:

Thank you for providing this well-structured Fashion-MNIST dataset with separate image folders and CSV label files. This organization makes it convenient for users to work with both the raw image data and the corresponding labels for training and evaluation of their fashion classification models.

If you find this dataset structure clear, well-organized, and useful for your projects, please consider giving it an upvote after downloading. Your feedback and appreciation are valuable!
mnist.pkl.gz
figshare.com
application/gzip
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yann LeCun (2023). mnist.pkl.gz [Dataset]. http://doi.org/10.6084/m9.figshare.13303457.v1
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.13303457.v1
Dataset updated
May 31, 2023
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Yann LeCun
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
MNIST dataset originally hosted on https://deeplearning.net, re-hosted here because deeplearning.net is currently inaccessible.
Hindi/Devanagari MNIST Data
kaggle.com
zip
Updated Mar 18, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anurag Shenoy (2025). Hindi/Devanagari MNIST Data [Dataset]. https://www.kaggle.com/datasets/anurags397/hindi-mnist-data
Explore at:
zip(18064821 bytes)Available download formats
Dataset updated
Mar 18, 2025
Authors
Anurag Shenoy
Description
Context

Handwritten image data is easy to find in languages such as English and Japanese, but not for many Indian languages including Hindi. While trying to create an MNIST like personal project, I stumbled upon a Hindi Handwritten characters dataset by Shailesh Acharya and Prashnna Kumar Gyawali, which is uploaded to the UCI Machine Learning Repository.

This dataset however, only has the digits from 0 to 9, and all other characters have been removed.

Content

Data Type: GrayScale Image Image Format: PNG Resolution: 32 by 32 pixels Actual character is centered within 28 by 28 pixel, padding of 2 pixel is added on all four sides of actual character.

There are ~1700 images per class in the Train set, and around ~300 images per class in the Test set.

Acknowledgements

The Dataset is ©️ Original Authors.

Original Authors: - Shailesh Acharya - Prashnna Kumar Gyawali

Citation: S. Acharya, A.K. Pant and P.K. Gyawali “**Deep Learning Based Large Scale Handwritten Devanagari Character Recognition**”, In Proceedings of the 9th International Conference on Software, Knowledge, Information Management and Applications (SKIMA), pp. 121-126, 2015.

The full Dataset is available here: https://archive.ics.uci.edu/ml/datasets/Devanagari+Handwritten+Character+Dataset
r
Extended MNIST (EMNIST) dataset
researchdata.edu.au
Updated May 16, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
van Schaik Andre; Tapson Jonathan; Afshar Saeed; Cohen Gregory (2023). Extended MNIST (EMNIST) dataset [Dataset]. http://doi.org/10.26183/ZN7S-GH79
Explore at:
Unique identifier
https://doi.org/10.26183/ZN7S-GH79
Dataset updated
May 16, 2023
Dataset provided by
Western Sydney University
Authors
van Schaik Andre; Tapson Jonathan; Afshar Saeed; Cohen Gregory
License
Attribution-NoDerivs 4.0 (CC BY-ND 4.0)https://creativecommons.org/licenses/by-nd/4.0/
License information was derived automatically
Description
The EMNIST dataset is a set of handwritten character digits derived from the NIST Special Database 19 (https://www.nist.gov/srd/nist-special-database-19) and converted to a 28x28 pixel image format and dataset structure that directly matches the MNIST dataset (http://yann.lecun.com/exdb/mnist/). Further information on the dataset contents and conversion process can be found in the paper available at https://arxiv.org/abs/1702.05373v2
The MNIST dataset has become a standard benchmark for learning, classification and computer vision systems. Contributing to its widespread adoption are the understandable and intuitive nature of the task, its relatively small size and storage requirements and the accessibility and ease-of-use of the database itself. The MNIST database was derived from a larger dataset known as the NIST Special Database 19 which contains digits, uppercase and lowercase handwritten letters. This paper introduces a variant of the full NIST dataset, which we have called Extended MNIST (EMNIST), which follows the same conversion paradigm used to create the MNIST dataset. The result is a set of datasets that constitute a more challenging classification tasks involving letters and digits, and that shares the same image structure and parameters as the original MNIST task, allowing for direct compatibility with all existing classifiers and systems. Benchmark results are presented along with a validation of the conversion process through the comparison of the classification results on converted NIST digits and the MNIST digits.
The database is made available in original MNIST format and Matlab format.
a
not-MNIST
datasets.activeloop.ai
opendatalab.com
+2more
deeplake
Updated Mar 11, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yaroslav Bulatov (2022). not-MNIST [Dataset]. https://datasets.activeloop.ai/docs/ml/datasets/not-mnist-dataset/
Explore at:
deeplakeAvailable download formats
Dataset updated
Mar 11, 2022
Authors
Yaroslav Bulatov
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
The not-MNIST dataset is a dataset of handwritten digits. It is a challenging dataset that can be used for machine learning and artificial intelligence research. The dataset consists of 100,000 images of handwritten digits. The images are divided into a training set of 60,000 images and a test set of 40,000 images. The images are drawn from a variety of fonts and styles, making them more challenging than the MNIST dataset. The images are 28x28 pixels in size and are grayscale. The dataset is available under the Creative Commons Zero Public Domain Dedication license.
g
MNIST-100
gts.ai
json
Updated Apr 28, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Globose Technology Solutions Pvt Ltd (2024). MNIST-100 [Dataset]. https://gts.ai/dataset-download/mnist-100/
Explore at:
jsonAvailable download formats
Dataset updated
Apr 28, 2024
Dataset authored and provided by
Globose Technology Solutions Pvt Ltd
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
The MNIST-100 dataset is a curated subset of the original MNIST dataset, designed to support computer vision and machine learning research focused on digit recognition and classification. It provides clean, well-labeled samples for rapid experimentation and model benchmarking.
T
moving_mnist
tensorflow.org
opendatalab.com
Updated Nov 23, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). moving_mnist [Dataset]. https://www.tensorflow.org/datasets/catalog/moving_mnist
Explore at:
Dataset updated
Nov 23, 2022
Description
Moving variant of MNIST database of handwritten digits. This is the data used by the authors for reporting model performance. See tfds.video.moving_mnist.image_as_moving_sequence for generating training/validation data from the MNIST dataset.

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('moving_mnist', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.
MNIST Dataset
kaggle.com
zip
Updated Feb 6, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Marvin Luckianto (2024). MNIST Dataset [Dataset]. https://www.kaggle.com/datasets/marvinluckianto/mnist-dataset
Explore at:
zip(11494011 bytes)Available download formats
Dataset updated
Feb 6, 2024
Authors
Marvin Luckianto
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
The MNIST database (Modified National Institute of Standards and Technology database) is a large collection of handwritten digits. It is a subset of a larger NIST Special Database 3 (digits written by employees of the United States Census Bureau) and Special Database 1 (digits written by high school students) which contain monochrome images of handwritten digits. The digits have been size-normalized and centered in a fixed-size image. The original black and white (bilevel) images from NIST were size normalized to fit in a 20x20 pixel box while preserving their aspect ratio. The resulting images contain grey levels as a result of the anti-aliasing technique used by the normalization algorithm. the images were centered in a 28x28 image by computing the center of mass of the pixels and translating the image so as to position this point at the center of the 28x28 field.

License: Yann LeCun and Corinna Cortes hold the copyright of MNIST dataset, which is a derivative work from original NIST datasets. MNIST dataset is made available under the terms of the Creative Commons Attribution-Share Alike 3.0 license.
Moving MNIST
kaggle.com
zip
Updated Jun 17, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Huy Phan (2024). Moving MNIST [Dataset]. https://www.kaggle.com/datasets/hughiephan/moving-mnist
Explore at:
zip(22299997 bytes)Available download formats
Dataset updated
Jun 17, 2024
Authors
Huy Phan
Description
Dataset

This dataset was created by Huy Phan

Contents
g
NMNIST
gts.ai
data.mendeley.com
json
Updated Mar 28, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
GTS (2024). NMNIST [Dataset]. https://gts.ai/dataset-download/nmnist/
Explore at:
jsonAvailable download formats
Dataset updated
Mar 28, 2024
Dataset provided by
GLOBOSE TECHNOLOGY SOLUTIONS PRIVATE LIMITED
Authors
GTS
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Recording of the MNIST dataset displayed on a screen as viewed by a dynamic vision sensor moving through a fixed trajectory on a pan-tilt unit. Details are in the listed paper.
R
Mnist Dataset
universe.roboflow.com
zip
Updated Nov 28, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
hobby (2022). Mnist Dataset [Dataset]. https://universe.roboflow.com/hobby-mmwmp/mnist-4kzkx/dataset/1
Explore at:
zipAvailable download formats
Dataset updated
Nov 28, 2022
Dataset authored and provided by
hobby
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Variables measured
Numbers
Description
Mnist

## Overview Mnist is a dataset for classification tasks - it contains Numbers annotations for 400 images. ## Getting Started You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model. ## License This dataset is available under the [MIT license](https://creativecommons.org/licenses/MIT).
T
kmnist
tensorflow.org
datasets.activeloop.ai
Updated Jun 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). kmnist [Dataset]. https://www.tensorflow.org/datasets/catalog/kmnist
Explore at:
Dataset updated
Jun 1, 2024
Description
Kuzushiji-MNIST is a drop-in replacement for the MNIST dataset (28x28 grayscale, 70,000 images), provided in the original MNIST format as well as a NumPy format. Since MNIST restricts us to 10 classes, we chose one character to represent each of the 10 rows of Hiragana when creating Kuzushiji-MNIST.

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('kmnist', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.

https://storage.googleapis.com/tfds-data/visualization/fig/kmnist-3.0.1.png" alt="Visualization" width="500px">
MNIST-fashion-png
kaggle.com
zip
Updated Feb 19, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
PedroStu (2022). MNIST-fashion-png [Dataset]. https://www.kaggle.com/datasets/prashantdandriyal/mnistfashionpng
Explore at:
zip(52473305 bytes)Available download formats
Dataset updated
Feb 19, 2022
Authors
PedroStu
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Dataset

This dataset was created by PedroStu

Released under CC0: Public Domain

Contents
Mnist Dataset
universe.roboflow.com
zip
Updated Feb 20, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Object Detection (2023). Mnist Dataset [Dataset]. https://universe.roboflow.com/object-detection-uscpv/mnist-a64ay/model/1
Explore at:
zipAvailable download formats
Dataset updated
Feb 20, 2023
Dataset provided by
Object detection
Authors
Object Detection
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Variables measured
Mnist Bounding Boxes
Description
Mnist

## Overview Mnist is a dataset for object detection tasks - it contains Mnist annotations for 1,800 images. ## Getting Started You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model. ## License This dataset is available under the [Public Domain license](https://creativecommons.org/licenses/Public Domain).
Corrupted MNIST
kaggle.com
zip
Updated Nov 24, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shreyasi Mandal (2023). Corrupted MNIST [Dataset]. https://www.kaggle.com/datasets/shreyasi2002/corrupted-mnist/code
Explore at:
zip(55618716 bytes)Available download formats
Dataset updated
Nov 24, 2023
Authors
Shreyasi Mandal
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
This dataset consists of 60,000 images with dimensions 32x32. The images are the same as the MNIST database of handwritten digits - http://yann.lecun.com/exdb/mnist/

CHALLENGE 1. The notebook provided gets a very low test accuracy (45%) on this data, while the training accuracy was 99%. Can you get a higher accuracy? 2. Train models on the original MNIST dataset and test it on this dataset.

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F17836414%2Ff5120df23eff1cd395fa01e57964171d%2FScreenshot%202023-11-24%20at%2019.43.35.png?generation=1700835254577242&alt=media" alt="">

Notebook to get started - https://www.kaggle.com/code/shreyasi2002/testing-vgg16-on-corrupted-mnist/notebook

So, how are the images corrupted?
The MNIST images are perturbed using Projected Gradient Descent Attack (https://www.kaggle.com/code/shreyasi2002/pgd-attack-on-mnist-and-fashion-mnist)
h
notMNIST
huggingface.co
Updated Dec 21, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anubhav Maity (2023). notMNIST [Dataset]. https://huggingface.co/datasets/anubhavmaity/notMNIST
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 21, 2023
Authors
Anubhav Maity
Description
Dataset Card for "notMNIST"

Overview

The notMNIST dataset is a collection of images of letters from A to J in various fonts. It is designed as a more challenging alternative to the traditional MNIST dataset, which consists of handwritten digits. The notMNIST dataset is commonly used in machine learning and computer vision tasks for character recognition.

Dataset Information

Number of Classes: 10 (A to J) Number of Samples: 187,24 Image Size: 28 x 28 pixels… See the full description on the dataset page: https://huggingface.co/datasets/anubhavmaity/notMNIST.
Federated EMNIST Dataset
figshare.com
xz
Updated Jul 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Saroj Mali (2024). Federated EMNIST Dataset [Dataset]. http://doi.org/10.6084/m9.figshare.26308777.v1
Explore at:
xzAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.26308777.v1
Dataset updated
Jul 16, 2024
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Saroj Mali
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is derived from the Leaf repository (https://github.com/TalwalkarLab/leaf) pre-processing of the Extended MNIST dataset, grouping examples by writer. Details about Leaf were published in "LEAF: A Benchmark for Federated Settings" https://arxiv.org/abs/1812.01097Note: This dataset does not include some additional preprocessing that MNIST includes, such as size-normalization and centering. In the Federated EMNIST data, the value of 1.0 corresponds to the background, and 0.0 corresponds to the color of the digits themselves; this is the inverse of some MNIST representations, e.g. in tensorflow_datasets, where 0 corresponds to the background color, and 255 represents the color of the digit.Data set sizes:only_digits=True: 3,383 users, 10 label classestrain: 341,873 examplestest: 40,832 examplesonly_digits=False: 3,400 users, 62 label classestrain: 671,585 examplestest: 77,483 examplesRather than holding out specific users, each user's examples are split across train and test so that all users have at least one example in train and one example in test. Writers that had less than 2 examples are excluded from the data set.The tf.data.Datasets returned by tff.simulation.datasets.ClientData.create_tf_dataset_for_client will yield collections.OrderedDict objects at each iteration, with the following keys and values, in lexicographic order by key:'label': a tf.Tensor with dtype=tf.int32 and shape [1], the class label of the corresponding pixels. Labels [0-9] correspond to the digits classes, labels [10-35] correspond to the uppercase classes (e.g., label 11 is 'B'), and labels [36-61] correspond to the lowercase classes (e.g., label 37 is 'b').'pixels': a tf.Tensor with dtype=tf.float32 and shape [28, 28], containing the pixels of the handwritten digit, with values in the range [0.0, 1.0].Argsonly_digits(Optional) whether to only include examples that are from the digits [0-9] classes. If False, includes lower and upper case characters, for a total of 62 class labels.cache_dir(Optional) directory to cache the downloaded file. If None, caches in Keras' default cache directory.ReturnsTuple of (train, test) where the tuple elements are tff.simulation.datasets.ClientData objects.

Label	Description
0	T-shirt/top
1	Trouser
2	Pullover
3	Dress
4	Coat
5	Sandal
6	Shirt
7	Sneaker
8	Bag
9	Ankle boot

Facebook

Twitter

Click to copy link

Link copied

Cite

Hojjat Khodabakhsh (2019). MNIST Dataset [Dataset]. https://www.kaggle.com/datasets/hojjatk/mnist-dataset

MNIST Dataset

The MNIST database of handwritten digits (http://yann.lecun.com)

Explore at:

124 scholarly articles cite this dataset (View in Google Scholar)

zip(23112702 bytes)Available download formats

Dataset updated

Jan 8, 2019

Authors

Hojjat Khodabakhsh

Description

Context

MNIST is a subset of a larger set available from NIST (it's copied from http://yann.lecun.com/exdb/mnist/)

Content

The MNIST database of handwritten digits has a training set of 60,000 examples, and a test set of 10,000 examples. . Four files are available:

train-images-idx3-ubyte.gz: training set images (9912422 bytes)
train-labels-idx1-ubyte.gz: training set labels (28881 bytes)
t10k-images-idx3-ubyte.gz: test set images (1648877 bytes)
t10k-labels-idx1-ubyte.gz: test set labels (4542 bytes)

How to read

See sample MNIST reader

Acknowledgements

Yann LeCun, Courant Institute, NYU
Corinna Cortes, Google Labs, New York
Christopher J.C. Burges, Microsoft Research, Redmond

Inspiration

Many methods have been tested with this training set and test set (see http://yann.lecun.com/exdb/mnist/ for more details)

Clear search

Close search

Google apps

Main menu

Label	Occurrences	Label	Occurrences	Label	Occurrences
0	561	34	629	68	606
1	687	35	540	69	582
2	582	36	588	70	566
3	633	37	619	71	659
4	588	38	584	72	572
5	544	39	609	73	682
6	582	40	570	74	627
7	615	41	679	75	598
8	584	42	544	76	605
9	567	43	567	77	602
10	641	44	574	78	595
11	780	45	555	79	586
12	720	46	550	80	569
13	699	47	614	81	628
14	630	48	614	82	578
15	627	49	595	83	622
16	684	50	505	84	569
17	713	51	583	85	540
18	743	52	512	86	557
19	706	53	555	87	628
20	527	54	504	88	562
21	710	55	488	89	625
22	586	56	531	90	600
23	584	57	556	91	700
24	568	58	497	92	622
25	530	59	520	93	622
26	612	60	556	94	591
27	627	61	682	95	557
28	618	62	594	96	580
29	619	63	539	97	640
30	622	64	610	98	577
31	684	65	514	99	563
32	606	66	587
33	592	67	655

Label	Occurrences	Label	Occurrences	Label	Occurrences
0	561	34	629	68	606
1	687	35	540	69	582
2	582	36	588	70	566
3	633	37	619	71	659
4	588	38	584	72	572
5	544	39	609	73	682
6	582	40	570	74	627
7	615	41	679	75	598
8	584	42	544	76	605
9	567	43	567	77	602
10	641	44	574	78	595
11	780	45	555	79	586
12	720	46	550	80	569
13	699	47	614	81	628
14	630	48	614	82	578
15	627	49	595	83	622
16	684	50	505	84	569
17	713	51	583	85	540
18	743	52	512	86	557
19	706	53	555	87	628
20	527	54	504	88	562
21	710	55	488	89	625
22	586	56	531	90	600
23	584	57	556	91	700
24	568	58	497	92	622
25	530	59	520	93	622
26	612	60	556	94	591
27	627	61	682	95	557
28	618	62	594	96	580
29	619	63	539	97	640
30	622	64	610	98	577
31	684	65	514	99	563
32	606	66	587
33	592	67	655

MNIST Dataset

Context

Content

How to read

Acknowledgements

Inspiration

MNIST-100

Data from: Fashion-MNIST

Fashion MNIST Image Dataset

Fashion-MNIST Dataset (Image Files and CSV Labels)

mnist.pkl.gz

Hindi/Devanagari MNIST Data

Context

Content

Acknowledgements

Extended MNIST (EMNIST) dataset

not-MNIST

MNIST-100

moving_mnist

MNIST Dataset

Moving MNIST

Dataset

Contents

NMNIST

Mnist Dataset

Mnist

kmnist

MNIST-fashion-png

Dataset

Contents

Mnist Dataset

Mnist

Corrupted MNIST

notMNIST

Federated EMNIST Dataset

MNIST Dataset

The MNIST database of handwritten digits (http://yann.lecun.com)

Context

Content

How to read

Acknowledgements

Inspiration

Label	Occurrences	Label	Occurrences	Label	Occurrences
0	561	34	629	68	606
1	687	35	540	69	582
2	582	36	588	70	566
3	633	37	619	71	659
4	588	38	584	72	572
5	544	39	609	73	682
6	582	40	570	74	627
7	615	41	679	75	598
8	584	42	544	76	605
9	567	43	567	77	602
10	641	44	574	78	595
11	780	45	555	79	586
12	720	46	550	80	569
13	699	47	614	81	628
14	630	48	614	82	578
15	627	49	595	83	622
16	684	50	505	84	569
17	713	51	583	85	540
18	743	52	512	86	557
19	706	53	555	87	628
20	527	54	504	88	562
21	710	55	488	89	625
22	586	56	531	90	600
23	584	57	556	91	700
24	568	58	497	92	622
25	530	59	520	93	622
26	612	60	556	94	591
27	627	61	682	95	557
28	618	62	594	96	580
29	619	63	539	97	640
30	622	64	610	98	577
31	684	65	514	99	563
32	606	66	587
33	592	67	655