32 datasets found

T
mnist
tensorflow.org
universe.roboflow.com
+3more
Updated Jun 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). mnist [Dataset]. https://www.tensorflow.org/datasets/catalog/mnist
Explore at:
Dataset updated
Jun 1, 2024
Description
The MNIST database of handwritten digits.

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('mnist', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.

https://storage.googleapis.com/tfds-data/visualization/fig/mnist-3.0.1.png" alt="Visualization" width="500px">
P
MNIST Dataset
paperswithcode.com
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Y. LeCun; L. Bottou; Y. Bengio; P. Haffner, MNIST Dataset [Dataset]. https://paperswithcode.com/dataset/mnist
Explore at:
Authors
Y. LeCun; L. Bottou; Y. Bengio; P. Haffner
Description
The MNIST database (Modified National Institute of Standards and Technology database) is a large collection of handwritten digits. It has a training set of 60,000 examples, and a test set of 10,000 examples. It is a subset of a larger NIST Special Database 3 (digits written by employees of the United States Census Bureau) and Special Database 1 (digits written by high school students) which contain monochrome images of handwritten digits. The digits have been size-normalized and centered in a fixed-size image. The original black and white (bilevel) images from NIST were size normalized to fit in a 20x20 pixel box while preserving their aspect ratio. The resulting images contain grey levels as a result of the anti-aliasing technique used by the normalization algorithm. the images were centered in a 28x28 image by computing the center of mass of the pixels, and translating the image so as to position this point at the center of the 28x28 field.
a
MNIST
datasets.activeloop.ai
deeplake
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yann LeCun, MNIST [Dataset]. https://datasets.activeloop.ai/docs/ml/datasets/mnist/
Explore at:
deeplakeAvailable download formats
Authors
Yann LeCun
License
Attribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
License information was derived automatically
Time period covered
Jan 1, 1998 - Dec 31, 2000
Area covered
Earth
Dataset funded by
AT&T Bell Labs
Description
The MNIST dataset is a dataset of handwritten digits. It is a popular dataset for machine learning and artificial intelligence research. The dataset consists of 60,000 training images and 10,000 test images. Each image is a 28x28 pixel grayscale image of a handwritten digit. The digits are labeled from 0 to 9.
Mnist 42000 Images Dataset
universe.roboflow.com
zip
Updated Apr 25, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Roboflow (2023). Mnist 42000 Images Dataset [Dataset]. https://universe.roboflow.com/roboflow-jvuqo/mnist-42000-images-u0qdg
Explore at:
zipAvailable download formats
Dataset updated
Apr 25, 2023
Dataset provided by
Roboflow, Inc.
Authors
Roboflow
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Numbers
Description
The MNIST database (Modified National Institute of Standards and Technology database) is a large database of handwritten digits that is commonly used for training various image processing systems. The database is also widely used for training and testing in the field of machine learning. It was created by "re-mixing" the samples from NIST's original datasets. The creators felt that since NIST's training dataset was taken from American Census Bureau employees, while the testing dataset was taken from American high school students, it was not well-suited for machine learning experiments. Furthermore, the black and white images from NIST were normalized to fit into a 28x28 pixel bounding box and anti-aliased, which introduced grayscale levels.

Yann LeCun, Courant Institute, NYU Corinna Cortes, Google Labs, New York Christopher J.C. Burges, Microsoft Research, Redmond
h
mnist100
huggingface.co
Updated Aug 16, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Marcin Wierzbiński (2023). mnist100 [Dataset]. https://huggingface.co/datasets/marcin119a/mnist100
Explore at:
Dataset updated
Aug 16, 2023
Authors
Marcin Wierzbiński
License
https://choosealicense.com/licenses/gpl/https://choosealicense.com/licenses/gpl/
Description
The MNIST-100 dataset is a variation of the original MNIST dataset, consisting of 100 handwritten numbers extracted from the MNIST dataset. Unlike the traditional MNIST dataset, which contains 60,000 training images of digits from 0 to 9, the Modified MNIST-10 dataset focuses on 100 numbers. Dataset Overview:

Dataset Name: MNIST-100 Total Number of Images: train: 60000 test: 1000 Classes: 100 (Numbers from 00 to 99) Image Size: 28x56 pixels (grayscale)

Data Collection: The MNIST-100 dataset… See the full description on the dataset page: https://huggingface.co/datasets/marcin119a/mnist100.
S
MNIST Dataset
scidb.cn
Updated Feb 16, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Xuyu Zhang; Jingjing Gao; Yu Gan; Chunyuan Song; Dawei Zhang; Songlin Zhuang; Shensheng Han; Puxiang Lai; Honglin Liu (2023). MNIST Dataset [Dataset]. http://doi.org/10.57760/sciencedb.07421
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.57760/sciencedb.07421
Dataset updated
Feb 16, 2023
Dataset provided by
Science Data Bank
Authors
Xuyu Zhang; Jingjing Gao; Yu Gan; Chunyuan Song; Dawei Zhang; Songlin Zhuang; Shensheng Han; Puxiang Lai; Honglin Liu
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
MNIST is a picture data set of handwritten numbers, which was organized by the National Institute of Standards and Technology (NIST) of the United States. A total of 250 handwritten digital pictures were collected, 50% of which were high school students and 50% were from the staff of the Census Bureau. The collection purpose of this data set is to realize the recognition of handwritten digits through algorithms. The data set contains 60000 images and labels, while the test set contains 10000 images and labels. The first 5000 training sets from the initial NIST program, The last 5000 test sets from the original NIST program. The first 5000 are more regular than the last 5000, because the first 5000 data come from the employees of the US Census Bureau, and the last 5000 data come from college students.
MNIST-Federated-Learning
zenodo.org
csv, zip
Updated Jul 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ferraguig Lynda; Ferraguig Lynda; Benoit Alexandre; Benoit Alexandre; Bettinelli Mickael; Bettinelli Mickael; Lin-Kwong-Chon Christophe; Lin-Kwong-Chon Christophe (2023). MNIST-Federated-Learning [Dataset]. http://doi.org/10.5281/zenodo.8104408
Explore at:
csv, zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.8104408
Dataset updated
Jul 3, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Ferraguig Lynda; Ferraguig Lynda; Benoit Alexandre; Benoit Alexandre; Bettinelli Mickael; Bettinelli Mickael; Lin-Kwong-Chon Christophe; Lin-Kwong-Chon Christophe
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Please find below the descriptions of the three configurations for partitioning the MNIST Train dataset into 10 clients and the MNIST Train data:

Balanced Distribution: In the first configuration, the MNIST dataset is partitioned among 10 clients in a balanced manner. This means that the data samples from each class are evenly distributed among the clients. Each client receives a roughly equal number of images from each digit class, ensuring that the distribution of samples across clients is proportional and representative of the overall dataset. [ Config 1]

Heterogeneous Distribution (One Class per Client): In the second configuration, the MNIST dataset is partitioned in a heterogeneous manner, where each client is assigned a single digit class exclusively. This means that one client will only receive images of the digit '0', another client will receive images of the digit '1', and so on. In this setup, each client becomes an expert in classifying a specific digit, allowing for specialized training and evaluation. [ Config 2]

Mixed Distribution: In the third configuration, the MNIST dataset is partitioned using a mixed distribution approach. This means that the data samples from all digit classes are distributed among the 10 clients, but the distribution is not necessarily balanced. The number of samples assigned to each client may vary for different digit classes, resulting in an uneven distribution across the clients. This configuration aims to capture both the overall diversity of the dataset and the varying difficulty levels of classifying different digits. [ Config 3 ]

Mnist-dataset/
├── config1/
│ ├── client-1/
│ │ └── data.csv
│ ├── client-2/
│ │ └── data.csv
│ ├── client-3/
│ │ └── data.csv
│ └── ...
├── config2/
│ ├── client-1/
│ │ └── data.csv
│ ├── client-2/
│ │ └── data.csv
│ ├── client-3/
│ │ └── data.csv
│ └── ...
├── config3/
│ ├── client-1/
│ │ └── data.csv
│ ├── client-2/
│ │ └── data.csv
│ ├── client-3/
│ │ └── data.csv
│ └── ...
└── mnist_test.csv

***

License: Yann LeCun and Corinna Cortes hold the copyright of MNIST dataset, which is a derivative work from original NIST datasets. MNIST dataset is made available under the terms of the Creative Commons Attribution-Share Alike 3.0 license.

***
MNIST_(hand-written-numbers)
kaggle.com
Updated Sep 4, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ItzLoghotXD (2024). MNIST_(hand-written-numbers) [Dataset]. https://www.kaggle.com/datasets/itzloghotxd/mnist-hand-written-numbers/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 4, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
ItzLoghotXD
Description
Dataset

This dataset was created by ItzLoghotXD

Contents
Wildlife MNIST
zenodo.org
data.niaid.nih.gov
bin, png
Updated Jul 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Vít Škvára; Vít Škvára (2024). Wildlife MNIST [Dataset]. http://doi.org/10.5281/zenodo.7602025
Explore at:
png, binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7602025
Dataset updated
Jul 12, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Vít Škvára; Vít Škvára
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The Wildlife MNIST dataset contains MNIST digits with colored backgrounds and foregrounds with annotations, suitable for benchmarking disentangling or factor identification. Originally used for the project https://github.com/vitskvara/sgad. There are two versions - non-mixed and mixed. In the non-mixed version (data.npy and label.npy), the background and foreground textures are the same for all digits of a single MNIST class, therefore only a single label describes each sample. In the mixed version (data_test.npy and labels_test.npy), each sample image has a random digit, background and foreground (out of 10 classes for each factor of variation). Then, the label is a tuple of three numbers, describing the individual (digit,background,foreground) labels. Note that the data is scaled to the interval [-1,1], so rescaling them by computing "x*0.5 + 0.5" is necessary for some applications that require them to be in the interval [0,1]. Example images from both versions of the dataset are included. Note that the dataset was originally used in "Sauer, Axel, and Andreas Geiger. Counterfactual generative networks. 2021."
Chinese MNIST in CSV - Digit Recognizer
kaggle.com
Updated Jun 8, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
fedesoriano (2021). Chinese MNIST in CSV - Digit Recognizer [Dataset]. https://www.kaggle.com/fedesoriano/chinese-mnist-digit-recognizer/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 8, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
fedesoriano
Description
Context

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F769452%2Ff6e2d0f05093e42a67119bde723b24d5%2Fdata-original.png?generation=1600931282565624&alt=media" alt="">

The Chinese MNIST dataset uses data collected in the frame of a project at Newcastle University.

Project Description

One hundred Chinese nationals took part in data collection. Each participant wrote with a standard black ink pen all 15 numbers in a table with 15 designated regions drawn on a white A4 paper. This process was repeated 10 times with each participant. Each sheet was scanned with a resolution of 300x300 pixels. It resulted in a dataset of 15000 images, each representing one character from a set of 15 characters (grouped in samples, grouped in suites, with 10 samples/volunteer and 100 volunteers).

Further Data Processing

The project was originally downloaded from the original project page the raw images. This dataset is the CSV version of the original dataset uploaded to Kaggle by Gabriel Preda. The original Chinese MNIST dataset uploaded by him can be found at the following LINK. The only difference is that this dataset contains all the images and labels in the same unique file.

Content

The dataset contains the following:

a unique CSV file: chineseMNIST.csv

This file contains the 15000 observations and 4098 columns. Columns 1 to 4096 represent each pixel of the image (64x64). The last two columns denote the value label and the original Chinese character. The following image shows the unique labels and characters https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F769452%2F61c54df3540346d4b56cd611ba41143d%2Fchanracter_mapping.png?generation=1596618751340901&alt=media" alt="">

Acknowledgements

The original dataset from Kaggle was uploaded by Gabriel Preda. See the original Chinese MNIST dataset. The following authors collected the data: Dr. K Nazarpour and Dr. M Chen from Newcastle University.

Nazarpour, K; Chen, M (2017): Handwritten Chinese Numbers. Newcastle University. Dataset. https://doi.org/10.17634/137930-3

Inspiration

You can use this data the same way you used MNIST, KMNIST of Fashion MNIST: refine your image classification skills, use GPU & TPU to implement CNN architectures for models to perform such multiclass classifications.
MNIST dataset for Outliers Detection - [ MNIST4OD ]
figshare.com
application/gzip
Updated May 17, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Giovanni Stilo; Bardh Prenkaj (2024). MNIST dataset for Outliers Detection - [ MNIST4OD ] [Dataset]. http://doi.org/10.6084/m9.figshare.9954986.v2
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.9954986.v2
Dataset updated
May 17, 2024
Dataset provided by
Figsharehttp://figshare.com/
Authors
Giovanni Stilo; Bardh Prenkaj
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Here we present a dataset, MNIST4OD, of large size (number of dimensions and number of instances) suitable for Outliers Detection task.The dataset is based on the famous MNIST dataset (http://yann.lecun.com/exdb/mnist/).We build MNIST4OD in the following way:To distinguish between outliers and inliers, we choose the images belonging to a digit as inliers (e.g. digit 1) and we sample with uniform probability on the remaining images as outliers such as their number is equal to 10% of that of inliers. We repeat this dataset generation process for all digits. For implementation simplicity we then flatten the images (28 X 28) into vectors.Each file MNIST_x.csv.gz contains the corresponding dataset where the inlier class is equal to x.The data contains one instance (vector) in each line where the last column represents the outlier label (yes/no) of the data point. The data contains also a column which indicates the original image class (0-9).See the following numbers for a complete list of the statistics of each datasets ( Name | Instances | Dimensions | Number of Outliers in % ):MNIST_0 | 7594 | 784 | 10MNIST_1 | 8665 | 784 | 10MNIST_2 | 7689 | 784 | 10MNIST_3 | 7856 | 784 | 10MNIST_4 | 7507 | 784 | 10MNIST_5 | 6945 | 784 | 10MNIST_6 | 7564 | 784 | 10MNIST_7 | 8023 | 784 | 10MNIST_8 | 7508 | 784 | 10MNIST_9 | 7654 | 784 | 10
mnist-60000-hand-written-number-images
kaggle.com
zip
Updated Mar 18, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Syarif Hidayatullah (2021). mnist-60000-hand-written-number-images [Dataset]. https://www.kaggle.com/syarifdjumar/mnist60000handwrittennumberimages
Explore at:
zip(13674630 bytes)Available download formats
Dataset updated
Mar 18, 2021
Authors
Syarif Hidayatullah
Description
Dataset

This dataset was created by Syarif Hidayatullah

Contents
train_model_tensorflow_mnist
kaggle.com
Updated Mar 15, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Murad Al Dahmashi (2021). train_model_tensorflow_mnist [Dataset]. https://www.kaggle.com/muradaldahmashi/train-model-tensorflow-mnist/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 15, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Murad Al Dahmashi
Description
Dataset

This dataset was created by Murad Al Dahmashi

Contents
h
notMNIST
huggingface.co
Updated Dec 21, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anubhav Maity (2023). notMNIST [Dataset]. https://huggingface.co/datasets/anubhavmaity/notMNIST
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 21, 2023
Authors
Anubhav Maity
Description
Dataset Card for "notMNIST"

Overview

The notMNIST dataset is a collection of images of letters from A to J in various fonts. It is designed as a more challenging alternative to the traditional MNIST dataset, which consists of handwritten digits. The notMNIST dataset is commonly used in machine learning and computer vision tasks for character recognition.

Dataset Information

Number of Classes: 10 (A to J) Number of Samples: 187,24 Image Size: 28 x 28 pixels… See the full description on the dataset page: https://huggingface.co/datasets/anubhavmaity/notMNIST.

MNIST IDX Dataset- Fasion

kaggle.com

Updated May 21, 2025

Facebook

Twitter

Click to copy link

Link copied

Cite

ShreyaSuresh (2025). MNIST IDX Dataset- Fasion [Dataset]. https://www.kaggle.com/datasets/shreyasuresh0407/mnist-idx-dataset-fasion

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

May 21, 2025

Dataset provided by

Kagglehttp://kaggle.com/

Authors

ShreyaSuresh

Description

📦 About the Dataset

This project uses a classic machine learning dataset of handwritten digits — the MNIST dataset — stored in IDX format.

🧠 Each image is a 28x28 pixel grayscale picture of a handwritten number from 0 to 9. Your task is to teach a simple neural network (your "brain") to recognize these digits.

🔍 What’s Inside?

File Name	Description
`train-images-idx3-ubyte`	🖼️ 60,000 training images (28x28 pixels each)
`train-labels-idx1-ubyte`	🔢 Labels (0–9) for each training image
`t10k-images-idx3-ubyte`	🖼️ 10,000 test images
`t10k-labels-idx1-ubyte`	🔢 Labels (0–9) for test images

All files are in the IDX binary format, which is compact and fast for loading, but needs to be parsed using a small Python function (see below 👇).

###✨ Why This Dataset Is Awesome

🎯 It's the “Hello World” of machine learning — perfect for beginners
📊 Ideal for testing image classification algorithms
🧠 Helps you learn how neural networks "see" numbers
💥 Small enough to train quickly, powerful enough to learn real skills

🧩 Sample Image

(Add this cell below in your notebook to visualize a few images)

import matplotlib.pyplot as plt

# Show the first 10 images
fig, axes = plt.subplots(1, 10, figsize=(15, 2))
for i in range(10):
  axes[i].imshow(train_images[i][0], cmap="gray")
  axes[i].set_title(f"Label: {train_labels[i].item()}")
  axes[i].axis("off")
plt.show()

MNIST in CSV
kaggle.com
zip
Updated May 19, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dariel Dato-on (2018). MNIST in CSV [Dataset]. https://www.kaggle.com/oddrationale/mnist-in-csv
Explore at:
zip(15948628 bytes)Available download formats
Dataset updated
May 19, 2018
Authors
Dariel Dato-on
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
The MNIST dataset provided in a easy-to-use CSV format

The original dataset is in a format that is difficult for beginners to use. This dataset uses the work of Joseph Redmon to provide the MNIST dataset in a CSV format.

The dataset consists of two files:

mnist_train.csv

mnist_test.csv

The mnist_train.csv file contains the 60,000 training examples and labels. The mnist_test.csv contains 10,000 test examples and labels. Each row consists of 785 values: the first value is the label (a number from 0 to 9) and the remaining 784 values are the pixel values (a number from 0 to 255).
Safran-MNIST-DLS
zenodo.org
application/gzip, csv
Updated Dec 5, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sofia Marino; Jennifer Vandoni; Ichraq Lemghari; Basile Musquer; Thierry Arsaut; Sofia Marino; Jennifer Vandoni; Ichraq Lemghari; Basile Musquer; Thierry Arsaut (2025). Safran-MNIST-DLS [Dataset]. http://doi.org/10.5281/zenodo.13321202
Explore at:
csv, application/gzipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.13321202
Dataset updated
Dec 5, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Sofia Marino; Jennifer Vandoni; Ichraq Lemghari; Basile Musquer; Thierry Arsaut; Sofia Marino; Jennifer Vandoni; Ichraq Lemghari; Basile Musquer; Thierry Arsaut
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Time period covered
Apr 30, 2024
Description
This dataset contains images of serial numbers extracted from diverse avionic parts manufactured by SAFRAN, the international high-technology group and world leader operating in the aviation (propulsion, equipment and interiors), defense and space markets. This dataset resembles the well-known MNIST dataset, but with a focus to industrial contexts, encompassing variations in lighting conditions, orientations, writing styles and surface textures.

The dataset contains 32 classes depicting numbers, alphabetic characters, and symbols, namely: [0, 1, 2, 3, 4, 5, 5, 6, 7, 8, 9, A, B, C, D, E, F, G, H, J, K, L, M, N, P, R, S, T, U, W, Y, /, .]

April 30th, 2024 : Training dataset containing 9314 images without labels is released.

December 5th, 2024 : Testing and validation datasets released, ground-truth labels for training, validation and testing released.

This dataset has been proposed in the context of https://dagecc-challenge.github.io/icpr2024/" href="https://dagecc-challenge.github.io/icpr2024/" target="_blank" rel="noreferrer noopener">ICPR24 DAGECC Competition
MNIST FASHION
kaggle.com
zip
Updated Sep 28, 2017
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
bahadir60 (2017). MNIST FASHION [Dataset]. https://www.kaggle.com/bahadir60/mnistfashion
Explore at:
zip(23155203 bytes)Available download formats
Dataset updated
Sep 28, 2017
Authors
bahadir60
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Context

Fashion-MNIST is a dataset of Zalando's article images—consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grayscale image, associated with a label from 10 classes. Zalando intends Fashion-MNIST to serve as a direct drop-in replacement for the original MNIST dataset for benchmarking machine learning algorithms. It shares the same image size and structure of training and testing splits.

The original MNIST dataset contains a lot of handwritten digits. Members of the AI/ML/Data Science community love this dataset and use it as a benchmark to validate their algorithms. In fact, MNIST is often the first dataset researchers try. "If it doesn't work on MNIST, it won't work at all", they said. "Well, if it does work on MNIST, it may still fail on others."

Zalando seeks to replace the original MNIST dataset

Content

Each image is 28 pixels in height and 28 pixels in width, for a total of 784 pixels in total. Each pixel has a single pixel-value associated with it, indicating the lightness or darkness of that pixel, with higher numbers meaning darker. This pixel-value is an integer between 0 and 255. The training and test data sets have 785 columns. The first column consists of the class labels (see above), and represents the article of clothing. The rest of the columns contain the pixel-values of the associated image.

To locate a pixel on the image, suppose that we have decomposed x as x = i * 28 + j, where i and j are integers between 0 and 27. The pixel is located on row i and column j of a 28 x 28 matrix. For example, pixel31 indicates the pixel that is in the fourth column from the left, and the second row from the top, as in the ascii-diagram below.

Labels

Each training and test example is assigned to one of the following labels:

0 T-shirt/top 1 Trouser 2 Pullover 3 Dress 4 Coat 5 Sandal 6 Shirt 7 Sneaker 8 Bag 9 Ankle boot

TL;DR

Each row is a separate image Column 1 is the class label. Remaining columns are pixel numbers (784 total). Each value is the darkness of the pixel (1 to 255) Acknowledgements

Original dataset was downloaded from https://github.com/zalandoresearch/fashion-mnist Dataset was converted to CSV with this script: https://pjreddie.com/projects/mnist-in-csv/ License

The MIT License (MIT) Copyright © [2017] Zalando SE, https://tech.zalando.com

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
h
svhn
huggingface.co
Updated Jul 23, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Genius Society (2025). svhn [Dataset]. https://huggingface.co/datasets/Genius-Society/svhn
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 23, 2025
Dataset authored and provided by
Genius Society
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset card for SVHN

The Street View House Numbers (SVHN) dataset is a real-world image dataset developed and designed for machine learning and object recognition algorithms, and is characterized by low data preprocessing and formatting requirements. Similar to MNIST, SVHN contains images of small cropped numbers, but in terms of labeled data, SVHN is an order of magnitude larger than MNIST, comprising over 600,000 digital images. Unlike MNIST, SVHN deals with a much more… See the full description on the dataset page: https://huggingface.co/datasets/Genius-Society/svhn.
Number Ops Dataset
universe.roboflow.com
zip
Updated May 7, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Roboflow 100 (2023). Number Ops Dataset [Dataset]. https://universe.roboflow.com/roboflow-100/number-ops/model/1
Explore at:
zipAvailable download formats
Dataset updated
May 7, 2023
Dataset provided by
Roboflow, Inc.
Authors
Roboflow 100
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Number Ops Bounding Boxes
Description
This dataset was originally created by Pavel Kulikov, Djopa Volosata, Daria Podryadova. To see the current project, which may have been updated since this version, please go here: https://universe.roboflow.com/mnist-bvalq/mnist-icrul.

This dataset is part of RF100, an Intel-sponsored initiative to create a new object detection benchmark for model generalizability.

Access the RF100 Github repo: https://github.com/roboflow-ai/roboflow-100-benchmark

Facebook

Twitter

Click to copy link

Link copied

Cite

(2024). mnist [Dataset]. https://www.tensorflow.org/datasets/catalog/mnist

mnist

Explore at:

77 scholarly articles cite this dataset (View in Google Scholar)

Dataset updated

Jun 1, 2024

Description

The MNIST database of handwritten digits.

To use this dataset:

import tensorflow_datasets as tfds

ds = tfds.load('mnist', split='train')
for ex in ds.take(4):
 print(ex)

See the guide for more informations on tensorflow_datasets.

https://storage.googleapis.com/tfds-data/visualization/fig/mnist-3.0.1.png" alt="Visualization" width="500px">

Clear search

Close search

Google apps

Main menu

mnist

MNIST Dataset

MNIST

Mnist 42000 Images Dataset

mnist100

MNIST Dataset

MNIST-Federated-Learning

MNIST_(hand-written-numbers)

Dataset

Contents

Wildlife MNIST

Chinese MNIST in CSV - Digit Recognizer

Context

Project Description

Further Data Processing

Content

Acknowledgements

Inspiration

MNIST dataset for Outliers Detection - [ MNIST4OD ]

mnist-60000-hand-written-number-images

Dataset

Contents

train_model_tensorflow_mnist

Dataset

Contents

notMNIST

MNIST IDX Dataset- Fasion

🔍 What’s Inside?

🧩 Sample Image

MNIST in CSV

The MNIST dataset provided in a easy-to-use CSV format

Safran-MNIST-DLS

MNIST FASHION

svhn

Number Ops Dataset

mnist