32 datasets found
  1. T

    mnist

    • tensorflow.org
    • universe.roboflow.com
    • +3more
    Updated Jun 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). mnist [Dataset]. https://www.tensorflow.org/datasets/catalog/mnist
    Explore at:
    Dataset updated
    Jun 1, 2024
    Description

    The MNIST database of handwritten digits.

    To use this dataset:

    import tensorflow_datasets as tfds
    
    ds = tfds.load('mnist', split='train')
    for ex in ds.take(4):
     print(ex)
    

    See the guide for more informations on tensorflow_datasets.

    https://storage.googleapis.com/tfds-data/visualization/fig/mnist-3.0.1.png" alt="Visualization" width="500px">

  2. P

    MNIST Dataset

    • paperswithcode.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Y. LeCun; L. Bottou; Y. Bengio; P. Haffner, MNIST Dataset [Dataset]. https://paperswithcode.com/dataset/mnist
    Explore at:
    Authors
    Y. LeCun; L. Bottou; Y. Bengio; P. Haffner
    Description

    The MNIST database (Modified National Institute of Standards and Technology database) is a large collection of handwritten digits. It has a training set of 60,000 examples, and a test set of 10,000 examples. It is a subset of a larger NIST Special Database 3 (digits written by employees of the United States Census Bureau) and Special Database 1 (digits written by high school students) which contain monochrome images of handwritten digits. The digits have been size-normalized and centered in a fixed-size image. The original black and white (bilevel) images from NIST were size normalized to fit in a 20x20 pixel box while preserving their aspect ratio. The resulting images contain grey levels as a result of the anti-aliasing technique used by the normalization algorithm. the images were centered in a 28x28 image by computing the center of mass of the pixels, and translating the image so as to position this point at the center of the 28x28 field.

  3. a

    MNIST

    • datasets.activeloop.ai
    deeplake
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yann LeCun, MNIST [Dataset]. https://datasets.activeloop.ai/docs/ml/datasets/mnist/
    Explore at:
    deeplakeAvailable download formats
    Authors
    Yann LeCun
    License

    Attribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
    License information was derived automatically

    Time period covered
    Jan 1, 1998 - Dec 31, 2000
    Area covered
    Earth
    Dataset funded by
    AT&T Bell Labs
    Description

    The MNIST dataset is a dataset of handwritten digits. It is a popular dataset for machine learning and artificial intelligence research. The dataset consists of 60,000 training images and 10,000 test images. Each image is a 28x28 pixel grayscale image of a handwritten digit. The digits are labeled from 0 to 9.

  4. Mnist 42000 Images Dataset

    • universe.roboflow.com
    zip
    Updated Apr 25, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Roboflow (2023). Mnist 42000 Images Dataset [Dataset]. https://universe.roboflow.com/roboflow-jvuqo/mnist-42000-images-u0qdg
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 25, 2023
    Dataset provided by
    Roboflow, Inc.
    Authors
    Roboflow
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Numbers
    Description

    The MNIST database (Modified National Institute of Standards and Technology database) is a large database of handwritten digits that is commonly used for training various image processing systems. The database is also widely used for training and testing in the field of machine learning. It was created by "re-mixing" the samples from NIST's original datasets. The creators felt that since NIST's training dataset was taken from American Census Bureau employees, while the testing dataset was taken from American high school students, it was not well-suited for machine learning experiments. Furthermore, the black and white images from NIST were normalized to fit into a 28x28 pixel bounding box and anti-aliased, which introduced grayscale levels.

    Yann LeCun, Courant Institute, NYU Corinna Cortes, Google Labs, New York Christopher J.C. Burges, Microsoft Research, Redmond

  5. h

    mnist100

    • huggingface.co
    Updated Aug 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marcin Wierzbiński (2023). mnist100 [Dataset]. https://huggingface.co/datasets/marcin119a/mnist100
    Explore at:
    Dataset updated
    Aug 16, 2023
    Authors
    Marcin Wierzbiński
    License

    https://choosealicense.com/licenses/gpl/https://choosealicense.com/licenses/gpl/

    Description

    The MNIST-100 dataset is a variation of the original MNIST dataset, consisting of 100 handwritten numbers extracted from the MNIST dataset. Unlike the traditional MNIST dataset, which contains 60,000 training images of digits from 0 to 9, the Modified MNIST-10 dataset focuses on 100 numbers. Dataset Overview:

    Dataset Name: MNIST-100 Total Number of Images: train: 60000 test: 1000 Classes: 100 (Numbers from 00 to 99) Image Size: 28x56 pixels (grayscale)

    Data Collection: The MNIST-100 dataset… See the full description on the dataset page: https://huggingface.co/datasets/marcin119a/mnist100.

  6. S

    MNIST Dataset

    • scidb.cn
    Updated Feb 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Xuyu Zhang; Jingjing Gao; Yu Gan; Chunyuan Song; Dawei Zhang; Songlin Zhuang; Shensheng Han; Puxiang Lai; Honglin Liu (2023). MNIST Dataset [Dataset]. http://doi.org/10.57760/sciencedb.07421
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 16, 2023
    Dataset provided by
    Science Data Bank
    Authors
    Xuyu Zhang; Jingjing Gao; Yu Gan; Chunyuan Song; Dawei Zhang; Songlin Zhuang; Shensheng Han; Puxiang Lai; Honglin Liu
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    MNIST is a picture data set of handwritten numbers, which was organized by the National Institute of Standards and Technology (NIST) of the United States. A total of 250 handwritten digital pictures were collected, 50% of which were high school students and 50% were from the staff of the Census Bureau. The collection purpose of this data set is to realize the recognition of handwritten digits through algorithms. The data set contains 60000 images and labels, while the test set contains 10000 images and labels. The first 5000 training sets from the initial NIST program, The last 5000 test sets from the original NIST program. The first 5000 are more regular than the last 5000, because the first 5000 data come from the employees of the US Census Bureau, and the last 5000 data come from college students.

  7. MNIST-Federated-Learning

    • zenodo.org
    csv, zip
    Updated Jul 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ferraguig Lynda; Ferraguig Lynda; Benoit Alexandre; Benoit Alexandre; Bettinelli Mickael; Bettinelli Mickael; Lin-Kwong-Chon Christophe; Lin-Kwong-Chon Christophe (2023). MNIST-Federated-Learning [Dataset]. http://doi.org/10.5281/zenodo.8104408
    Explore at:
    csv, zipAvailable download formats
    Dataset updated
    Jul 3, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Ferraguig Lynda; Ferraguig Lynda; Benoit Alexandre; Benoit Alexandre; Bettinelli Mickael; Bettinelli Mickael; Lin-Kwong-Chon Christophe; Lin-Kwong-Chon Christophe
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Please find below the descriptions of the three configurations for partitioning the MNIST Train dataset into 10 clients and the MNIST Train data:

    1. Balanced Distribution: In the first configuration, the MNIST dataset is partitioned among 10 clients in a balanced manner. This means that the data samples from each class are evenly distributed among the clients. Each client receives a roughly equal number of images from each digit class, ensuring that the distribution of samples across clients is proportional and representative of the overall dataset. [ Config 1]
    2. Heterogeneous Distribution (One Class per Client): In the second configuration, the MNIST dataset is partitioned in a heterogeneous manner, where each client is assigned a single digit class exclusively. This means that one client will only receive images of the digit '0', another client will receive images of the digit '1', and so on. In this setup, each client becomes an expert in classifying a specific digit, allowing for specialized training and evaluation. [ Config 2]
    3. Mixed Distribution: In the third configuration, the MNIST dataset is partitioned using a mixed distribution approach. This means that the data samples from all digit classes are distributed among the 10 clients, but the distribution is not necessarily balanced. The number of samples assigned to each client may vary for different digit classes, resulting in an uneven distribution across the clients. This configuration aims to capture both the overall diversity of the dataset and the varying difficulty levels of classifying different digits. [ Config 3 ]

    Mnist-dataset/
    ├── config1/
    │ ├── client-1/
    │ │ └── data.csv
    │ ├── client-2/
    │ │ └── data.csv
    │ ├── client-3/
    │ │ └── data.csv
    │ └── ...
    ├── config2/
    │ ├── client-1/
    │ │ └── data.csv
    │ ├── client-2/
    │ │ └── data.csv
    │ ├── client-3/
    │ │ └── data.csv
    │ └── ...
    ├── config3/
    │ ├── client-1/
    │ │ └── data.csv
    │ ├── client-2/
    │ │ └── data.csv
    │ ├── client-3/
    │ │ └── data.csv
    │ └── ...
    └── mnist_test.csv

    ***

    License: Yann LeCun and Corinna Cortes hold the copyright of MNIST dataset, which is a derivative work from original NIST datasets. MNIST dataset is made available under the terms of the Creative Commons Attribution-Share Alike 3.0 license.

    ***

  8. MNIST_(hand-written-numbers)

    • kaggle.com
    Updated Sep 4, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ItzLoghotXD (2024). MNIST_(hand-written-numbers) [Dataset]. https://www.kaggle.com/datasets/itzloghotxd/mnist-hand-written-numbers/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 4, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    ItzLoghotXD
    Description

    Dataset

    This dataset was created by ItzLoghotXD

    Contents

  9. Wildlife MNIST

    • zenodo.org
    • data.niaid.nih.gov
    bin, png
    Updated Jul 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vít Škvára; Vít Škvára (2024). Wildlife MNIST [Dataset]. http://doi.org/10.5281/zenodo.7602025
    Explore at:
    png, binAvailable download formats
    Dataset updated
    Jul 12, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Vít Škvára; Vít Škvára
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The Wildlife MNIST dataset contains MNIST digits with colored backgrounds and foregrounds with annotations, suitable for benchmarking disentangling or factor identification. Originally used for the project https://github.com/vitskvara/sgad. There are two versions - non-mixed and mixed. In the non-mixed version (data.npy and label.npy), the background and foreground textures are the same for all digits of a single MNIST class, therefore only a single label describes each sample. In the mixed version (data_test.npy and labels_test.npy), each sample image has a random digit, background and foreground (out of 10 classes for each factor of variation). Then, the label is a tuple of three numbers, describing the individual (digit,background,foreground) labels. Note that the data is scaled to the interval [-1,1], so rescaling them by computing "x*0.5 + 0.5" is necessary for some applications that require them to be in the interval [0,1]. Example images from both versions of the dataset are included. Note that the dataset was originally used in "Sauer, Axel, and Andreas Geiger. Counterfactual generative networks. 2021."

  10. Chinese MNIST in CSV - Digit Recognizer

    • kaggle.com
    Updated Jun 8, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    fedesoriano (2021). Chinese MNIST in CSV - Digit Recognizer [Dataset]. https://www.kaggle.com/fedesoriano/chinese-mnist-digit-recognizer/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 8, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    fedesoriano
    Description

    Context

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F769452%2Ff6e2d0f05093e42a67119bde723b24d5%2Fdata-original.png?generation=1600931282565624&alt=media" alt="">

    The Chinese MNIST dataset uses data collected in the frame of a project at Newcastle University.

    Project Description

    One hundred Chinese nationals took part in data collection. Each participant wrote with a standard black ink pen all 15 numbers in a table with 15 designated regions drawn on a white A4 paper. This process was repeated 10 times with each participant. Each sheet was scanned with a resolution of 300x300 pixels. It resulted in a dataset of 15000 images, each representing one character from a set of 15 characters (grouped in samples, grouped in suites, with 10 samples/volunteer and 100 volunteers).

    Further Data Processing

    The project was originally downloaded from the original project page the raw images. This dataset is the CSV version of the original dataset uploaded to Kaggle by Gabriel Preda. The original Chinese MNIST dataset uploaded by him can be found at the following LINK. The only difference is that this dataset contains all the images and labels in the same unique file.

    Content

    The dataset contains the following:

    • a unique CSV file: chineseMNIST.csv

    This file contains the 15000 observations and 4098 columns. Columns 1 to 4096 represent each pixel of the image (64x64). The last two columns denote the value label and the original Chinese character. The following image shows the unique labels and characters https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F769452%2F61c54df3540346d4b56cd611ba41143d%2Fchanracter_mapping.png?generation=1596618751340901&alt=media" alt="">

    Acknowledgements

    The original dataset from Kaggle was uploaded by Gabriel Preda. See the original Chinese MNIST dataset. The following authors collected the data: Dr. K Nazarpour and Dr. M Chen from Newcastle University.

    Nazarpour, K; Chen, M (2017): Handwritten Chinese Numbers. Newcastle University. Dataset. https://doi.org/10.17634/137930-3

    Inspiration

    You can use this data the same way you used MNIST, KMNIST of Fashion MNIST: refine your image classification skills, use GPU & TPU to implement CNN architectures for models to perform such multiclass classifications.

  11. MNIST dataset for Outliers Detection - [ MNIST4OD ]

    • figshare.com
    application/gzip
    Updated May 17, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Giovanni Stilo; Bardh Prenkaj (2024). MNIST dataset for Outliers Detection - [ MNIST4OD ] [Dataset]. http://doi.org/10.6084/m9.figshare.9954986.v2
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    May 17, 2024
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Giovanni Stilo; Bardh Prenkaj
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Here we present a dataset, MNIST4OD, of large size (number of dimensions and number of instances) suitable for Outliers Detection task.The dataset is based on the famous MNIST dataset (http://yann.lecun.com/exdb/mnist/).We build MNIST4OD in the following way:To distinguish between outliers and inliers, we choose the images belonging to a digit as inliers (e.g. digit 1) and we sample with uniform probability on the remaining images as outliers such as their number is equal to 10% of that of inliers. We repeat this dataset generation process for all digits. For implementation simplicity we then flatten the images (28 X 28) into vectors.Each file MNIST_x.csv.gz contains the corresponding dataset where the inlier class is equal to x.The data contains one instance (vector) in each line where the last column represents the outlier label (yes/no) of the data point. The data contains also a column which indicates the original image class (0-9).See the following numbers for a complete list of the statistics of each datasets ( Name | Instances | Dimensions | Number of Outliers in % ):MNIST_0 | 7594 | 784 | 10MNIST_1 | 8665 | 784 | 10MNIST_2 | 7689 | 784 | 10MNIST_3 | 7856 | 784 | 10MNIST_4 | 7507 | 784 | 10MNIST_5 | 6945 | 784 | 10MNIST_6 | 7564 | 784 | 10MNIST_7 | 8023 | 784 | 10MNIST_8 | 7508 | 784 | 10MNIST_9 | 7654 | 784 | 10

  12. mnist-60000-hand-written-number-images

    • kaggle.com
    zip
    Updated Mar 18, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Syarif Hidayatullah (2021). mnist-60000-hand-written-number-images [Dataset]. https://www.kaggle.com/syarifdjumar/mnist60000handwrittennumberimages
    Explore at:
    zip(13674630 bytes)Available download formats
    Dataset updated
    Mar 18, 2021
    Authors
    Syarif Hidayatullah
    Description

    Dataset

    This dataset was created by Syarif Hidayatullah

    Contents

  13. train_model_tensorflow_mnist

    • kaggle.com
    Updated Mar 15, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Murad Al Dahmashi (2021). train_model_tensorflow_mnist [Dataset]. https://www.kaggle.com/muradaldahmashi/train-model-tensorflow-mnist/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 15, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Murad Al Dahmashi
    Description

    Dataset

    This dataset was created by Murad Al Dahmashi

    Contents

  14. h

    notMNIST

    • huggingface.co
    Updated Dec 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anubhav Maity (2023). notMNIST [Dataset]. https://huggingface.co/datasets/anubhavmaity/notMNIST
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 21, 2023
    Authors
    Anubhav Maity
    Description

    Dataset Card for "notMNIST"

      Overview
    

    The notMNIST dataset is a collection of images of letters from A to J in various fonts. It is designed as a more challenging alternative to the traditional MNIST dataset, which consists of handwritten digits. The notMNIST dataset is commonly used in machine learning and computer vision tasks for character recognition.

      Dataset Information
    

    Number of Classes: 10 (A to J) Number of Samples: 187,24 Image Size: 28 x 28 pixels… See the full description on the dataset page: https://huggingface.co/datasets/anubhavmaity/notMNIST.

  15. MNIST IDX Dataset- Fasion

    • kaggle.com
    Updated May 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ShreyaSuresh (2025). MNIST IDX Dataset- Fasion [Dataset]. https://www.kaggle.com/datasets/shreyasuresh0407/mnist-idx-dataset-fasion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 21, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    ShreyaSuresh
    Description

    📦 About the Dataset

    This project uses a classic machine learning dataset of handwritten digits — the MNIST dataset — stored in IDX format.

    đź§  Each image is a 28x28 pixel grayscale picture of a handwritten number from 0 to 9. Your task is to teach a simple neural network (your "brain") to recognize these digits.

    🔍 What’s Inside?

    File NameDescription
    train-images-idx3-ubyte🖼️ 60,000 training images (28x28 pixels each)
    train-labels-idx1-ubyte🔢 Labels (0–9) for each training image
    t10k-images-idx3-ubyte🖼️ 10,000 test images
    t10k-labels-idx1-ubyte🔢 Labels (0–9) for test images

    All files are in the IDX binary format, which is compact and fast for loading, but needs to be parsed using a small Python function (see below 👇).

    ###✨ Why This Dataset Is Awesome

    • 🎯 It's the “Hello World” of machine learning — perfect for beginners
    • 📊 Ideal for testing image classification algorithms
    • đź§  Helps you learn how neural networks "see" numbers
    • đź’Ą Small enough to train quickly, powerful enough to learn real skills

    đź§© Sample Image

    (Add this cell below in your notebook to visualize a few images)

    import matplotlib.pyplot as plt
    
    # Show the first 10 images
    fig, axes = plt.subplots(1, 10, figsize=(15, 2))
    for i in range(10):
      axes[i].imshow(train_images[i][0], cmap="gray")
      axes[i].set_title(f"Label: {train_labels[i].item()}")
      axes[i].axis("off")
    plt.show()
    
  16. MNIST in CSV

    • kaggle.com
    zip
    Updated May 19, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dariel Dato-on (2018). MNIST in CSV [Dataset]. https://www.kaggle.com/oddrationale/mnist-in-csv
    Explore at:
    zip(15948628 bytes)Available download formats
    Dataset updated
    May 19, 2018
    Authors
    Dariel Dato-on
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    The MNIST dataset provided in a easy-to-use CSV format

    The original dataset is in a format that is difficult for beginners to use. This dataset uses the work of Joseph Redmon to provide the MNIST dataset in a CSV format.

    The dataset consists of two files:

    1. mnist_train.csv
    2. mnist_test.csv

    The mnist_train.csv file contains the 60,000 training examples and labels. The mnist_test.csv contains 10,000 test examples and labels. Each row consists of 785 values: the first value is the label (a number from 0 to 9) and the remaining 784 values are the pixel values (a number from 0 to 255).

  17. Safran-MNIST-DLS

    • zenodo.org
    application/gzip, csv
    Updated Dec 5, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sofia Marino; Jennifer Vandoni; Ichraq Lemghari; Basile Musquer; Thierry Arsaut; Sofia Marino; Jennifer Vandoni; Ichraq Lemghari; Basile Musquer; Thierry Arsaut (2025). Safran-MNIST-DLS [Dataset]. http://doi.org/10.5281/zenodo.13321202
    Explore at:
    csv, application/gzipAvailable download formats
    Dataset updated
    Dec 5, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Sofia Marino; Jennifer Vandoni; Ichraq Lemghari; Basile Musquer; Thierry Arsaut; Sofia Marino; Jennifer Vandoni; Ichraq Lemghari; Basile Musquer; Thierry Arsaut
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Time period covered
    Apr 30, 2024
    Description

    This dataset contains images of serial numbers extracted from diverse avionic parts manufactured by SAFRAN, the international high-technology group and world leader operating in the aviation (propulsion, equipment and interiors), defense and space markets. This dataset resembles the well-known MNIST dataset, but with a focus to industrial contexts, encompassing variations in lighting conditions, orientations, writing styles and surface textures.

    The dataset contains 32 classes depicting numbers, alphabetic characters, and symbols, namely: [0, 1, 2, 3, 4, 5, 5, 6, 7, 8, 9, A, B, C, D, E, F, G, H, J, K, L, M, N, P, R, S, T, U, W, Y, /, .]

    April 30th, 2024 : Training dataset containing 9314 images without labels is released.

    December 5th, 2024 : Testing and validation datasets released, ground-truth labels for training, validation and testing released.

    This dataset has been proposed in the context of https://dagecc-challenge.github.io/icpr2024/" href="https://dagecc-challenge.github.io/icpr2024/" target="_blank" rel="noreferrer noopener">ICPR24 DAGECC Competition

  18. MNIST FASHION

    • kaggle.com
    zip
    Updated Sep 28, 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    bahadir60 (2017). MNIST FASHION [Dataset]. https://www.kaggle.com/bahadir60/mnistfashion
    Explore at:
    zip(23155203 bytes)Available download formats
    Dataset updated
    Sep 28, 2017
    Authors
    bahadir60
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    Fashion-MNIST is a dataset of Zalando's article images—consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grayscale image, associated with a label from 10 classes. Zalando intends Fashion-MNIST to serve as a direct drop-in replacement for the original MNIST dataset for benchmarking machine learning algorithms. It shares the same image size and structure of training and testing splits.

    The original MNIST dataset contains a lot of handwritten digits. Members of the AI/ML/Data Science community love this dataset and use it as a benchmark to validate their algorithms. In fact, MNIST is often the first dataset researchers try. "If it doesn't work on MNIST, it won't work at all", they said. "Well, if it does work on MNIST, it may still fail on others."

    Zalando seeks to replace the original MNIST dataset

    Content

    Each image is 28 pixels in height and 28 pixels in width, for a total of 784 pixels in total. Each pixel has a single pixel-value associated with it, indicating the lightness or darkness of that pixel, with higher numbers meaning darker. This pixel-value is an integer between 0 and 255. The training and test data sets have 785 columns. The first column consists of the class labels (see above), and represents the article of clothing. The rest of the columns contain the pixel-values of the associated image.

    To locate a pixel on the image, suppose that we have decomposed x as x = i * 28 + j, where i and j are integers between 0 and 27. The pixel is located on row i and column j of a 28 x 28 matrix. For example, pixel31 indicates the pixel that is in the fourth column from the left, and the second row from the top, as in the ascii-diagram below.

    Labels

    Each training and test example is assigned to one of the following labels:

    0 T-shirt/top 1 Trouser 2 Pullover 3 Dress 4 Coat 5 Sandal 6 Shirt 7 Sneaker 8 Bag 9 Ankle boot

    TL;DR

    Each row is a separate image Column 1 is the class label. Remaining columns are pixel numbers (784 total). Each value is the darkness of the pixel (1 to 255) Acknowledgements

    Original dataset was downloaded from https://github.com/zalandoresearch/fashion-mnist Dataset was converted to CSV with this script: https://pjreddie.com/projects/mnist-in-csv/ License

    The MIT License (MIT) Copyright © [2017] Zalando SE, https://tech.zalando.com

    Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

    The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

    THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

  19. h

    svhn

    • huggingface.co
    Updated Jul 23, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Genius Society (2025). svhn [Dataset]. https://huggingface.co/datasets/Genius-Society/svhn
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 23, 2025
    Dataset authored and provided by
    Genius Society
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset card for SVHN

    The Street View House Numbers (SVHN) dataset is a real-world image dataset developed and designed for machine learning and object recognition algorithms, and is characterized by low data preprocessing and formatting requirements. Similar to MNIST, SVHN contains images of small cropped numbers, but in terms of labeled data, SVHN is an order of magnitude larger than MNIST, comprising over 600,000 digital images. Unlike MNIST, SVHN deals with a much more… See the full description on the dataset page: https://huggingface.co/datasets/Genius-Society/svhn.

  20. Number Ops Dataset

    • universe.roboflow.com
    zip
    Updated May 7, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Roboflow 100 (2023). Number Ops Dataset [Dataset]. https://universe.roboflow.com/roboflow-100/number-ops/model/1
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 7, 2023
    Dataset provided by
    Roboflow, Inc.
    Authors
    Roboflow 100
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Number Ops Bounding Boxes
    Description

    This dataset was originally created by Pavel Kulikov, Djopa Volosata, Daria Podryadova. To see the current project, which may have been updated since this version, please go here: https://universe.roboflow.com/mnist-bvalq/mnist-icrul.

    This dataset is part of RF100, an Intel-sponsored initiative to create a new object detection benchmark for model generalizability.

    Access the RF100 Github repo: https://github.com/roboflow-ai/roboflow-100-benchmark

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
(2024). mnist [Dataset]. https://www.tensorflow.org/datasets/catalog/mnist

mnist

Explore at:
77 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Jun 1, 2024
Description

The MNIST database of handwritten digits.

To use this dataset:

import tensorflow_datasets as tfds

ds = tfds.load('mnist', split='train')
for ex in ds.take(4):
 print(ex)

See the guide for more informations on tensorflow_datasets.

https://storage.googleapis.com/tfds-data/visualization/fig/mnist-3.0.1.png" alt="Visualization" width="500px">

Search
Clear search
Close search
Google apps
Main menu