19 datasets found
  1. P

    EMNIST Dataset

    • paperswithcode.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gregory Cohen; Saeed Afshar; Jonathan Tapson; André van Schaik, EMNIST Dataset [Dataset]. https://paperswithcode.com/dataset/emnist
    Explore at:
    Authors
    Gregory Cohen; Saeed Afshar; Jonathan Tapson; André van Schaik
    Description

    EMNIST (extended MNIST) has 4 times more data than MNIST. It is a set of handwritten digits with a 28 x 28 format.

  2. h

    emnist-letters

    • huggingface.co
    Updated Aug 5, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Royc30ne (2024). emnist-letters [Dataset]. https://huggingface.co/datasets/Royc30ne/emnist-letters
    Explore at:
    Dataset updated
    Aug 5, 2024
    Authors
    Royc30ne
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    EMNIST Letters Dataset

      Authors
    

    Gregory Cohen Saeed Afshar Jonathan Tapson Andre van Schaik

    The MARCS Institute for Brain, Behaviour and DevelopmentWestern Sydney UniversityPenrith, Australia 2751 Email: g.cohen@westernsydney.edu.au

      What is it?
    

    The EMNIST dataset is a set of handwritten character digits derived from the NIST Special Database 19 (NIST Special Database 19) and converted to a 28x28 pixel image format and dataset structure that… See the full description on the dataset page: https://huggingface.co/datasets/Royc30ne/emnist-letters.

  3. r

    Extended MNIST (EMNIST) dataset

    • researchdata.edu.au
    Updated May 16, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    van Schaik Andre; Tapson Jonathan; Afshar Saeed; Cohen Gregory (2023). Extended MNIST (EMNIST) dataset [Dataset]. http://doi.org/10.26183/ZN7S-GH79
    Explore at:
    Dataset updated
    May 16, 2023
    Dataset provided by
    Western Sydney University
    Authors
    van Schaik Andre; Tapson Jonathan; Afshar Saeed; Cohen Gregory
    License

    Attribution-NoDerivs 4.0 (CC BY-ND 4.0)https://creativecommons.org/licenses/by-nd/4.0/
    License information was derived automatically

    Description

    The EMNIST dataset is a set of handwritten character digits derived from the NIST Special Database 19 (https://www.nist.gov/srd/nist-special-database-19) and converted to a 28x28 pixel image format and dataset structure that directly matches the MNIST dataset (http://yann.lecun.com/exdb/mnist/). Further information on the dataset contents and conversion process can be found in the paper available at https://arxiv.org/abs/1702.05373v2

    The MNIST dataset has become a standard benchmark for learning, classification and computer vision systems. Contributing to its widespread adoption are the understandable and intuitive nature of the task, its relatively small size and storage requirements and the accessibility and ease-of-use of the database itself. The MNIST database was derived from a larger dataset known as the NIST Special Database 19 which contains digits, uppercase and lowercase handwritten letters. This paper introduces a variant of the full NIST dataset, which we have called Extended MNIST (EMNIST), which follows the same conversion paradigm used to create the MNIST dataset. The result is a set of datasets that constitute a more challenging classification tasks involving letters and digits, and that shares the same image structure and parameters as the original MNIST task, allowing for direct compatibility with all existing classifiers and systems. Benchmark results are presented along with a validation of the conversion process through the comparison of the classification results on converted NIST digits and the MNIST digits.

    The database is made available in original MNIST format and Matlab format.

  4. f

    Federated EMNIST Dataset

    • figshare.com
    xz
    Updated Jul 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Saroj Mali (2024). Federated EMNIST Dataset [Dataset]. http://doi.org/10.6084/m9.figshare.26308777.v1
    Explore at:
    xzAvailable download formats
    Dataset updated
    Jul 16, 2024
    Dataset provided by
    figshare
    Authors
    Saroj Mali
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is derived from the Leaf repository (https://github.com/TalwalkarLab/leaf) pre-processing of the Extended MNIST dataset, grouping examples by writer. Details about Leaf were published in "LEAF: A Benchmark for Federated Settings" https://arxiv.org/abs/1812.01097Note: This dataset does not include some additional preprocessing that MNIST includes, such as size-normalization and centering. In the Federated EMNIST data, the value of 1.0 corresponds to the background, and 0.0 corresponds to the color of the digits themselves; this is the inverse of some MNIST representations, e.g. in tensorflow_datasets, where 0 corresponds to the background color, and 255 represents the color of the digit.Data set sizes:only_digits=True: 3,383 users, 10 label classestrain: 341,873 examplestest: 40,832 examplesonly_digits=False: 3,400 users, 62 label classestrain: 671,585 examplestest: 77,483 examplesRather than holding out specific users, each user's examples are split across train and test so that all users have at least one example in train and one example in test. Writers that had less than 2 examples are excluded from the data set.The tf.data.Datasets returned by tff.simulation.datasets.ClientData.create_tf_dataset_for_client will yield collections.OrderedDict objects at each iteration, with the following keys and values, in lexicographic order by key:'label': a tf.Tensor with dtype=tf.int32 and shape [1], the class label of the corresponding pixels. Labels [0-9] correspond to the digits classes, labels [10-35] correspond to the uppercase classes (e.g., label 11 is 'B'), and labels [36-61] correspond to the lowercase classes (e.g., label 37 is 'b').'pixels': a tf.Tensor with dtype=tf.float32 and shape [28, 28], containing the pixels of the handwritten digit, with values in the range [0.0, 1.0].Argsonly_digits(Optional) whether to only include examples that are from the digits [0-9] classes. If False, includes lower and upper case characters, for a total of 62 class labels.cache_dir(Optional) directory to cache the downloaded file. If None, caches in Keras' default cache directory.ReturnsTuple of (train, test) where the tuple elements are tff.simulation.datasets.ClientData objects.

  5. Z

    EMNIST-DA: A dataset for studying measurement shift

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jun 2, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Schoelkopf, Bernhard (2022). EMNIST-DA: A dataset for studying measurement shift [Dataset]. https://data.niaid.nih.gov/resources?id=ZENODO_6602350
    Explore at:
    Dataset updated
    Jun 2, 2022
    Dataset provided by
    Schoelkopf, Bernhard
    Eastwood, Cian
    Williams, Christopher K. I.
    Mason, Ian
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    emnist_large.tar.gz contains the EMNIST-DA dataset consisting of 13 shifted versions of the 47-class extended-MNIST (EMNIST) dataset. As many methods achieve very good performance on MNIST datasets, this dataset was created to be more challenging with 47-classes and some difficult (measurement) shifts.

    Source code to generate the dataset is available at https://github.com/cianeastwood/bufr/blob/main/data/emnist.py.

  6. EMNIST - JPEG

    • kaggle.com
    Updated Jan 28, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tomas Ramos (2019). EMNIST - JPEG [Dataset]. https://www.kaggle.com/tomasramos21/emnist-jpeg/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 28, 2019
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Tomas Ramos
    Description

    Context

    Famous extended MNIST dataset. The original dataset is available at https://www.nist.gov/itl/iad/image-group/emnist-dataset. This dataset was "re-created" to provide the images in JPEG format for individuals who might want to manipulate the images without having to convert them from their original IDX3 format. Original Authors: Gregory Cohen, Saeed Afshar, Jonathan Tapson, and Andre van Schaik

    Content

    Contains 28x28 grayscale images of hand-written characters, as well as a CSV file with providing each image's path and respective label. The characters are grouped in directories with the same name as their label.

  7. Emnist Letters Dataset

    • kaggle.com
    Updated Sep 25, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mohammed Sadiq (2021). Emnist Letters Dataset [Dataset]. https://www.kaggle.com/datasets/mzaink14/emnist-letters-dataset/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 25, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Mohammed Sadiq
    Description

    Dataset

    This dataset was created by Mohammed Sadiq

    Contents

  8. EMNIST

    • kaggle.com
    Updated Jun 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kevin Min Seong Park (2024). EMNIST [Dataset]. https://www.kaggle.com/datasets/kevinminseongpark/emnist/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 7, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Kevin Min Seong Park
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Kevin Min Seong Park

    Released under MIT

    Contents

  9. EMNIST-Letters

    • kaggle.com
    zip
    Updated Feb 17, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    syed khadir ahmed (2019). EMNIST-Letters [Dataset]. https://www.kaggle.com/datasets/skhadirahmed/emnistletters
    Explore at:
    zip(35318847 bytes)Available download formats
    Dataset updated
    Feb 17, 2019
    Authors
    syed khadir ahmed
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset

    This dataset was created by syed khadir ahmed

    Released under CC0: Public Domain

    Contents

  10. h

    emnist_mnist

    • huggingface.co
    Updated May 3, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anke Tang (2024). emnist_mnist [Dataset]. https://huggingface.co/datasets/tanganke/emnist_mnist
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 3, 2024
    Authors
    Anke Tang
    Description

    Dataset Card for "emnist-mnist"

      Dataset Information
    

    The emnist-mnist dataset is a set of images of handwritten digits. The dataset is split into a training set and a test set.

      Data Fields
    

    image: The image of the handwritten digit. The data type of this field is image. label: The label of the handwritten digit. The data type of this field is class_label, and it can take on the values '0' to '9'.

      Data Splits
    

    train: The training set consists of 60000… See the full description on the dataset page: https://huggingface.co/datasets/tanganke/emnist_mnist.

  11. EMNIST-all

    • kaggle.com
    Updated Jul 22, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Om Duggineni (2021). EMNIST-all [Dataset]. https://www.kaggle.com/datasets/omduggineni/emnistall/data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 22, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Om Duggineni
    Description

    Dataset

    This dataset was created by Om Duggineni

    Contents

  12. h

    emnist_letters

    • huggingface.co
    Updated Aug 27, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anke Tang (2024). emnist_letters [Dataset]. https://huggingface.co/datasets/tanganke/emnist_letters
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 27, 2024
    Authors
    Anke Tang
    Description

    Dataset Card for "emnist-letters"

      Dataset Information
    

    The emnist-letters dataset is a set of images of handwritten letters. The dataset is split into a training set and a test set.

      Data Fields
    

    image: The image of the handwritten letter. The data type of this field is image. label: The label of the handwritten letter. The data type of this field is class_label, and it can take on the values 'A' to 'Z'.

      Data Splits
    

    train: The training set consists… See the full description on the dataset page: https://huggingface.co/datasets/tanganke/emnist_letters.

  13. EMNIST Balanced [a-z A-z 0-9]

    • kaggle.com
    Updated Oct 26, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ankur Goswami (2020). EMNIST Balanced [a-z A-z 0-9] [Dataset]. https://www.kaggle.com/ankur1401/emnist-balanced-az-az-09/tasks
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 26, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Ankur Goswami
    Description

    Dataset

    This dataset was created by Ankur Goswami

    Contents

  14. Model Letters EMNIST Classification

    • kaggle.com
    Updated Sep 25, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Олексій Чорний (2023). Model Letters EMNIST Classification [Dataset]. https://www.kaggle.com/oleksiichornyi/model-letters-emnist-classification/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 25, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Олексій Чорний
    Description

    Dataset

    This dataset was created by Олексій Чорний

    Contents

  15. Z

    Outputs of the paper "Blockchain-enabled Server-less Federated Learning"

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jul 30, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Paolo Dini (2022). Outputs of the paper "Blockchain-enabled Server-less Federated Learning" [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6913380
    Explore at:
    Dataset updated
    Jul 30, 2022
    Dataset provided by
    Francesc Wilhelmi
    Lorenza Giupponi
    Paolo Dini
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This repository contains the different outputs generated for the paper "Blockchain-enabled Server-less Federated Learning", submitted to Computer Networks (COMNET) journal. Two types of outputs are provided:

    Blockchain queue simulator (output_queue_simulator): results of the simulations done in the batch-service queue simulator (https://github.com/fwilhelmi/batch_service_queue_simulator) to characterize the queue latency of blockchain applications.

    Tensorflow (output_tensorflow): results of the simulations done in Tensorflow Federated (TFF), resulting from the application of different models to the federated EMNIST dataset.

    Each folder also includes the scripts used to execute the corresponding simulations. For more details, see the repository in https://github.com/fwilhelmi/blockchain_enabled_federated_learning

  16. EMNIST StratifiedKFold_models

    • kaggle.com
    Updated Oct 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Олексій Чорний (2023). EMNIST StratifiedKFold_models [Dataset]. https://www.kaggle.com/datasets/oleksiichornyi/emnist-stratifiedkfold-models/suggestions
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 22, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Олексій Чорний
    Description

    Dataset

    This dataset was created by Олексій Чорний

    Contents

  17. EMNIST_preprocessed_EXPANDED

    • kaggle.com
    Updated May 30, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dmytro Potapov (2024). EMNIST_preprocessed_EXPANDED [Dataset]. https://www.kaggle.com/datasets/dmytropotapov/emnist-preprocessed-expanded
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 30, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Dmytro Potapov
    Description

    Dataset

    This dataset was created by Dmytro Potapov

    Contents

  18. BALANCED_EMNIST_MATHS_SYMBOL_DATASET

    • kaggle.com
    Updated Mar 13, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yuvraj Joshi 1110 (2024). BALANCED_EMNIST_MATHS_SYMBOL_DATASET [Dataset]. https://www.kaggle.com/datasets/yuvrajjoshi1110/balanced-emnist-maths-symbol-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 13, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Yuvraj Joshi 1110
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Yuvraj Joshi 1110

    Released under MIT

    Contents

  19. Telerik RadCaptcha Segmented Characters 40x50 size

    • kaggle.com
    Updated Jan 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TBO Gamer 22 (2025). Telerik RadCaptcha Segmented Characters 40x50 size [Dataset]. https://www.kaggle.com/datasets/tbogamer22/telerik-radcaptcha-segmented-characters-40x50-size
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 3, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    TBO Gamer 22
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset contains 101354 labeled Segmented Character Telerik RadCaptcha images, each with a resolution of 40x50 pixels. The images were segmented from 3000 Telerik RadCaptcha images featuring 5-character alphanumeric strings which were labeled manually which were resized to a fix resolution of 40x50 pixels. Inorder to create a diverse dataset we used image augmentation and applied various rotations and translations to the resized character images to increase generalization capability. This dataset is ideal for CAPTCHA text recognition and optical character recognition (OCR). The dataset is designed to support various deep learning and machine learning tasks, including text extraction and model robustness evaluation. Suitable for developing and fine-tuning CAPTCHA recognition systems, this dataset provides a diverse set of CAPTCHA samples for effective model training and testing. The dataset is very similar to MNIST and EMNIST dataset.

  20. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Gregory Cohen; Saeed Afshar; Jonathan Tapson; André van Schaik, EMNIST Dataset [Dataset]. https://paperswithcode.com/dataset/emnist

EMNIST Dataset

Extended MNIST

Explore at:
Authors
Gregory Cohen; Saeed Afshar; Jonathan Tapson; André van Schaik
Description

EMNIST (extended MNIST) has 4 times more data than MNIST. It is a set of handwritten digits with a 28 x 28 format.

Search
Clear search
Close search
Google apps
Main menu