19 datasets found

P
EMNIST Dataset
paperswithcode.com
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gregory Cohen; Saeed Afshar; Jonathan Tapson; André van Schaik, EMNIST Dataset [Dataset]. https://paperswithcode.com/dataset/emnist
Explore at:
Authors
Gregory Cohen; Saeed Afshar; Jonathan Tapson; André van Schaik
Description
EMNIST (extended MNIST) has 4 times more data than MNIST. It is a set of handwritten digits with a 28 x 28 format.
h
emnist-letters
huggingface.co
Updated Aug 5, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Royc30ne (2024). emnist-letters [Dataset]. https://huggingface.co/datasets/Royc30ne/emnist-letters
Explore at:
Dataset updated
Aug 5, 2024
Authors
Royc30ne
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
EMNIST Letters Dataset

Authors

Gregory Cohen Saeed Afshar Jonathan Tapson Andre van Schaik

The MARCS Institute for Brain, Behaviour and DevelopmentWestern Sydney UniversityPenrith, Australia 2751 Email: g.cohen@westernsydney.edu.au

What is it?

The EMNIST dataset is a set of handwritten character digits derived from the NIST Special Database 19 (NIST Special Database 19) and converted to a 28x28 pixel image format and dataset structure that… See the full description on the dataset page: https://huggingface.co/datasets/Royc30ne/emnist-letters.
r
Extended MNIST (EMNIST) dataset
researchdata.edu.au
Updated May 16, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
van Schaik Andre; Tapson Jonathan; Afshar Saeed; Cohen Gregory (2023). Extended MNIST (EMNIST) dataset [Dataset]. http://doi.org/10.26183/ZN7S-GH79
Explore at:
Unique identifier
https://doi.org/10.26183/ZN7S-GH79
Dataset updated
May 16, 2023
Dataset provided by
Western Sydney University
Authors
van Schaik Andre; Tapson Jonathan; Afshar Saeed; Cohen Gregory
License
Attribution-NoDerivs 4.0 (CC BY-ND 4.0)https://creativecommons.org/licenses/by-nd/4.0/
License information was derived automatically
Description
The EMNIST dataset is a set of handwritten character digits derived from the NIST Special Database 19 (https://www.nist.gov/srd/nist-special-database-19) and converted to a 28x28 pixel image format and dataset structure that directly matches the MNIST dataset (http://yann.lecun.com/exdb/mnist/). Further information on the dataset contents and conversion process can be found in the paper available at https://arxiv.org/abs/1702.05373v2
The MNIST dataset has become a standard benchmark for learning, classification and computer vision systems. Contributing to its widespread adoption are the understandable and intuitive nature of the task, its relatively small size and storage requirements and the accessibility and ease-of-use of the database itself. The MNIST database was derived from a larger dataset known as the NIST Special Database 19 which contains digits, uppercase and lowercase handwritten letters. This paper introduces a variant of the full NIST dataset, which we have called Extended MNIST (EMNIST), which follows the same conversion paradigm used to create the MNIST dataset. The result is a set of datasets that constitute a more challenging classification tasks involving letters and digits, and that shares the same image structure and parameters as the original MNIST task, allowing for direct compatibility with all existing classifiers and systems. Benchmark results are presented along with a validation of the conversion process through the comparison of the classification results on converted NIST digits and the MNIST digits.
The database is made available in original MNIST format and Matlab format.
f
Federated EMNIST Dataset
figshare.com
xz
Updated Jul 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Saroj Mali (2024). Federated EMNIST Dataset [Dataset]. http://doi.org/10.6084/m9.figshare.26308777.v1
Explore at:
xzAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.26308777.v1
Dataset updated
Jul 16, 2024
Dataset provided by
figshare
Authors
Saroj Mali
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is derived from the Leaf repository (https://github.com/TalwalkarLab/leaf) pre-processing of the Extended MNIST dataset, grouping examples by writer. Details about Leaf were published in "LEAF: A Benchmark for Federated Settings" https://arxiv.org/abs/1812.01097Note: This dataset does not include some additional preprocessing that MNIST includes, such as size-normalization and centering. In the Federated EMNIST data, the value of 1.0 corresponds to the background, and 0.0 corresponds to the color of the digits themselves; this is the inverse of some MNIST representations, e.g. in tensorflow_datasets, where 0 corresponds to the background color, and 255 represents the color of the digit.Data set sizes:only_digits=True: 3,383 users, 10 label classestrain: 341,873 examplestest: 40,832 examplesonly_digits=False: 3,400 users, 62 label classestrain: 671,585 examplestest: 77,483 examplesRather than holding out specific users, each user's examples are split across train and test so that all users have at least one example in train and one example in test. Writers that had less than 2 examples are excluded from the data set.The tf.data.Datasets returned by tff.simulation.datasets.ClientData.create_tf_dataset_for_client will yield collections.OrderedDict objects at each iteration, with the following keys and values, in lexicographic order by key:'label': a tf.Tensor with dtype=tf.int32 and shape [1], the class label of the corresponding pixels. Labels [0-9] correspond to the digits classes, labels [10-35] correspond to the uppercase classes (e.g., label 11 is 'B'), and labels [36-61] correspond to the lowercase classes (e.g., label 37 is 'b').'pixels': a tf.Tensor with dtype=tf.float32 and shape [28, 28], containing the pixels of the handwritten digit, with values in the range [0.0, 1.0].Argsonly_digits(Optional) whether to only include examples that are from the digits [0-9] classes. If False, includes lower and upper case characters, for a total of 62 class labels.cache_dir(Optional) directory to cache the downloaded file. If None, caches in Keras' default cache directory.ReturnsTuple of (train, test) where the tuple elements are tff.simulation.datasets.ClientData objects.
Z
EMNIST-DA: A dataset for studying measurement shift
data.niaid.nih.gov
zenodo.org
Updated Jun 2, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Schoelkopf, Bernhard (2022). EMNIST-DA: A dataset for studying measurement shift [Dataset]. https://data.niaid.nih.gov/resources?id=ZENODO_6602350
Explore at:
Dataset updated
Jun 2, 2022
Dataset provided by
Schoelkopf, Bernhard
Eastwood, Cian
Williams, Christopher K. I.
Mason, Ian
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
emnist_large.tar.gz contains the EMNIST-DA dataset consisting of 13 shifted versions of the 47-class extended-MNIST (EMNIST) dataset. As many methods achieve very good performance on MNIST datasets, this dataset was created to be more challenging with 47-classes and some difficult (measurement) shifts.

Source code to generate the dataset is available at https://github.com/cianeastwood/bufr/blob/main/data/emnist.py.
EMNIST - JPEG
kaggle.com
Updated Jan 28, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tomas Ramos (2019). EMNIST - JPEG [Dataset]. https://www.kaggle.com/tomasramos21/emnist-jpeg/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 28, 2019
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Tomas Ramos
Description
Context

Famous extended MNIST dataset. The original dataset is available at https://www.nist.gov/itl/iad/image-group/emnist-dataset. This dataset was "re-created" to provide the images in JPEG format for individuals who might want to manipulate the images without having to convert them from their original IDX3 format. Original Authors: Gregory Cohen, Saeed Afshar, Jonathan Tapson, and Andre van Schaik

Content

Contains 28x28 grayscale images of hand-written characters, as well as a CSV file with providing each image's path and respective label. The characters are grouped in directories with the same name as their label.
Emnist Letters Dataset
kaggle.com
Updated Sep 25, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mohammed Sadiq (2021). Emnist Letters Dataset [Dataset]. https://www.kaggle.com/datasets/mzaink14/emnist-letters-dataset/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 25, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Mohammed Sadiq
Description
Dataset

This dataset was created by Mohammed Sadiq

Contents
EMNIST
kaggle.com
Updated Jun 7, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kevin Min Seong Park (2024). EMNIST [Dataset]. https://www.kaggle.com/datasets/kevinminseongpark/emnist/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 7, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Kevin Min Seong Park
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset

This dataset was created by Kevin Min Seong Park

Released under MIT

Contents
EMNIST-Letters
kaggle.com
zip
Updated Feb 17, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
syed khadir ahmed (2019). EMNIST-Letters [Dataset]. https://www.kaggle.com/datasets/skhadirahmed/emnistletters
Explore at:
zip(35318847 bytes)Available download formats
Dataset updated
Feb 17, 2019
Authors
syed khadir ahmed
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Dataset

This dataset was created by syed khadir ahmed

Released under CC0: Public Domain

Contents
h
emnist_mnist
huggingface.co
Updated May 3, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anke Tang (2024). emnist_mnist [Dataset]. https://huggingface.co/datasets/tanganke/emnist_mnist
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 3, 2024
Authors
Anke Tang
Description
Dataset Card for "emnist-mnist"

Dataset Information

The emnist-mnist dataset is a set of images of handwritten digits. The dataset is split into a training set and a test set.

Data Fields

image: The image of the handwritten digit. The data type of this field is image. label: The label of the handwritten digit. The data type of this field is class_label, and it can take on the values '0' to '9'.

Data Splits

train: The training set consists of 60000… See the full description on the dataset page: https://huggingface.co/datasets/tanganke/emnist_mnist.
EMNIST-all
kaggle.com
Updated Jul 22, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Om Duggineni (2021). EMNIST-all [Dataset]. https://www.kaggle.com/datasets/omduggineni/emnistall/data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 22, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Om Duggineni
Description
Dataset

This dataset was created by Om Duggineni

Contents
h
emnist_letters
huggingface.co
Updated Aug 27, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anke Tang (2024). emnist_letters [Dataset]. https://huggingface.co/datasets/tanganke/emnist_letters
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 27, 2024
Authors
Anke Tang
Description
Dataset Card for "emnist-letters"

Dataset Information

The emnist-letters dataset is a set of images of handwritten letters. The dataset is split into a training set and a test set.

Data Fields

image: The image of the handwritten letter. The data type of this field is image. label: The label of the handwritten letter. The data type of this field is class_label, and it can take on the values 'A' to 'Z'.

Data Splits

train: The training set consists… See the full description on the dataset page: https://huggingface.co/datasets/tanganke/emnist_letters.
EMNIST Balanced [a-z A-z 0-9]
kaggle.com
Updated Oct 26, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ankur Goswami (2020). EMNIST Balanced [a-z A-z 0-9] [Dataset]. https://www.kaggle.com/ankur1401/emnist-balanced-az-az-09/tasks
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 26, 2020
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Ankur Goswami
Description
Dataset

This dataset was created by Ankur Goswami

Contents
Model Letters EMNIST Classification
kaggle.com
Updated Sep 25, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Олексій Чорний (2023). Model Letters EMNIST Classification [Dataset]. https://www.kaggle.com/oleksiichornyi/model-letters-emnist-classification/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 25, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Олексій Чорний
Description
Dataset

This dataset was created by Олексій Чорний

Contents
Z
Outputs of the paper "Blockchain-enabled Server-less Federated Learning"
data.niaid.nih.gov
zenodo.org
Updated Jul 30, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Paolo Dini (2022). Outputs of the paper "Blockchain-enabled Server-less Federated Learning" [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6913380
Explore at:
Dataset updated
Jul 30, 2022
Dataset provided by
Francesc Wilhelmi
Lorenza Giupponi
Paolo Dini
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This repository contains the different outputs generated for the paper "Blockchain-enabled Server-less Federated Learning", submitted to Computer Networks (COMNET) journal. Two types of outputs are provided:

Blockchain queue simulator (output_queue_simulator): results of the simulations done in the batch-service queue simulator (https://github.com/fwilhelmi/batch_service_queue_simulator) to characterize the queue latency of blockchain applications.

Tensorflow (output_tensorflow): results of the simulations done in Tensorflow Federated (TFF), resulting from the application of different models to the federated EMNIST dataset.

Each folder also includes the scripts used to execute the corresponding simulations. For more details, see the repository in https://github.com/fwilhelmi/blockchain_enabled_federated_learning
EMNIST StratifiedKFold_models
kaggle.com
Updated Oct 22, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Олексій Чорний (2023). EMNIST StratifiedKFold_models [Dataset]. https://www.kaggle.com/datasets/oleksiichornyi/emnist-stratifiedkfold-models/suggestions
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 22, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Олексій Чорний
Description
Dataset

This dataset was created by Олексій Чорний

Contents
EMNIST_preprocessed_EXPANDED
kaggle.com
Updated May 30, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dmytro Potapov (2024). EMNIST_preprocessed_EXPANDED [Dataset]. https://www.kaggle.com/datasets/dmytropotapov/emnist-preprocessed-expanded
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 30, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Dmytro Potapov
Description
Dataset

This dataset was created by Dmytro Potapov

Contents
BALANCED_EMNIST_MATHS_SYMBOL_DATASET
kaggle.com
Updated Mar 13, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yuvraj Joshi 1110 (2024). BALANCED_EMNIST_MATHS_SYMBOL_DATASET [Dataset]. https://www.kaggle.com/datasets/yuvrajjoshi1110/balanced-emnist-maths-symbol-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 13, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Yuvraj Joshi 1110
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset

This dataset was created by Yuvraj Joshi 1110

Released under MIT

Contents
Telerik RadCaptcha Segmented Characters 40x50 size
kaggle.com
Updated Jan 3, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TBO Gamer 22 (2025). Telerik RadCaptcha Segmented Characters 40x50 size [Dataset]. https://www.kaggle.com/datasets/tbogamer22/telerik-radcaptcha-segmented-characters-40x50-size
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 3, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
TBO Gamer 22
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
This dataset contains 101354 labeled Segmented Character Telerik RadCaptcha images, each with a resolution of 40x50 pixels. The images were segmented from 3000 Telerik RadCaptcha images featuring 5-character alphanumeric strings which were labeled manually which were resized to a fix resolution of 40x50 pixels. Inorder to create a diverse dataset we used image augmentation and applied various rotations and translations to the resized character images to increase generalization capability. This dataset is ideal for CAPTCHA text recognition and optical character recognition (OCR). The dataset is designed to support various deep learning and machine learning tasks, including text extraction and model robustness evaluation. Suitable for developing and fine-tuning CAPTCHA recognition systems, this dataset provides a diverse set of CAPTCHA samples for effective model training and testing. The dataset is very similar to MNIST and EMNIST dataset.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Gregory Cohen; Saeed Afshar; Jonathan Tapson; André van Schaik, EMNIST Dataset [Dataset]. https://paperswithcode.com/dataset/emnist

EMNIST Dataset

Extended MNIST

Explore at:

Authors

Gregory Cohen; Saeed Afshar; Jonathan Tapson; André van Schaik

Description

EMNIST (extended MNIST) has 4 times more data than MNIST. It is a set of handwritten digits with a 28 x 28 format.

Clear search

Close search

Google apps

Main menu

EMNIST Dataset

emnist-letters

Extended MNIST (EMNIST) dataset

Federated EMNIST Dataset

EMNIST-DA: A dataset for studying measurement shift

EMNIST - JPEG

Context

Content

Emnist Letters Dataset

Dataset

Contents

EMNIST

Dataset

Contents

EMNIST-Letters

Dataset

Contents

emnist_mnist

EMNIST-all

Dataset

Contents

emnist_letters

EMNIST Balanced [a-z A-z 0-9]

Dataset

Contents

Model Letters EMNIST Classification

Dataset

Contents

Outputs of the paper "Blockchain-enabled Server-less Federated Learning"

EMNIST StratifiedKFold_models

Dataset

Contents

EMNIST_preprocessed_EXPANDED

Dataset

Contents

BALANCED_EMNIST_MATHS_SYMBOL_DATASET

Dataset

Contents

Telerik RadCaptcha Segmented Characters 40x50 size

EMNIST Dataset

Extended MNIST