51 datasets found

T
mnist
tensorflow.org
universe.roboflow.com
+3more
Updated Jun 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). mnist [Dataset]. https://www.tensorflow.org/datasets/catalog/mnist
Explore at:
Dataset updated
Jun 1, 2024
Description
The MNIST database of handwritten digits.

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('mnist', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.

https://storage.googleapis.com/tfds-data/visualization/fig/mnist-3.0.1.png" alt="Visualization" width="500px">
P
MNIST Dataset
paperswithcode.com
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Y. LeCun; L. Bottou; Y. Bengio; P. Haffner, MNIST Dataset [Dataset]. https://paperswithcode.com/dataset/mnist
Explore at:
Authors
Y. LeCun; L. Bottou; Y. Bengio; P. Haffner
Description
The MNIST database (Modified National Institute of Standards and Technology database) is a large collection of handwritten digits. It has a training set of 60,000 examples, and a test set of 10,000 examples. It is a subset of a larger NIST Special Database 3 (digits written by employees of the United States Census Bureau) and Special Database 1 (digits written by high school students) which contain monochrome images of handwritten digits. The digits have been size-normalized and centered in a fixed-size image. The original black and white (bilevel) images from NIST were size normalized to fit in a 20x20 pixel box while preserving their aspect ratio. The resulting images contain grey levels as a result of the anti-aliasing technique used by the normalization algorithm. the images were centered in a 28x28 image by computing the center of mass of the pixels, and translating the image so as to position this point at the center of the 28x28 field.
a
MNIST
datasets.activeloop.ai
deeplake
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yann LeCun, MNIST [Dataset]. https://datasets.activeloop.ai/docs/ml/datasets/mnist/
Explore at:
deeplakeAvailable download formats
Authors
Yann LeCun
License
Attribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
License information was derived automatically
Time period covered
Jan 1, 1998 - Dec 31, 2000
Area covered
Earth
Dataset funded by
AT&T Bell Labs
Description
The MNIST dataset is a dataset of handwritten digits. It is a popular dataset for machine learning and artificial intelligence research. The dataset consists of 60,000 training images and 10,000 test images. Each image is a 28x28 pixel grayscale image of a handwritten digit. The digits are labeled from 0 to 9.
Mnist 42000 Images Dataset
universe.roboflow.com
zip
Updated Apr 25, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Roboflow (2023). Mnist 42000 Images Dataset [Dataset]. https://universe.roboflow.com/roboflow-jvuqo/mnist-42000-images-u0qdg
Explore at:
zipAvailable download formats
Dataset updated
Apr 25, 2023
Dataset authored and provided by
Roboflowhttps://roboflow.com/
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Numbers
Description
The MNIST database (Modified National Institute of Standards and Technology database) is a large database of handwritten digits that is commonly used for training various image processing systems. The database is also widely used for training and testing in the field of machine learning. It was created by "re-mixing" the samples from NIST's original datasets. The creators felt that since NIST's training dataset was taken from American Census Bureau employees, while the testing dataset was taken from American high school students, it was not well-suited for machine learning experiments. Furthermore, the black and white images from NIST were normalized to fit into a 28x28 pixel bounding box and anti-aliased, which introduced grayscale levels.

Yann LeCun, Courant Institute, NYU Corinna Cortes, Google Labs, New York Christopher J.C. Burges, Microsoft Research, Redmond
T
fashion_mnist
tensorflow.org
opendatalab.com
+3more
Updated Jun 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). fashion_mnist [Dataset]. https://www.tensorflow.org/datasets/catalog/fashion_mnist
Explore at:
Dataset updated
Jun 1, 2024
Description
Fashion-MNIST is a dataset of Zalando's article images consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grayscale image, associated with a label from 10 classes.

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('fashion_mnist', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.

https://storage.googleapis.com/tfds-data/visualization/fig/fashion_mnist-3.0.1.png" alt="Visualization" width="500px">
h
MNIST
huggingface.co
Updated Mar 2, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Graph Datasets (2023). MNIST [Dataset]. https://huggingface.co/datasets/graphs-datasets/MNIST
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 2, 2023
Dataset authored and provided by
Graph Datasets
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset Card for MNIST

Dataset Summary

The MNIST dataset consists of 55000 images in 10 classes, represented as graphs. It comes from a computer vision dataset.

Supported Tasks and Leaderboards

MNIST should be used for multiclass graph classification.

External Use PyGeometric

To load in PyGeometric, do the following: from datasets import load_dataset

from torch_geometric.data import Data from torch_geometric.loader import DataLoader… See the full description on the dataset page: https://huggingface.co/datasets/graphs-datasets/MNIST.
r
Extended MNIST (EMNIST) dataset
researchdata.edu.au
Updated May 16, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
van Schaik Andre; Tapson Jonathan; Afshar Saeed; Cohen Gregory (2023). Extended MNIST (EMNIST) dataset [Dataset]. http://doi.org/10.26183/ZN7S-GH79
Explore at:
Unique identifier
https://doi.org/10.26183/ZN7S-GH79
Dataset updated
May 16, 2023
Dataset provided by
Western Sydney University
Authors
van Schaik Andre; Tapson Jonathan; Afshar Saeed; Cohen Gregory
License
Attribution-NoDerivs 4.0 (CC BY-ND 4.0)https://creativecommons.org/licenses/by-nd/4.0/
License information was derived automatically
Description
The EMNIST dataset is a set of handwritten character digits derived from the NIST Special Database 19 (https://www.nist.gov/srd/nist-special-database-19) and converted to a 28x28 pixel image format and dataset structure that directly matches the MNIST dataset (http://yann.lecun.com/exdb/mnist/). Further information on the dataset contents and conversion process can be found in the paper available at https://arxiv.org/abs/1702.05373v2
The MNIST dataset has become a standard benchmark for learning, classification and computer vision systems. Contributing to its widespread adoption are the understandable and intuitive nature of the task, its relatively small size and storage requirements and the accessibility and ease-of-use of the database itself. The MNIST database was derived from a larger dataset known as the NIST Special Database 19 which contains digits, uppercase and lowercase handwritten letters. This paper introduces a variant of the full NIST dataset, which we have called Extended MNIST (EMNIST), which follows the same conversion paradigm used to create the MNIST dataset. The result is a set of datasets that constitute a more challenging classification tasks involving letters and digits, and that shares the same image structure and parameters as the original MNIST task, allowing for direct compatibility with all existing classifiers and systems. Benchmark results are presented along with a validation of the conversion process through the comparison of the classification results on converted NIST digits and the MNIST digits.
The database is made available in original MNIST format and Matlab format.
P
Moving MNIST Dataset
paperswithcode.com
tensorflow.org
+2more
Updated Feb 7, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nitish Srivastava; Elman Mansimov; Ruslan Salakhutdinov (2021). Moving MNIST Dataset [Dataset]. https://paperswithcode.com/dataset/moving-mnist
Explore at:
Dataset updated
Feb 7, 2021
Authors
Nitish Srivastava; Elman Mansimov; Ruslan Salakhutdinov
Description
The Moving MNIST dataset contains 10,000 video sequences, each consisting of 20 frames. In each video sequence, two digits move independently around the frame, which has a spatial resolution of 64×64 pixels. The digits frequently intersect with each other and bounce off the edges of the frame
3D MNIST
kaggle.com
Updated Oct 18, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
David de la Iglesia Castro (2019). 3D MNIST [Dataset]. https://www.kaggle.com/daavoo/3d-mnist/Kernels
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 18, 2019
Dataset provided by
Kagglehttp://kaggle.com/
Authors
David de la Iglesia Castro
Description
Context

The aim of this dataset is to provide a simple way to get started with 3D computer vision problems such as 3D shape recognition.

Accurate 3D point clouds can (easily and cheaply) be adquired nowdays from different sources:

RGB-D devices: Google Tango, Microsoft Kinect, etc.

Lidar.

3D reconstruction from multiple images.

However there is a lack of large 3D datasets (you can find a good one here based on triangular meshes); it's especially hard to find datasets based on point clouds (wich is the raw output from every 3D sensing device).

This dataset contains 3D point clouds generated from the original images of the MNIST dataset to bring a familiar introduction to 3D to people used to work with 2D datasets (images).

In the 3D_from_2D notebook you can find the code used to generate the dataset.

You can use the code in the notebook to generate a bigger 3D dataset from the original.

Content

full_dataset_vectors.h5

The entire dataset stored as 4096-D vectors obtained from the voxelization (x:16, y:16, z:16) of all the 3D point clouds.

In adition to the original point clouds, it contains randomly rotated copies with noise.

The full dataset is splitted into arrays:

X_train (10000, 4096)

y_train (10000)

X_test(2000, 4096)

y_test (2000)

Example python code reading the full dataset:

with h5py.File("../input/train_point_clouds.h5", "r") as hf: X_train = hf["X_train"][:] y_train = hf["y_train"][:] X_test = hf["X_test"][:] y_test = hf["y_test"][:]

train_point_clouds.h5 & test_point_clouds.h5

5000 (train), and 1000 (test) 3D point clouds stored in HDF5 file format. The point clouds have zero mean and a maximum dimension range of 1.

Each file is divided into HDF5 groups

Each group is named as its corresponding array index in the original mnist dataset and it contains:

"points" dataset: x, y, z coordinates of each 3D point in the point cloud.

"normals" dataset: nx, ny, nz components of the unit normal associate to each point.

"img" dataset: the original mnist image.

"label" attribute: the original mnist label.

Example python code reading 2 digits and storing some of the group content in tuples:

with h5py.File("../input/train_point_clouds.h5", "r") as hf: a = hf["0"] b = hf["1"] digit_a = (a["img"][:], a["points"][:], a.attrs["label"]) digit_b = (b["img"][:], b["points"][:], b.attrs["label"])

voxelgrid.py

Simple Python class that generates a grid of voxels from the 3D point cloud. Check kernel for use.

plot3D.py

Module with functions to plot point clouds and voxelgrid inside jupyter notebook. You have to run this locally due to Kaggle's notebook lack of support to rendering Iframes. See github issue here

Functions included:

array_to_color Converts 1D array to rgb values use as kwarg color in plot_points()

plot_points(xyz, colors=None, size=0.1, axis=False)

plot_voxelgrid(v_grid, cmap="Oranges", axis=False)

Acknowledgements

Website of the original MNIST dataset

Website of the 3D MNIST dataset

Have fun!
MPI-MNIST Dataset
zenodo.org
application/gzip, pdf
Updated Jan 14, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Meira Iske; Meira Iske; Hannes Albers; Hannes Albers; Tobias Kluth; Tobias Kluth; Tobias Knopp; Tobias Knopp (2025). MPI-MNIST Dataset [Dataset]. http://doi.org/10.5281/zenodo.12799417
Explore at:
application/gzip, pdfAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.12799417
Dataset updated
Jan 14, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Meira Iske; Meira Iske; Hannes Albers; Hannes Albers; Tobias Kluth; Tobias Kluth; Tobias Knopp; Tobias Knopp
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
A dataset for magnetic particle imaging based on the MNIST dataset.

This dataset contains simulated MPI measurements along with ground truth phantoms selected from the https://yann.lecun.com/exdb/mnist/" target="_blank" rel="noopener">MNIST database of handwritten digits. A state-of-the-art model-based system matrix is used to simulate the MPI measurements of the MNIST phantoms. These measurements are equipped with noise perturbations captured by the preclinical MPI system (Bruker, Ettlingen, Germany). The dataset can be utilized in its provided form, while additional data is included to offer flexibility for creating customized versions.

MPI-MNIST features four different system matrices, each available in three spatial resolutions. The provided data is generated using a specified system matrix at highest spatial resolution. Reconstruction operations can be performed by using any of the provided system matrices at a lower resolution. This setup allows for simulating reconstructions from either an exact or an inexact forward operator. To cover further operator deviation setups, we provide additional noise data for the application of pixelwise noise to the reconstruction system matrix.

For supporting the development of learning-based methods, a large amount of further noise samples, captured by the Bruker scanner, is provided.

For a detailed description of the dataset, see arxiv.org/abs/2501.05583.

The Python-based GitHub repository available at https://github.com/meiraiske/MPI-MNIST" href="https://github.com/meiraiske/MPI-MNIST" target="_blank" rel="noopener">https://github.com/meiraiske/MPI-MNIST can be used for downloading the data from this website and preparing it for project use which includes an integration to PyTorch or PyTorch Lightning modules.

File Structure

All data, except for the phantoms, is provided in the MDF file format. This format is specifically tailored to store MPI data and contains metadata corresponding to the experimental setup. The ground truth phantoms are provided as HDF5 files since they do not require any metadata.

SM: Contains twelve system matrices named SM_{physical model}_{resolution}.mdf. It covers four physical models given in three resolutions ('coarse', 'int' and 'fine'). The highest resolution ('fine') is used for data generation.

large_noise: Contains large_NoiseMeas.mdf with 390060 noise measurements. Each noise measurement has been averaged over ten empty scanner measurements. This can be used e.g. for learning-based methods.

For dataset in ['train', 'test']:

{dataset}_noise: Contains four noise matrices, where each noise measurement has been averaged over ten empty scanner measurements:
1. NoiseMeas_phantom_{dataset}.mdf : Additive measurement noise for simulated measurements.
2. NoiseMeas_phantom_bg_{dataset}.mdf : Unused noise reserved for background correction of 1.
3. NoiseMeas_SM_{dataset}.mdf : System Matrix noise, that can be applied to each pixel of the reconstruction system matrix.
4. NoiseMeas_SM_bg_{dataset}.mdf : Unused noise reserved for background correction of 3.

{dataset}_gt: Contains {dataset}_gt.hdf5 with flattened and preprocessed ground truth MNIST phantoms given in coarse resolution (15x17=255 pixels) with pixel values in [0, 10].

{dataset}_obs: Contains {dataset}_obs.mdf with noise free simulated measurements (observations) of {dataset}_gt.hdf5 using the system matrix stored in SM_fluid_opt_fine.mdf.

{dataset}_obsnoisy: Contains {dataset}_obsnoisy.mdf with noise contained simulated measurements, resulting from {dataset}_obs.mdf and {dataset}_phantom_noise.mdf.

In line with MNIST, each MDF/HDF5 file in {dataset}_gt, {dataset}_obs, {dataset}_obsnoisy for dataset in ['train', 'test'] contains 60000 samples for 'train' and 10000 samples for 'test'. The data can be manually reproduced in the intermediate resolution (45x51=2295 pixels) from the files in this dataset using the system matrices in intermediate ('int') resolution for reconstruction and upsampling the ground truth phantoms by 3 pixels per dimension. This case is also implemented in the Github repository .

The PDF file MPI-MNIST_Metadata.pdf contains a list of meta information for each of the MDF files of this dataset.
h
mnist100
huggingface.co
Updated Aug 16, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Marcin Wierzbiński (2023). mnist100 [Dataset]. https://huggingface.co/datasets/marcin119a/mnist100
Explore at:
Dataset updated
Aug 16, 2023
Authors
Marcin Wierzbiński
License
https://choosealicense.com/licenses/gpl/https://choosealicense.com/licenses/gpl/
Description
The MNIST-100 dataset is a variation of the original MNIST dataset, consisting of 100 handwritten numbers extracted from the MNIST dataset. Unlike the traditional MNIST dataset, which contains 60,000 training images of digits from 0 to 9, the Modified MNIST-10 dataset focuses on 100 numbers. Dataset Overview:

Dataset Name: MNIST-100 Total Number of Images: train: 60000 test: 1000 Classes: 100 (Numbers from 00 to 99) Image Size: 28x56 pixels (grayscale)

Data Collection: The MNIST-100 dataset… See the full description on the dataset page: https://huggingface.co/datasets/marcin119a/mnist100.
Hindi Characters MNIST
kaggle.com
Updated Sep 10, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bikram Saha (2022). Hindi Characters MNIST [Dataset]. https://www.kaggle.com/datasets/imbikramsaha/hindicharacters
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 10, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Bikram Saha
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
This is original MNIST type Hindi-Characters-MNIST dataset.

This dataset contains total 10.8k images of 36 categories, use 20% images as testing.

Categories Name: adna, ba, bha, cha, chha, chhya, da, daa, dha, dhaa, ga, gha, gya, ha, ja, jha, ka, kha, kna, la, ma, motosaw, na, pa, patalosaw, petchiryakha, pha, ra, taamatar, tabala, tha, thaa, tra, waw, yaw, yna
h
emnist-mnist
huggingface.co
Updated Aug 5, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Royc30ne (2024). emnist-mnist [Dataset]. https://huggingface.co/datasets/Royc30ne/emnist-mnist
Explore at:
Dataset updated
Aug 5, 2024
Authors
Royc30ne
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
EMNIST MNIST Dataset

Authors

Gregory Cohen Saeed Afshar Jonathan Tapson Andre van Schaik

The MARCS Institute for Brain, Behaviour and DevelopmentWestern Sydney UniversityPenrith, Australia 2751 Email: g.cohen@westernsydney.edu.au

What is it?

The EMNIST dataset is a set of handwritten character digits derived from the NIST Special Database 19 (NIST Special Database 19) and converted to a 28x28 pixel image format and dataset structure that directly… See the full description on the dataset page: https://huggingface.co/datasets/Royc30ne/emnist-mnist.
m
MNISTDigit
data.mendeley.com
Updated Apr 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Arpit Rai (2023). MNISTDigit [Dataset]. http://doi.org/10.17632/k864xjhrw6.1
Explore at:
Unique identifier
https://doi.org/10.17632/k864xjhrw6.1
Dataset updated
Apr 3, 2023
Authors
Arpit Rai
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
A dataset containing images of handwritten english numerals from 0-9 obtained from National Institute of Standards and Technology. It consists of greyscale images of handwritten digits and consists of 60000 images of size 28*28 for training and 10000 images as test examples.
h
notMNIST
huggingface.co
Updated Dec 21, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anubhav Maity (2023). notMNIST [Dataset]. https://huggingface.co/datasets/anubhavmaity/notMNIST
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 21, 2023
Authors
Anubhav Maity
Description
Dataset Card for "notMNIST"

Overview

The notMNIST dataset is a collection of images of letters from A to J in various fonts. It is designed as a more challenging alternative to the traditional MNIST dataset, which consists of handwritten digits. The notMNIST dataset is commonly used in machine learning and computer vision tasks for character recognition.

Dataset Information

Number of Classes: 10 (A to J) Number of Samples: 187,24 Image Size: 28 x 28 pixels… See the full description on the dataset page: https://huggingface.co/datasets/anubhavmaity/notMNIST.
Stroke Based MNIST Data
zenodo.org
zip
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
John L. Singleton; John L. Singleton (2020). Stroke Based MNIST Data [Dataset]. http://doi.org/10.5281/zenodo.201035
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.201035
Dataset updated
Jan 24, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
John L. Singleton; John L. Singleton
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The following dataset contains the MNIST dataset in stroke/point form. The data in this repository was based on the data obtained from the following project: https://github.com/edwin-de-jong/mnist-digits-stroke-sequence-data
h
MNIST
huggingface.co
Updated Aug 24, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
MNIST [Dataset]. https://huggingface.co/datasets/p2pfl/MNIST
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 24, 2024
Authors
P2PFL
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
🖼️ MNIST (Extracted from PyTorch Vision)

MNIST is a classic dataset of handwritten digits, widely used for image classification tasks in machine learning.

ℹ️ Dataset Details 📖 Dataset Description

The MNIST database of handwritten digits is a commonly used benchmark dataset in machine learning. It consists of 70,000 grayscale images of handwritten digits (0-9), each with a size of 28x28 pixels. The dataset is split into 60,000 training images and 10,000… See the full description on the dataset page: https://huggingface.co/datasets/p2pfl/MNIST.
p
Downscaled MNIST data for quantum computing
pennylane.ai
Updated Mar 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Joseph Bowles; Shahnawaz Ahmed; Maria Schuld (2024). Downscaled MNIST data for quantum computing [Dataset]. https://pennylane.ai/datasets/downscaled-mnist
Explore at:
Dataset updated
Mar 16, 2024
Authors
Joseph Bowles; Shahnawaz Ahmed; Maria Schuld
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Measurement technique
Simulation
Dataset funded by
Xanadu Quantum Technologies
Description
This dataset contains a simplified version of the famous MNIST handwritten digits dataset. This version involves distinguishing between digits 3 and 5 rather than the full range 0-9.
P
EMNIST Dataset
paperswithcode.com
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gregory Cohen; Saeed Afshar; Jonathan Tapson; André van Schaik, EMNIST Dataset [Dataset]. https://paperswithcode.com/dataset/emnist
Explore at:
Authors
Gregory Cohen; Saeed Afshar; Jonathan Tapson; André van Schaik
Description
EMNIST (extended MNIST) has 4 times more data than MNIST. It is a set of handwritten digits with a 28 x 28 format.
MNIST dataset for Outliers Detection - [ MNIST4OD ]
figshare.com
application/gzip
Updated May 17, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Giovanni Stilo; Bardh Prenkaj (2024). MNIST dataset for Outliers Detection - [ MNIST4OD ] [Dataset]. http://doi.org/10.6084/m9.figshare.9954986.v2
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.9954986.v2
Dataset updated
May 17, 2024
Dataset provided by
Figsharehttp://figshare.com/
Authors
Giovanni Stilo; Bardh Prenkaj
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Here we present a dataset, MNIST4OD, of large size (number of dimensions and number of instances) suitable for Outliers Detection task.The dataset is based on the famous MNIST dataset (http://yann.lecun.com/exdb/mnist/).We build MNIST4OD in the following way:To distinguish between outliers and inliers, we choose the images belonging to a digit as inliers (e.g. digit 1) and we sample with uniform probability on the remaining images as outliers such as their number is equal to 10% of that of inliers. We repeat this dataset generation process for all digits. For implementation simplicity we then flatten the images (28 X 28) into vectors.Each file MNIST_x.csv.gz contains the corresponding dataset where the inlier class is equal to x.The data contains one instance (vector) in each line where the last column represents the outlier label (yes/no) of the data point. The data contains also a column which indicates the original image class (0-9).See the following numbers for a complete list of the statistics of each datasets ( Name | Instances | Dimensions | Number of Outliers in % ):MNIST_0 | 7594 | 784 | 10MNIST_1 | 8665 | 784 | 10MNIST_2 | 7689 | 784 | 10MNIST_3 | 7856 | 784 | 10MNIST_4 | 7507 | 784 | 10MNIST_5 | 6945 | 784 | 10MNIST_6 | 7564 | 784 | 10MNIST_7 | 8023 | 784 | 10MNIST_8 | 7508 | 784 | 10MNIST_9 | 7654 | 784 | 10

Facebook

Twitter

Click to copy link

Link copied

Cite

(2024). mnist [Dataset]. https://www.tensorflow.org/datasets/catalog/mnist

mnist

Explore at:

75 scholarly articles cite this dataset (View in Google Scholar)

Dataset updated

Jun 1, 2024

Description

The MNIST database of handwritten digits.

To use this dataset:

import tensorflow_datasets as tfds

ds = tfds.load('mnist', split='train')
for ex in ds.take(4):
 print(ex)

See the guide for more informations on tensorflow_datasets.

https://storage.googleapis.com/tfds-data/visualization/fig/mnist-3.0.1.png" alt="Visualization" width="500px">

Clear search

Close search

Google apps

Main menu

mnist

MNIST Dataset

MNIST

Mnist 42000 Images Dataset

fashion_mnist

MNIST

Extended MNIST (EMNIST) dataset

Moving MNIST Dataset

3D MNIST

Context

Content

full_dataset_vectors.h5

train_point_clouds.h5 & test_point_clouds.h5

voxelgrid.py

plot3D.py

Acknowledgements

Have fun!

MPI-MNIST Dataset

mnist100

Hindi Characters MNIST

This is original MNIST type Hindi-Characters-MNIST dataset.

This dataset contains total 10.8k images of 36 categories, use 20% images as testing.

Categories Name: adna, ba, bha, cha, chha, chhya, da, daa, dha, dhaa, ga, gha, gya, ha, ja, jha, ka, kha, kna, la, ma, motosaw, na, pa, patalosaw, petchiryakha, pha, ra, taamatar, tabala, tha, thaa, tra, waw, yaw, yna

emnist-mnist

MNISTDigit

notMNIST

Stroke Based MNIST Data

MNIST

Downscaled MNIST data for quantum computing

EMNIST Dataset

MNIST dataset for Outliers Detection - [ MNIST4OD ]

mnist

Categories Name: `adna, ba, bha, cha, chha, chhya, da, daa, dha, dhaa, ga, gha, gya, ha, ja, jha, ka, kha, kna, la, ma, motosaw, na, pa, patalosaw, petchiryakha, pha, ra, taamatar, tabala, tha, thaa, tra, waw, yaw, yna`