16 datasets found

T
mnist
tensorflow.org
universe.roboflow.com
+4more
Updated Jun 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). mnist [Dataset]. https://www.tensorflow.org/datasets/catalog/mnist
Explore at:
Dataset updated
Jun 1, 2024
Description
The MNIST database of handwritten digits.

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('mnist', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.

https://storage.googleapis.com/tfds-data/visualization/fig/mnist-3.0.1.png" alt="Visualization" width="500px">
T
fashion_mnist
tensorflow.org
opendatalab.com
+3more
Updated Jun 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). fashion_mnist [Dataset]. https://www.tensorflow.org/datasets/catalog/fashion_mnist
Explore at:
Dataset updated
Jun 1, 2024
Description
Fashion-MNIST is a dataset of Zalando's article images consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grayscale image, associated with a label from 10 classes.

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('fashion_mnist', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.

https://storage.googleapis.com/tfds-data/visualization/fig/fashion_mnist-3.0.1.png" alt="Visualization" width="500px">
T
moving_mnist
tensorflow.org
opendatalab.com
Updated Nov 23, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). moving_mnist [Dataset]. https://www.tensorflow.org/datasets/catalog/moving_mnist
Explore at:
Dataset updated
Nov 23, 2022
Description
Moving variant of MNIST database of handwritten digits. This is the data used by the authors for reporting model performance. See tfds.video.moving_mnist.image_as_moving_sequence for generating training/validation data from the MNIST dataset.

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('moving_mnist', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.
Cifar10_resnet50_embeddings
kaggle.com
zip
Updated Nov 9, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
3Jlou 4eJluk (2023). Cifar10_resnet50_embeddings [Dataset]. https://www.kaggle.com/zjlou4ejluk/cifar10-resnet50-embeddings
Explore at:
zip(246408215 bytes)Available download formats
Dataset updated
Nov 9, 2023
Authors
3Jlou 4eJluk
Description
Dataset

This dataset was created by 3Jlou 4eJluk

Contents
T
kmnist
tensorflow.org
datasets.activeloop.ai
Updated Jun 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). kmnist [Dataset]. https://www.tensorflow.org/datasets/catalog/kmnist
Explore at:
Dataset updated
Jun 1, 2024
Description
Kuzushiji-MNIST is a drop-in replacement for the MNIST dataset (28x28 grayscale, 70,000 images), provided in the original MNIST format as well as a NumPy format. Since MNIST restricts us to 10 classes, we chose one character to represent each of the 10 rows of Hiragana when creating Kuzushiji-MNIST.

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('kmnist', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.

https://storage.googleapis.com/tfds-data/visualization/fig/kmnist-3.0.1.png" alt="Visualization" width="500px">
train_model_tensorflow_mnist
kaggle.com
zip
Updated Mar 15, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Murad Al Dahmashi (2021). train_model_tensorflow_mnist [Dataset]. https://www.kaggle.com/muradaldahmashi/train-model-tensorflow-mnist
Explore at:
zip(1165622 bytes)Available download formats
Dataset updated
Mar 15, 2021
Authors
Murad Al Dahmashi
Description
Dataset

This dataset was created by Murad Al Dahmashi

Contents
MNIST From Tensorflow Tutorial
kaggle.com
zip
Updated Nov 23, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Arpan Dhatt (2017). MNIST From Tensorflow Tutorial [Dataset]. https://www.kaggle.com/arpandhatt/mnist-from-tensorflow-tutorial
Explore at:
zip(23155203 bytes)Available download formats
Dataset updated
Nov 23, 2017
Authors
Arpan Dhatt
Description
Dataset

This dataset was created by Arpan Dhatt

Contents
mnist for tf
kaggle.com
zip
Updated May 16, 2017
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
JerryWang (2017). mnist for tf [Dataset]. https://www.kaggle.com/miningjerry/mnist-for-tf
Explore at:
zip(33667734 bytes)Available download formats
Dataset updated
May 16, 2017
Authors
JerryWang
License
http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
Description
Context

There's a story behind every dataset and here's your opportunity to share yours.

Content

What's inside is more than just rows and columns. Make it easy for others to get started by describing how you acquired the data and what time period it represents, too.

Acknowledgements

We wouldn't be here without the help of others. If you owe any attributions or thanks, include them here along with any citations of past research.

Inspiration

Your data will be in front of the world's largest data science community. What questions do you want to see answered?
T
emnist
tensorflow.org
datasets.activeloop.ai
Updated Jun 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). emnist [Dataset]. https://www.tensorflow.org/datasets/catalog/emnist
Explore at:
Dataset updated
Jun 1, 2024
Description
The EMNIST dataset is a set of handwritten character digits derived from the NIST Special Database 19 and converted to a 28x28 pixel image format and dataset structure that directly matches the MNIST dataset.

Note: Like the original EMNIST data, images provided here are inverted horizontally and rotated 90 anti-clockwise. You can use tf.transpose within ds.map to convert the images to a human-friendlier format.

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('emnist', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.

https://storage.googleapis.com/tfds-data/visualization/fig/emnist-byclass-3.1.0.png" alt="Visualization" width="500px">
MNIST Restructured
kaggle.com
zip
Updated Nov 30, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jamal Uddin Tanvin (2024). MNIST Restructured [Dataset]. https://www.kaggle.com/datasets/jamaluddintanvin/mnist-reorganized
Explore at:
zip(29833637 bytes)Available download formats
Dataset updated
Nov 30, 2024
Authors
Jamal Uddin Tanvin
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
This dataset is a customized and restructured version of the well-known MNIST handwritten digit dataset by Yann LeCun, Corinna Cortes and Christopher J.C. Burges from THE MNIST DATABASE of handwritten digits. The adjustments are intended to improve usability and make it easier integration into various machine learning workflows.

Key Features:

Restructured Image Files: Each digit image is saved as a .png file in separate directories for training and testing.

CSV Metadata: Includes train_labels.csv and test_labels.csv, mapping image filenames to their respective labels.

Improved Accessibility: Simplified folder structure for easier dataset exploration and model training.

Format: Images are grayscale (28x28 pixels), suitable for most deep learning frameworks (TensorFlow, PyTorch, etc.).

Usage:

This dataset is ideal for: - Developing and testing classification models for handwritten digit recognition. - Exploring custom preprocessing pipelines for digit datasets. - Comparing model performance on a restructured MNIST dataset.
notMNIST dataset
kaggle.com
zip
Updated Aug 22, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
lubaroli (2017). notMNIST dataset [Dataset]. https://www.kaggle.com/lubaroli/notmnist
Explore at:
zip(8460905 bytes)Available download formats
Dataset updated
Aug 22, 2017
Authors
lubaroli
Description
Context

This dataset was created by Yaroslav Bulatov by taking some publicly available fonts and extracting glyphs from them to make a dataset similar to MNIST. There are 10 classes, with letters A-J.

Content

A set of training and test images of letters from A to J on various typefaces. The images size is 28x28 pixels.

Acknowledgements

The dataset can be found on Tensorflow github page as well as on the blog from Yaroslav, here.

Inspiration

This is a pretty good dataset to train classifiers! According to Yaroslav:

Judging by the examples, one would expect this to be a harder task than MNIST. This seems to be the case -- logistic regression on top of stacked auto-encoder with fine-tuning gets about 89% accuracy whereas same approach gives got 98% on MNIST. Dataset consists of small hand-cleaned part, about 19k instances, and large uncleaned dataset, 500k instances. Two parts have approximately 0.5% and 6.5% label error rate. I got this by looking through glyphs and counting how often my guess of the letter didn't match it's unicode value in the font file.

Enjoy!
MNIST_pytorch_tensorflow
kaggle.com
zip
Updated Apr 17, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
DanielTxz (2022). MNIST_pytorch_tensorflow [Dataset]. https://www.kaggle.com/datasets/danieltxz/mnist-pytorch-tensorflow
Explore at:
zip(13313 bytes)Available download formats
Dataset updated
Apr 17, 2022
Authors
DanielTxz
Description
Dataset

This dataset was created by DanielTxz

Contents
T
cats_vs_dogs
tensorflow.org
universe.roboflow.com
+1more
Updated Dec 19, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). cats_vs_dogs [Dataset]. https://www.tensorflow.org/datasets/catalog/cats_vs_dogs
Explore at:
Dataset updated
Dec 19, 2023
Description
A large set of images of cats and dogs. There are 1738 corrupted images that are dropped.

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('cats_vs_dogs', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.

https://storage.googleapis.com/tfds-data/visualization/fig/cats_vs_dogs-4.0.1.png" alt="Visualization" width="500px">
MIEDT dataset
kaggle.com
Updated Jan 12, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
机关鸢鸟 (2025). MIEDT dataset [Dataset]. https://www.kaggle.com/datasets/lidang78/miedt-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 12, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
机关鸢鸟
Description
Dataset Overview This dataset is organized based on the edge detection task, aiming to provide rich image resources and corresponding edge detection annotation information for related research and applications, which can be used for the testing of edge detection algorithms. In order to evaluate the performance of the edge detection method comprehensively, we created the Medical Image Edge Detection Test (MIEDT) dataset. The MIEDT contains 100 medical images, which were randomly selected from three publicly available datasets, Head CT-hemorrhage, Coronary Artery Diseases DataSet, and Skin Cancer MNIST: HAM10000 .

Data Set Structure Original image: This folder stores the original image data. It contains 15 Head CT images in PNG format with varying image resolutions; 25 coronary heart disease images in JPG format and with an image resolution of [1024 * 1024]; 60 skin images in JPG format and with an image resolution of [600 * 450]. It covers a variety of medical image materials with different imaging and contrast, providing diverse input data for edge detection algorithms. Ground truth：The data in this folder are the edge detection annotation images corresponding to the images in the "Originals" folder. They are in PNG format. In these images, the white pixels represent the edge parts of the image, and the black pixels represent the non-edge areas. These annotation information accurately outlines the object contours and edge features in the original images.

Usage Instructions For users who conduct image processing using Python, they can utilize the cv2 (OpenCV) library to read image data. The sample code is as follows:

import cv2 original_image = cv2.imread('Original image/IMG-001.png') # Read original image ground_truth_image = cv2.imread('Ground truth/GT-001.png', cv2.IMREAD_GRAYSCALE) # Read the corresponding Ground Truth image When performing model training based on deep learning frameworks (such as TensorFlow, PyTorch), the dataset path can be configured into the corresponding dataset loading class according to the data loading mechanism of the framework to ensure that the model can correctly read and process the image and its annotation data.

4. Data Sources and References Data Sources: The original images are collected from public image datasets Head CT-hemorrhage, Coronary Artery Diseases DataSet, and Skin Cancer MNIST: HAM10000 to ensure the quality and diversity of the images. If you are using this dataset in academic research, please cite the following literature.

References: [1] Noel Codella, Veronica Rotemberg, Philipp Tschandl, M. Emre Celebi, Stephen Dusza, David Gutman, Brian Helba, Aadi Kalloo, Konstantinos Liopyris, Michael Marchetti, Harald Kittler, Allan Halpern: “Skin Lesion Analysis Toward Melanoma Detection 2018: A Challenge Hosted by the International Skin Imaging Collaboration (ISIC)”, 2018; https://arxiv.org/abs/1902.03368

[2] Tschandl, P., Rosendahl, C. & Kittler, H. The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions. Sci. Data 5, 180161 doi:10.1038/sdata.2018.161 (2018).

[3] Classification of Brain Hemorrhage Using Deep Learning from CT Scan Images - https://link.springer.com/chapter/10.1007/978-981-19-7528-8_15
PHCD - Polish Handwritten Characters Database
kaggle.com
zip
Updated Dec 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Wiktor Flis (2023). PHCD - Polish Handwritten Characters Database [Dataset]. https://www.kaggle.com/datasets/westedcrean/phcd-polish-handwritten-characters-database/versions/3
Explore at:
zip(250262763 bytes)Available download formats
Dataset updated
Dec 30, 2023
Authors
Wiktor Flis
Description
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F950187%2Fd8a0b40fa9a5ad45c65e703b28d4a504%2Fbackground.png?generation=1703873571061442&alt=media" alt="">

The process for collecting this dataset was documented in paper "https://doi.org/10.12913/22998624/122567">"Development of Extensive Polish Handwritten Characters Database for Text Recognition Research" by Mikhail Tokovarov, dr Monika Kaczorowska and dr Marek Miłosz. Link to download the original dataset: https://cs.pollub.pl/phcd/. The source fileset also contains a dataset of raw images of whole sentences written in Polish.

Context

PHCD (Polish Handwritten Characters Database) is a collection of handwritten texts in Polish. It was created by researchers at Lublin University of Technology for the purpose of offline handwritten text recognition. The database contains more than 530 000 images of handwritten characters. Each image is a 32x32 pixel grayscale image representing one of 89 classes (10 digits, 26 lowercase latin letters, 26 uppercase latin letters, 9 lowercase polish letters, 9 uppercase polish letters and 9 special characters), with around 6 000 examples per class.

How to use

This notebook contains a PyTorch example of how to load the dataset from .npz files and train a CNN model. You can also use the dataset with other frameworks, such as TensorFlow, Keras, etc.

For .npz files, use numpy.load method.

Contents

The dataset contains the following:

dataset.npz - a file with two compressed numpy arrays:

"signs" - with all the images, sized 32 x 32 (grayscale)

"labels" - with all the labels (0-88) for examples from signs

label_mapping.csv - a csv file with columns label and char, mapping from ids to characters from dataset

images - folder with original 530 000 png images, sized 32 x 32, to use with other loading techniques

Acknowledgements

I want to express my gratitude to the following people: Dr. Edyta Łukasik for introducing me to this dataset and to authors of this dataset - Mikhail Tokovarov, dr. Monika Kaczorowska and dr. Marek Miłosz from Lublin University of Technology in Poland.

Inspiration

You can use this data the same way you used MNIST, KMNIST of Fashion MNIST: refine your image classification skills, use GPU & TPU to implement CNN architectures for models to perform such multiclass classifications.
Dice Images
kaggle.com
zip
Updated Jan 9, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yash Srivastava (2022). Dice Images [Dataset]. https://www.kaggle.com/datasets/yashsrivastava51213/dice-images
Explore at:
zip(1317193 bytes)Available download formats
Dataset updated
Jan 9, 2022
Authors
Yash Srivastava
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Context

There is no story behind this dataset, I just felt that I should also have a dataset 😬 .

About the Dataset.

The dataset contains top view of dice digits which can be used as an alternative to the MNIST dataset for digit recognition, a benchmark dataset for classification.

The images currently are only 120 and attempts to augment the data have already been made through the Tensorflow data augmentation pipeline, which further increased the dataset to about 1600 images(with random rotations, crops amongst other operations)

Image Type and Nomenclature

For the small dataset that we have here, the images were made from just two dice. The images of the dice are resized to be similar to that of the MNIST dataset for testing results on the already present models.

The images currently in the dataset are named as follows: {number}_{color of the dice**}_{transform angle}_{transformation direction*}

Expectation

My aim is that the dataset should be big enough so as to not cause overfitting. The dataset should also be diverse enough so that the model for which it is used is accurate.

Albeit augmentation of the dataset is a way to increase the dataset size, original images are preferred for their variability amongst many variables that I might have neglected in my analysis.

*if the direction is necessary, it is mentioned
** Although the images are converted to grayscale, the color of the dice might be feature that is required for some other analysis.

Acknowledgements

There is no one particularly that comes to mind, because each and every picture in this small dataset was manually edited by me, although I would like to help

Inspiration

The question that I have is whether this dataset can be used for Image Classification ? My take on this problem : GitHub Implementation
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

(2024). mnist [Dataset]. https://www.tensorflow.org/datasets/catalog/mnist

mnist

Explore at:

89 scholarly articles cite this dataset (View in Google Scholar)

Dataset updated

Jun 1, 2024

Description

The MNIST database of handwritten digits.

To use this dataset:

import tensorflow_datasets as tfds

ds = tfds.load('mnist', split='train')
for ex in ds.take(4):
 print(ex)

See the guide for more informations on tensorflow_datasets.

https://storage.googleapis.com/tfds-data/visualization/fig/mnist-3.0.1.png" alt="Visualization" width="500px">

Clear search

Close search

Google apps

Main menu

mnist

fashion_mnist

moving_mnist

Cifar10_resnet50_embeddings

Dataset

Contents

kmnist

train_model_tensorflow_mnist

Dataset

Contents

MNIST From Tensorflow Tutorial

Dataset

Contents

mnist for tf

Context

Content

Acknowledgements

Inspiration

emnist

MNIST Restructured

Key Features:

Usage:

notMNIST dataset

Context

Content

Acknowledgements

Inspiration

MNIST_pytorch_tensorflow

Dataset

Contents

cats_vs_dogs

MIEDT dataset

PHCD - Polish Handwritten Characters Database

Context

How to use

Contents

Acknowledgements

Inspiration

Dice Images

Context

About the Dataset.

Image Type and Nomenclature

Expectation

Acknowledgements

Inspiration

mnist