16 datasets found
  1. T

    mnist

    • tensorflow.org
    • universe.roboflow.com
    • +4more
    Updated Jun 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). mnist [Dataset]. https://www.tensorflow.org/datasets/catalog/mnist
    Explore at:
    Dataset updated
    Jun 1, 2024
    Description

    The MNIST database of handwritten digits.

    To use this dataset:

    import tensorflow_datasets as tfds
    
    ds = tfds.load('mnist', split='train')
    for ex in ds.take(4):
     print(ex)
    

    See the guide for more informations on tensorflow_datasets.

    https://storage.googleapis.com/tfds-data/visualization/fig/mnist-3.0.1.png" alt="Visualization" width="500px">

  2. T

    fashion_mnist

    • tensorflow.org
    • opendatalab.com
    • +3more
    Updated Jun 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). fashion_mnist [Dataset]. https://www.tensorflow.org/datasets/catalog/fashion_mnist
    Explore at:
    Dataset updated
    Jun 1, 2024
    Description

    Fashion-MNIST is a dataset of Zalando's article images consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grayscale image, associated with a label from 10 classes.

    To use this dataset:

    import tensorflow_datasets as tfds
    
    ds = tfds.load('fashion_mnist', split='train')
    for ex in ds.take(4):
     print(ex)
    

    See the guide for more informations on tensorflow_datasets.

    https://storage.googleapis.com/tfds-data/visualization/fig/fashion_mnist-3.0.1.png" alt="Visualization" width="500px">

  3. T

    moving_mnist

    • tensorflow.org
    • opendatalab.com
    Updated Nov 23, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). moving_mnist [Dataset]. https://www.tensorflow.org/datasets/catalog/moving_mnist
    Explore at:
    Dataset updated
    Nov 23, 2022
    Description

    Moving variant of MNIST database of handwritten digits. This is the data used by the authors for reporting model performance. See tfds.video.moving_mnist.image_as_moving_sequence for generating training/validation data from the MNIST dataset.

    To use this dataset:

    import tensorflow_datasets as tfds
    
    ds = tfds.load('moving_mnist', split='train')
    for ex in ds.take(4):
     print(ex)
    

    See the guide for more informations on tensorflow_datasets.

  4. Cifar10_resnet50_embeddings

    • kaggle.com
    zip
    Updated Nov 9, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    3Jlou 4eJluk (2023). Cifar10_resnet50_embeddings [Dataset]. https://www.kaggle.com/zjlou4ejluk/cifar10-resnet50-embeddings
    Explore at:
    zip(246408215 bytes)Available download formats
    Dataset updated
    Nov 9, 2023
    Authors
    3Jlou 4eJluk
    Description

    Dataset

    This dataset was created by 3Jlou 4eJluk

    Contents

  5. T

    kmnist

    • tensorflow.org
    • datasets.activeloop.ai
    Updated Jun 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). kmnist [Dataset]. https://www.tensorflow.org/datasets/catalog/kmnist
    Explore at:
    Dataset updated
    Jun 1, 2024
    Description

    Kuzushiji-MNIST is a drop-in replacement for the MNIST dataset (28x28 grayscale, 70,000 images), provided in the original MNIST format as well as a NumPy format. Since MNIST restricts us to 10 classes, we chose one character to represent each of the 10 rows of Hiragana when creating Kuzushiji-MNIST.

    To use this dataset:

    import tensorflow_datasets as tfds
    
    ds = tfds.load('kmnist', split='train')
    for ex in ds.take(4):
     print(ex)
    

    See the guide for more informations on tensorflow_datasets.

    https://storage.googleapis.com/tfds-data/visualization/fig/kmnist-3.0.1.png" alt="Visualization" width="500px">

  6. train_model_tensorflow_mnist

    • kaggle.com
    zip
    Updated Mar 15, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Murad Al Dahmashi (2021). train_model_tensorflow_mnist [Dataset]. https://www.kaggle.com/muradaldahmashi/train-model-tensorflow-mnist
    Explore at:
    zip(1165622 bytes)Available download formats
    Dataset updated
    Mar 15, 2021
    Authors
    Murad Al Dahmashi
    Description

    Dataset

    This dataset was created by Murad Al Dahmashi

    Contents

  7. MNIST From Tensorflow Tutorial

    • kaggle.com
    zip
    Updated Nov 23, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Arpan Dhatt (2017). MNIST From Tensorflow Tutorial [Dataset]. https://www.kaggle.com/arpandhatt/mnist-from-tensorflow-tutorial
    Explore at:
    zip(23155203 bytes)Available download formats
    Dataset updated
    Nov 23, 2017
    Authors
    Arpan Dhatt
    Description

    Dataset

    This dataset was created by Arpan Dhatt

    Contents

  8. mnist for tf

    • kaggle.com
    zip
    Updated May 16, 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    JerryWang (2017). mnist for tf [Dataset]. https://www.kaggle.com/miningjerry/mnist-for-tf
    Explore at:
    zip(33667734 bytes)Available download formats
    Dataset updated
    May 16, 2017
    Authors
    JerryWang
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    Context

    There's a story behind every dataset and here's your opportunity to share yours.

    Content

    What's inside is more than just rows and columns. Make it easy for others to get started by describing how you acquired the data and what time period it represents, too.

    Acknowledgements

    We wouldn't be here without the help of others. If you owe any attributions or thanks, include them here along with any citations of past research.

    Inspiration

    Your data will be in front of the world's largest data science community. What questions do you want to see answered?

  9. T

    emnist

    • tensorflow.org
    • datasets.activeloop.ai
    Updated Jun 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). emnist [Dataset]. https://www.tensorflow.org/datasets/catalog/emnist
    Explore at:
    Dataset updated
    Jun 1, 2024
    Description

    The EMNIST dataset is a set of handwritten character digits derived from the NIST Special Database 19 and converted to a 28x28 pixel image format and dataset structure that directly matches the MNIST dataset.

    Note: Like the original EMNIST data, images provided here are inverted horizontally and rotated 90 anti-clockwise. You can use tf.transpose within ds.map to convert the images to a human-friendlier format.

    To use this dataset:

    import tensorflow_datasets as tfds
    
    ds = tfds.load('emnist', split='train')
    for ex in ds.take(4):
     print(ex)
    

    See the guide for more informations on tensorflow_datasets.

    https://storage.googleapis.com/tfds-data/visualization/fig/emnist-byclass-3.1.0.png" alt="Visualization" width="500px">

  10. MNIST Restructured

    • kaggle.com
    zip
    Updated Nov 30, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jamal Uddin Tanvin (2024). MNIST Restructured [Dataset]. https://www.kaggle.com/datasets/jamaluddintanvin/mnist-reorganized
    Explore at:
    zip(29833637 bytes)Available download formats
    Dataset updated
    Nov 30, 2024
    Authors
    Jamal Uddin Tanvin
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    This dataset is a customized and restructured version of the well-known MNIST handwritten digit dataset by Yann LeCun, Corinna Cortes and Christopher J.C. Burges from THE MNIST DATABASE of handwritten digits. The adjustments are intended to improve usability and make it easier integration into various machine learning workflows.

    Key Features:

    Restructured Image Files: Each digit image is saved as a .png file in separate directories for training and testing.

    CSV Metadata: Includes train_labels.csv and test_labels.csv, mapping image filenames to their respective labels.

    Improved Accessibility: Simplified folder structure for easier dataset exploration and model training.

    Format: Images are grayscale (28x28 pixels), suitable for most deep learning frameworks (TensorFlow, PyTorch, etc.).

    Usage:

    This dataset is ideal for: - Developing and testing classification models for handwritten digit recognition. - Exploring custom preprocessing pipelines for digit datasets. - Comparing model performance on a restructured MNIST dataset.

  11. notMNIST dataset

    • kaggle.com
    zip
    Updated Aug 22, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    lubaroli (2017). notMNIST dataset [Dataset]. https://www.kaggle.com/lubaroli/notmnist
    Explore at:
    zip(8460905 bytes)Available download formats
    Dataset updated
    Aug 22, 2017
    Authors
    lubaroli
    Description

    Context

    This dataset was created by Yaroslav Bulatov by taking some publicly available fonts and extracting glyphs from them to make a dataset similar to MNIST. There are 10 classes, with letters A-J.

    Content

    A set of training and test images of letters from A to J on various typefaces. The images size is 28x28 pixels.

    Acknowledgements

    The dataset can be found on Tensorflow github page as well as on the blog from Yaroslav, here.

    Inspiration

    This is a pretty good dataset to train classifiers! According to Yaroslav:

    Judging by the examples, one would expect this to be a harder task than MNIST. This seems to be the case -- logistic regression on top of stacked auto-encoder with fine-tuning gets about 89% accuracy whereas same approach gives got 98% on MNIST. Dataset consists of small hand-cleaned part, about 19k instances, and large uncleaned dataset, 500k instances. Two parts have approximately 0.5% and 6.5% label error rate. I got this by looking through glyphs and counting how often my guess of the letter didn't match it's unicode value in the font file.

    Enjoy!

  12. MNIST_pytorch_tensorflow

    • kaggle.com
    zip
    Updated Apr 17, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DanielTxz (2022). MNIST_pytorch_tensorflow [Dataset]. https://www.kaggle.com/datasets/danieltxz/mnist-pytorch-tensorflow
    Explore at:
    zip(13313 bytes)Available download formats
    Dataset updated
    Apr 17, 2022
    Authors
    DanielTxz
    Description

    Dataset

    This dataset was created by DanielTxz

    Contents

  13. T

    cats_vs_dogs

    • tensorflow.org
    • universe.roboflow.com
    • +1more
    Updated Dec 19, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). cats_vs_dogs [Dataset]. https://www.tensorflow.org/datasets/catalog/cats_vs_dogs
    Explore at:
    Dataset updated
    Dec 19, 2023
    Description

    A large set of images of cats and dogs. There are 1738 corrupted images that are dropped.

    To use this dataset:

    import tensorflow_datasets as tfds
    
    ds = tfds.load('cats_vs_dogs', split='train')
    for ex in ds.take(4):
     print(ex)
    

    See the guide for more informations on tensorflow_datasets.

    https://storage.googleapis.com/tfds-data/visualization/fig/cats_vs_dogs-4.0.1.png" alt="Visualization" width="500px">

  14. MIEDT dataset

    • kaggle.com
    Updated Jan 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    机关鸢鸟 (2025). MIEDT dataset [Dataset]. https://www.kaggle.com/datasets/lidang78/miedt-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 12, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    机关鸢鸟
    Description
      1. Dataset Overview This dataset is organized based on the edge detection task, aiming to provide rich image resources and corresponding edge detection annotation information for related research and applications, which can be used for the testing of edge detection algorithms. In order to evaluate the performance of the edge detection method comprehensively, we created the Medical Image Edge Detection Test (MIEDT) dataset. The MIEDT contains 100 medical images, which were randomly selected from three publicly available datasets, Head CT-hemorrhage, Coronary Artery Diseases DataSet, and Skin Cancer MNIST: HAM10000 .
      1. Data Set Structure Original image: This folder stores the original image data. It contains 15 Head CT images in PNG format with varying image resolutions; 25 coronary heart disease images in JPG format and with an image resolution of [1024 * 1024]; 60 skin images in JPG format and with an image resolution of [600 * 450]. It covers a variety of medical image materials with different imaging and contrast, providing diverse input data for edge detection algorithms. Ground truth:The data in this folder are the edge detection annotation images corresponding to the images in the "Originals" folder. They are in PNG format. In these images, the white pixels represent the edge parts of the image, and the black pixels represent the non-edge areas. These annotation information accurately outlines the object contours and edge features in the original images.
      1. Usage Instructions For users who conduct image processing using Python, they can utilize the cv2 (OpenCV) library to read image data. The sample code is as follows:

    import cv2 original_image = cv2.imread('Original image/IMG-001.png') # Read original image ground_truth_image = cv2.imread('Ground truth/GT-001.png', cv2.IMREAD_GRAYSCALE) # Read the corresponding Ground Truth image When performing model training based on deep learning frameworks (such as TensorFlow, PyTorch), the dataset path can be configured into the corresponding dataset loading class according to the data loading mechanism of the framework to ensure that the model can correctly read and process the image and its annotation data.

    • 4. Data Sources and References Data Sources: The original images are collected from public image datasets Head CT-hemorrhage, Coronary Artery Diseases DataSet, and Skin Cancer MNIST: HAM10000 to ensure the quality and diversity of the images. If you are using this dataset in academic research, please cite the following literature.

    References: [1] Noel Codella, Veronica Rotemberg, Philipp Tschandl, M. Emre Celebi, Stephen Dusza, David Gutman, Brian Helba, Aadi Kalloo, Konstantinos Liopyris, Michael Marchetti, Harald Kittler, Allan Halpern: “Skin Lesion Analysis Toward Melanoma Detection 2018: A Challenge Hosted by the International Skin Imaging Collaboration (ISIC)”, 2018; https://arxiv.org/abs/1902.03368

    [2] Tschandl, P., Rosendahl, C. & Kittler, H. The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions. Sci. Data 5, 180161 doi:10.1038/sdata.2018.161 (2018).

    [3] Classification of Brain Hemorrhage Using Deep Learning from CT Scan Images - https://link.springer.com/chapter/10.1007/978-981-19-7528-8_15

  15. PHCD - Polish Handwritten Characters Database

    • kaggle.com
    zip
    Updated Dec 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wiktor Flis (2023). PHCD - Polish Handwritten Characters Database [Dataset]. https://www.kaggle.com/datasets/westedcrean/phcd-polish-handwritten-characters-database/versions/3
    Explore at:
    zip(250262763 bytes)Available download formats
    Dataset updated
    Dec 30, 2023
    Authors
    Wiktor Flis
    Description

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F950187%2Fd8a0b40fa9a5ad45c65e703b28d4a504%2Fbackground.png?generation=1703873571061442&alt=media" alt="">

    The process for collecting this dataset was documented in paper "https://doi.org/10.12913/22998624/122567">"Development of Extensive Polish Handwritten Characters Database for Text Recognition Research" by Mikhail Tokovarov, dr Monika Kaczorowska and dr Marek Miłosz. Link to download the original dataset: https://cs.pollub.pl/phcd/. The source fileset also contains a dataset of raw images of whole sentences written in Polish.

    Context

    PHCD (Polish Handwritten Characters Database) is a collection of handwritten texts in Polish. It was created by researchers at Lublin University of Technology for the purpose of offline handwritten text recognition. The database contains more than 530 000 images of handwritten characters. Each image is a 32x32 pixel grayscale image representing one of 89 classes (10 digits, 26 lowercase latin letters, 26 uppercase latin letters, 9 lowercase polish letters, 9 uppercase polish letters and 9 special characters), with around 6 000 examples per class.

    How to use

    This notebook contains a PyTorch example of how to load the dataset from .npz files and train a CNN model. You can also use the dataset with other frameworks, such as TensorFlow, Keras, etc.

    For .npz files, use numpy.load method.

    Contents

    The dataset contains the following:

    • dataset.npz - a file with two compressed numpy arrays:
      • "signs" - with all the images, sized 32 x 32 (grayscale)
      • "labels" - with all the labels (0-88) for examples from signs
    • label_mapping.csv - a csv file with columns label and char, mapping from ids to characters from dataset
    • images - folder with original 530 000 png images, sized 32 x 32, to use with other loading techniques

    Acknowledgements

    I want to express my gratitude to the following people: Dr. Edyta Łukasik for introducing me to this dataset and to authors of this dataset - Mikhail Tokovarov, dr. Monika Kaczorowska and dr. Marek Miłosz from Lublin University of Technology in Poland.

    Inspiration

    You can use this data the same way you used MNIST, KMNIST of Fashion MNIST: refine your image classification skills, use GPU & TPU to implement CNN architectures for models to perform such multiclass classifications.

  16. Dice Images

    • kaggle.com
    zip
    Updated Jan 9, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yash Srivastava (2022). Dice Images [Dataset]. https://www.kaggle.com/datasets/yashsrivastava51213/dice-images
    Explore at:
    zip(1317193 bytes)Available download formats
    Dataset updated
    Jan 9, 2022
    Authors
    Yash Srivastava
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    There is no story behind this dataset, I just felt that I should also have a dataset 😬 .

    About the Dataset.

    The dataset contains top view of dice digits which can be used as an alternative to the MNIST dataset for digit recognition, a benchmark dataset for classification.

    The images currently are only 120 and attempts to augment the data have already been made through the Tensorflow data augmentation pipeline, which further increased the dataset to about 1600 images(with random rotations, crops amongst other operations)

    Image Type and Nomenclature

    For the small dataset that we have here, the images were made from just two dice. The images of the dice are resized to be similar to that of the MNIST dataset for testing results on the already present models.

    The images currently in the dataset are named as follows: {number}_{color of the dice**}_{transform angle}_{transformation direction*}

    Expectation

    My aim is that the dataset should be big enough so as to not cause overfitting. The dataset should also be diverse enough so that the model for which it is used is accurate.

    Albeit augmentation of the dataset is a way to increase the dataset size, original images are preferred for their variability amongst many variables that I might have neglected in my analysis.

    *if the direction is necessary, it is mentioned
    ** Although the images are converted to grayscale, the color of the dice might be feature that is required for some other analysis.

    Acknowledgements

    There is no one particularly that comes to mind, because each and every picture in this small dataset was manually edited by me, although I would like to help

    Inspiration

    The question that I have is whether this dataset can be used for Image Classification ? My take on this problem : GitHub Implementation

  17. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
(2024). mnist [Dataset]. https://www.tensorflow.org/datasets/catalog/mnist

mnist

Explore at:
89 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Jun 1, 2024
Description

The MNIST database of handwritten digits.

To use this dataset:

import tensorflow_datasets as tfds

ds = tfds.load('mnist', split='train')
for ex in ds.take(4):
 print(ex)

See the guide for more informations on tensorflow_datasets.

https://storage.googleapis.com/tfds-data/visualization/fig/mnist-3.0.1.png" alt="Visualization" width="500px">

Search
Clear search
Close search
Google apps
Main menu