23 datasets found
  1. T

    imagenet2012_multilabel

    • tensorflow.org
    Updated Dec 10, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). imagenet2012_multilabel [Dataset]. https://www.tensorflow.org/datasets/catalog/imagenet2012_multilabel
    Explore at:
    Dataset updated
    Dec 10, 2022
    Description

    This dataset contains ILSVRC-2012 (ImageNet) validation images annotated with multi-class labels from "Evaluating Machine Accuracy on ImageNet", ICML, 2020. The multi-class labels were reviewed by a panel of experts extensively trained in the intricacies of fine-grained class distinctions in the ImageNet class hierarchy (see paper for more details). Compared to the original labels, these expert-reviewed multi-class labels enable a more semantically coherent evaluation of accuracy.

    Version 3.0.0 of this dataset contains more corrected labels from "When does dough become a bagel? Analyzing the remaining mistakes on ImageNet as well as the ImageNet-Major (ImageNet-M) 68-example split under 'imagenet-m'.

    Only 20,000 of the 50,000 ImageNet validation images have multi-label annotations. The set of multi-labels was first generated by a testbed of 67 trained ImageNet models, and then each individual model prediction was manually annotated by the experts as either correct (the label is correct for the image),wrong (the label is incorrect for the image), or unclear (no consensus was reached among the experts).

    Additionally, during annotation, the expert panel identified a set of problematic images. An image was problematic if it met any of the below criteria:

    • The original ImageNet label (top-1 label) was incorrect or unclear
    • Image was a drawing, painting, sketch, cartoon, or computer-rendered
    • Image was excessively edited
    • Image had inappropriate content

    The problematic images are included in this dataset but should be ignored when computing multi-label accuracy. Additionally, since the initial set of 20,000 annotations is class-balanced, but the set of problematic images is not, we recommend computing the per-class accuracies and then averaging them. We also recommend counting a prediction as correct if it is marked as correct or unclear (i.e., being lenient with the unclear labels).

    One possible way of doing this is with the following NumPy code:

    import tensorflow_datasets as tfds
    
    ds = tfds.load('imagenet2012_multilabel', split='validation')
    
    # We assume that predictions is a dictionary from file_name to a class index between 0 and 999
    
    num_correct_per_class = {}
    num_images_per_class = {}
    
    for example in ds:
      # We ignore all problematic images
      if example[‘is_problematic’].numpy():
        continue
    
      # The label of the image in ImageNet
      cur_class = example['original_label'].numpy()
    
      # If we haven't processed this class yet, set the counters to 0
      if cur_class not in num_correct_per_class:
        num_correct_per_class[cur_class] = 0
        assert cur_class not in num_images_per_class
        num_images_per_class[cur_class] = 0
    
      num_images_per_class[cur_class] += 1
    
      # Get the predictions for this image
      cur_pred = predictions[example['file_name'].numpy()]
    
      # We count a prediction as correct if it is marked as correct or unclear
      # (i.e., we are lenient with the unclear labels)
      if cur_pred is in example['correct_multi_labels'].numpy() or cur_pred is in example['unclear_multi_labels'].numpy():
        num_correct_per_class[cur_class] += 1
    
    # Check that we have collected accuracy data for each of the 1,000 classes
    num_classes = 1000
    assert len(num_correct_per_class) == num_classes
    assert len(num_images_per_class) == num_classes
    
    # Compute the per-class accuracies and then average them
    final_avg = 0
    for cid in range(num_classes):
     assert cid in num_correct_per_class
     assert cid in num_images_per_class
     final_avg += num_correct_per_class[cid] / num_images_per_class[cid]
    final_avg /= num_classes
    
    

    To use this dataset:

    import tensorflow_datasets as tfds
    
    ds = tfds.load('imagenet2012_multilabel', split='train')
    for ex in ds.take(4):
     print(ex)
    

    See the guide for more informations on tensorflow_datasets.

    https://storage.googleapis.com/tfds-data/visualization/fig/imagenet2012_multilabel-3.0.0.png" alt="Visualization" width="500px">

  2. T

    imagenet2012_subset

    • tensorflow.org
    Updated Oct 21, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). imagenet2012_subset [Dataset]. https://www.tensorflow.org/datasets/catalog/imagenet2012_subset
    Explore at:
    Dataset updated
    Oct 21, 2024
    Description

    ILSVRC 2012, commonly known as 'ImageNet' is an image dataset organized according to the WordNet hierarchy. Each meaningful concept in WordNet, possibly described by multiple words or word phrases, is called a "synonym set" or "synset". There are more than 100,000 synsets in WordNet, majority of them are nouns (80,000+). In ImageNet, we aim to provide on average 1000 images to illustrate each synset. Images of each concept are quality-controlled and human-annotated. In its completion, we hope ImageNet will offer tens of millions of cleanly sorted images for most of the concepts in the WordNet hierarchy.

    The test split contains 100K images but no labels because no labels have been publicly released. We provide support for the test split from 2012 with the minor patch released on October 10, 2019. In order to manually download this data, a user must perform the following operations:

    1. Download the 2012 test split available here.
    2. Download the October 10, 2019 patch. There is a Google Drive link to the patch provided on the same page.
    3. Combine the two tar-balls, manually overwriting any images in the original archive with images from the patch. According to the instructions on image-net.org, this procedure overwrites just a few images.

    The resulting tar-ball may then be processed by TFDS.

    To assess the accuracy of a model on the ImageNet test split, one must run inference on all images in the split, export those results to a text file that must be uploaded to the ImageNet evaluation server. The maintainers of the ImageNet evaluation server permits a single user to submit up to 2 submissions per week in order to prevent overfitting.

    To evaluate the accuracy on the test split, one must first create an account at image-net.org. This account must be approved by the site administrator. After the account is created, one can submit the results to the test server at https://image-net.org/challenges/LSVRC/eval_server.php The submission consists of several ASCII text files corresponding to multiple tasks. The task of interest is "Classification submission (top-5 cls error)". A sample of an exported text file looks like the following:

    771 778 794 387 650
    363 691 764 923 427
    737 369 430 531 124
    755 930 755 59 168
    

    The export format is described in full in "readme.txt" within the 2013 development kit available here: https://image-net.org/data/ILSVRC/2013/ILSVRC2013_devkit.tgz Please see the section entitled "3.3 CLS-LOC submission format". Briefly, the format of the text file is 100,000 lines corresponding to each image in the test split. Each line of integers correspond to the rank-ordered, top 5 predictions for each test image. The integers are 1-indexed corresponding to the line number in the corresponding labels file. See labels.txt.

    To use this dataset:

    import tensorflow_datasets as tfds
    
    ds = tfds.load('imagenet2012_subset', split='train')
    for ex in ds.take(4):
     print(ex)
    

    See the guide for more informations on tensorflow_datasets.

    https://storage.googleapis.com/tfds-data/visualization/fig/imagenet2012_subset-1pct-5.0.0.png" alt="Visualization" width="500px">

  3. T

    food101

    • tensorflow.org
    • opendatalab.com
    • +2more
    Updated Nov 23, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). food101 [Dataset]. https://www.tensorflow.org/datasets/catalog/food101
    Explore at:
    Dataset updated
    Nov 23, 2022
    Description

    This dataset consists of 101 food categories, with 101'000 images. For each class, 250 manually reviewed test images are provided as well as 750 training images. On purpose, the training images were not cleaned, and thus still contain some amount of noise. This comes mostly in the form of intense colors and sometimes wrong labels. All images were rescaled to have a maximum side length of 512 pixels.

    To use this dataset:

    import tensorflow_datasets as tfds
    
    ds = tfds.load('food101', split='train')
    for ex in ds.take(4):
     print(ex)
    

    See the guide for more informations on tensorflow_datasets.

    https://storage.googleapis.com/tfds-data/visualization/fig/food101-2.0.0.png" alt="Visualization" width="500px">

  4. notMNIST

    • kaggle.com
    • opendatalab.com
    • +3more
    Updated Feb 14, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    jwjohnson314 (2018). notMNIST [Dataset]. https://www.kaggle.com/datasets/jwjohnson314/notmnist/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 14, 2018
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    jwjohnson314
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    The MNIST dataset is one of the best known image classification problems out there, and a veritable classic of the field of machine learning. This dataset is more challenging version of the same root problem: classifying letters from images. This is a multiclass classification dataset of glyphs of English letters A - J.

    This dataset is used extensively in the Udacity Deep Learning course, and is available in the Tensorflow Github repo (under Examples). I'm not aware of any license governing the use of this data, so I'm posting it here so that the community can use it with Kaggle kernels.

    Content

    notMNIST _large.zip is a large but dirty version of the dataset with 529,119 images, and notMNIST_small.zip is a small hand-cleaned version of the dataset, with 18726 images. The dataset was assembled by Yaroslav Bulatov, and can be obtained on his blog. According to this blog entry there is about a 6.5% label error rate on the large uncleaned dataset, and a 0.5% label error rate on the small hand-cleaned dataset.

    The two files each containing 28x28 grayscale images of letters A - J, organized into directories by letter. notMNIST_large.zip contains 529,119 images and notMNIST_small.zip contains 18726 images.

    Acknowledgements

    Thanks to Yaroslav Bulatov for putting together the dataset.

  5. T

    imagenet_a

    • tensorflow.org
    Updated Jun 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). imagenet_a [Dataset]. https://www.tensorflow.org/datasets/catalog/imagenet_a
    Explore at:
    Dataset updated
    Jun 1, 2024
    Description

    ImageNet-A is a set of images labelled with ImageNet labels that were obtained by collecting new data and keeping only those images that ResNet-50 models fail to correctly classify. For more details please refer to the paper.

    The label space is the same as that of ImageNet2012. Each example is represented as a dictionary with the following keys:

    • 'image': The image, a (H, W, 3)-tensor.
    • 'label': An integer in the range [0, 1000).
    • 'file_name': A unique sting identifying the example within the dataset.

    To use this dataset:

    import tensorflow_datasets as tfds
    
    ds = tfds.load('imagenet_a', split='train')
    for ex in ds.take(4):
     print(ex)
    

    See the guide for more informations on tensorflow_datasets.

    https://storage.googleapis.com/tfds-data/visualization/fig/imagenet_a-0.1.0.png" alt="Visualization" width="500px">

  6. Z

    Dataset for "Enhancing Cloud Detection in Sentinel-2 Imagery: A...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Feb 4, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yin Ranyu (2024). Dataset for "Enhancing Cloud Detection in Sentinel-2 Imagery: A Spatial-Temporal Approach and Dataset" [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8419699
    Explore at:
    Dataset updated
    Feb 4, 2024
    Dataset provided by
    He Guojin
    Gong Chengjuan
    Long Tengfei
    Jiao Weili
    Yin Ranyu
    Wang Guizhou
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    This dataset is built for time-series Sentinel-2 cloud detection and stored in Tensorflow TFRecord (refer to https://www.tensorflow.org/tutorials/load_data/tfrecord).

    Each file is compressed in 7z format and can be decompressed using Bandzip or 7-zip software.

    Dataset Structure:

    Each filename can be split into three parts using underscores. The first part indicates whether it is designated for training or validation ('train' or 'val'); the second part indicates the Sentinel-2 tile name, and the last part indicates the number of samples in this file.

    For each sample, it includes:

    Sample ID;

    Array of time series 4 band image patches in 10m resolution, shaped as (n_timestamps, 4, 42, 42);

    Label list indicating cloud cover status for the center (6\times6) pixels of each timestamp;

    Ordinal list for each timestamp;

    Sample weight list (reserved);

    Here is a demonstration function for parsing the TFRecord file:

    import tensorflow as tf

    init Tensorflow Dataset from file name

    def parseRecordDirect(fname): sep = '/' parts = tf.strings.split(fname,sep) tn = tf.strings.split(parts[-1],sep='_')[-2] nn = tf.strings.to_number(tf.strings.split(parts[-1],sep='_')[-1],tf.dtypes.int64) t = tf.data.Dataset.from_tensors(tn).repeat().take(nn) t1 = tf.data.TFRecordDataset(fname) ds = tf.data.Dataset.zip((t, t1)) return ds

    keys_to_features_direct = { 'localid': tf.io.FixedLenFeature([], tf.int64, -1), 'image_raw_ldseries': tf.io.FixedLenFeature((), tf.string, ''), 'labels': tf.io.FixedLenFeature((), tf.string, ''), 'dates': tf.io.FixedLenFeature((), tf.string, ''), 'weights': tf.io.FixedLenFeature((), tf.string, '') }

    The Decoder (Optional)

    class SeriesClassificationDirectDecorder(decoder.Decoder): """A tf.Example decoder for tfds classification datasets.""" def init(self) -> None: super()._init_()

    def decode(self, tid, ds): parsed = tf.io.parse_single_example(ds, keys_to_features_direct) encoded = parsed['image_raw_ldseries'] labels_encoded = parsed['labels'] decoded = tf.io.decode_raw(encoded, tf.uint16) label = tf.io.decode_raw(labels_encoded, tf.int8) dates = tf.io.decode_raw(parsed['dates'], tf.int64) weight = tf.io.decode_raw(parsed['weights'], tf.float32) decoded = tf.reshape(decoded,[-1,4,42,42]) sample_dict = { 'tid': tid, # tile ID 'dates': dates, # Date list 'localid': parsed['localid'], # sample ID 'imgs': decoded, # image array 'labels': label, # label list 'weights': weight } return sample_dict

    simple function

    def preprocessDirect(tid, record): parsed = tf.io.parse_single_example(record, keys_to_features_direct) encoded = parsed['image_raw_ldseries'] labels_encoded = parsed['labels'] decoded = tf.io.decode_raw(encoded, tf.uint16) label = tf.io.decode_raw(labels_encoded, tf.int8) dates = tf.io.decode_raw(parsed['dates'], tf.int64) weight = tf.io.decode_raw(parsed['weights'], tf.float32) decoded = tf.reshape(decoded,[-1,4,42,42]) return tid, dates, parsed['localid'], decoded, label, weight

    t1 = parseRecordDirect('filename here') dataset = t1.map(preprocessDirect, num_parallel_calls=tf.data.experimental.AUTOTUNE)

    #

    Class Definition:

    0: clear

    1: opaque cloud

    2: thin cloud

    3: haze

    4: cloud shadow

    5: snow

    Dataset Construction:

    First, we randomly generate 500 points for each tile, and all these points are aligned to the pixel grid center of the subdatasets in 60m resolution (eg. B10) for consistence when comparing with other products. It is because that other cloud detection method may use the cirrus band as features, which is in 60m resolution.

    Then, the time series image patches of two shapes are cropped with each point as the center.The patches of shape (42 \times 42) are cropped from the bands in 10m resolution (B2, B3, B4, B8) and are used to construct this dataset.And the patches of shape (348 \times 348) are cropped from the True Colour Image (TCI, details see sentinel-2 user guide) file and are used to interpreting class labels.

    The samples with a large number of timestamps could be time-consuming in the IO stage, thus the time series patches are divided into different groups with timestamps not exceeding 100 for every group.

  7. Boat types examples

    • kaggle.com
    Updated Nov 18, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Clorichel (2018). Boat types examples [Dataset]. https://www.kaggle.com/clorichel/boat-types-examples/metadata
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 18, 2018
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Clorichel
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    This dataset is used on this blog post https://clorichel.com/blog/2018/11/10/machine-learning-and-object-detection/ where you'll train an image recognition model with TensorFlow to find about anything on pictures and videos.

    Content

    You'll find a few pictures of different types of boats, of various sizes, not classified and not contained within the original dataset used for retraining on this Notebook Kernel.

    Acknowledgements

    Thanks to Unsplash and its community for the beautiful free images and pictures.

  8. R

    Cifar 100 Dataset

    • universe.roboflow.com
    • opendatalab.com
    • +3more
    zip
    Updated Aug 11, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Popular Benchmarks (2022). Cifar 100 Dataset [Dataset]. https://universe.roboflow.com/popular-benchmarks/cifar100
    Explore at:
    zipAvailable download formats
    Dataset updated
    Aug 11, 2022
    Dataset authored and provided by
    Popular Benchmarks
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Animals People CommonObjects
    Description

    CIFAR-100

    The CIFAR-10 and CIFAR-100 dataset contains labeled subsets of the 80 million tiny images dataset. They were collected by Alex Krizhevsky, Vinod Nair, and Geoffrey Hinton. * More info on CIFAR-100: https://www.cs.toronto.edu/~kriz/cifar.html * TensorFlow listing of the dataset: https://www.tensorflow.org/datasets/catalog/cifar100 * GitHub repo for converting CIFAR-100 tarball files to png format: https://github.com/knjcode/cifar2png

    All images were sized 32x32 in the original dataset

    The CIFAR-10 dataset consists of 60,000 32x32 colour images in 10 classes, with 6,000 images per class. There are 50,000 training images and 10,000 test images [in the original dataset].

    This dataset is just like the CIFAR-10, except it has 100 classes containing 600 images each. There are 500 training images and 100 testing images per class. The 100 classes in the CIFAR-100 are grouped into 20 superclasses. Each image comes with a "fine" label (the class to which it belongs) and a "coarse" label (the superclass to which it belongs). However, this project does not contain the superclasses. * Superclasses version: https://universe.roboflow.com/popular-benchmarks/cifar100-with-superclasses/

    More background on the dataset: https://i.imgur.com/5w8A0Vm.png" alt="CIFAR-100 Dataset Classes and Superclassees">

    Version 1 (original-images_Original-CIFAR100-Splits):

    • Original images, with the original splits for CIFAR-100: train (83.33% of images - 50,000 images) set and test (16.67% of images - 10,000 images) set only.
    • This version was not trained

    Version 2 (original-images_trainSetSplitBy80_20):

    • Original, raw images, with the train set split to provide 80% of its images to the training set (approximately 40,000 images) and 20% of its images to the validation set (approximately 10,000 images)
    • Trained from Roboflow Classification Model's ImageNet training checkpoint
    • https://blog.roboflow.com/train-test-split/ https://i.imgur.com/kSPeKGn.png" alt="Train/Valid/Test Split Rebalancing">

    Citation:

    @TECHREPORT{Krizhevsky09learningmultiple,
      author = {Alex Krizhevsky},
      title = {Learning multiple layers of features from tiny images},
      institution = {},
      year = {2009}
    }
    
  9. a

    Data from: Fashion-MNIST

    • datasets.activeloop.ai
    • tensorflow.org
    • +3more
    deeplake
    Updated Feb 8, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Han Xiao, Kashif Rasul, Roland Vollgraf (2022). Fashion-MNIST [Dataset]. https://datasets.activeloop.ai/docs/ml/datasets/fashion-mnist-dataset/
    Explore at:
    deeplakeAvailable download formats
    Dataset updated
    Feb 8, 2022
    Authors
    Han Xiao, Kashif Rasul, Roland Vollgraf
    License

    https://github.com/zalandoresearch/fashion-mnist/blob/master/LICENSEhttps://github.com/zalandoresearch/fashion-mnist/blob/master/LICENSE

    Description

    A dataset of 70,000 fashion images with labels for 10 classes. The dataset was created by researchers at Zalando Research and is used for research in machine learning and computer vision tasks such as image classification.

  10. PHCD - Polish Handwritten Characters Database

    • kaggle.com
    Updated Dec 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wiktor Flis (2023). PHCD - Polish Handwritten Characters Database [Dataset]. https://www.kaggle.com/datasets/westedcrean/phcd-polish-handwritten-characters-database
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 30, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Wiktor Flis
    Description

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F950187%2Fd8a0b40fa9a5ad45c65e703b28d4a504%2Fbackground.png?generation=1703873571061442&alt=media" alt="">

    The process for collecting this dataset was documented in paper "https://doi.org/10.12913/22998624/122567">"Development of Extensive Polish Handwritten Characters Database for Text Recognition Research" by Mikhail Tokovarov, dr Monika Kaczorowska and dr Marek Miłosz. Link to download the original dataset: https://cs.pollub.pl/phcd/. The source fileset also contains a dataset of raw images of whole sentences written in Polish.

    Context

    PHCD (Polish Handwritten Characters Database) is a collection of handwritten texts in Polish. It was created by researchers at Lublin University of Technology for the purpose of offline handwritten text recognition. The database contains more than 530 000 images of handwritten characters. Each image is a 32x32 pixel grayscale image representing one of 89 classes (10 digits, 26 lowercase latin letters, 26 uppercase latin letters, 9 lowercase polish letters, 9 uppercase polish letters and 9 special characters), with around 6 000 examples per class.

    How to use

    This notebook contains a PyTorch example of how to load the dataset from .npz files and train a CNN model. You can also use the dataset with other frameworks, such as TensorFlow, Keras, etc.

    For .npz files, use numpy.load method.

    Contents

    The dataset contains the following:

    • dataset.npz - a file with two compressed numpy arrays:
      • "signs" - with all the images, sized 32 x 32 (grayscale)
      • "labels" - with all the labels (0-88) for examples from signs
    • label_mapping.csv - a csv file with columns label and char, mapping from ids to characters from dataset
    • images - folder with original 530 000 png images, sized 32 x 32, to use with other loading techniques

    Acknowledgements

    I want to express my gratitude to the following people: Dr. Edyta Łukasik for introducing me to this dataset and to authors of this dataset - Mikhail Tokovarov, dr. Monika Kaczorowska and dr. Marek Miłosz from Lublin University of Technology in Poland.

    Inspiration

    You can use this data the same way you used MNIST, KMNIST of Fashion MNIST: refine your image classification skills, use GPU & TPU to implement CNN architectures for models to perform such multiclass classifications.

  11. T

    celeb_a

    • tensorflow.org
    • datasetninja.com
    • +3more
    Updated Jun 1, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). celeb_a [Dataset]. https://www.tensorflow.org/datasets/catalog/celeb_a
    Explore at:
    Dataset updated
    Jun 1, 2024
    Description

    CelebFaces Attributes Dataset (CelebA) is a large-scale face attributes dataset with more than 200K celebrity images, each with 40 attribute annotations. The images in this dataset cover large pose variations and background clutter. CelebA has large diversities, large quantities, and rich annotations, including - 10,177 number of identities, - 202,599 number of face images, and - 5 landmark locations, 40 binary attributes annotations per image.

    The dataset can be employed as the training and test sets for the following computer vision tasks: face attribute recognition, face detection, and landmark (or facial part) localization.

    Note: CelebA dataset may contain potential bias. The fairness indicators example goes into detail about several considerations to keep in mind while using the CelebA dataset.

    To use this dataset:

    import tensorflow_datasets as tfds
    
    ds = tfds.load('celeb_a', split='train')
    for ex in ds.take(4):
     print(ex)
    

    See the guide for more informations on tensorflow_datasets.

    https://storage.googleapis.com/tfds-data/visualization/fig/celeb_a-2.1.0.png" alt="Visualization" width="500px">

  12. Apple Leaf Disease Detection Using Vision Transformer

    • zenodo.org
    text/x-python
    Updated Jun 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amreen Batool; Amreen Batool (2025). Apple Leaf Disease Detection Using Vision Transformer [Dataset]. http://doi.org/10.5281/zenodo.15702007
    Explore at:
    text/x-pythonAvailable download formats
    Dataset updated
    Jun 20, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Amreen Batool; Amreen Batool
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This repository contains a Python script for classifying apple leaf diseases using a Vision Transformer (ViT) model. The dataset used is the Plant Village dataset, which contains images of apple leaves with four classes: Healthy, Apple Scab, Black Rot, and Cedar Apple Rust. The script includes data preprocessing, model training, and evaluation steps.

    Table of Contents

    Introduction

    The goal of this project is to classify apple leaf diseases using a Vision Transformer (ViT) model. The dataset is divided into four classes: Healthy, Apple Scab, Black Rot, and Cedar Apple Rust. The script includes data preprocessing, model training, and evaluation steps.

    Code Explanation

    1. Importing Libraries

    • The script starts by importing necessary libraries such as matplotlib, seaborn, numpy, pandas, tensorflow, and sklearn. These libraries are used for data visualization, data manipulation, and building/training the deep learning model.

    2. Visualizing the Dataset

    • The walk_through_dir function is used to explore the dataset directory structure and count the number of images in each class.
    • The dataset is divided into Train, Val, and Test directories, each containing subdirectories for the four classes.

    3. Data Augmentation

    • The script uses ImageDataGenerator from Keras to apply data augmentation techniques such as rotation, horizontal flipping, and rescaling to the training data. This helps in improving the model's generalization ability.
    • Separate generators are created for training, validation, and test datasets.

    4. Patch Visualization

    • The script defines a Patches layer that extracts patches from the images. This is a crucial step in Vision Transformers, where images are divided into smaller patches that are then processed by the transformer.
    • The script visualizes these patches for different patch sizes (32x32, 16x16, 8x8) to understand how the image is divided.

    5. Model Training

    • The script defines a Vision Transformer (ViT) model using TensorFlow and Keras. The model is compiled with the Adam optimizer and categorical cross-entropy loss.
    • The model is trained for a specified number of epochs, and the training history is stored for later analysis.

    6. Model Evaluation

    • After training, the model is evaluated on the test dataset. The script generates a confusion matrix and a classification report to assess the model's performance.
    • The confusion matrix is visualized using seaborn to provide a clear understanding of the model's predictions.

    7. Visualizing Misclassified Images

    • The script includes functionality to visualize misclassified images, which helps in understanding where the model is making errors.

    8. Fine-Tuning and Learning Rate Adjustment

    • The script demonstrates how to fine-tune the model by adjusting the learning rate and re-training the model.

    Steps for Implementation

    1. Dataset Preparation

      • Ensure that the dataset is organized into Train, Val, and Test directories, with each directory containing subdirectories for each class (Healthy, Apple Scab, Black Rot, Cedar Apple Rust).
    2. Install Required Libraries

      • Install the necessary Python libraries using pip:
        pip install tensorflow matplotlib seaborn numpy pandas scikit-learn
    3. Run the Script

      • Execute the script in a Python environment. The script will automatically:
        • Load and preprocess the dataset.
        • Apply data augmentation.
        • Train the Vision Transformer model.
        • Evaluate the model and generate performance metrics.
    4. Analyze Results

      • Review the confusion matrix and classification report to understand the model's performance.
      • Visualize misclassified images to identify potential areas for improvement.
    5. Fine-Tuning

      • Experiment with different patch sizes, learning rates, and data augmentation techniques to improve the model's accuracy.
  13. O

    iNaturalist 2017

    • opendatalab.com
    • tensorflow.org
    zip
    Updated Mar 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    California Institute of Technology (2023). iNaturalist 2017 [Dataset]. https://opendatalab.com/OpenDataLab/iNaturalist
    Explore at:
    zipAvailable download formats
    Dataset updated
    Mar 22, 2023
    Dataset provided by
    California Institute of Technology
    License

    https://www.inaturalist.org/pages/termshttps://www.inaturalist.org/pages/terms

    Description

    The iNaturalist 2017 dataset (iNat) contains 675,170 training and validation images from 5,089 natural fine-grained categories. Those categories belong to 13 super-categories including Plantae (Plant), Insecta (Insect), Aves (Bird), Mammalia (Mammal), and so on. The iNat dataset is highly imbalanced with dramatically different number of images per category. For example, the largest super-category “Plantae (Plant)” has 196,613 images from 2,101 categories; whereas the smallest super-category “Protozoa” only has 381 images from 4 categories.The iNaturalist 2017 dataset (iNat) contains 675,170 training and validation images from 5,089 natural fine-grained categories. Those categories belong to 13 super-categories including Plantae (Plant), Insecta (Insect), Aves (Bird), Mammalia (Mammal), and so on. The iNat dataset is highly imbalanced with dramatically different number of images per category. For example, the largest super-category “Plantae (Plant)” has 196,613 images from 2,101 categories; whereas the smallest super-category “Protozoa” only has 381 images from 4 categories.

  14. T

    mnist

    • tensorflow.org
    • universe.roboflow.com
    • +3more
    Updated Jun 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). mnist [Dataset]. https://www.tensorflow.org/datasets/catalog/mnist
    Explore at:
    Dataset updated
    Jun 1, 2024
    Description

    The MNIST database of handwritten digits.

    To use this dataset:

    import tensorflow_datasets as tfds
    
    ds = tfds.load('mnist', split='train')
    for ex in ds.take(4):
     print(ex)
    

    See the guide for more informations on tensorflow_datasets.

    https://storage.googleapis.com/tfds-data/visualization/fig/mnist-3.0.1.png" alt="Visualization" width="500px">

  15. T

    resisc45

    • tensorflow.org
    Updated Dec 21, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). resisc45 [Dataset]. https://www.tensorflow.org/datasets/catalog/resisc45
    Explore at:
    Dataset updated
    Dec 21, 2022
    Description

    RESISC45 dataset is a publicly available benchmark for Remote Sensing Image Scene Classification (RESISC), created by Northwestern Polytechnical University (NWPU). This dataset contains 31,500 images, covering 45 scene classes with 700 images in each class.

    To use this dataset:

    import tensorflow_datasets as tfds
    
    ds = tfds.load('resisc45', split='train')
    for ex in ds.take(4):
     print(ex)
    

    See the guide for more informations on tensorflow_datasets.

    https://storage.googleapis.com/tfds-data/visualization/fig/resisc45-3.0.0.png" alt="Visualization" width="500px">

  16. T

    bccd

    • tensorflow.org
    • datasetninja.com
    Updated Dec 6, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). bccd [Dataset]. https://www.tensorflow.org/datasets/catalog/bccd
    Explore at:
    Dataset updated
    Dec 6, 2022
    Description

    BCCD Dataset is a small-scale dataset for blood cells detection.

    Thanks the original data and annotations from cosmicad and akshaylamba. The original dataset is re-organized into VOC format. BCCD Dataset is under MIT licence.

    Data preparation is important to use machine learning. In this project, the Faster R-CNN algorithm from keras-frcnn for Object Detection is used. From this dataset, nicolaschen1 developed two Python scripts to make preparation data (CSV file and images) for recognition of abnormalities in blood cells on medical images.

    export.py: it creates the file "test.csv" with all data needed: filename, class_name, x1,y1,x2,y2. plot.py: it plots the boxes for each image and save it in a new directory.

    Image Type : jpeg(JPEG) Width x Height : 640 x 480

    To use this dataset:

    import tensorflow_datasets as tfds
    
    ds = tfds.load('bccd', split='train')
    for ex in ds.take(4):
     print(ex)
    

    See the guide for more informations on tensorflow_datasets.

    https://storage.googleapis.com/tfds-data/visualization/fig/bccd-1.0.0.png" alt="Visualization" width="500px">

  17. T

    stl10

    • tensorflow.org
    Updated Jan 13, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). stl10 [Dataset]. https://www.tensorflow.org/datasets/catalog/stl10
    Explore at:
    Dataset updated
    Jan 13, 2023
    Description

    The STL-10 dataset is an image recognition dataset for developing unsupervised feature learning, deep learning, self-taught learning algorithms. It is inspired by the CIFAR-10 dataset but with some modifications. In particular, each class has fewer labeled training examples than in CIFAR-10, but a very large set of unlabeled examples is provided to learn image models prior to supervised training. The primary challenge is to make use of the unlabeled data (which comes from a similar but different distribution from the labeled data) to build a useful prior. All images were acquired from labeled examples on ImageNet.

    To use this dataset:

    import tensorflow_datasets as tfds
    
    ds = tfds.load('stl10', split='train')
    for ex in ds.take(4):
     print(ex)
    

    See the guide for more informations on tensorflow_datasets.

    https://storage.googleapis.com/tfds-data/visualization/fig/stl10-1.0.0.png" alt="Visualization" width="500px">

  18. T

    voc

    • tensorflow.org
    Updated Jun 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). voc [Dataset]. https://www.tensorflow.org/datasets/catalog/voc
    Explore at:
    Dataset updated
    Jun 3, 2025
    Description

    This dataset contains the data from the PASCAL Visual Object Classes Challenge, corresponding to the Classification and Detection competitions.

    In the Classification competition, the goal is to predict the set of labels contained in the image, while in the Detection competition the goal is to predict the bounding box and label of each individual object. WARNING: As per the official dataset, the test set of VOC2012 does not contain annotations.

    To use this dataset:

    import tensorflow_datasets as tfds
    
    ds = tfds.load('voc', split='train')
    for ex in ds.take(4):
     print(ex)
    

    See the guide for more informations on tensorflow_datasets.

    https://storage.googleapis.com/tfds-data/visualization/fig/voc-2007-5.0.0.png" alt="Visualization" width="500px">

  19. T

    imagenet_r

    • tensorflow.org
    Updated Jun 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). imagenet_r [Dataset]. https://www.tensorflow.org/datasets/catalog/imagenet_r
    Explore at:
    Dataset updated
    Jun 1, 2024
    Description

    ImageNet-R is a set of images labelled with ImageNet labels that were obtained by collecting art, cartoons, deviantart, graffiti, embroidery, graphics, origami, paintings, patterns, plastic objects, plush objects, sculptures, sketches, tattoos, toys, and video game renditions of ImageNet classes. ImageNet-R has renditions of 200 ImageNet classes resulting in 30,000 images. by collecting new data and keeping only those images that ResNet-50 models fail to correctly classify. For more details please refer to the paper.

    The label space is the same as that of ImageNet2012. Each example is represented as a dictionary with the following keys:

    • 'image': The image, a (H, W, 3)-tensor.
    • 'label': An integer in the range [0, 1000).
    • 'file_name': A unique sting identifying the example within the dataset.

    To use this dataset:

    import tensorflow_datasets as tfds
    
    ds = tfds.load('imagenet_r', split='train')
    for ex in ds.take(4):
     print(ex)
    

    See the guide for more informations on tensorflow_datasets.

    https://storage.googleapis.com/tfds-data/visualization/fig/imagenet_r-0.2.0.png" alt="Visualization" width="500px">

  20. T

    diabetic_retinopathy_detection

    • tensorflow.org
    Updated Feb 20, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2020). diabetic_retinopathy_detection [Dataset]. https://www.tensorflow.org/datasets/catalog/diabetic_retinopathy_detection
    Explore at:
    Dataset updated
    Feb 20, 2020
    Description

    A large set of high-resolution retina images taken under a variety of imaging conditions.

    To use this dataset:

    import tensorflow_datasets as tfds
    
    ds = tfds.load('diabetic_retinopathy_detection', split='train')
    for ex in ds.take(4):
     print(ex)
    

    See the guide for more informations on tensorflow_datasets.

    https://storage.googleapis.com/tfds-data/visualization/fig/diabetic_retinopathy_detection-original-3.0.0.png" alt="Visualization" width="500px">

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
(2022). imagenet2012_multilabel [Dataset]. https://www.tensorflow.org/datasets/catalog/imagenet2012_multilabel

imagenet2012_multilabel

Explore at:
Dataset updated
Dec 10, 2022
Description

This dataset contains ILSVRC-2012 (ImageNet) validation images annotated with multi-class labels from "Evaluating Machine Accuracy on ImageNet", ICML, 2020. The multi-class labels were reviewed by a panel of experts extensively trained in the intricacies of fine-grained class distinctions in the ImageNet class hierarchy (see paper for more details). Compared to the original labels, these expert-reviewed multi-class labels enable a more semantically coherent evaluation of accuracy.

Version 3.0.0 of this dataset contains more corrected labels from "When does dough become a bagel? Analyzing the remaining mistakes on ImageNet as well as the ImageNet-Major (ImageNet-M) 68-example split under 'imagenet-m'.

Only 20,000 of the 50,000 ImageNet validation images have multi-label annotations. The set of multi-labels was first generated by a testbed of 67 trained ImageNet models, and then each individual model prediction was manually annotated by the experts as either correct (the label is correct for the image),wrong (the label is incorrect for the image), or unclear (no consensus was reached among the experts).

Additionally, during annotation, the expert panel identified a set of problematic images. An image was problematic if it met any of the below criteria:

  • The original ImageNet label (top-1 label) was incorrect or unclear
  • Image was a drawing, painting, sketch, cartoon, or computer-rendered
  • Image was excessively edited
  • Image had inappropriate content

The problematic images are included in this dataset but should be ignored when computing multi-label accuracy. Additionally, since the initial set of 20,000 annotations is class-balanced, but the set of problematic images is not, we recommend computing the per-class accuracies and then averaging them. We also recommend counting a prediction as correct if it is marked as correct or unclear (i.e., being lenient with the unclear labels).

One possible way of doing this is with the following NumPy code:

import tensorflow_datasets as tfds

ds = tfds.load('imagenet2012_multilabel', split='validation')

# We assume that predictions is a dictionary from file_name to a class index between 0 and 999

num_correct_per_class = {}
num_images_per_class = {}

for example in ds:
  # We ignore all problematic images
  if example[‘is_problematic’].numpy():
    continue

  # The label of the image in ImageNet
  cur_class = example['original_label'].numpy()

  # If we haven't processed this class yet, set the counters to 0
  if cur_class not in num_correct_per_class:
    num_correct_per_class[cur_class] = 0
    assert cur_class not in num_images_per_class
    num_images_per_class[cur_class] = 0

  num_images_per_class[cur_class] += 1

  # Get the predictions for this image
  cur_pred = predictions[example['file_name'].numpy()]

  # We count a prediction as correct if it is marked as correct or unclear
  # (i.e., we are lenient with the unclear labels)
  if cur_pred is in example['correct_multi_labels'].numpy() or cur_pred is in example['unclear_multi_labels'].numpy():
    num_correct_per_class[cur_class] += 1

# Check that we have collected accuracy data for each of the 1,000 classes
num_classes = 1000
assert len(num_correct_per_class) == num_classes
assert len(num_images_per_class) == num_classes

# Compute the per-class accuracies and then average them
final_avg = 0
for cid in range(num_classes):
 assert cid in num_correct_per_class
 assert cid in num_images_per_class
 final_avg += num_correct_per_class[cid] / num_images_per_class[cid]
final_avg /= num_classes

To use this dataset:

import tensorflow_datasets as tfds

ds = tfds.load('imagenet2012_multilabel', split='train')
for ex in ds.take(4):
 print(ex)

See the guide for more informations on tensorflow_datasets.

https://storage.googleapis.com/tfds-data/visualization/fig/imagenet2012_multilabel-3.0.0.png" alt="Visualization" width="500px">

Search
Clear search
Close search
Google apps
Main menu