16 datasets found
  1. T

    cifar10

    • tensorflow.org
    • opendatalab.com
    • +3more
    Updated Jun 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). cifar10 [Dataset]. https://www.tensorflow.org/datasets/catalog/cifar10
    Explore at:
    Dataset updated
    Jun 1, 2024
    Description

    The CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images.

    To use this dataset:

    import tensorflow_datasets as tfds
    
    ds = tfds.load('cifar10', split='train')
    for ex in ds.take(4):
     print(ex)
    

    See the guide for more informations on tensorflow_datasets.

    https://storage.googleapis.com/tfds-data/visualization/fig/cifar10-3.0.2.png" alt="Visualization" width="500px">

  2. features-vgg19-cifar10test-85

    • kaggle.com
    zip
    Updated Apr 16, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eka Antonius Kurniawan (2021). features-vgg19-cifar10test-85 [Dataset]. https://www.kaggle.com/datasets/ekaakurniawan/featuresvgg19cifar10test85
    Explore at:
    zip(11468534921 bytes)Available download formats
    Dataset updated
    Apr 16, 2021
    Authors
    Eka Antonius Kurniawan
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    Context

    CIFAR-10 features collected from VGG19 model at 85% test accuracy. Consists of features from all 19 layers of 10,000 test images. Each layer has its own directory. Inside each layer directory, the features are stored per batch with filename started from 1 to (in current case) 5 with batch size of 2,000 images.

    Image dataset is from version 1. Refer to this dataset to get the image, the label and the filename. Concatenate batch files in an ascending manner to get index to index relation between the image and the feature.

    Extracted from notebook version 10.

    Content

    16 feature layer directories (conv1_1, conv1_2, ..., conv5_4) and 3 classifier layer directories (lin1, lin2, and lin3). Each layer directory consists of saved files 1 to 5 in serialized NumPy using Python pickle. The file with 2,000 batch size has shape of [2000, dim1, dim2, dim3] for feature layers or [2000, dim1] for classifier layers.

    Example for feature layer: conv1_1 (2000, 64, 32, 32) conv1_2 (2000, 64, 32, 32) conv2_1 (2000, 128, 16, 16) conv2_2 (2000, 128, 16, 16) conv3_1 (2000, 256, 8, 8) conv3_2 (2000, 256, 8, 8) conv3_3 (2000, 256, 8, 8) conv3_4 (2000, 256, 8, 8) conv4_1 (2000, 512, 4, 4) conv4_2 (2000, 512, 4, 4) conv4_3 (2000, 512, 4, 4) conv4_4 (2000, 512, 4, 4) conv5_1 (2000, 512, 2, 2) conv5_2 (2000, 512, 2, 2) conv5_3 (2000, 512, 2, 2) conv5_4 (2000, 512, 2, 2)

    Example for classifier layer: lin1 (2000, 4096) lin2 (2000, 4096) lin3 (2000, 10)

  3. o

    Rescaled CIFAR-10 dataset

    • explore.openaire.eu
    • zenodo.org
    Updated Apr 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andrzej Perzanowski; Tony Lindeberg (2025). Rescaled CIFAR-10 dataset [Dataset]. http://doi.org/10.5281/zenodo.15188748
    Explore at:
    Dataset updated
    Apr 10, 2025
    Authors
    Andrzej Perzanowski; Tony Lindeberg
    Description

    Motivation The goal of introducing the Rescaled CIFAR-10 dataset is to provide a dataset that contains scale variations (up to a factor of 4), to evaluate the ability of networks to generalise to scales not present in the training data. The Rescaled CIFAR-10 dataset was introduced in the paper: [1] A. Perzanowski and T. Lindeberg (2025) "Scale generalisation properties of extended scale-covariant and scale-invariant Gaussian derivative networks on image datasets with spatial scaling variations”, Journal of Mathematical Imaging and Vision, to appear. with a pre-print available at arXiv: [2] Perzanowski and Lindeberg (2024) "Scale generalisation properties of extended scale-covariant and scale-invariant Gaussian derivative networks on image datasets with spatial scaling variations”, arXiv preprint arXiv:2409.11140. Importantly, the Rescaled CIFAR-10 dataset contains substantially more natural textures and patterns than the MNIST Large Scale dataset, introduced in: [3] Y. Jansson and T. Lindeberg (2022) "Scale-invariant scale-channel networks: Deep networks that generalise to previously unseen scales", Journal of Mathematical Imaging and Vision, 64(5): 506-536, https://doi.org/10.1007/s10851-022-01082-2 and is therefore significantly more challenging. Access and rights The Rescaled CIFAR-10 dataset is provided on the condition that you provide proper citation for the original CIFAR-10 dataset: [4] Krizhevsky, A. and Hinton, G. (2009). Learning multiple layers of features from tiny images. Tech. rep., University of Toronto. and also for this new rescaled version, using the reference [1] above. The data set is made available on request. If you would be interested in trying out this data set, please make a request in the system below, and we will grant you access as soon as possible. The dataset The Rescaled CIFAR-10 dataset is generated by rescaling 32×32 RGB images of animals and vehicles from the original CIFAR-10 dataset [4]. The scale variations are up to a factor of 4. In order to have all test images have the same resolution, mirror extension is used to extend the images to size 64x64. The imresize() function in Matlab was used for the rescaling, with default anti-aliasing turned on, and bicubic interpolation overshoot removed by clipping to the [0, 255] range. The details of how the dataset was created can be found in [1]. There are 10 distinct classes in the dataset: “airplane”, “automobile”, “bird”, “cat”, “deer”, “dog”, “frog”, “horse”, “ship” and “truck”. In the dataset, these are represented by integer labels in the range [0, 9]. The dataset is split into 40 000 training samples, 10 000 validation samples and 10 000 testing samples. The training dataset is generated using the initial 40 000 samples from the original CIFAR-10 training set. The validation dataset, on the other hand, is formed from the final 10 000 image batch of that same training set. For testing, all test datasets are built from the 10 000 images contained in the original CIFAR-10 test set. The h5 files containing the dataset The training dataset file (~5.9 GB) for scale 1, which also contains the corresponding validation and test data for the same scale, is: cifar10_with_scale_variations_tr40000_vl10000_te10000_outsize64-64_scte1p000_scte1p000.h5 Additionally, for the Rescaled CIFAR-10 dataset, there are 9 datasets (~1 GB each) for testing scale generalisation at scales not present in the training set. Each of these datasets is rescaled using a different image scaling factor, 2k/4, with k being integers in the range [-4, 4]: cifar10_with_scale_variations_te10000_outsize64-64_scte0p500.h5 cifar10_with_scale_variations_te10000_outsize64-64_scte0p595.h5 cifar10_with_scale_variations_te10000_outsize64-64_scte0p707.h5 cifar10_with_scale_variations_te10000_outsize64-64_scte0p841.h5 cifar10_with_scale_variations_te10000_outsize64-64_scte1p000.h5 cifar10_with_scale_variations_te10000_outsize64-64_scte1p189.h5 cifar10_with_scale_variations_te10000_outsize64-64_scte1p414.h5 cifar10_with_scale_variations_te10000_outsize64-64_scte1p682.h5 cifar10_with_scale_variations_te10000_outsize64-64_scte2p000.h5 These dataset files were used for the experiments presented in Figures 9, 10, 15, 16, 20 and 24 in [1]. Instructions for loading the data set The datasets are saved in HDF5 format, with the partitions in the respective h5 files named as('/x_train', '/x_val', '/x_test', '/y_train', '/y_test', '/y_val'); which ones exist depends on which data split is used. The training dataset can be loaded in Python as: with h5py.File(``, 'r') as f: x_train = np.array( f["/x_train"], dtype=np.float32) x_val = np.array( f["/x_val"], dtype=np.float32) x_test = np.array( f["/x_test"], dtype=np.float32) y_train = np.array( f["/y_train"], dtype=np.int32) y_val = np.array( f["/y_val"], dtype=np.int32) y_test = np.array( f["/y_test"], dtype=np.int32) We also need to permute the data, since Pytorch uses the format [num_samples, channels, width...

  4. CIFAR-10 Python in CSV

    • kaggle.com
    Updated Jun 22, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    fedesoriano (2021). CIFAR-10 Python in CSV [Dataset]. https://www.kaggle.com/fedesoriano/cifar10-python-in-csv
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 22, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    fedesoriano
    Description

    Context

    The CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. The classes are completely mutually exclusive. There are 50000 training images and 10000 test images.

    The batches.meta file contains the label names of each class.

    The dataset was originally divided in 5 training batches with 10000 images per batch. The original dataset can be found here: https://www.cs.toronto.edu/~kriz/cifar.html. This dataset contains all the training data and test data in the same CSV file so it is easier to load.

    Content

    Here is the list of the 10 classes in the CIFAR-10:

    Classes: 1) 0: airplane 2) 1: automobile 3) 2: bird 4) 3: cat 5) 4: deer 6) 5: dog 7) 6: frog 8) 7: horse 9) 8: ship 10) 9: truck

    Acknowledgements

    • Learning Multiple Layers of Features from Tiny Images, Alex Krizhevsky, 2009. Link

    How to load the batches.meta file (Python)

    The function used to open the file: def unpickle(file): import pickle with open(file, 'rb') as fo: dict = pickle.load(fo, encoding='bytes') return dict

    Example of how to read the file: metadata_path = './cifar-10-python/batches.meta' # change this path metadata = unpickle(metadata_path)

  5. h

    cifar10

    • huggingface.co
    Updated Jul 31, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Élie Goudout (2025). cifar10 [Dataset]. https://huggingface.co/datasets/ego-thales/cifar10
    Explore at:
    Dataset updated
    Jul 31, 2025
    Authors
    Élie Goudout
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset Specifications

    Contains the entire CIFAR10 dataset, downloaded via PyTorch, then split and saved as .png files representing 32x32 images. There a three splits, perfectly balanced class-wise:

    train: 49,000 out of the original 50,000 samples from the training set of CIFAR10; calibration: 1,000 left-out samples from the training set; test: 10,000 samples, the entire original test set.

    Every sample has a unique filename XXX.png where XXX goes from 0 to 59,999.

  6. R

    Cifar 100 Dataset

    • universe.roboflow.com
    • opendatalab.com
    • +3more
    zip
    Updated Aug 11, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Popular Benchmarks (2022). Cifar 100 Dataset [Dataset]. https://universe.roboflow.com/popular-benchmarks/cifar100
    Explore at:
    zipAvailable download formats
    Dataset updated
    Aug 11, 2022
    Dataset authored and provided by
    Popular Benchmarks
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Animals People CommonObjects
    Description

    CIFAR-100

    The CIFAR-10 and CIFAR-100 dataset contains labeled subsets of the 80 million tiny images dataset. They were collected by Alex Krizhevsky, Vinod Nair, and Geoffrey Hinton. * More info on CIFAR-100: https://www.cs.toronto.edu/~kriz/cifar.html * TensorFlow listing of the dataset: https://www.tensorflow.org/datasets/catalog/cifar100 * GitHub repo for converting CIFAR-100 tarball files to png format: https://github.com/knjcode/cifar2png

    All images were sized 32x32 in the original dataset

    The CIFAR-10 dataset consists of 60,000 32x32 colour images in 10 classes, with 6,000 images per class. There are 50,000 training images and 10,000 test images [in the original dataset].

    This dataset is just like the CIFAR-10, except it has 100 classes containing 600 images each. There are 500 training images and 100 testing images per class. The 100 classes in the CIFAR-100 are grouped into 20 superclasses. Each image comes with a "fine" label (the class to which it belongs) and a "coarse" label (the superclass to which it belongs). However, this project does not contain the superclasses. * Superclasses version: https://universe.roboflow.com/popular-benchmarks/cifar100-with-superclasses/

    More background on the dataset: https://i.imgur.com/5w8A0Vm.png" alt="CIFAR-100 Dataset Classes and Superclassees">

    Version 1 (original-images_Original-CIFAR100-Splits):

    • Original images, with the original splits for CIFAR-100: train (83.33% of images - 50,000 images) set and test (16.67% of images - 10,000 images) set only.
    • This version was not trained

    Version 2 (original-images_trainSetSplitBy80_20):

    • Original, raw images, with the train set split to provide 80% of its images to the training set (approximately 40,000 images) and 20% of its images to the validation set (approximately 10,000 images)
    • Trained from Roboflow Classification Model's ImageNet training checkpoint
    • https://blog.roboflow.com/train-test-split/ https://i.imgur.com/kSPeKGn.png" alt="Train/Valid/Test Split Rebalancing">

    Citation:

    @TECHREPORT{Krizhevsky09learningmultiple,
      author = {Alex Krizhevsky},
      title = {Learning multiple layers of features from tiny images},
      institution = {},
      year = {2009}
    }
    
  7. T

    cifar10_corrupted

    • tensorflow.org
    Updated Jun 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). cifar10_corrupted [Dataset]. https://www.tensorflow.org/datasets/catalog/cifar10_corrupted
    Explore at:
    Dataset updated
    Jun 1, 2024
    Description

    Cifar10Corrupted is a dataset generated by adding 15 common corruptions + 4 extra corruptions to the test images in the Cifar10 dataset. This dataset wraps the corrupted Cifar10 test images uploaded by the original authors.

    To use this dataset:

    import tensorflow_datasets as tfds
    
    ds = tfds.load('cifar10_corrupted', split='train')
    for ex in ds.take(4):
     print(ex)
    

    See the guide for more informations on tensorflow_datasets.

    https://storage.googleapis.com/tfds-data/visualization/fig/cifar10_corrupted-brightness_1-1.0.0.png" alt="Visualization" width="500px">

  8. Z

    Data from "Benchmark Generation Framework with Customizable Distortions for...

    • data.niaid.nih.gov
    • explore.openaire.eu
    Updated Jun 14, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Guillen, Antonio (2023). Data from "Benchmark Generation Framework with Customizable Distortions for Image Classifier Robustness" [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8034832
    Explore at:
    Dataset updated
    Jun 14, 2023
    Dataset provided by
    Naug, Avisek
    Ramesh Babu, Ashwin
    Ghorbanpour, Sahand
    Sarkar, Soumyendu
    Mousavi, Sajad
    Carmichael, Zachariah
    Guillen, Antonio
    Luna Gutierrez, Ricardo
    Gundecha, Vineet
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This repository contains the data from the paper, "Benchmark Generation Framework with Customizable Distortions for Image Classifier Robustness."

    Relevant URLs:

    https://hewlettpackard.github.io/trust-ml/

    https://github.com/HewlettPackard/trust-ml/

    Abstract:

    We present a novel framework for generating adversarial benchmarks to evaluate the robustness of image classification models. The RLAB framework allows users to customize the types of distortions to be optimally applied to images, which helps address the specific distortions relevant to their deployment. The benchmark can generate datasets at various distortion levels to assess the robustness of different image classifiers. Our results show that the adversarial samples generated by our framework with any of the image classification models, like ResNet-50, Inception-V3, and VGG-16, are effective and transferable to other models causing them to fail. These failures happen even when these models are adversarially retrained using state-of-the-art techniques, demonstrating the generalizability of our adversarial samples. Our framework also allows the creation of adversarial samples for non-ground truth classes at different levels of intensity, enabling tunable benchmarks for the evaluation of false positives. We achieve competitive performance in terms of net $L_2$ distortion compared to state-of-the-art benchmark techniques on CIFAR-10 and ImageNet; however, we demonstrate our framework achieves such results with simple distortions like Gaussian noise without introducing unnatural artifacts or color bleeds. This is made possible by a model-based reinforcement learning (RL) agent and a technique that reduces a deep tree search of the image for model sensitivity to perturbations, to a one-level analysis and action. The flexibility of choosing distortions and setting classification probability thresholds for multiple classes makes our framework suitable for algorithmic audits.

  9. cinic10

    • huggingface.co
    Updated Aug 10, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Flower Labs (2024). cinic10 [Dataset]. https://huggingface.co/datasets/flwrlabs/cinic10
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 10, 2024
    Dataset provided by
    Flower Labs GmbH
    Authors
    Flower Labs
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset Card for CINIC-10

    CINIC-10 has a total of 270,000 images equally split amongst three subsets: train, validate, and test. This means that CINIC-10 has 4.5 times as many samples than CIFAR-10.

      Dataset Details
    

    In each subset (90,000 images), there are ten classes (identical to CIFAR-10 classes). There are 9000 images per class per subset. Using the suggested data split (an equal three-way split), CINIC-10 has 1.8 times as many training samples as in CIFAR-10.… See the full description on the dataset page: https://huggingface.co/datasets/flwrlabs/cinic10.

  10. a

    Stanford STL-10 Image Dataset

    • academictorrents.com
    bittorrent
    Updated Nov 26, 2015
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Adam Coates and Honglak Lee and Andrew Y. Ng (2015). Stanford STL-10 Image Dataset [Dataset]. https://academictorrents.com/details/a799a2845ac29a66c07cf74e2a2838b6c5698a6a
    Explore at:
    bittorrent(2640397119)Available download formats
    Dataset updated
    Nov 26, 2015
    Dataset authored and provided by
    Adam Coates and Honglak Lee and Andrew Y. Ng
    License

    https://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified

    Description

    ![]() The STL-10 dataset is an image recognition dataset for developing unsupervised feature learning, deep learning, self-taught learning algorithms. It is inspired by the CIFAR-10 dataset but with some modifications. In particular, each class has fewer labeled training examples than in CIFAR-10, but a very large set of unlabeled examples is provided to learn image models prior to supervised training. The primary challenge is to make use of the unlabeled data (which comes from a similar but different distribution from the labeled data) to build a useful prior. We also expect that the higher resolution of this dataset (96x96) will make it a challenging benchmark for developing more scalable unsupervised learning methods. Overview 10 classes: airplane, bird, car, cat, deer, dog, horse, monkey, ship, truck. Images are 96x96 pixels, color. 500 training images (10 pre-defined folds), 800 test images per class. 100000 unlabeled images for uns

  11. Dollar street 10 - 64x64x3

    • zenodo.org
    • data.niaid.nih.gov
    bin
    Updated May 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sven van der burg; Sven van der burg (2025). Dollar street 10 - 64x64x3 [Dataset]. http://doi.org/10.5281/zenodo.10970014
    Explore at:
    binAvailable download formats
    Dataset updated
    May 6, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Sven van der burg; Sven van der burg
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The MLCommons Dollar Street Dataset is a collection of images of everyday household items from homes around the world that visually captures socioeconomic diversity of traditionally underrepresented populations. It consists of public domain data, licensed for academic, commercial and non-commercial usage, under CC-BY and CC-BY-SA 4.0. The dataset was developed because similar datasets lack socioeconomic metadata and are not representative of global diversity.

    This is a subset of the original dataset that can be used for multiclass classification with 10 categories. It is designed to be used in teaching, similar to the widely used, but unlicensed CIFAR-10 dataset.

    These are the preprocessing steps that were performed:

    1. Only take examples with one imagenet_synonym label
    2. Use only examples with the 10 most frequently occuring labels
    3. Downscale images to 64 x 64 pixels
    4. Split data in train and test
    5. Store as numpy array

    This is the label mapping:

    Categorylabel
    day bed0
    dishrag1
    plate2
    running shoe3
    soap dispenser4
    street sign5
    table lamp6
    tile roof7
    toilet seat8
    washing machine9

    Checkout https://github.com/carpentries-lab/deep-learning-intro/blob/main/instructors/prepare-dollar-street-data.ipynb" target="_blank" rel="noopener">this notebook to see how the subset was created.

    The original dataset was downloaded from https://www.kaggle.com/datasets/mlcommons/the-dollar-street-dataset. See https://mlcommons.org/datasets/dollar-street/ for more information.

  12. CIFAR-100 Python

    • kaggle.com
    zip
    Updated Dec 26, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    fedesoriano (2020). CIFAR-100 Python [Dataset]. https://www.kaggle.com/fedesoriano/cifar100
    Explore at:
    zip(168517809 bytes)Available download formats
    Dataset updated
    Dec 26, 2020
    Authors
    fedesoriano
    Description

    Similar Datasets:

    CIFAR-10 Python (in CSV): LINK

    Context

    The CIFAR-100 dataset consists of 60000 32x32 colour images in 100 classes, with 600 images per class. The 100 classes in the CIFAR-100 are grouped into 20 superclasses. Each image comes with a "fine" label (the class to which it belongs) and a "coarse" label (the superclass to which it belongs). There are 50000 training images and 10000 test images. The meta file contains the label names of each class and superclass.

    Content

    Here is the list of the 100 classes in the CIFAR-100:

    Classes: 1-5) beaver, dolphin, otter, seal, whale 6-10) aquarium fish, flatfish, ray, shark, trout 11-15) orchids, poppies, roses, sunflowers, tulips 16-20) bottles, bowls, cans, cups, plates 21-25) apples, mushrooms, oranges, pears, sweet peppers 26-30) clock, computer keyboard, lamp, telephone, television 31-35) bed, chair, couch, table, wardrobe 36-40) bee, beetle, butterfly, caterpillar, cockroach 41-45) bear, leopard, lion, tiger, wolf 46-50) bridge, castle, house, road, skyscraper 51-55) cloud, forest, mountain, plain, sea 56-60) camel, cattle, chimpanzee, elephant, kangaroo 61-65) fox, porcupine, possum, raccoon, skunk 66-70) crab, lobster, snail, spider, worm 71-75) baby, boy, girl, man, woman 76-80) crocodile, dinosaur, lizard, snake, turtle 81-85) hamster, mouse, rabbit, shrew, squirrel 86-90) maple, oak, palm, pine, willow 91-95) bicycle, bus, motorcycle, pickup truck, train 96-100) lawn-mower, rocket, streetcar, tank, tractor

    and the list of the 20 superclasses: 1) aquatic mammals (classes 1-5) 2) fish (classes 6-10) 3) flowers (classes 11-15) 4) food containers (classes 16-20) 5) fruit and vegetables (classes 21-25) 6) household electrical devices (classes 26-30) 7) household furniture (classes 31-35) 8) insects (classes 36-40) 9) large carnivores (classes 41-45) 10) large man-made outdoor things (classes 46-50) 11) large natural outdoor scenes (classes 51-55) 12) large omnivores and herbivores (classes 56-60) 13) medium-sized mammals (classes 61-65) 14) non-insect invertebrates (classes 66-70) 15) people (classes 71-75) 16) reptiles (classes 76-80) 17) small mammals (classes 81-85) 18) trees (classes 86-90) 19) vehicles 1 (classes 91-95) 20) vehicles 2 (classes 96-100)

    Acknowledgements

    • Learning Multiple Layers of Features from Tiny Images, Alex Krizhevsky, 2009. Link

    How to load the data (Python)

    The function used to open each file: def unpickle(file): import pickle with open(file, 'rb') as fo: dict = pickle.load(fo, encoding='bytes') return dict

    Example of how to read the metadata and the superclasses: metadata_path = './cifar-100-python/meta' # change this path`\ metadata = unpickle(metadata_path) superclass_dict = dict(list(enumerate(metadata[b'coarse_label_names'])))

    How to load the training and test sets (using superclasses): ``` data_pre_path = './cifar-100-python/' # change this path

    File paths

    data_train_path = data_pre_path + 'train' data_test_path = data_pre_path + 'test'

    Read dictionary

    data_train_dict = unpickle(data_train_path) data_test_dict = unpickle(data_test_path)

    Get data (change the coarse_labels if you want to use the 100 classes)

    data_train = data_train_dict[b'data'] label_train = np.array(data_train_dict[b'coarse_labels']) data_test = data_test_dict[b'data'] label_test = np.array(data_test_dict[b'coarse_labels']) ```

  13. h

    aclcifar10

    • huggingface.co
    Updated May 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NTU Computational Learning Lab (2023). aclcifar10 [Dataset]. https://huggingface.co/datasets/ntucllab/aclcifar10
    Explore at:
    Dataset updated
    May 16, 2023
    Dataset authored and provided by
    NTU Computational Learning Lab
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset Card for ACLCIFAR10

    This Complementary labeled CIFAR10 dataset contains auto-labeled complementary labels for all 50000 images in the training split of CIFAR10. For more details, please visit our github or paper.

      Dataset Structure
    
    
    
    
    
      Data Instances
    

    A sample from the training set is provided below: { 'images':

  14. f

    Data_Sheet_1_Building One-Shot Semi-Supervised (BOSS) Learning Up to Fully...

    • frontiersin.figshare.com
    pdf
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Leslie N. Smith; Adam Conovaloff (2023). Data_Sheet_1_Building One-Shot Semi-Supervised (BOSS) Learning Up to Fully Supervised Performance.pdf [Dataset]. http://doi.org/10.3389/frai.2022.880729.s001
    Explore at:
    pdfAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    Frontiers
    Authors
    Leslie N. Smith; Adam Conovaloff
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Reaching the performance of fully supervised learning with unlabeled data and only labeling one sample per class might be ideal for deep learning applications. We demonstrate for the first time the potential for building one-shot semi-supervised (BOSS) learning on CIFAR-10 and SVHN up to attain test accuracies that are comparable to fully supervised learning. Our method combines class prototype refining, class balancing, and self-training. A good prototype choice is essential and we propose a technique for obtaining iconic examples. In addition, we demonstrate that class balancing methods substantially improve accuracy results in semi-supervised learning to levels that allow self-training to reach the level of fully supervised learning performance. Our experiments demonstrate the value with computing and analyzing test accuracies for every class, rather than only a total test accuracy. We show that our BOSS methodology can obtain total test accuracies with CIFAR-10 images and only one labeled sample per class up to 95% (compared to 94.5% for fully supervised). Similarly, the SVHN images obtains test accuracies of 97.8%, compared to 98.27% for fully supervised. Rigorous empirical evaluations provide evidence that labeling large datasets is not necessary for training deep neural networks. Our code is available at https://github.com/lnsmith54/BOSS to facilitate replication.

  15. h

    clcifar10

    • huggingface.co
    Updated Jul 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NTU Computational Learning Lab (2025). clcifar10 [Dataset]. https://huggingface.co/datasets/ntucllab/clcifar10
    Explore at:
    Dataset updated
    Jul 4, 2025
    Dataset authored and provided by
    NTU Computational Learning Lab
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset Card for CLCIFAR10

    This Complementary labeled CIFAR10 dataset contains 3 human-annotated complementary labels for all 50000 images in the training split of CIFAR10. The workers are from Amazon Mechanical Turk. We randomly sampled 4 different labels for 3 different annotators, so each image would have 3 (probably repeated) complementary labels. For more details, please visit our github or paper.

      Dataset Structure
    
    
    
    
    
    
    
      Data Instances
    

    A sample from the… See the full description on the dataset page: https://huggingface.co/datasets/ntucllab/clcifar10.

  16. t

    CEs dataset - Dataset - LDM

    • service.tib.eu
    Updated Dec 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). CEs dataset - Dataset - LDM [Dataset]. https://service.tib.eu/ldmservice/dataset/ces-dataset
    Explore at:
    Dataset updated
    Dec 16, 2024
    Description

    The dataset used in the paper is a counterfactual examples (CEs) dataset, which is generated using a diffusion model. The dataset consists of images from the CIFAR10 and SVHN datasets, with each image paired with its corresponding CE.

  17. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
(2024). cifar10 [Dataset]. https://www.tensorflow.org/datasets/catalog/cifar10

cifar10

Explore at:
Dataset updated
Jun 1, 2024
Description

The CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images.

To use this dataset:

import tensorflow_datasets as tfds

ds = tfds.load('cifar10', split='train')
for ex in ds.take(4):
 print(ex)

See the guide for more informations on tensorflow_datasets.

https://storage.googleapis.com/tfds-data/visualization/fig/cifar10-3.0.2.png" alt="Visualization" width="500px">

Search
Clear search
Close search
Google apps
Main menu