16 datasets found

T
cifar10
tensorflow.org
opendatalab.com
+3more
Updated Jun 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). cifar10 [Dataset]. https://www.tensorflow.org/datasets/catalog/cifar10
Explore at:
Dataset updated
Jun 1, 2024
Description
The CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images.

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('cifar10', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.

https://storage.googleapis.com/tfds-data/visualization/fig/cifar10-3.0.2.png" alt="Visualization" width="500px">
features-vgg19-cifar10test-85
kaggle.com
zip
Updated Apr 16, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Eka Antonius Kurniawan (2021). features-vgg19-cifar10test-85 [Dataset]. https://www.kaggle.com/datasets/ekaakurniawan/featuresvgg19cifar10test85
Explore at:
zip(11468534921 bytes)Available download formats
Dataset updated
Apr 16, 2021
Authors
Eka Antonius Kurniawan
License
http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
Description
Context

CIFAR-10 features collected from VGG19 model at 85% test accuracy. Consists of features from all 19 layers of 10,000 test images. Each layer has its own directory. Inside each layer directory, the features are stored per batch with filename started from 1 to (in current case) 5 with batch size of 2,000 images.

Image dataset is from version 1. Refer to this dataset to get the image, the label and the filename. Concatenate batch files in an ascending manner to get index to index relation between the image and the feature.

Extracted from notebook version 10.

Content

16 feature layer directories (conv1_1, conv1_2, ..., conv5_4) and 3 classifier layer directories (lin1, lin2, and lin3). Each layer directory consists of saved files 1 to 5 in serialized NumPy using Python pickle. The file with 2,000 batch size has shape of [2000, dim1, dim2, dim3] for feature layers or [2000, dim1] for classifier layers.

Example for feature layer: conv1_1 (2000, 64, 32, 32) conv1_2 (2000, 64, 32, 32) conv2_1 (2000, 128, 16, 16) conv2_2 (2000, 128, 16, 16) conv3_1 (2000, 256, 8, 8) conv3_2 (2000, 256, 8, 8) conv3_3 (2000, 256, 8, 8) conv3_4 (2000, 256, 8, 8) conv4_1 (2000, 512, 4, 4) conv4_2 (2000, 512, 4, 4) conv4_3 (2000, 512, 4, 4) conv4_4 (2000, 512, 4, 4) conv5_1 (2000, 512, 2, 2) conv5_2 (2000, 512, 2, 2) conv5_3 (2000, 512, 2, 2) conv5_4 (2000, 512, 2, 2)

Example for classifier layer: lin1 (2000, 4096) lin2 (2000, 4096) lin3 (2000, 10)
o
Rescaled CIFAR-10 dataset
explore.openaire.eu
zenodo.org
Updated Apr 10, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Andrzej Perzanowski; Tony Lindeberg (2025). Rescaled CIFAR-10 dataset [Dataset]. http://doi.org/10.5281/zenodo.15188748
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.15188748
Dataset updated
Apr 10, 2025
Authors
Andrzej Perzanowski; Tony Lindeberg
Description
Motivation The goal of introducing the Rescaled CIFAR-10 dataset is to provide a dataset that contains scale variations (up to a factor of 4), to evaluate the ability of networks to generalise to scales not present in the training data. The Rescaled CIFAR-10 dataset was introduced in the paper: [1] A. Perzanowski and T. Lindeberg (2025) "Scale generalisation properties of extended scale-covariant and scale-invariant Gaussian derivative networks on image datasets with spatial scaling variations”, Journal of Mathematical Imaging and Vision, to appear. with a pre-print available at arXiv: [2] Perzanowski and Lindeberg (2024) "Scale generalisation properties of extended scale-covariant and scale-invariant Gaussian derivative networks on image datasets with spatial scaling variations”, arXiv preprint arXiv:2409.11140. Importantly, the Rescaled CIFAR-10 dataset contains substantially more natural textures and patterns than the MNIST Large Scale dataset, introduced in: [3] Y. Jansson and T. Lindeberg (2022) "Scale-invariant scale-channel networks: Deep networks that generalise to previously unseen scales", Journal of Mathematical Imaging and Vision, 64(5): 506-536, https://doi.org/10.1007/s10851-022-01082-2 and is therefore significantly more challenging. Access and rights The Rescaled CIFAR-10 dataset is provided on the condition that you provide proper citation for the original CIFAR-10 dataset: [4] Krizhevsky, A. and Hinton, G. (2009). Learning multiple layers of features from tiny images. Tech. rep., University of Toronto. and also for this new rescaled version, using the reference [1] above. The data set is made available on request. If you would be interested in trying out this data set, please make a request in the system below, and we will grant you access as soon as possible. The dataset The Rescaled CIFAR-10 dataset is generated by rescaling 32×32 RGB images of animals and vehicles from the original CIFAR-10 dataset [4]. The scale variations are up to a factor of 4. In order to have all test images have the same resolution, mirror extension is used to extend the images to size 64x64. The imresize() function in Matlab was used for the rescaling, with default anti-aliasing turned on, and bicubic interpolation overshoot removed by clipping to the [0, 255] range. The details of how the dataset was created can be found in [1]. There are 10 distinct classes in the dataset: “airplane”, “automobile”, “bird”, “cat”, “deer”, “dog”, “frog”, “horse”, “ship” and “truck”. In the dataset, these are represented by integer labels in the range [0, 9]. The dataset is split into 40 000 training samples, 10 000 validation samples and 10 000 testing samples. The training dataset is generated using the initial 40 000 samples from the original CIFAR-10 training set. The validation dataset, on the other hand, is formed from the final 10 000 image batch of that same training set. For testing, all test datasets are built from the 10 000 images contained in the original CIFAR-10 test set. The h5 files containing the dataset The training dataset file (~5.9 GB) for scale 1, which also contains the corresponding validation and test data for the same scale, is: cifar10_with_scale_variations_tr40000_vl10000_te10000_outsize64-64_scte1p000_scte1p000.h5 Additionally, for the Rescaled CIFAR-10 dataset, there are 9 datasets (~1 GB each) for testing scale generalisation at scales not present in the training set. Each of these datasets is rescaled using a different image scaling factor, 2k/4, with k being integers in the range [-4, 4]: cifar10_with_scale_variations_te10000_outsize64-64_scte0p500.h5 cifar10_with_scale_variations_te10000_outsize64-64_scte0p595.h5 cifar10_with_scale_variations_te10000_outsize64-64_scte0p707.h5 cifar10_with_scale_variations_te10000_outsize64-64_scte0p841.h5 cifar10_with_scale_variations_te10000_outsize64-64_scte1p000.h5 cifar10_with_scale_variations_te10000_outsize64-64_scte1p189.h5 cifar10_with_scale_variations_te10000_outsize64-64_scte1p414.h5 cifar10_with_scale_variations_te10000_outsize64-64_scte1p682.h5 cifar10_with_scale_variations_te10000_outsize64-64_scte2p000.h5 These dataset files were used for the experiments presented in Figures 9, 10, 15, 16, 20 and 24 in [1]. Instructions for loading the data set The datasets are saved in HDF5 format, with the partitions in the respective h5 files named as('/x_train', '/x_val', '/x_test', '/y_train', '/y_test', '/y_val'); which ones exist depends on which data split is used. The training dataset can be loaded in Python as: with h5py.File(``, 'r') as f: x_train = np.array( f["/x_train"], dtype=np.float32) x_val = np.array( f["/x_val"], dtype=np.float32) x_test = np.array( f["/x_test"], dtype=np.float32) y_train = np.array( f["/y_train"], dtype=np.int32) y_val = np.array( f["/y_val"], dtype=np.int32) y_test = np.array( f["/y_test"], dtype=np.int32) We also need to permute the data, since Pytorch uses the format [num_samples, channels, width...
CIFAR-10 Python in CSV
kaggle.com
Updated Jun 22, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
fedesoriano (2021). CIFAR-10 Python in CSV [Dataset]. https://www.kaggle.com/fedesoriano/cifar10-python-in-csv
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 22, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
fedesoriano
Description
Context

The CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. The classes are completely mutually exclusive. There are 50000 training images and 10000 test images.

The batches.meta file contains the label names of each class.

The dataset was originally divided in 5 training batches with 10000 images per batch. The original dataset can be found here: https://www.cs.toronto.edu/~kriz/cifar.html. This dataset contains all the training data and test data in the same CSV file so it is easier to load.

Content

Here is the list of the 10 classes in the CIFAR-10:

Classes: 1) 0: airplane 2) 1: automobile 3) 2: bird 4) 3: cat 5) 4: deer 6) 5: dog 7) 6: frog 8) 7: horse 9) 8: ship 10) 9: truck

Acknowledgements

Learning Multiple Layers of Features from Tiny Images, Alex Krizhevsky, 2009. Link

How to load the batches.meta file (Python)

The function used to open the file: def unpickle(file): import pickle with open(file, 'rb') as fo: dict = pickle.load(fo, encoding='bytes') return dict

Example of how to read the file: metadata_path = './cifar-10-python/batches.meta' # change this path metadata = unpickle(metadata_path)
h
cifar10
huggingface.co
Updated Jul 31, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Élie Goudout (2025). cifar10 [Dataset]. https://huggingface.co/datasets/ego-thales/cifar10
Explore at:
Dataset updated
Jul 31, 2025
Authors
Élie Goudout
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset Specifications

Contains the entire CIFAR10 dataset, downloaded via PyTorch, then split and saved as .png files representing 32x32 images. There a three splits, perfectly balanced class-wise:

train: 49,000 out of the original 50,000 samples from the training set of CIFAR10; calibration: 1,000 left-out samples from the training set; test: 10,000 samples, the entire original test set.

Every sample has a unique filename XXX.png where XXX goes from 0 to 59,999.
R
Cifar 100 Dataset
universe.roboflow.com
opendatalab.com
+3more
zip
Updated Aug 11, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Popular Benchmarks (2022). Cifar 100 Dataset [Dataset]. https://universe.roboflow.com/popular-benchmarks/cifar100
Explore at:
zipAvailable download formats
Dataset updated
Aug 11, 2022
Dataset authored and provided by
Popular Benchmarks
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Animals People CommonObjects
Description
CIFAR-100

The CIFAR-10 and CIFAR-100 dataset contains labeled subsets of the 80 million tiny images dataset. They were collected by Alex Krizhevsky, Vinod Nair, and Geoffrey Hinton. * More info on CIFAR-100: https://www.cs.toronto.edu/~kriz/cifar.html * TensorFlow listing of the dataset: https://www.tensorflow.org/datasets/catalog/cifar100 * GitHub repo for converting CIFAR-100 tarball files to png format: https://github.com/knjcode/cifar2png

All images were sized 32x32 in the original dataset

The CIFAR-10 dataset consists of 60,000 32x32 colour images in 10 classes, with 6,000 images per class. There are 50,000 training images and 10,000 test images [in the original dataset].

This dataset is just like the CIFAR-10, except it has 100 classes containing 600 images each. There are 500 training images and 100 testing images per class. The 100 classes in the CIFAR-100 are grouped into 20 superclasses. Each image comes with a "fine" label (the class to which it belongs) and a "coarse" label (the superclass to which it belongs). However, this project does not contain the superclasses. * Superclasses version: https://universe.roboflow.com/popular-benchmarks/cifar100-with-superclasses/

More background on the dataset: https://i.imgur.com/5w8A0Vm.png" alt="CIFAR-100 Dataset Classes and Superclassees">

Version 1 (original-images_Original-CIFAR100-Splits):

Original images, with the original splits for CIFAR-100: train (83.33% of images - 50,000 images) set and test (16.67% of images - 10,000 images) set only.

This version was not trained

Version 2 (original-images_trainSetSplitBy80_20):

Original, raw images, with the train set split to provide 80% of its images to the training set (approximately 40,000 images) and 20% of its images to the validation set (approximately 10,000 images)

Trained from Roboflow Classification Model's ImageNet training checkpoint

https://blog.roboflow.com/train-test-split/ https://i.imgur.com/kSPeKGn.png" alt="Train/Valid/Test Split Rebalancing">

Citation:

@TECHREPORT{Krizhevsky09learningmultiple, author = {Alex Krizhevsky}, title = {Learning multiple layers of features from tiny images}, institution = {}, year = {2009} }
T
cifar10_corrupted
tensorflow.org
Updated Jun 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). cifar10_corrupted [Dataset]. https://www.tensorflow.org/datasets/catalog/cifar10_corrupted
Explore at:
Dataset updated
Jun 1, 2024
Description
Cifar10Corrupted is a dataset generated by adding 15 common corruptions + 4 extra corruptions to the test images in the Cifar10 dataset. This dataset wraps the corrupted Cifar10 test images uploaded by the original authors.

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('cifar10_corrupted', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.

https://storage.googleapis.com/tfds-data/visualization/fig/cifar10_corrupted-brightness_1-1.0.0.png" alt="Visualization" width="500px">
Z
Data from "Benchmark Generation Framework with Customizable Distortions for...
data.niaid.nih.gov
explore.openaire.eu
Updated Jun 14, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Guillen, Antonio (2023). Data from "Benchmark Generation Framework with Customizable Distortions for Image Classifier Robustness" [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8034832
Explore at:
Dataset updated
Jun 14, 2023
Dataset provided by
Naug, Avisek
Ramesh Babu, Ashwin
Ghorbanpour, Sahand
Sarkar, Soumyendu
Mousavi, Sajad
Carmichael, Zachariah
Guillen, Antonio
Luna Gutierrez, Ricardo
Gundecha, Vineet
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This repository contains the data from the paper, "Benchmark Generation Framework with Customizable Distortions for Image Classifier Robustness."

Relevant URLs:

https://hewlettpackard.github.io/trust-ml/

https://github.com/HewlettPackard/trust-ml/

Abstract:

We present a novel framework for generating adversarial benchmarks to evaluate the robustness of image classification models. The RLAB framework allows users to customize the types of distortions to be optimally applied to images, which helps address the specific distortions relevant to their deployment. The benchmark can generate datasets at various distortion levels to assess the robustness of different image classifiers. Our results show that the adversarial samples generated by our framework with any of the image classification models, like ResNet-50, Inception-V3, and VGG-16, are effective and transferable to other models causing them to fail. These failures happen even when these models are adversarially retrained using state-of-the-art techniques, demonstrating the generalizability of our adversarial samples. Our framework also allows the creation of adversarial samples for non-ground truth classes at different levels of intensity, enabling tunable benchmarks for the evaluation of false positives. We achieve competitive performance in terms of net $L_2$ distortion compared to state-of-the-art benchmark techniques on CIFAR-10 and ImageNet; however, we demonstrate our framework achieves such results with simple distortions like Gaussian noise without introducing unnatural artifacts or color bleeds. This is made possible by a model-based reinforcement learning (RL) agent and a technique that reduces a deep tree search of the image for model sensitivity to perturbations, to a one-level analysis and action. The flexibility of choosing distortions and setting classification probability thresholds for multiple classes makes our framework suitable for algorithmic audits.
cinic10
huggingface.co
Updated Aug 10, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Flower Labs (2024). cinic10 [Dataset]. https://huggingface.co/datasets/flwrlabs/cinic10
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 10, 2024
Dataset provided by
Flower Labs GmbH
Authors
Flower Labs
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset Card for CINIC-10

CINIC-10 has a total of 270,000 images equally split amongst three subsets: train, validate, and test. This means that CINIC-10 has 4.5 times as many samples than CIFAR-10.

Dataset Details

In each subset (90,000 images), there are ten classes (identical to CIFAR-10 classes). There are 9000 images per class per subset. Using the suggested data split (an equal three-way split), CINIC-10 has 1.8 times as many training samples as in CIFAR-10.… See the full description on the dataset page: https://huggingface.co/datasets/flwrlabs/cinic10.
a
Stanford STL-10 Image Dataset
academictorrents.com
bittorrent
Updated Nov 26, 2015
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Adam Coates and Honglak Lee and Andrew Y. Ng (2015). Stanford STL-10 Image Dataset [Dataset]. https://academictorrents.com/details/a799a2845ac29a66c07cf74e2a2838b6c5698a6a
Explore at:
bittorrent(2640397119)Available download formats
Dataset updated
Nov 26, 2015
Dataset authored and provided by
Adam Coates and Honglak Lee and Andrew Y. Ng
License
https://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified
Description
![]() The STL-10 dataset is an image recognition dataset for developing unsupervised feature learning, deep learning, self-taught learning algorithms. It is inspired by the CIFAR-10 dataset but with some modifications. In particular, each class has fewer labeled training examples than in CIFAR-10, but a very large set of unlabeled examples is provided to learn image models prior to supervised training. The primary challenge is to make use of the unlabeled data (which comes from a similar but different distribution from the labeled data) to build a useful prior. We also expect that the higher resolution of this dataset (96x96) will make it a challenging benchmark for developing more scalable unsupervised learning methods. Overview 10 classes: airplane, bird, car, cat, deer, dog, horse, monkey, ship, truck. Images are 96x96 pixels, color. 500 training images (10 pre-defined folds), 800 test images per class. 100000 unlabeled images for uns
Dollar street 10 - 64x64x3
zenodo.org
data.niaid.nih.gov
bin
Updated May 6, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sven van der burg; Sven van der burg (2025). Dollar street 10 - 64x64x3 [Dataset]. http://doi.org/10.5281/zenodo.10970014
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.10970014
Dataset updated
May 6, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Sven van der burg; Sven van der burg
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The MLCommons Dollar Street Dataset is a collection of images of everyday household items from homes around the world that visually captures socioeconomic diversity of traditionally underrepresented populations. It consists of public domain data, licensed for academic, commercial and non-commercial usage, under CC-BY and CC-BY-SA 4.0. The dataset was developed because similar datasets lack socioeconomic metadata and are not representative of global diversity.

This is a subset of the original dataset that can be used for multiclass classification with 10 categories. It is designed to be used in teaching, similar to the widely used, but unlicensed CIFAR-10 dataset.

These are the preprocessing steps that were performed:

Only take examples with one imagenet_synonym label

Use only examples with the 10 most frequently occuring labels

Downscale images to 64 x 64 pixels

Split data in train and test

Store as numpy array

This is the label mapping:

Category label
day bed 0
dishrag 1
plate 2
running shoe 3
soap dispenser 4
street sign 5
table lamp 6
tile roof 7
toilet seat 8
washing machine 9

Checkout https://github.com/carpentries-lab/deep-learning-intro/blob/main/instructors/prepare-dollar-street-data.ipynb" target="_blank" rel="noopener">this notebook to see how the subset was created.

The original dataset was downloaded from https://www.kaggle.com/datasets/mlcommons/the-dollar-street-dataset. See https://mlcommons.org/datasets/dollar-street/ for more information.
CIFAR-100 Python
kaggle.com
zip
Updated Dec 26, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
fedesoriano (2020). CIFAR-100 Python [Dataset]. https://www.kaggle.com/fedesoriano/cifar100
Explore at:
zip(168517809 bytes)Available download formats
Dataset updated
Dec 26, 2020
Authors
fedesoriano
Description
Similar Datasets:

CIFAR-10 Python (in CSV): LINK

Context

The CIFAR-100 dataset consists of 60000 32x32 colour images in 100 classes, with 600 images per class. The 100 classes in the CIFAR-100 are grouped into 20 superclasses. Each image comes with a "fine" label (the class to which it belongs) and a "coarse" label (the superclass to which it belongs). There are 50000 training images and 10000 test images. The meta file contains the label names of each class and superclass.

Content

Here is the list of the 100 classes in the CIFAR-100:

Classes: 1-5) beaver, dolphin, otter, seal, whale 6-10) aquarium fish, flatfish, ray, shark, trout 11-15) orchids, poppies, roses, sunflowers, tulips 16-20) bottles, bowls, cans, cups, plates 21-25) apples, mushrooms, oranges, pears, sweet peppers 26-30) clock, computer keyboard, lamp, telephone, television 31-35) bed, chair, couch, table, wardrobe 36-40) bee, beetle, butterfly, caterpillar, cockroach 41-45) bear, leopard, lion, tiger, wolf 46-50) bridge, castle, house, road, skyscraper 51-55) cloud, forest, mountain, plain, sea 56-60) camel, cattle, chimpanzee, elephant, kangaroo 61-65) fox, porcupine, possum, raccoon, skunk 66-70) crab, lobster, snail, spider, worm 71-75) baby, boy, girl, man, woman 76-80) crocodile, dinosaur, lizard, snake, turtle 81-85) hamster, mouse, rabbit, shrew, squirrel 86-90) maple, oak, palm, pine, willow 91-95) bicycle, bus, motorcycle, pickup truck, train 96-100) lawn-mower, rocket, streetcar, tank, tractor

and the list of the 20 superclasses: 1) aquatic mammals (classes 1-5) 2) fish (classes 6-10) 3) flowers (classes 11-15) 4) food containers (classes 16-20) 5) fruit and vegetables (classes 21-25) 6) household electrical devices (classes 26-30) 7) household furniture (classes 31-35) 8) insects (classes 36-40) 9) large carnivores (classes 41-45) 10) large man-made outdoor things (classes 46-50) 11) large natural outdoor scenes (classes 51-55) 12) large omnivores and herbivores (classes 56-60) 13) medium-sized mammals (classes 61-65) 14) non-insect invertebrates (classes 66-70) 15) people (classes 71-75) 16) reptiles (classes 76-80) 17) small mammals (classes 81-85) 18) trees (classes 86-90) 19) vehicles 1 (classes 91-95) 20) vehicles 2 (classes 96-100)

Acknowledgements

Learning Multiple Layers of Features from Tiny Images, Alex Krizhevsky, 2009. Link

How to load the data (Python)

The function used to open each file: def unpickle(file): import pickle with open(file, 'rb') as fo: dict = pickle.load(fo, encoding='bytes') return dict

Example of how to read the metadata and the superclasses: metadata_path = './cifar-100-python/meta' # change this path`\ metadata = unpickle(metadata_path) superclass_dict = dict(list(enumerate(metadata[b'coarse_label_names'])))

How to load the training and test sets (using superclasses): ``` data_pre_path = './cifar-100-python/' # change this path

File paths

data_train_path = data_pre_path + 'train' data_test_path = data_pre_path + 'test'

Read dictionary

data_train_dict = unpickle(data_train_path) data_test_dict = unpickle(data_test_path)

Get data (change the coarse_labels if you want to use the 100 classes)

data_train = data_train_dict[b'data'] label_train = np.array(data_train_dict[b'coarse_labels']) data_test = data_test_dict[b'data'] label_test = np.array(data_test_dict[b'coarse_labels']) ```
h
aclcifar10
huggingface.co
Updated May 16, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
NTU Computational Learning Lab (2023). aclcifar10 [Dataset]. https://huggingface.co/datasets/ntucllab/aclcifar10
Explore at:
Dataset updated
May 16, 2023
Dataset authored and provided by
NTU Computational Learning Lab
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset Card for ACLCIFAR10

This Complementary labeled CIFAR10 dataset contains auto-labeled complementary labels for all 50000 images in the training split of CIFAR10. For more details, please visit our github or paper.

Dataset Structure Data Instances

A sample from the training set is provided below: { 'images':
f
Data_Sheet_1_Building One-Shot Semi-Supervised (BOSS) Learning Up to Fully...
frontiersin.figshare.com
pdf
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Leslie N. Smith; Adam Conovaloff (2023). Data_Sheet_1_Building One-Shot Semi-Supervised (BOSS) Learning Up to Fully Supervised Performance.pdf [Dataset]. http://doi.org/10.3389/frai.2022.880729.s001
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.3389/frai.2022.880729.s001
Dataset updated
May 30, 2023
Dataset provided by
Frontiers
Authors
Leslie N. Smith; Adam Conovaloff
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Reaching the performance of fully supervised learning with unlabeled data and only labeling one sample per class might be ideal for deep learning applications. We demonstrate for the first time the potential for building one-shot semi-supervised (BOSS) learning on CIFAR-10 and SVHN up to attain test accuracies that are comparable to fully supervised learning. Our method combines class prototype refining, class balancing, and self-training. A good prototype choice is essential and we propose a technique for obtaining iconic examples. In addition, we demonstrate that class balancing methods substantially improve accuracy results in semi-supervised learning to levels that allow self-training to reach the level of fully supervised learning performance. Our experiments demonstrate the value with computing and analyzing test accuracies for every class, rather than only a total test accuracy. We show that our BOSS methodology can obtain total test accuracies with CIFAR-10 images and only one labeled sample per class up to 95% (compared to 94.5% for fully supervised). Similarly, the SVHN images obtains test accuracies of 97.8%, compared to 98.27% for fully supervised. Rigorous empirical evaluations provide evidence that labeling large datasets is not necessary for training deep neural networks. Our code is available at https://github.com/lnsmith54/BOSS to facilitate replication.
h
clcifar10
huggingface.co
Updated Jul 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
NTU Computational Learning Lab (2025). clcifar10 [Dataset]. https://huggingface.co/datasets/ntucllab/clcifar10
Explore at:
Dataset updated
Jul 4, 2025
Dataset authored and provided by
NTU Computational Learning Lab
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset Card for CLCIFAR10

This Complementary labeled CIFAR10 dataset contains 3 human-annotated complementary labels for all 50000 images in the training split of CIFAR10. The workers are from Amazon Mechanical Turk. We randomly sampled 4 different labels for 3 different annotators, so each image would have 3 (probably repeated) complementary labels. For more details, please visit our github or paper.

Dataset Structure Data Instances

A sample from the… See the full description on the dataset page: https://huggingface.co/datasets/ntucllab/clcifar10.
t
CEs dataset - Dataset - LDM
service.tib.eu
Updated Dec 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). CEs dataset - Dataset - LDM [Dataset]. https://service.tib.eu/ldmservice/dataset/ces-dataset
Explore at:
Dataset updated
Dec 16, 2024
Description
The dataset used in the paper is a counterfactual examples (CEs) dataset, which is generated using a diffusion model. The dataset consists of images from the CIFAR10 and SVHN datasets, with each image paired with its corresponding CE.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

(2024). cifar10 [Dataset]. https://www.tensorflow.org/datasets/catalog/cifar10

cifar10

Explore at:

Dataset updated

Jun 1, 2024

Description

The CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images.

To use this dataset:

import tensorflow_datasets as tfds

ds = tfds.load('cifar10', split='train')
for ex in ds.take(4):
 print(ex)

See the guide for more informations on tensorflow_datasets.

https://storage.googleapis.com/tfds-data/visualization/fig/cifar10-3.0.2.png" alt="Visualization" width="500px">

Clear search

Close search

Google apps

Main menu

Category	label
day bed	0
dishrag	1
plate	2
running shoe	3
soap dispenser	4
street sign	5
table lamp	6
tile roof	7
toilet seat	8
washing machine	9

cifar10

features-vgg19-cifar10test-85

Context

Content

Rescaled CIFAR-10 dataset

CIFAR-10 Python in CSV

Context

Content

Acknowledgements

How to load the batches.meta file (Python)

cifar10

Cifar 100 Dataset

CIFAR-100

All images were sized 32x32 in the original dataset

Version 1 (original-images_Original-CIFAR100-Splits):

Version 2 (original-images_trainSetSplitBy80_20):

Citation:

cifar10_corrupted

Data from "Benchmark Generation Framework with Customizable Distortions for...

cinic10

Stanford STL-10 Image Dataset

Dollar street 10 - 64x64x3

CIFAR-100 Python

Similar Datasets:

Context

Content

Acknowledgements

How to load the data (Python)

File paths

Read dictionary

Get data (change the coarse_labels if you want to use the 100 classes)

aclcifar10

Data_Sheet_1_Building One-Shot Semi-Supervised (BOSS) Learning Up to Fully...

clcifar10

CEs dataset - Dataset - LDM

cifar10