3 datasets found

STL-10 Image Recognition Dataset
kaggle.com
Updated Jun 11, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jessica Li (2018). STL-10 Image Recognition Dataset [Dataset]. https://www.kaggle.com/datasets/jessicali9530/stl10/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 11, 2018
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Jessica Li
Description
Context

STL-10 is an image recognition dataset inspired by CIFAR-10 dataset with some improvements. With a corpus of 100,000 unlabeled images and 500 training images, this dataset is best for developing unsupervised feature learning, deep learning, self-taught learning algorithms. Unlike CIFAR-10, the dataset has a higher resolution which makes it a challenging benchmark for developing more scalable unsupervised learning methods.

Content

Data overview:

There are three files: train_image.zips, test_images.zip and unlabeled_images.zip

10 classes: airplane, bird, car, cat, deer, dog, horse, monkey, ship, truck

Images are 96x96 pixels, color

500 training images (10 pre-defined folds), 800 test images per class

100,000 unlabeled images for unsupervised learning. These examples are extracted from a similar but broader distribution of images. For instance, it contains other types of animals (bears, rabbits, etc.) and vehicles (trains, buses, etc.) in addition to the ones in the labeled set

Images were acquired from labeled examples on ImageNet

The original data source recommends the following standardized testing protocol for reporting results:

Perform unsupervised training on the unlabeled data

Perform supervised training on the labeled data using 10 (pre-defined) folds of 100 examples from the training data. The indices of the examples to be used for each fold are provided

Report average accuracy on the full test set

Acknowledgements

Original data source and banner image: https://cs.stanford.edu/~acoates/stl10/

Please cite the following reference when using this dataset:

Adam Coates, Honglak Lee, Andrew Y. Ng An Analysis of Single Layer Networks in Unsupervised Feature Learning AISTATS, 2011.

Inspiration

Can you train a model to accurately identify what animal or transportation object is in each image?
T
stl10
tensorflow.org
Updated Jan 13, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). stl10 [Dataset]. https://www.tensorflow.org/datasets/catalog/stl10
Explore at:
Dataset updated
Jan 13, 2023
Description
The STL-10 dataset is an image recognition dataset for developing unsupervised feature learning, deep learning, self-taught learning algorithms. It is inspired by the CIFAR-10 dataset but with some modifications. In particular, each class has fewer labeled training examples than in CIFAR-10, but a very large set of unlabeled examples is provided to learn image models prior to supervised training. The primary challenge is to make use of the unlabeled data (which comes from a similar but different distribution from the labeled data) to build a useful prior. All images were acquired from labeled examples on ImageNet.

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('stl10', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.

https://storage.googleapis.com/tfds-data/visualization/fig/stl10-1.0.0.png" alt="Visualization" width="500px">

STL-10

opendatalab.com

zip

Updated Aug 24, 2022

Facebook

Twitter

Click to copy link

Link copied

Cite

Stanford University (2022). STL-10 [Dataset]. https://opendatalab.com/OpenDataLab/STL-10

Explore at:

zip(5978439104 bytes)Available download formats

Dataset updated

Aug 24, 2022

Dataset provided by

University of Michigan
Stanford University

Description

Inspired by the CIFAR-10 dataset, STL-10 is an image recognition dataset for the development of unsupervised machine and feature learning as well as deep learning algorithms. Each class has fewer number of labeled training examples compared to CIFAR-10, and a large set of unlabeled samples is provided to learn image models prior to training the models. The primary challenge is to utilize the unlabeled data. With the higher resolution (96x96) of this dataset, it is expected that will be a more challenging benchmark to attain when developing such scalable unsupervised ML models.

Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Jessica Li (2018). STL-10 Image Recognition Dataset [Dataset]. https://www.kaggle.com/datasets/jessicali9530/stl10/discussion

STL-10 Image Recognition Dataset

Train models to recognize different animals and vehicles

Explore at:

7 scholarly articles cite this dataset (View in Google Scholar)

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Jun 11, 2018

Dataset provided by

Kagglehttp://kaggle.com/

Authors

Jessica Li

Description

Context

STL-10 is an image recognition dataset inspired by CIFAR-10 dataset with some improvements. With a corpus of 100,000 unlabeled images and 500 training images, this dataset is best for developing unsupervised feature learning, deep learning, self-taught learning algorithms. Unlike CIFAR-10, the dataset has a higher resolution which makes it a challenging benchmark for developing more scalable unsupervised learning methods.

Content

Data overview:

There are three files: train_image.zips, test_images.zip and unlabeled_images.zip
10 classes: airplane, bird, car, cat, deer, dog, horse, monkey, ship, truck
Images are 96x96 pixels, color
500 training images (10 pre-defined folds), 800 test images per class
100,000 unlabeled images for unsupervised learning. These examples are extracted from a similar but broader distribution of images. For instance, it contains other types of animals (bears, rabbits, etc.) and vehicles (trains, buses, etc.) in addition to the ones in the labeled set
Images were acquired from labeled examples on ImageNet

The original data source recommends the following standardized testing protocol for reporting results:

Perform unsupervised training on the unlabeled data
Perform supervised training on the labeled data using 10 (pre-defined) folds of 100 examples from the training data. The indices of the examples to be used for each fold are provided
Report average accuracy on the full test set

Acknowledgements

Original data source and banner image: https://cs.stanford.edu/~acoates/stl10/

Please cite the following reference when using this dataset:

Adam Coates, Honglak Lee, Andrew Y. Ng An Analysis of Single Layer Networks in Unsupervised Feature Learning AISTATS, 2011.

Inspiration

Can you train a model to accurately identify what animal or transportation object is in each image?

Clear search

Close search

Google apps

Main menu