3 datasets found
  1. STL-10 Image Recognition Dataset

    • kaggle.com
    Updated Jun 11, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jessica Li (2018). STL-10 Image Recognition Dataset [Dataset]. https://www.kaggle.com/datasets/jessicali9530/stl10/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 11, 2018
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Jessica Li
    Description

    Context

    STL-10 is an image recognition dataset inspired by CIFAR-10 dataset with some improvements. With a corpus of 100,000 unlabeled images and 500 training images, this dataset is best for developing unsupervised feature learning, deep learning, self-taught learning algorithms. Unlike CIFAR-10, the dataset has a higher resolution which makes it a challenging benchmark for developing more scalable unsupervised learning methods.

    Content

    Data overview:

    • There are three files: train_image.zips, test_images.zip and unlabeled_images.zip
    • 10 classes: airplane, bird, car, cat, deer, dog, horse, monkey, ship, truck
    • Images are 96x96 pixels, color
    • 500 training images (10 pre-defined folds), 800 test images per class
    • 100,000 unlabeled images for unsupervised learning. These examples are extracted from a similar but broader distribution of images. For instance, it contains other types of animals (bears, rabbits, etc.) and vehicles (trains, buses, etc.) in addition to the ones in the labeled set
    • Images were acquired from labeled examples on ImageNet

    The original data source recommends the following standardized testing protocol for reporting results:

    1. Perform unsupervised training on the unlabeled data
    2. Perform supervised training on the labeled data using 10 (pre-defined) folds of 100 examples from the training data. The indices of the examples to be used for each fold are provided
    3. Report average accuracy on the full test set

    Acknowledgements

    Original data source and banner image: https://cs.stanford.edu/~acoates/stl10/

    Please cite the following reference when using this dataset:

    Adam Coates, Honglak Lee, Andrew Y. Ng An Analysis of Single Layer Networks in Unsupervised Feature Learning AISTATS, 2011.

    Inspiration

    • Can you train a model to accurately identify what animal or transportation object is in each image?
  2. T

    stl10

    • tensorflow.org
    Updated Jan 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). stl10 [Dataset]. https://www.tensorflow.org/datasets/catalog/stl10
    Explore at:
    Dataset updated
    Jan 13, 2023
    Description

    The STL-10 dataset is an image recognition dataset for developing unsupervised feature learning, deep learning, self-taught learning algorithms. It is inspired by the CIFAR-10 dataset but with some modifications. In particular, each class has fewer labeled training examples than in CIFAR-10, but a very large set of unlabeled examples is provided to learn image models prior to supervised training. The primary challenge is to make use of the unlabeled data (which comes from a similar but different distribution from the labeled data) to build a useful prior. All images were acquired from labeled examples on ImageNet.

    To use this dataset:

    import tensorflow_datasets as tfds
    
    ds = tfds.load('stl10', split='train')
    for ex in ds.take(4):
     print(ex)
    

    See the guide for more informations on tensorflow_datasets.

    https://storage.googleapis.com/tfds-data/visualization/fig/stl10-1.0.0.png" alt="Visualization" width="500px">

  3. O

    STL-10

    • opendatalab.com
    zip
    Updated Aug 24, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stanford University (2022). STL-10 [Dataset]. https://opendatalab.com/OpenDataLab/STL-10
    Explore at:
    zip(5978439104 bytes)Available download formats
    Dataset updated
    Aug 24, 2022
    Dataset provided by
    University of Michigan
    Stanford University
    Description
    Inspired by the CIFAR-10 dataset, STL-10 is an image recognition dataset for the development of unsupervised machine and feature learning as well as deep learning algorithms. Each class has fewer number of labeled training examples compared to CIFAR-10, and a large set of unlabeled samples is provided to learn image models prior to training the models. The primary challenge is to utilize the unlabeled data. With the higher resolution (96x96) of this dataset, it is expected that will be a more challenging benchmark to attain when developing such scalable unsupervised ML models.
    
  4. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Jessica Li (2018). STL-10 Image Recognition Dataset [Dataset]. https://www.kaggle.com/datasets/jessicali9530/stl10/discussion
Organization logo

STL-10 Image Recognition Dataset

Train models to recognize different animals and vehicles

Explore at:
7 scholarly articles cite this dataset (View in Google Scholar)
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 11, 2018
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Jessica Li
Description

Context

STL-10 is an image recognition dataset inspired by CIFAR-10 dataset with some improvements. With a corpus of 100,000 unlabeled images and 500 training images, this dataset is best for developing unsupervised feature learning, deep learning, self-taught learning algorithms. Unlike CIFAR-10, the dataset has a higher resolution which makes it a challenging benchmark for developing more scalable unsupervised learning methods.

Content

Data overview:

  • There are three files: train_image.zips, test_images.zip and unlabeled_images.zip
  • 10 classes: airplane, bird, car, cat, deer, dog, horse, monkey, ship, truck
  • Images are 96x96 pixels, color
  • 500 training images (10 pre-defined folds), 800 test images per class
  • 100,000 unlabeled images for unsupervised learning. These examples are extracted from a similar but broader distribution of images. For instance, it contains other types of animals (bears, rabbits, etc.) and vehicles (trains, buses, etc.) in addition to the ones in the labeled set
  • Images were acquired from labeled examples on ImageNet

The original data source recommends the following standardized testing protocol for reporting results:

  1. Perform unsupervised training on the unlabeled data
  2. Perform supervised training on the labeled data using 10 (pre-defined) folds of 100 examples from the training data. The indices of the examples to be used for each fold are provided
  3. Report average accuracy on the full test set

Acknowledgements

Original data source and banner image: https://cs.stanford.edu/~acoates/stl10/

Please cite the following reference when using this dataset:

Adam Coates, Honglak Lee, Andrew Y. Ng An Analysis of Single Layer Networks in Unsupervised Feature Learning AISTATS, 2011.

Inspiration

  • Can you train a model to accurately identify what animal or transportation object is in each image?
Search
Clear search
Close search
Google apps
Main menu