This dataset was created by Christopher Sham
https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
Dataset Description
A mini version of ImageNet-1k with 100 of 1000 classes present. Unlike some 'mini' variants this one includes the original images at their original sizes. Many such subsets downsample to 84x84 or other smaller resolutions.
Data Splits
Train
50000 samples from ImageNet-1k train split
Validation
10000 samples from ImageNet-1k train split
Test
5000 samples from ImageNet-1k validation split (all 50 samples per class)… See the full description on the dataset page: https://huggingface.co/datasets/timm/mini-imagenet.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
LGV models used as surrogate in the original paper.
Those resnet50 models were collected along the SGD trajectory with a high learning rate. The zip file contains three random seeds in respective subfolders. Each one contains a subfolder with the original pretrained model from which the model collection started. These pretrained models were trained by Ashukha, A., et al. Pitfalls of In-Domain Uncertainty Estimation and Ensembling in Deep Learning (2020).
https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
Dataset Summary
This is a copy of the full ImageNet dataset consisting of all of the original 21841 clases. It also contains labels in a separate field for the '12k' subset described at at (https://github.com/rwightman/imagenet-12k, https://huggingface.co/datasets/timm/imagenet-12k-wds) This dataset is from the original fall11 ImageNet release which has been replaced by the winter21 release which removes close to 3000 synsets containing people, a number of these are of an offensive… See the full description on the dataset page: https://huggingface.co/datasets/timm/imagenet-22k-wds.
This dataset is used in the Pytorch example Transfer Learning for Computer Vision Tutorial
https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
Dataset Summary
This is a copy of the full Winter21 release of ImageNet in webdataset tar format with JPEG images. This release consists of 19167 classes, 2674 fewer classes than the original 21841 class Fall11 release of the full ImageNet. The classes were removed due to these concerns: https://www.image-net.org/update-sep-17-2019.php
Data Splits
The full ImageNet dataset has no defined splits. This release follows that and leaves everything in the train split.… See the full description on the dataset page: https://huggingface.co/datasets/timm/imagenet-w21-wds.
https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
Dataset Summary
This is a filtered copy of the full ImageNet dataset consisting of the top 11821 (of 21841) classes by number of samples. It has been used to pretrain a number of in12k models in timm. The code and metadata for building this dataset from the original full ImageNet can be found at https://github.com/rwightman/imagenet-12k NOTE: This subset was filtered from the original fall11 ImageNet release which has been replaced by the winter21 release which removes close to 3000… See the full description on the dataset page: https://huggingface.co/datasets/timm/imagenet-12k-wds.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Recent work has shown that convolutional networks can be substantially deeper, more accurate, and efficient to train if they contain shorter connections between layers close to the input and those close to the output. In this paper, we embrace this observation and introduce the Dense Convolutional Network (DenseNet), which connects each layer to every other layer in a feed-forward fashion. Whereas traditional convolutional networks with L layers have L connections - one between each layer and its subsequent layer - our network has L(L+1)/2 direct connections. For each layer, the feature-maps of all preceding layers are used as inputs, and its own feature-maps are used as inputs into all subsequent layers. DenseNets have several compelling advantages: they alleviate the vanishing-gradient problem, strengthen feature propagation, encourage feature reuse, and substantially reduce the number of parameters. We evaluate our proposed architecture on four highly competitive object recognition benchmark tasks (CIFAR-10, CIFAR-100, SVHN, and ImageNet). DenseNets obtain significant improvements over the state-of-the-art on most of them, whilst requiring less memory and computation to achieve high performance. Code and models are available at this https URL.
Authors: Gao Huang, Zhuang Liu, Kilian Q. Weinberger, Laurens van der Maaten
https://arxiv.org/abs/1608.06993
https://imgur.com/wWHWbQt.jpg" alt="DenseNet">
https://imgur.com/oiTdqJL.jpg" alt="DenseNet Architectures">
A pre-trained model has been previously trained on a dataset and contains the weights and biases that represent the features of whichever dataset it was trained on. Learned features are often transferable to different data. For example, a model trained on a large dataset of bird images will contain learned features like edges or horizontal lines that you would be transferable your dataset.
Pre-trained models are beneficial to us for many reasons. By using a pre-trained model you are saving time. Someone else has already spent the time and compute resources to learn a lot of features and your model will likely benefit from it.
Dataset Summary
This is a subset of the full Winter21, filtered according to https://github.com/Alibaba-MIIL/ImageNet21K. This instance contains 10450 classes with a train and validation split.
Processing
I performed some processing while sharding this dataset:
Synsets were filtered according to ImageNet-21-P scripts Images were re-encoded in WEBP
Additional Information
Dataset Curators
Authors of [1] and [2]:
Olga Russakovsky Jia Deng Hao Su… See the full description on the dataset page: https://huggingface.co/datasets/timm/imagenet-w21-p.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
In this work we investigate the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting. Our main contribution is a thorough evaluation of networks of increasing depth using an architecture with very small (3x3) convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers. These findings were the basis of our ImageNet Challenge 2014 submission, where our team secured the first and the second places in the localisation and classification tracks respectively. We also show that our representations generalise well to other datasets, where they achieve state-of-the-art results. We have made our two best-performing ConvNet models publicly available to facilitate further research on the use of deep visual representations in computer vision.
Authors: Karen Simonyan, Andrew Zisserman
https://arxiv.org/abs/1409.1556
https://imgur.com/uLXrKxe.jpg" alt="VGG Architecture">
A pre-trained model has been previously trained on a dataset and contains the weights and biases that represent the features of whichever dataset it was trained on. Learned features are often transferable to different data. For example, a model trained on a large dataset of bird images will contain learned features like edges or horizontal lines that you would be transferable your dataset.
Pre-trained models are beneficial to us for many reasons. By using a pre-trained model you are saving time. Someone else has already spent the time and compute resources to learn a lot of features and your model will likely benefit from it.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Recent research on deep neural networks has focused primarily on improving accuracy. For a given accuracy level, it is typically possible to identify multiple DNN architectures that achieve that accuracy level. With equivalent accuracy, smaller DNN architectures offer at least three advantages: (1) Smaller DNNs require less communication across servers during distributed training. (2) Smaller DNNs require less bandwidth to export a new model from the cloud to an autonomous car. (3) Smaller DNNs are more feasible to deploy on FPGAs and other hardware with limited memory. To provide all of these advantages, we propose a small DNN architecture called SqueezeNet. SqueezeNet achieves AlexNet-level accuracy on ImageNet with 50x fewer parameters. Additionally, with model compression techniques we are able to compress SqueezeNet to less than 0.5MB (510x smaller than AlexNet).
Authors: Forrest N. Iandola, Song Han, Matthew W. Moskewicz, Khalid Ashraf, William J. Dally, Kurt Keutzer
https://arxiv.org/abs/1602.07360
https://imgur.com/WV7Ru4Q.jpg" alt="SqueezeNet Architecture">
A pre-trained model has been previously trained on a dataset and contains the weights and biases that represent the features of whichever dataset it was trained on. Learned features are often transferable to different data. For example, a model trained on a large dataset of bird images will contain learned features like edges or horizontal lines that you would be transferable your dataset.
Pre-trained models are beneficial to us for many reasons. By using a pre-trained model you are saving time. Someone else has already spent the time and compute resources to learn a lot of features and your model will likely benefit from it.
https://choosealicense.com/licenses/undefined/https://choosealicense.com/licenses/undefined/
Dataset Card for tiny-imagenet
Dataset Summary
Tiny ImageNet contains 100000 images of 200 classes (500 for each class) downsized to 64×64 colored images. Each class has 500 training images, 50 validation images, and 50 test images.
Languages
The class labels in the dataset are in English.
Dataset Structure
Data Instances
{ 'image': <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=64x64 at 0x1A800E8E190, 'label': 15 }… See the full description on the dataset page: https://huggingface.co/datasets/zh-plus/tiny-imagenet.
Context
This is a Vector Quantized Variational AutoEncoder Mode Trained using some part of ImageNet DataSet
Content
This notebook shows the architecture and training of model
ImageNet SDXL Quantized
This repository provides the ImageNet-1K dataset pre-encoded with the Stable Diffusion XL VAE encoder and quantized to uint8, allowing for faster training of latent diffusion models by eliminating the need for on-the-fly encoding.
Key Features
Reduces quantization error by 2dB PSNR compared to a linear encoding scheme Provided in both 256 and 512 resolutions Compatible with NumPy, JAX, and PyTorch
Usage
Loading the dataset… See the full description on the dataset page: https://huggingface.co/datasets/jon-kyl/imagenet-sdxl-quantized.
ObjectNet (ImageNet-1k Overlapping)
A webp (lossless) encoded version of ObjectNet-1.0 at original resolution, containing only the images for the 113 classes that overlap with ImageNet-1k classes.
License / Usage Terms
ObjectNet is free to use for both research and commercial applications. The authors own the source images and allow their use under a license derived from Creative Commons Attribution 4.0 with only two additional clauses.
ObjectNet may never be used to… See the full description on the dataset page: https://huggingface.co/datasets/timm/objectnet-in1k.
Despite recent advances in object detection using deep learning neural networks, these neural networks still struggle to identify objects in art images such as paintings and drawings. This challenge is known as the cross depiction problem and it stems in part from the tendency of neural networks to prioritize identification of an object's texture over its shape. In this paper we propose and evaluate a process for training neural networks to localize objects - specifically people - in art images. We generated a large dataset for training and validation by modifying the images in the COCO dataset using AdaIn style transfer (style-coco.tar.xz). This dataset was used to fine-tune a Faster R-CNN object detection network (2020-12-10_09-45-15_58672_resnet152_stylecoco_epoch_15.pth), which is then tested on the existing People-Art testing dataset (PeopleArt-Coco.tar.xz). The result is a significant improvement on the state of the art and a new way forward for creating datasets to train neural networks to process art images.
2020-12-10_09-45-15_58672_resnet152_stylecoco_epoch_15.pth
: Trained object detection network (Faster-RCNN using a ResNet152 backbone pretrained on ImageNet) for use with PyTorch
PeopleArt-Coco.tar.xz
: People-Art dataset with COCO-formatted annotations (original at https://github.com/BathVisArtData/PeopleArt)
style-coco.tar.xz
: Stylized COCO dataset containing only the person category. Used to train 2020-12-10_09-45-15_58672_resnet152_stylecoco_epoch_15.pth
The code is available on github at https://github.com/dkadish/Style-Transfer-for-Object-Detection-in-Art
If you are using this code or the concept of style transfer for object detection in art, please cite our paper (https://arxiv.org/abs/2102.06529):
D. Kadish, S. Risi, and A. S. Løvlie, “Improving Object Detection in Art Images Using Only Style Transfer,” Feb. 2021.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
This dataset was created by Christopher Sham