MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
ImageNet-R
This repo is made to facilitate the evaluation of various pretraining models. It's constructed from the source file provided by official implementation.
Usage
from datasets import load_dataset
dataset = load_dataset('axiong/imagenet-r')
Dataset Summary
ImageNet-R(endition) contains art, cartoons, deviantart, graffiti, embroidery, graphics, origami, paintings, patterns, plastic objects, plush objects, sculptures, sketches, tattoos, toys, and video… See the full description on the dataset page: https://huggingface.co/datasets/axiong/imagenet-r.
ImageNet-R is a set of images labelled with ImageNet labels that were obtained by collecting art, cartoons, deviantart, graffiti, embroidery, graphics, origami, paintings, patterns, plastic objects, plush objects, sculptures, sketches, tattoos, toys, and video game renditions of ImageNet classes. ImageNet-R has renditions of 200 ImageNet classes resulting in 30,000 images. by collecting new data and keeping only those images that ResNet-50 models fail to correctly classify. For more details please refer to the paper.
The label space is the same as that of ImageNet2012. Each example is represented as a dictionary with the following keys:
To use this dataset:
import tensorflow_datasets as tfds
ds = tfds.load('imagenet_r', split='train')
for ex in ds.take(4):
print(ex)
See the guide for more informations on tensorflow_datasets.
https://storage.googleapis.com/tfds-data/visualization/fig/imagenet_r-0.2.0.png" alt="Visualization" width="500px">
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Tiny-ImageNet-R, is a down-sampled subset of ImageNet-R(enditions) imagenet-r. It contains roughly 12,000 samples categorized in 64 classes (a subset of Tiny-ImageNet classes), spread across multiple visual domains such as art, cartoons, sculptures, origami, graffiti, and embroidery.
MrZilinXiao/MMEB-eval-ImageNet-R-beir-v3 dataset hosted on Hugging Face and contributed by the HF Datasets community
The dataset used in the paper is a classification dataset, specifically DomainNet, ImageNet-R, ImageNet-B, and ImageNet-A.
ImageNet-W(atermark) is a test set to evaluate models’ reliance on the newly found watermark shortcut in ImageNet, which is used to predict the carton class. ImageNet-W is created by overlaying transparent watermarks on the ImageNet validation set. Two metrics are used to evaluate watermark shortcut reliance: (1) IN-W Gap: the top-1 accuracy drop from ImageNet to ImageNet-W, (2) Carton Gap: carton class accuracy increase from ImageNet to ImageNet-W. Combining ImageNet-W with previous out-of-distribution variants of ImageNet (e.g., Stylized ImageNet, ImageNet-R, ImageNet-9) forms a comprehensive suite of multi-shortcut evaluation on ImageNet.
The dataset used in the paper is not explicitly mentioned, but it is implied to be ImageNet-R/A/Sk for ImageNet-R/A/Sk classification.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Imagenet Mini Dataset
This dataset is a subset of the Imagenet validation set containing 26,000 images. It has been curated to have equal class distributions, with 26 randomly sampled images from each class. All images have been resized to (224, 224) pixels, and are in RGB format.
Citation
If you use this dataset in your research, please cite the original Imagenet dataset: Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., & Fei-Fei, L. (2009). Imagenet: A large-scale… See the full description on the dataset page: https://huggingface.co/datasets/richwardle/reduced-imagenet.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Towards Realistic Out-of-Distribution Detection: A Novel Evaluation Framework for Improving Generalization in OOD Detection:
This paper presents a novel evaluation framework for Out-of-Distribution (OOD) detection that aims to assess the performance of machine learning models in more realistic settings. We observed that the real-world requirements for testing OOD detection methods are not satisfied by the current testing protocols. They usually encourage methods to have a strong bias towards a low level of diversity in normal data. To address this limitation, we propose new OOD test datasets (CIFAR-10-R, CIFAR-100-R, and ImageNet-30-R) that can allow researchers to benchmark OOD detection performance under realistic distribution shifts. Additionally, we introduce a Generalizability Score (GS) to measure the generalization ability of a model during OOD detection. Our experiments demonstrate that improving the performance on existing benchmark datasets does not necessarily improve the usability of OOD detection models in real-world scenarios. While leveraging deep pre-trained features has been identified as a promising avenue for OOD detection research, our experiments show that state-of-the-art pre-trained models tested on our proposed datasets suffer a significant drop in performance. To address this issue, we propose a post-processing stage for adapting pre-trained features under these distribution shifts before calculating the OOD scores, which significantly enhances the performance of state-of-the-art pre-trained models on our benchmarks.
https://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified
You have been granted access for non-commercial research/educational use. By accessing the data, you have agreed to the following terms. You (the "Researcher") have requested permission to use the ImageNet database (the "Database") at Princeton University and Stanford University. In exchange for such permission, Researcher hereby agrees to the following terms and conditions: 1. Researcher shall use the Database only for non-commercial research and educational purposes. 2. Princeton University and Stanford University make no representations or warranties regarding the Database, including but not limited to warranties of non-infringement or fitness for a particular purpose. 3. Researcher accepts full responsibility for his or her use of the Database and shall defend and indemnify Princeton University and Stanford University, including their employees, Trustees, officers and agents, against any and all claims arising from Researcher s use of the Database, including but
ImageNet Subsets
The ImageNet dataset is a large-scale image database that contains over 14 million images, each labeled with one of 21,841 categories.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
UCIT benchmark
This benchmark is used to train and evaluate the Continual Instruction Tuning capabilities of MLLMs and is proposed by HiDe-LLaVA (ACL 2025). This repository contains mainly the training and testing instructions for the datasets used as well as images of ImageNet-R and Flickr30k datasets. For images of other datasets, please refer to the links provided in our GitHub. If you use our benchmarks, please cite our work: @article{guo2025hide, title={Hide-llava:… See the full description on the dataset page: https://huggingface.co/datasets/HaiyangGuo/UCIT.
This dataset was created by David R. Pugh
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Rows show the performance of each learning machine (SVM with linear kernel and SVM with rbf kernel) on each image view (head, dorsum and profile). Columns show accuracy, average precision and minimum precision performance for each label on top lists. H = head view; D = dorsal view; P = profile view; SVM-L = SVM with linear kernel; SVM-R = SVM with rbf kernel.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
这里仅仅是测试数据集, 请所代码中的数据都改成测试集来测试即可 例子: https://github.com/CreamyLong/stable-diffusion
\ldm\data\imagenet.py
class ImageNetTrain(ImageNetBase):
NAME = "ILSVRC2012_validation" #ILSVRC2012_train
URL = "http://www.image-net.org/challenges/LSVRC/2012/"
AT_HASH = "a306397ccf9c2ead27155983c254227c0fd938e2"
FILES = [
"ILSVRC2012_img_train.tar",
]
SIZES = [
147897477120,
]
def _init_(self, process_images=True, data_root=None, **kwargs):
self.process_images = process_images
self.data_root = data_root
super()._init_(**kwargs)
def _prepare(self):
if self.data_root:
self.root = os.path.join(self.data_root, self.NAME)
# print(self.root) #data/myimages/ILSVRC2012_validation
# exit()
else:
# cachedir = os.environ.get("XDG_CACHE_HOME", os.path.expanduser("~/.cache"))
# self.root = os.path.join(cachedir, "autoencoders/data", self.NAME)
print('不要在线下载ILSVRC2012,太大了')
exit()
self.datadir = os.path.join(self.root, "data")
# print(self.datadir) #data/myimages/ILSVRC2012_validation\data
# exit()
self.txt_filelist = os.path.join(self.root, "me_images.txt")
print('================',self.txt_filelist)
self.expected_length = 1281167
self.random_crop = retrieve(self.config, "ImageNetTrain/random_crop", default=True)
if not tdu.is_prepared(self.root):
# prep
print("Preparing dataset {} in {}".format(self.NAME, self.root))
datadir = self.datadir
# if not os.path.exists(datadir):
# path = os.path.join(self.root, self.FILES[0])
# if not os.path.exists(path) or not os.path.getsize(path)==self.SIZES[0]:
# import academictorrents as at
# atpath = at.get(self.AT_HASH, datastore=self.root)
# assert atpath == path
# print("Extracting {} to {}".format(path, datadir))
# os.makedirs(datadir, exist_ok=True)
# with tarfile.open(path, "r:") as tar:
# tar.extractall(path=datadir)
# print("Extracting sub-tars.")
# subpaths = sorted(glob.glob(os.path.join(datadir, "*.tar")))
# for subpath in tqdm(subpaths):
# subdir = subpath[:-len(".tar")]
# os.makedirs(subdir, exist_ok=True)
# with tarfile.open(subpath, "r:") as tar:
# tar.extractall(path=subdir)
# filelist = glob.glob(os.path.join(datadir, "**", "*.JPEG"))
# filelist = glob.glob(os.path.join(datadir, "*.JPEG"))
filelist = glob.glob(os.path.join(datadir, "*", "*.JPEG"))
filelist = [os.path.relpath(p, start=datadir) for p in filelist]
filelist = sorted(filelist)
filelist = "
".join(filelist)+" " with open(self.txt_filelist, "w") as f: f.write(filelist)
tdu.mark_prepared(self.root)
${XDG_CACHE}/autoencoders/data/ILSVRC2012_{split}/data/ ├── n01440764 │ ├── n01440764_10026.JPEG │ ├── n01440764_10027.JPEG │ ├── ... ├── n01443537 │ ├── n01443537_10007.JPEG │ ├── n01443537_10014.JPEG │ ├── ... ├── ...
Despite recent advances in object detection using deep learning neural networks, these neural networks still struggle to identify objects in art images such as paintings and drawings. This challenge is known as the cross depiction problem and it stems in part from the tendency of neural networks to prioritize identification of an object's texture over its shape. In this paper we propose and evaluate a process for training neural networks to localize objects - specifically people - in art images. We generated a large dataset for training and validation by modifying the images in the COCO dataset using AdaIn style transfer (style-coco.tar.xz). This dataset was used to fine-tune a Faster R-CNN object detection network (2020-12-10_09-45-15_58672_resnet152_stylecoco_epoch_15.pth), which is then tested on the existing People-Art testing dataset (PeopleArt-Coco.tar.xz). The result is a significant improvement on the state of the art and a new way forward for creating datasets to train neural networks to process art images.
2020-12-10_09-45-15_58672_resnet152_stylecoco_epoch_15.pth
: Trained object detection network (Faster-RCNN using a ResNet152 backbone pretrained on ImageNet) for use with PyTorch
PeopleArt-Coco.tar.xz
: People-Art dataset with COCO-formatted annotations (original at https://github.com/BathVisArtData/PeopleArt)
style-coco.tar.xz
: Stylized COCO dataset containing only the person category. Used to train 2020-12-10_09-45-15_58672_resnet152_stylecoco_epoch_15.pth
The code is available on github at https://github.com/dkadish/Style-Transfer-for-Object-Detection-in-Art
If you are using this code or the concept of style transfer for object detection in art, please cite our paper (https://arxiv.org/abs/2102.06529):
D. Kadish, S. Risi, and A. S. Løvlie, “Improving Object Detection in Art Images Using Only Style Transfer,” Feb. 2021.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
ImageNet-R
This repo is made to facilitate the evaluation of various pretraining models. It's constructed from the source file provided by official implementation.
Usage
from datasets import load_dataset
dataset = load_dataset('axiong/imagenet-r')
Dataset Summary
ImageNet-R(endition) contains art, cartoons, deviantart, graffiti, embroidery, graphics, origami, paintings, patterns, plastic objects, plush objects, sculptures, sketches, tattoos, toys, and video… See the full description on the dataset page: https://huggingface.co/datasets/axiong/imagenet-r.