Facebook
TwitterOver the past few years, we have witnessed the success of deep learning in image recognition thanks to the availability of large-scale human-annotated datasets such as PASCAL VOC, ImageNet, and COCO. Although these datasets have covered a wide range of object categories, there are still a significant number of objects that are not included. Can we perform the same task without a lot of human annotations? In this paper, we are interested in few-shot object segmentation where the number of annotated training examples are limited to 5 only. To evaluate and validate the performance of our approach, we have built a few-shot segmentation dataset, FSS-1000, which consists of 1000 object classes with pixelwise annotation of ground-truth segmentation. Unique in FSS-1000, our dataset contains significant number of objects that have never been seen or annotated in previous datasets, such as tiny daily objects, erchandise, cartoon characters, logos, etc
Object Classes We first referred to the classes in ILSVRC in our choice of object categories for FSS-1000. Consequently, FSS-1000 has 584 classes out of its 1,000 classes overlap with the classes in the ILSVRC dataset. We find ILSVRC dataset heavily biases toward animals, both in terms of the distribution of categories and number of images. Therefore, we fill in the other 486 by new classes unseen in any existing datasets. Specifically, we include more daily objects so that network models trained on FSS-1000 can learn from diverse artificial and manmade objects/features in addition to natural and organic objects/features where the latter was emphasized by existing large-scale datasets.
This dataset have been created by Xiang Li, Tianhan Wei, Yau Pun Chen, Yu-Wing Tai, Chi-Keung Tang. Refer paper FSS-1000: A 1000-Class Dataset for Few-Shot Segmentation for more details.
Facebook
TwitterA version of the Kaggle BCCD White Blood Cell (WBC) dataset modified for out-of-domain few-shot classification.
We recommend using this dataset as an out-of-domain testing target for few-shot classification. As such, no training/testing splits are published.
The original version can be found here. All credit goes to the original authors of this dataset (Shenggan and Paul Mooney).
This dataset is structured as an image folder dataset, meaning each folder represents a class and contains images of the respective class.
Total Number of Images: 3500 Total Number of Classes: 5 Images per Class: 700 Image Size: 84x84px
In-domain Training Set: mini-imagenet training set Testing Settings: 5-way 1-shot and 5-way 5-shot
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset features a curated collection of rare Chinese figure painting styles, specifically designed to facilitate research in few-shot classification and deep feature fusion strategies. The dataset comprises a total of 2,585 images, each representing a distinct artistic style that reflects the diversity and complexity of traditional Chinese figure painting, with attention to brushwork, color palettes, texture, and composition.
Facebook
TwitterThe Met Dataset was created by: Nikolaos-Antonios Ypsilantis, Noa Garcia, Guangxing Han, Sarah Ibrahimi, Nanne van Noord, Giorgos Tolias.
README Sourced from here
The images of the dataset and the ground-truth files can be downloaded from the links below. All images have been resized so that their largest side is 500 pixels. * Met exhibit (train) images (28 GB) * mini Met exhibit (train) images (3 GB) * Met query images (32 MB) * other query images (1 GB) * no-art query images (1 GB) * ground-truth (.json) files (3 MB)
Code is provided on github to offer support for: * using the dataset * performing the evaluation * reproducing experiments in the NeurIPS 2021 paper
The Met Dataset: Instance-level Recognition for Artworks [ pdf | suppl | bib | poster | video ]
N.A. Ypsilantis, N. Garcia, G. Han, S. Ibrahimi, N. van Noord, G. Tolias
Accepted at NeurIPS 2021 Track on Datasets and Benchmarks.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Still under-processing This dataset is a curated collection of character images extracted from classic Bengali comics, designed to support deep learning tasks like character classification, re-identification, style transfer, and more.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F11429899%2Fe7bb85e13335e39bf108298834c3b36b%2Fparay-ashish-barnish-kore-chere-dibo-meme-template.png?generation=1744316260621298&alt=media" alt="">
📁 Dataset Structure The dataset is organized into five folders:
Each folder contains image crops of a specific character, extracted manually or using computer vision pipelines.
🎯 Use Cases This dataset is versatile and can be used for:
Fine-tuning character classification models (e.g., with Flux.jl, PyTorch, TensorFlow) Image clustering & few-shot learning Vision transformers on sketch/comic data Domain adaptation & stylized generation Contrastive learning (same character across scenes) Graph-based re-ID of characters in comic panels
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Surface or Texture-based Anomaly detection is a field within industrial quality inspection that is focused on detecting errors related to the texture of the objects, for instance, stains, color mismatch, cracks, etc. This anomalies may lead to a faulty or even dangerous final product depending on the field, making quality inspection necessary. This is commonly performed employing cameras or other sensors that can detect the patterns of the object's texture. Being able to automatize this process is crucial, as manual inspection has proven to be tedious, slow and inaccurate for many industries.
Moreover, wooden textures present high variability in their structure and color making really difficult to "just learn the correct wooden textures" in order to detect anomalies. However, industry users cannot usually compile all the possible anomalies that might occur during production and their appearances as there are potentially infinite possibilities. Furthermore, manual labeling is a rather subjective and time-consuming task. Taking this into account, one-class classification is the most feasible approach to follow in order to address the problem from a practical industrial perspective.
Given this approach, creating a new anomaly detection process only requires to gather a set of objects that will be classified as "correct", and an algorithm will be trained using them. Then, every other object under inspection will be compared to the initial set. Given a significant deviation from the normality, the object can be automatically rejected.
In particular, this dataset addresses the anomaly detection problem by capturing several views from wooden textured objects from the same class. The anomalous regions of these images were manually labeled at pixel level. Furthermore, each image has been labelled by two different teams in order to increase its confidence merging the final labels. The objects contained 4 possible anomalies: crack, stain, porosity, and knot. These four were combined into a single label, namely, anomalous.
To summarize, the goal is to develop an algorithm that learns from a nominal wooden texture what nominal is and infers if a wooden texture under inspection is anomalous. Therefore, the algorithm takes an image and outputs an error map (the same size as the input image) with an anomaly score for each pixel.
The main partitions are train and test. Train contains only good samples. Test combines both good and anomalous samples.
The test partitions are divided between test_easy, test_medium, and test in increasing succession in difficulty and size.
The "incremental_" prefix modifies the one-class core concept of the dataset to allow a few-shot approach. This implies that there are some few images from the test partition that are being used to additionally refine the models. Therefore, the corresponding test_easy, test_medium, and test partitions must be adapted to remove these training samples, thus the partitions incremental_test_easy, etc.
"genai_dreambooth" partition is a set of 512 imágenes generated using dreambooth for fine-tuning a Stable Diffusion v1.5 model. The model was trained for 15000 iterations using the 802 images in the "train" partition, thus the images are likely to be mostly nominal samples.
"train" directory only contains a "images" directory as there is no need for label masks. However, the test directories contains two folders: "images", "masks" and "ignore_masks". The "images" directory contains the test RGB images, "masks" contains the masks of only the anomalous images, and "ignore_masks" contains the regions of the image that can be ignored for the metrics. Should an image name from the "images" directory not be in "masks", it implies that the image isn't anomalous.
Regarding the ignore masks, these regions were extracted during labeling phase. There were two teams to label the complete set of images. The regions of the images that were labeled unanimously by the two teams correspond to the "masks" and the regions that were labeled by only one team correspond to the "ignore masks". This division arises on the grounds of a more rightful evaluation of the algorithms. We cannot impose an algorithm to detect areas that not even humans can agree upon. Although, if the algorithm detects as anomalous any pixel within this region, we cannot penalize the score. Hence, this region should be omitted from the evaluation metrics.
Images were acquired using a ZG3D device Please use the following reference when citing: Perez-Cortes, J. C., Perez, A. J., Saez-Barona, S., Guardiola, J. L., & Salvador, I. (2018). A System for In-Line 3D Inspection without Hidden Surfaces. Sensors, 18(9), 2993.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
TwitterOver the past few years, we have witnessed the success of deep learning in image recognition thanks to the availability of large-scale human-annotated datasets such as PASCAL VOC, ImageNet, and COCO. Although these datasets have covered a wide range of object categories, there are still a significant number of objects that are not included. Can we perform the same task without a lot of human annotations? In this paper, we are interested in few-shot object segmentation where the number of annotated training examples are limited to 5 only. To evaluate and validate the performance of our approach, we have built a few-shot segmentation dataset, FSS-1000, which consists of 1000 object classes with pixelwise annotation of ground-truth segmentation. Unique in FSS-1000, our dataset contains significant number of objects that have never been seen or annotated in previous datasets, such as tiny daily objects, erchandise, cartoon characters, logos, etc
Object Classes We first referred to the classes in ILSVRC in our choice of object categories for FSS-1000. Consequently, FSS-1000 has 584 classes out of its 1,000 classes overlap with the classes in the ILSVRC dataset. We find ILSVRC dataset heavily biases toward animals, both in terms of the distribution of categories and number of images. Therefore, we fill in the other 486 by new classes unseen in any existing datasets. Specifically, we include more daily objects so that network models trained on FSS-1000 can learn from diverse artificial and manmade objects/features in addition to natural and organic objects/features where the latter was emphasized by existing large-scale datasets.
This dataset have been created by Xiang Li, Tianhan Wei, Yau Pun Chen, Yu-Wing Tai, Chi-Keung Tang. Refer paper FSS-1000: A 1000-Class Dataset for Few-Shot Segmentation for more details.