8 datasets found

GPR1200 Dataset
kaggle.com
zip
Updated Jan 4, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mathurin Aché (2022). GPR1200 Dataset [Dataset]. https://www.kaggle.com/datasets/mathurinache/gpr1200-dataset/data
Explore at:
zip(1248744484 bytes)Available download formats
Dataset updated
Jan 4, 2022
Authors
Mathurin Aché
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
Context

GPR1200: A Benchmark for General-Purpose Content-Based Image Retrieval (ArXiv)

Content

Similar to most vision related tasks, deep learning models have taken over in the field of content-based image retrieval (CBIR) over the course of the last decade. However, most publications that aim to optimise neural networks for CBIR, train and test their models on domain specific datasets. It is therefore unclear, if those networks can be used as a general-purpose image feature extractor. After analyzing popular image retrieval test sets we decided to manually curate GPR1200, an easy to use and accessible but challenging benchmark dataset with 1200 categories and 10 class examples. Classes and images were manually selected from six publicly available datasets of different image areas, ensuring high class diversity and clean class boundaries.

Acknowledgements

https://github.com/Visual-Computing/GPR1200/raw/main/images/GPR_main_pic.jpg" alt="GPR1200">

Inspiration

Benchmark your Image Retrieval Models on It https://github.com/Visual-Computing/GPR1200/raw/main/images/result_table.JPG" alt="Image Retrieval">
Google-Landmarks Dataset
kaggle.com
zip
Updated Jul 15, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Google (2022). Google-Landmarks Dataset [Dataset]. https://www.kaggle.com/google/google-landmarks-dataset
Explore at:
zip(1147 bytes)Available download formats
Dataset updated
Jul 15, 2022
Dataset authored and provided by
Googlehttp://google.com/
Description
Note: The Google Landmarks Dataset v1 is deprecated and no longer available. Please consider using the Google Landmarks Dataset v2 instead.

Did you ever go through your vacation photos and ask yourself: What is the name of this temple I visited in China? Who created this monument I saw in France? Landmark recognition can help! This technology can predict landmark labels directly from image pixels, to help people better understand and organize their photo collections. Today, a great obstacle to landmark recognition research is the lack of large annotated datasets. This motivated us to release Google-Landmarks, the largest worldwide dataset to date, to foster progress in this problem.

The dataset is divided into two sets of images, to evaluate two different computer vision tasks: recognition and retrieval. The data was originally described in [1], and published as part of the Google Landmark Recognition Challenge and Google Landmark Retrieval Challenge. Additionally, to spur research in this field, we have open-sourced Deep Local Features (DELF), an attentive local feature descriptor that we believe is especially suited for this kind of task. DELF's code can be found on github via this link.

UPDATE: We have now also made available the Google Landmark Boxes dataset, containing 86 thousand bounding boxes.

If you make use of the Google Landmarks dataset in your research, please consider citing:

H. Noh, A. Araujo, J. Sim, T. Weyand, B. Han, "Large-Scale Image Retrieval with Attentive Deep Local Features", Proc. ICCV'17

If you make use of the Google Landmark Boxes dataset in your research, please consider citing:

M. Teichmann*, A. Araujo*, M. Zhu and J. Sim, “Detect-to-Retrieve: Efficient Regional Aggregation for Image Search”, Proc. CVPR'19

Challenges

The two challenges associated to this dataset can be found in the following links:

Google Landmark Recognition Challenge

Google Landmark Retrieval Challenge

CVPR'18 Workshop

The Landmark Recognition Workshop at CVPR 2018 will discuss recent progress on landmark recognition and image retrieval, taking into account the results of the above-mentioned challenges. Top submissions for the challenges will be invited to give talks at the workshop.

Content

The dataset contains URLs of images which are publicly available online (this Python script may be useful to download the images). Note that no image data is released, only URLs.

The dataset contains test images, training images and index images. The test images are used in both tasks: for the recognition task, a landmark label may be predicted for each test image; for the retrieval task, relevant index images may be retrieved for each test image. The training images are associated to landmark labels, and can be used to train models for the recognition and retrieval challenges (for a visualization of the geographic distribution of training images, see [3]). The index images are used in the retrieval task, composing the set from which images should be retrieved.

Note that the test set for both the recognition and retrieval tasks is the same, to encourage researchers to experiment with both. We also encourage participants to use the training data from the recognition task to train models which could be useful for the retrieval task. Note, however, that there are no landmarks in common between the training/index sets of the two tasks.

The images listed in the dataset are not directly in our control, so their availability may change over time, and the dataset files may be updated to remove URLs which no longer work.

Dataset construction

The training and index sets were constructed by clustering photos with respect to their geolocation and visual similarity using an algorithm similar to the one described in [4]. Matches between training images were established using local feature matching. Note that there may be multiple clusters per landmark, which typically correspond to different views or different parts of the landmark. To avoid bias, no computer vision algorithms were used for ground truth generation. Instead, we established ground truth correspondences between test images and landmarks using human annotators.

License

The images listed in this dataset are publicly available on the web, and may have different licenses. Google does not own their copyright.

...
Data from: NeSy4VRD: A Multifaceted Resource for Neurosymbolic AI Research...
zenodo.org
data-staging.niaid.nih.gov
+1more
zip
Updated Sep 8, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
David Herron; David Herron; Ernesto Jimenez-Ruiz; Ernesto Jimenez-Ruiz; Giacomo Tarroni; Giacomo Tarroni; Tillman Weyde; Tillman Weyde (2025). NeSy4VRD: A Multifaceted Resource for Neurosymbolic AI Research using Knowledge Graphs in Visual Relationship Detection [Dataset]. http://doi.org/10.5281/zenodo.17076303
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.17076303
Dataset updated
Sep 8, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
David Herron; David Herron; Ernesto Jimenez-Ruiz; Ernesto Jimenez-Ruiz; Giacomo Tarroni; Giacomo Tarroni; Tillman Weyde; Tillman Weyde
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Sep 8, 2025
Description
NeSy4VRD

NeSy4VRD is a multifaceted, multipurpose resource designed to foster neurosymbolic AI (NeSy) research, particularly NeSy research using Semantic Web technologies such as OWL ontologies, OWL-based knowledge graphs and OWL-based reasoning as symbolic components. The NeSy4VRD research resource pertains to the computer vision field of AI and, within that field, to the application tasks of visual relationship detection (VRD) and scene graph generation.

Whilst the core motivation of the NeSy4VRD research resource is to foster computer vision-based NeSy research using Semantic Web technologies such as OWL ontologies and OWL-based knowledge graphs, AI researchers can readily use NeSy4VRD to either: 1) pursue computer vision-based NeSy research without involving Semantic Web technologies as symbolic components, or 2) pursue computer vision research without NeSy (i.e. pursue research that focuses purely on deep learning alone, without involving symbolic components of any kind). This is the sense in which we describe NeSy4VRD as being multipurpose: it can readily be used by diverse groups of computer vision-based AI researchers with diverse interests and objectives.

The NeSy4VRD research resource in its entirety is distributed across two locations: Zenodo and GitHub.

NeSy4VRD on Zenodo: the NeSy4VRD dataset package

This entry on Zenodo hosts the NeSy4VRD dataset package, which includes the NeSy4VRD dataset and its companion NeSy4VRD ontology, an OWL ontology called VRD-World.

The NeSy4VRD dataset consists of an image dataset with associated visual relationship annotations. The images of the NeSy4VRD dataset are the same as those that were once publicly available as part of the VRD dataset. The NeSy4VRD visual relationship annotations are a highly customised and quality-improved version of the original VRD visual relationship annotations. The NeSy4VRD dataset is designed for computer vision-based research that involves detecting objects in images and predicting relationships between ordered pairs of those objects. A visual relationship for an image of the NeSy4VRD dataset has the form <'subject', 'predicate', 'object'>, where the 'subject' and 'object' are two objects in the image, and the 'predicate' describes some relation between them. Both the 'subject' and 'object' objects are specified in terms of bounding boxes and object classes. For example, representative annotated visual relationships are <'person', 'ride', 'horse'>, <'hat', 'on', 'teddy bear'> and <'cat', 'under', 'pillow'>.

Visual relationship detection is pursued as a computer vision application task in its own right, and as a building block capability for the broader application task of scene graph generation. Scene graph generation, in turn, is commonly used as a precursor to a variety of enriched, downstream visual understanding and reasoning application tasks, such as image captioning, visual question answering, image retrieval, image generation and multimedia event processing.

The NeSy4VRD ontology, VRD-World, is a rich, well-aligned, companion OWL ontology engineered specifically for use with the NeSy4VRD dataset. It directly describes the domain of the NeSy4VRD dataset, as reflected in the NeSy4VRD visual relationship annotations. More specifically, all of the object classes that feature in the NeSy4VRD visual relationship annotations have corresponding classes within the VRD-World OWL class hierarchy, and all of the predicates that feature in the NeSy4VRD visual relationship annotations have corresponding properties within the VRD-World OWL object property hierarchy. The rich structure of the VRD-World class hierarchy and the rich characteristics and relationships of the VRD-World object properties together give the VRD-World OWL ontology rich inference semantics. These provide ample opportunity for OWL reasoning to be meaningfully exercised and exploited in NeSy research that uses OWL ontologies and OWL-based knowledge graphs as symbolic components. There is also ample potential for NeSy researchers to explore supplementing the OWL reasoning capabilities afforded by the VRD-World ontology with Datalog rules and reasoning.

Use of the NeSy4VRD ontology, VRD-World, in conjunction with the NeSy4VRD dataset is, of course, purely optional, however. Computer vision AI researchers who have no interest in NeSy, or NeSy researchers who have no interest in OWL ontologies and OWL-based knowledge graphs, can ignore the NeSy4VRD ontology and use the NeSy4VRD dataset by itself.

All computer vision-based AI research user groups can, if they wish, also avail themselves of the other components of the NeSy4VRD research resource available on GitHub.

NeSy4VRD on GitHub: open source infrastructure supporting extensibility, and sample code

The NeSy4VRD research resource incorporates additional components that are companions to the NeSy4VRD dataset package here on Zenodo. These companion components are available at NeSy4VRD on GitHub. These companion components consist of:

comprehensive open source Python-based infrastructure supporting the extensibility of the NeSy4VRD visual relationship annotations (and, thereby, the extensibility of the NeSy4VRD ontology, VRD-World, as well)

open source Python sample code showing how one can work with the NeSy4VRD visual relationship annotations in conjunction with the NeSy4VRD ontology, VRD-World, and RDF knowledge graphs.

The NeSy4VRD infrastructure supporting extensibility consists of:

open source Python code for conducting deep and comprehensive analyses of the NeSy4VRD dataset (the VRD images and their associated NeSy4VRD visual relationship annotations)

an open source, custom-designed NeSy4VRD protocol for specifying visual relationship annotation customisation instructions declaratively, in text files

an open source, custom-designed NeSy4VRD workflow, implemented using Python scripts and modules, for applying small or large volumes of customisations or extensions to the NeSy4VRD visual relationship annotations in a configurable, managed, automated and repeatable process.

The purpose behind providing comprehensive infrastructure to support extensibility of the NeSy4VRD visual relationship annotations is to make it easy for researchers to take the NeSy4VRD dataset in new directions, by further enriching the annotations, or by tailoring them to introduce new or more data conditions that better suit their particular research needs and interests. The option to use the NeSy4VRD extensibility infrastructure in this way applies equally well to each of the diverse potential NeSy4VRD user groups already mentioned.

The NeSy4VRD extensibility infrastructure, however, may be of particular interest to NeSy researchers interested in using the NeSy4VRD ontology, VRD-World, in conjunction with the NeSy4VRD dataset. These researchers can of course tailor the VRD-World ontology if they wish without needing to modify or extend the NeSy4VRD visual relationship annotations in any way. But their degrees of freedom for doing so will be limited by the need to maintain alignment with the NeSy4VRD visual relationship annotations and the particular set of object classes and predicates to which they refer. If NeSy researchers want full freedom to tailor the VRD-World ontology, they may well need to tailor the NeSy4VRD visual relationship annotations first, in order that alignment be maintained.

To illustrate our point, and to illustrate our vision of how the NeSy4VRD extensibility infrastructure can be used, let us consider a simple example. It is common in computer vision to distinguish between thing objects (that have well-defined shapes) and stuff objects (that are amorphous). Suppose a researcher wishes to have a greater number of stuff object classes with which to work. Water is such a stuff object. Many VRD images contain water but it is not currently one of the annotated object classes and hence is never referenced in any visual relationship annotations. So adding a Water class to the class hierarchy of the VRD-World ontology would be pointless because it would never acquire any instances (because an object detector would never detect any). However, our hypothetical researcher could choose to do the following:

use the analysis functionality of the NeSy4VRD extensibility infrastructure to find images containing water (by, say, searching for images whose visual relationships refer to object classes such as 'boat', 'surfboard', 'sand', 'umbrella', etc.);

use free image analysis software (such as GIMP, at gimp.org) to get bounding boxes for instances of water in these images;

use the NeSy4VRD protocol to specify new visual relationships for these images that refer to the new 'water' objects (e.g. <'boat', 'on', 'water'>);

use the NeSy4VRD workflow to introduce the new object class 'water' and to apply the specified new visual relationships to the sets of annotations for the affected images;

introduce class Water to the class hierarchy of the VRD-World ontology (using, say, the free Protege ontology editor);

continue experimenting, now with the added benefit of the additional stuff object class 'water';

contribute the enriched set of NeSy4VRD visual relationship
laion-400M
kaggle.com
opendatalab.com
+1more
zip
Updated Sep 5, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Romain Beaumont (2021). laion-400M [Dataset]. https://www.kaggle.com/romainbeaumont/laion400m
Explore at:
zip(48835402620 bytes)Available download formats
Dataset updated
Sep 5, 2021
Authors
Romain Beaumont
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Concept

The LAION-400M dataset is completely openly, freely accessible.

Check https://laion.ai/laion-400-open-dataset/ for the full description of this dataset.

All images and texts in the LAION-400M dataset have been filtered with OpenAI‘s CLIP by calculating the cosine similarity between the text and image embeddings and dropping those with a similarity below 0.3

The threshold of 0.3 had been determined through human evaluations and seems to be a good heuristic for estimating semantic image-text-content matching.

The image-text-pairs have been extracted from the Common Crawl webdata dump and are from random web pages crawled between 2014 and 2021.

Use img2dataset to download subsets of this.

Dataset Statistics

The LAION-400M and future even bigger ones are in fact datasets of datasets. For instance, it can be filtered out by image sizes into smaller datasets like this: Number of unique samples 413M Number with height or width >= 1024 26M
Number with height and width >= 1024 9.6M
Number with height or width >= 512 112M
Number with height and width >= 512 67M
Number with height or width >= 256 268M
Number with height and width >= 256 211M

By using the KNN index specialized datasets can also be extracted by domains of interest. They are (or will be) sufficient in size to train domain specialized models.

Random Samples from the dataset

http://gallerytest.christoph-schuhmann.de/photos/index.php?/category/4 (todo: replace link with local gallery) https://rom1504.github.io/clip-retrieval/ is a simple visualization of the dataset. There you can search among the dataset using clip and a knn index.

LAION-400M Open Dataset structure

We produced the dataset in several formats to address the various use cases:

a 50GB url+caption metadata dataset in parquet files. This can be use to compute statistics and redownload part of the dataset

a 10TB webdataset with 256x256 images, captions and metadata. This is a full version of the dataset, that can be used directly for training

a 1TB set of the 400M text and image clip embeddings, useful to rebuild new knn indices

two 4GB knn indices allowing to easily search in the dataset

In this kaggle, we provide the url and caption metadata dataset. Check https://laion.ai/laion-400-open-dataset/ for the other formats and the full explanation.

Url and caption metadata dataset.

We provide 32 parquet files of size around 1GB (total 50GB) with the image URLs, the associated texts and additional metadata in the following format:

SAMPLE_ID | URL | TEXT | LICENSE | NSFW | similarity | WIDTH | HEIGHT

where

SAMPLE_ID: A unique identifier LICENSE: If a Creative Commons License could be extracted from the image data, we name it here like e.g. “creativecommons.org/licenses/by-nc-sa/3.0/” - otherwise you’ll find it here a “?” NSFW: CLIP had been used to estimate if the image has NSFW content. The estimation has been pretty conservative, reducing the number of false negatives at the cost of more false positives. Possible values are “UNLIKELY”, “UNSURE” and “NSFW” similarity: Value of the cosine similarity between the text and image embedding WIDTH and HEIGHT: image size as the image was embedded. Originals that were larger than 4K size were resized to 4K

This metadata dataset is best used to redownload the whole dataset or a subset of it. The img2dataset tool can be used to efficiently download such subsets.
Z
MuMu: Multimodal Music Dataset
data.niaid.nih.gov
zenodo.org
Updated Dec 6, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Oramas, Sergio (2022). MuMu: Multimodal Music Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_831188
Explore at:
Dataset updated
Dec 6, 2022
Dataset provided by
Universitat Pompeu Fabra
Authors
Oramas, Sergio
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
MuMu is a Multimodal Music dataset with multi-label genre annotations that combines information from the Amazon Reviews dataset and the Million Song Dataset (MSD). The former contains millions of album customer reviews and album metadata gathered from Amazon.com. The latter is a collection of metadata and precomputed audio features for a million songs.

To map the information from both datasets we use MusicBrainz. This process yields the final set of 147,295 songs, which belong to 31,471 albums. For the mapped set of albums, there are 447,583 customer reviews from the Amazon Dataset. The dataset have been used for multi-label music genre classification experiments in the related publication. In addition to genre annotations, this dataset provides further information about each album, such as genre annotations, average rating, selling rank, similar products, and cover image url. For every text review it also provides helpfulness score of the reviews, average rating, and summary of the review.

The mapping between the three datasets (Amazon, MusicBrainz and MSD), genre annotations, metadata, data splits, text reviews and links to images are available here. Images and audio files can not be released due to copyright issues.

MuMu dataset (mapping, metadata, annotations and text reviews)

Data splits and multimodal feature embeddings for ISMIR multi-label classification experiments

These data can be used together with the Tartarus deep learning library https://github.com/sergiooramas/tartarus.

NOTE: This version provides simplified files with metadata and splits.

Scientific References

Please cite the following papers if using MuMu dataset or Tartarus library.

Oramas, S., Barbieri, F., Nieto, O., and Serra, X (2018). Multimodal Deep Learning for Music Genre Classification, Transactions of the International Society for Music Information Retrieval, V(1).

Oramas S., Nieto O., Barbieri F., & Serra X. (2017). Multi-label Music Genre Classification from audio, text and images using Deep Features. In Proceedings of the 18th International Society for Music Information Retrieval Conference (ISMIR 2017). https://arxiv.org/abs/1707.04916
Brain Tumor .npy
kaggle.com
zip
Updated Apr 18, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Awsaf (2020). Brain Tumor .npy [Dataset]. https://www.kaggle.com/awsaf49/brain-tumor
Explore at:
zip(880096105 bytes)Available download formats
Dataset updated
Apr 18, 2020
Authors
Awsaf
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Context

This brain tumor dataset containing 3064 T1-weighted contrast-inhanced images from 233 patients with three kinds of brain tumor: meningioma (708 slices), glioma (1426 slices), and pituitary tumor (930 slices). Due to the file size limit of repository, we split the whole dataset into 4 subsets, and achive them in 4 .zip files with each .zip file containing 766 slices.The 5-fold cross-validation indices are also provided.

Content

This data is organized in matlab data format (.mat file). Each file stores a struct containing the following fields for an image:

cjdata.label: 1 for meningioma, 2 for glioma, 3 for pituitary tumor

cjdata.PID: patient ID

cjdata.image: image data

cjdata.tumorBorder: a vector storing the coordinates of discrete points on tumor border. For example, [x1, y1, x2, y2,...] in which x1, y1 are planar coordinates on tumor border. It was generated by manually delineating the tumor border. So we can use it to generate binary image of tumor mask.

cjdata.tumorMask: a binary image with 1s indicating tumor region

Acknowledgements

This data was used in the following paper: 1. Cheng, Jun, et al. "Enhanced Performance of Brain Tumor Classification via Tumor Region Augmentation and Partition." PloS one 10.10 (2015). Enhanced performance of brain tumor classification via tumor region augmentation and partition

Cheng, Jun, et al. "Retrieval of Brain Tumors by Adaptive Spatial Pooling and Fisher Vector Representation." PloS one 11.6 (2016). Retrieval of Brain Tumors by Adaptive Spatial Pooling and Fisher Vector Representation

Matlab source codes are available on github https://github.com/chengjun583/brainTumorRetrieval
Baseline Landmark Retrieval Model
kaggle.com
zip
Updated Jul 1, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cam Askew (2020). Baseline Landmark Retrieval Model [Dataset]. https://www.kaggle.com/camaskew/baseline-landmark-retrieval-model
Explore at:
zip(174041401 bytes)Available download formats
Dataset updated
Jul 1, 2020
Authors
Cam Askew
Description
About

This is a reference global feature extraction model for the Google Landmark Retrieval 2020 Competition. You can use it as an initial submission to the competition or to better understand the model submission format and requirements.

To create a submission to the competition, download the dataset, and zip its contents.

Content

This dataset contains a simplified version of DELG (ResNet-101 backbone with ArcFace). It outputs global features output only, and has been exported as a Tensorflow SavedModel, with the competition's required serving signature, serving_default (the default when creating a SavedModel), and the required output, global_descriptor.

The model takes as input a single arbitrarily sized uint8 tensor of an RGB image, and outputs the embedding for the image as a float tensor with shape (2048,) to global_descriptor.

Acknowledgements

DELG (github):

"Unifying Deep Local and Global Features for Image Search", B. Cao*, A. Araujo* and J. Sim, arxiv:2001.05027

GLDv2:

"Google Landmarks Dataset v2 - A Large-Scale Benchmark for Instance-Level Recognition and Retrieval", T. Weyand*, A. Araujo*, B. Cao and J. Sim, Proc. CVPR'20

GLDv2 clean (Kaggle dataset here):

"Large-scale Landmark Retrieval/Recognition under a Noisy and Diverse Dataset", K. Ozaki, S. Yokoo
Batik Nusantara (Batik Indonesia) Dataset
kaggle.com
zip
Updated Feb 17, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
HendryHB (2024). Batik Nusantara (Batik Indonesia) Dataset [Dataset]. https://www.kaggle.com/datasets/hendryhb/batik-nusantara-batik-indonesia-dataset
Explore at:
zip(105554919 bytes)Available download formats
Dataset updated
Feb 17, 2024
Authors
HendryHB
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Batik Background

Indonesian textile craftsmanship has evolved over millennia, transitioning from basic utilitarian weaving techniques around 2500 BC to more intricate patterns and religious symbolism and social and culture during the time, with production hubs across regions like Sumatra, Borneo, Java, Celebes, Nusa Tenggara, and Bali. These textiles evolved from utilitarian items to carriers of sacred meanings, divided into secular and sacred cloths, both renowned for their aesthetic beauty. They played a pivotal role in individuals' cultural journeys, symbolizing life stages like maternity, matrimony, and mortality, with designs reflecting religious beliefs and the era's influence. The Batik technique, a hallmark of Indonesian textile artistry, involves creating intricate patterns using a resist wax method. Traditionally, artisans used a tool called a canting to draw patterns on fabric, a process known as batik tulis (drawn batik). Following the drawing phase, the cloth was dyed using natural dyes, and then subjected to the "lorot" process, involving boiling the wax out of the fabric. https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F19051508%2Fe543b4e91ad5dffe2b54e7f4300cc7b2%2F2024-02-16%2015.09.06%20copy%202.jpg?generation=1708074019154098&alt=media" alt=""> Batik making is revered for its complexity and demands high craftsmanship, requiring precise hand gestures and mastery of the canting tool. It stands as one of the most challenging pattern-making techniques in textile artistry. [1]

Dataset Collection

The primary objective of this dataset is to serve as a resource for research or academic or educational purposes rather than commercial endeavors. The dataset was meticulously compiled to include high-quality images representative of various types of Batik, encompassing the rich diversity of Batik Nusantara or Indonesian Batik from the Aceh to Papua regions.

Andrew has mentioned that the cornerstone of effective machine learning lies in the quality of the data. Meticulously curated datasets hold the power to unlock valuable insights and drive meaningful results. In other words, data is more important than models. In contrast, datasets lacking in quality may hinder the learning process and lead to suboptimal outcomes. Therefore, prioritizing data quality is paramount, as it lays the foundation for successful machine learning initiatives [2]. Also Sebastian added that the effectiveness of a machine learning algorithm greatly depends on the quality of the data and the richness of the information it encapsulates [3].

Acknowledgments

This dataset was meticulously carefully collected with the assistance of Ultralytics. The ownership of all images within this dataset belongs to respective parties, to whom we extend our gratitude for their contribution of these visually captivating images.

To cite from Kaggle:

[Dataset creator's name]. ([Year & Month of dataset creation]). [Name of the dataset], [Version of the dataset]. Retrieved [Date Retrieved] from [URL of the dataset].

Dataset

Comprising 40 raw images per class with image dimension of 224 x 224, this dataset encompasses a wide array of Batik designs, each representing a distinct category. The classes include 'Aceh PintuAceh', 'Bali Barong', 'Bali Merak', 'DKI OndelOndel', 'JawaBarat Megamendung', 'JawaTimur Pring', 'Kalimantan Dayak', 'Lampung Gajah', 'Madura Mataketeran', 'Maluku Pala', 'NTB Lumbung', 'Papua Asmat', 'Papua Cendrawasih', 'Papua Tifa', 'Solo Parang', 'SulawesiSelatan Lontara', 'SumateraBarat Rumah Minang', 'SumateraUtara Boraspati', 'Yogyakarta Kawung', and 'Yogyakarta Parang' [2][3][4][5][6][7]. These classes collectively portray the rich heritage of Batik Nusantara or Batik Indonesia, spanning from the Aceh to Papua regions. Feel free to explore image augmentation techniques to further enhance the dataset.

Simple Coding is available @ git with assumption using Colab. For reference, the following pre-trained architectures have been added: VGG16, ResNet50, Xception, MobileNetV2, along with Content-Based Image Retrieval (CBIR), Random Forest, a CNN architecture, and modeling, in addition to the MLP. It is also available on Kaggle Dataset Notebooks (Code).

Instructions for Dataset Usage

Below are steps to utilise the dataset using either Google Colab or Jupyter Notebook: 1. Begin by downloading the dataset. 2. Upon extraction, you'll find separate folders for training and testing data. Should you require validation data, either manually split a portion (approximately around 20%) from the training set and store it separately, or perform on-the-fly splitting during coding. 3. If splitting validation data manually, remember to re-zip the dataset after the separation process. 4....
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Mathurin Aché (2022). GPR1200 Dataset [Dataset]. https://www.kaggle.com/datasets/mathurinache/gpr1200-dataset/data

GPR1200 Dataset

GPR1200: A Benchmark for General-Purpose Content-Based Image Retrieval

Explore at:

42 scholarly articles cite this dataset (View in Google Scholar)

zip(1248744484 bytes)Available download formats

Dataset updated

Jan 4, 2022

Authors

Mathurin Aché

License

Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically

Description

Context

GPR1200: A Benchmark for General-Purpose Content-Based Image Retrieval (ArXiv)

Content

Similar to most vision related tasks, deep learning models have taken over in the field of content-based image retrieval (CBIR) over the course of the last decade. However, most publications that aim to optimise neural networks for CBIR, train and test their models on domain specific datasets. It is therefore unclear, if those networks can be used as a general-purpose image feature extractor. After analyzing popular image retrieval test sets we decided to manually curate GPR1200, an easy to use and accessible but challenging benchmark dataset with 1200 categories and 10 class examples. Classes and images were manually selected from six publicly available datasets of different image areas, ensuring high class diversity and clean class boundaries.

Acknowledgements

https://github.com/Visual-Computing/GPR1200/raw/main/images/GPR_main_pic.jpg" alt="GPR1200">

Inspiration

Benchmark your Image Retrieval Models on It https://github.com/Visual-Computing/GPR1200/raw/main/images/result_table.JPG" alt="Image Retrieval">

Clear search

Close search

Google apps

Main menu

GPR1200 Dataset

Context

Content

Acknowledgements

Inspiration

Google-Landmarks Dataset

Note: The Google Landmarks Dataset v1 is deprecated and no longer available. Please consider using the Google Landmarks Dataset v2 instead.

Challenges

CVPR'18 Workshop

Content

Dataset construction

License

Data from: NeSy4VRD: A Multifaceted Resource for Neurosymbolic AI Research...

laion-400M

Concept

Dataset Statistics

Random Samples from the dataset

LAION-400M Open Dataset structure

Url and caption metadata dataset.

MuMu: Multimodal Music Dataset

Brain Tumor .npy

Context

Content

Acknowledgements

Baseline Landmark Retrieval Model

About

Content

Acknowledgements

Batik Nusantara (Batik Indonesia) Dataset

Batik Background

Dataset Collection

Acknowledgments

To cite from Kaggle:

Dataset

Instructions for Dataset Usage

GPR1200 Dataset

GPR1200: A Benchmark for General-Purpose Content-Based Image Retrieval

Context

Content

Acknowledgements

Inspiration