8 datasets found
  1. GPR1200 Dataset

    • kaggle.com
    zip
    Updated Jan 4, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mathurin Aché (2022). GPR1200 Dataset [Dataset]. https://www.kaggle.com/datasets/mathurinache/gpr1200-dataset/data
    Explore at:
    zip(1248744484 bytes)Available download formats
    Dataset updated
    Jan 4, 2022
    Authors
    Mathurin Aché
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Context

    GPR1200: A Benchmark for General-Purpose Content-Based Image Retrieval (ArXiv)

    Content

    Similar to most vision related tasks, deep learning models have taken over in the field of content-based image retrieval (CBIR) over the course of the last decade. However, most publications that aim to optimise neural networks for CBIR, train and test their models on domain specific datasets. It is therefore unclear, if those networks can be used as a general-purpose image feature extractor. After analyzing popular image retrieval test sets we decided to manually curate GPR1200, an easy to use and accessible but challenging benchmark dataset with 1200 categories and 10 class examples. Classes and images were manually selected from six publicly available datasets of different image areas, ensuring high class diversity and clean class boundaries.

    Acknowledgements

    https://github.com/Visual-Computing/GPR1200/raw/main/images/GPR_main_pic.jpg" alt="GPR1200">

    Inspiration

    Benchmark your Image Retrieval Models on It https://github.com/Visual-Computing/GPR1200/raw/main/images/result_table.JPG" alt="Image Retrieval">

  2. Google-Landmarks Dataset

    • kaggle.com
    zip
    Updated Jul 15, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Google (2022). Google-Landmarks Dataset [Dataset]. https://www.kaggle.com/google/google-landmarks-dataset
    Explore at:
    zip(1147 bytes)Available download formats
    Dataset updated
    Jul 15, 2022
    Dataset authored and provided by
    Googlehttp://google.com/
    Description

    Note: The Google Landmarks Dataset v1 is deprecated and no longer available. Please consider using the Google Landmarks Dataset v2 instead.

    Did you ever go through your vacation photos and ask yourself: What is the name of this temple I visited in China? Who created this monument I saw in France? Landmark recognition can help! This technology can predict landmark labels directly from image pixels, to help people better understand and organize their photo collections. Today, a great obstacle to landmark recognition research is the lack of large annotated datasets. This motivated us to release Google-Landmarks, the largest worldwide dataset to date, to foster progress in this problem.

    The dataset is divided into two sets of images, to evaluate two different computer vision tasks: recognition and retrieval. The data was originally described in [1], and published as part of the Google Landmark Recognition Challenge and Google Landmark Retrieval Challenge. Additionally, to spur research in this field, we have open-sourced Deep Local Features (DELF), an attentive local feature descriptor that we believe is especially suited for this kind of task. DELF's code can be found on github via this link.

    UPDATE: We have now also made available the Google Landmark Boxes dataset, containing 86 thousand bounding boxes.

    If you make use of the Google Landmarks dataset in your research, please consider citing:

    H. Noh, A. Araujo, J. Sim, T. Weyand, B. Han, "Large-Scale Image Retrieval with Attentive Deep Local Features", Proc. ICCV'17

    If you make use of the Google Landmark Boxes dataset in your research, please consider citing:

    M. Teichmann*, A. Araujo*, M. Zhu and J. Sim, “Detect-to-Retrieve: Efficient Regional Aggregation for Image Search”, Proc. CVPR'19

    Challenges

    The two challenges associated to this dataset can be found in the following links:

    CVPR'18 Workshop

    The Landmark Recognition Workshop at CVPR 2018 will discuss recent progress on landmark recognition and image retrieval, taking into account the results of the above-mentioned challenges. Top submissions for the challenges will be invited to give talks at the workshop.

    Content

    The dataset contains URLs of images which are publicly available online (this Python script may be useful to download the images). Note that no image data is released, only URLs.

    The dataset contains test images, training images and index images. The test images are used in both tasks: for the recognition task, a landmark label may be predicted for each test image; for the retrieval task, relevant index images may be retrieved for each test image. The training images are associated to landmark labels, and can be used to train models for the recognition and retrieval challenges (for a visualization of the geographic distribution of training images, see [3]). The index images are used in the retrieval task, composing the set from which images should be retrieved.

    Note that the test set for both the recognition and retrieval tasks is the same, to encourage researchers to experiment with both. We also encourage participants to use the training data from the recognition task to train models which could be useful for the retrieval task. Note, however, that there are no landmarks in common between the training/index sets of the two tasks.

    The images listed in the dataset are not directly in our control, so their availability may change over time, and the dataset files may be updated to remove URLs which no longer work.

    Dataset construction

    The training and index sets were constructed by clustering photos with respect to their geolocation and visual similarity using an algorithm similar to the one described in [4]. Matches between training images were established using local feature matching. Note that there may be multiple clusters per landmark, which typically correspond to different views or different parts of the landmark. To avoid bias, no computer vision algorithms were used for ground truth generation. Instead, we established ground truth correspondences between test images and landmarks using human annotators.

    License

    The images listed in this dataset are publicly available on the web, and may have different licenses. Google does not own their copyright.

    ...

  3. Data from: NeSy4VRD: A Multifaceted Resource for Neurosymbolic AI Research...

    • zenodo.org
    • data-staging.niaid.nih.gov
    • +1more
    zip
    Updated Sep 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    David Herron; David Herron; Ernesto Jimenez-Ruiz; Ernesto Jimenez-Ruiz; Giacomo Tarroni; Giacomo Tarroni; Tillman Weyde; Tillman Weyde (2025). NeSy4VRD: A Multifaceted Resource for Neurosymbolic AI Research using Knowledge Graphs in Visual Relationship Detection [Dataset]. http://doi.org/10.5281/zenodo.17076303
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 8, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    David Herron; David Herron; Ernesto Jimenez-Ruiz; Ernesto Jimenez-Ruiz; Giacomo Tarroni; Giacomo Tarroni; Tillman Weyde; Tillman Weyde
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Sep 8, 2025
    Description

    NeSy4VRD

    NeSy4VRD is a multifaceted, multipurpose resource designed to foster neurosymbolic AI (NeSy) research, particularly NeSy research using Semantic Web technologies such as OWL ontologies, OWL-based knowledge graphs and OWL-based reasoning as symbolic components. The NeSy4VRD research resource pertains to the computer vision field of AI and, within that field, to the application tasks of visual relationship detection (VRD) and scene graph generation.

    Whilst the core motivation of the NeSy4VRD research resource is to foster computer vision-based NeSy research using Semantic Web technologies such as OWL ontologies and OWL-based knowledge graphs, AI researchers can readily use NeSy4VRD to either: 1) pursue computer vision-based NeSy research without involving Semantic Web technologies as symbolic components, or 2) pursue computer vision research without NeSy (i.e. pursue research that focuses purely on deep learning alone, without involving symbolic components of any kind). This is the sense in which we describe NeSy4VRD as being multipurpose: it can readily be used by diverse groups of computer vision-based AI researchers with diverse interests and objectives.

    The NeSy4VRD research resource in its entirety is distributed across two locations: Zenodo and GitHub.

    NeSy4VRD on Zenodo: the NeSy4VRD dataset package

    This entry on Zenodo hosts the NeSy4VRD dataset package, which includes the NeSy4VRD dataset and its companion NeSy4VRD ontology, an OWL ontology called VRD-World.

    The NeSy4VRD dataset consists of an image dataset with associated visual relationship annotations. The images of the NeSy4VRD dataset are the same as those that were once publicly available as part of the VRD dataset. The NeSy4VRD visual relationship annotations are a highly customised and quality-improved version of the original VRD visual relationship annotations. The NeSy4VRD dataset is designed for computer vision-based research that involves detecting objects in images and predicting relationships between ordered pairs of those objects. A visual relationship for an image of the NeSy4VRD dataset has the form <'subject', 'predicate', 'object'>, where the 'subject' and 'object' are two objects in the image, and the 'predicate' describes some relation between them. Both the 'subject' and 'object' objects are specified in terms of bounding boxes and object classes. For example, representative annotated visual relationships are <'person', 'ride', 'horse'>, <'hat', 'on', 'teddy bear'> and <'cat', 'under', 'pillow'>.

    Visual relationship detection is pursued as a computer vision application task in its own right, and as a building block capability for the broader application task of scene graph generation. Scene graph generation, in turn, is commonly used as a precursor to a variety of enriched, downstream visual understanding and reasoning application tasks, such as image captioning, visual question answering, image retrieval, image generation and multimedia event processing.

    The NeSy4VRD ontology, VRD-World, is a rich, well-aligned, companion OWL ontology engineered specifically for use with the NeSy4VRD dataset. It directly describes the domain of the NeSy4VRD dataset, as reflected in the NeSy4VRD visual relationship annotations. More specifically, all of the object classes that feature in the NeSy4VRD visual relationship annotations have corresponding classes within the VRD-World OWL class hierarchy, and all of the predicates that feature in the NeSy4VRD visual relationship annotations have corresponding properties within the VRD-World OWL object property hierarchy. The rich structure of the VRD-World class hierarchy and the rich characteristics and relationships of the VRD-World object properties together give the VRD-World OWL ontology rich inference semantics. These provide ample opportunity for OWL reasoning to be meaningfully exercised and exploited in NeSy research that uses OWL ontologies and OWL-based knowledge graphs as symbolic components. There is also ample potential for NeSy researchers to explore supplementing the OWL reasoning capabilities afforded by the VRD-World ontology with Datalog rules and reasoning.

    Use of the NeSy4VRD ontology, VRD-World, in conjunction with the NeSy4VRD dataset is, of course, purely optional, however. Computer vision AI researchers who have no interest in NeSy, or NeSy researchers who have no interest in OWL ontologies and OWL-based knowledge graphs, can ignore the NeSy4VRD ontology and use the NeSy4VRD dataset by itself.

    All computer vision-based AI research user groups can, if they wish, also avail themselves of the other components of the NeSy4VRD research resource available on GitHub.

    NeSy4VRD on GitHub: open source infrastructure supporting extensibility, and sample code

    The NeSy4VRD research resource incorporates additional components that are companions to the NeSy4VRD dataset package here on Zenodo. These companion components are available at NeSy4VRD on GitHub. These companion components consist of:

    • comprehensive open source Python-based infrastructure supporting the extensibility of the NeSy4VRD visual relationship annotations (and, thereby, the extensibility of the NeSy4VRD ontology, VRD-World, as well)
    • open source Python sample code showing how one can work with the NeSy4VRD visual relationship annotations in conjunction with the NeSy4VRD ontology, VRD-World, and RDF knowledge graphs.

    The NeSy4VRD infrastructure supporting extensibility consists of:

    • open source Python code for conducting deep and comprehensive analyses of the NeSy4VRD dataset (the VRD images and their associated NeSy4VRD visual relationship annotations)
    • an open source, custom-designed NeSy4VRD protocol for specifying visual relationship annotation customisation instructions declaratively, in text files
    • an open source, custom-designed NeSy4VRD workflow, implemented using Python scripts and modules, for applying small or large volumes of customisations or extensions to the NeSy4VRD visual relationship annotations in a configurable, managed, automated and repeatable process.

    The purpose behind providing comprehensive infrastructure to support extensibility of the NeSy4VRD visual relationship annotations is to make it easy for researchers to take the NeSy4VRD dataset in new directions, by further enriching the annotations, or by tailoring them to introduce new or more data conditions that better suit their particular research needs and interests. The option to use the NeSy4VRD extensibility infrastructure in this way applies equally well to each of the diverse potential NeSy4VRD user groups already mentioned.

    The NeSy4VRD extensibility infrastructure, however, may be of particular interest to NeSy researchers interested in using the NeSy4VRD ontology, VRD-World, in conjunction with the NeSy4VRD dataset. These researchers can of course tailor the VRD-World ontology if they wish without needing to modify or extend the NeSy4VRD visual relationship annotations in any way. But their degrees of freedom for doing so will be limited by the need to maintain alignment with the NeSy4VRD visual relationship annotations and the particular set of object classes and predicates to which they refer. If NeSy researchers want full freedom to tailor the VRD-World ontology, they may well need to tailor the NeSy4VRD visual relationship annotations first, in order that alignment be maintained.

    To illustrate our point, and to illustrate our vision of how the NeSy4VRD extensibility infrastructure can be used, let us consider a simple example. It is common in computer vision to distinguish between thing objects (that have well-defined shapes) and stuff objects (that are amorphous). Suppose a researcher wishes to have a greater number of stuff object classes with which to work. Water is such a stuff object. Many VRD images contain water but it is not currently one of the annotated object classes and hence is never referenced in any visual relationship annotations. So adding a Water class to the class hierarchy of the VRD-World ontology would be pointless because it would never acquire any instances (because an object detector would never detect any). However, our hypothetical researcher could choose to do the following:

    • use the analysis functionality of the NeSy4VRD extensibility infrastructure to find images containing water (by, say, searching for images whose visual relationships refer to object classes such as 'boat', 'surfboard', 'sand', 'umbrella', etc.);
    • use free image analysis software (such as GIMP, at gimp.org) to get bounding boxes for instances of water in these images;
    • use the NeSy4VRD protocol to specify new visual relationships for these images that refer to the new 'water' objects (e.g. <'boat', 'on', 'water'>);
    • use the NeSy4VRD workflow to introduce the new object class 'water' and to apply the specified new visual relationships to the sets of annotations for the affected images;
    • introduce class Water to the class hierarchy of the VRD-World ontology (using, say, the free Protege ontology editor);
    • continue experimenting, now with the added benefit of the additional stuff object class 'water';
    • contribute the enriched set of NeSy4VRD visual relationship

  4. laion-400M

    • kaggle.com
    • opendatalab.com
    • +1more
    zip
    Updated Sep 5, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Romain Beaumont (2021). laion-400M [Dataset]. https://www.kaggle.com/romainbeaumont/laion400m
    Explore at:
    zip(48835402620 bytes)Available download formats
    Dataset updated
    Sep 5, 2021
    Authors
    Romain Beaumont
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Concept

    The LAION-400M dataset is completely openly, freely accessible.

    Check https://laion.ai/laion-400-open-dataset/ for the full description of this dataset.

    All images and texts in the LAION-400M dataset have been filtered with OpenAI‘s CLIP by calculating the cosine similarity between the text and image embeddings and dropping those with a similarity below 0.3

    The threshold of 0.3 had been determined through human evaluations and seems to be a good heuristic for estimating semantic image-text-content matching.

    The image-text-pairs have been extracted from the Common Crawl webdata dump and are from random web pages crawled between 2014 and 2021.

    Use img2dataset to download subsets of this.

    Dataset Statistics

    The LAION-400M and future even bigger ones are in fact datasets of datasets. For instance, it can be filtered out by image sizes into smaller datasets like this: Number of unique samples 413M Number with height or width >= 1024 26M
    Number with height and width >= 1024 9.6M
    Number with height or width >= 512 112M
    Number with height and width >= 512 67M
    Number with height or width >= 256 268M
    Number with height and width >= 256 211M

    By using the KNN index specialized datasets can also be extracted by domains of interest. They are (or will be) sufficient in size to train domain specialized models.

    Random Samples from the dataset

    http://gallerytest.christoph-schuhmann.de/photos/index.php?/category/4 (todo: replace link with local gallery) https://rom1504.github.io/clip-retrieval/ is a simple visualization of the dataset. There you can search among the dataset using clip and a knn index.

    LAION-400M Open Dataset structure

    We produced the dataset in several formats to address the various use cases:

    • a 50GB url+caption metadata dataset in parquet files. This can be use to compute statistics and redownload part of the dataset
    • a 10TB webdataset with 256x256 images, captions and metadata. This is a full version of the dataset, that can be used directly for training
    • a 1TB set of the 400M text and image clip embeddings, useful to rebuild new knn indices
    • two 4GB knn indices allowing to easily search in the dataset

    In this kaggle, we provide the url and caption metadata dataset. Check https://laion.ai/laion-400-open-dataset/ for the other formats and the full explanation.

    Url and caption metadata dataset.

    We provide 32 parquet files of size around 1GB (total 50GB) with the image URLs, the associated texts and additional metadata in the following format:

    SAMPLE_ID | URL | TEXT | LICENSE | NSFW | similarity | WIDTH | HEIGHT

    where

    SAMPLE_ID: A unique identifier LICENSE: If a Creative Commons License could be extracted from the image data, we name it here like e.g. “creativecommons.org/licenses/by-nc-sa/3.0/” - otherwise you’ll find it here a “?” NSFW: CLIP had been used to estimate if the image has NSFW content. The estimation has been pretty conservative, reducing the number of false negatives at the cost of more false positives. Possible values are “UNLIKELY”, “UNSURE” and “NSFW” similarity: Value of the cosine similarity between the text and image embedding WIDTH and HEIGHT: image size as the image was embedded. Originals that were larger than 4K size were resized to 4K

    This metadata dataset is best used to redownload the whole dataset or a subset of it. The img2dataset tool can be used to efficiently download such subsets.

  5. Z

    MuMu: Multimodal Music Dataset

    • data.niaid.nih.gov
    • zenodo.org
    Updated Dec 6, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Oramas, Sergio (2022). MuMu: Multimodal Music Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_831188
    Explore at:
    Dataset updated
    Dec 6, 2022
    Dataset provided by
    Universitat Pompeu Fabra
    Authors
    Oramas, Sergio
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    MuMu is a Multimodal Music dataset with multi-label genre annotations that combines information from the Amazon Reviews dataset and the Million Song Dataset (MSD). The former contains millions of album customer reviews and album metadata gathered from Amazon.com. The latter is a collection of metadata and precomputed audio features for a million songs.

    To map the information from both datasets we use MusicBrainz. This process yields the final set of 147,295 songs, which belong to 31,471 albums. For the mapped set of albums, there are 447,583 customer reviews from the Amazon Dataset. The dataset have been used for multi-label music genre classification experiments in the related publication. In addition to genre annotations, this dataset provides further information about each album, such as genre annotations, average rating, selling rank, similar products, and cover image url. For every text review it also provides helpfulness score of the reviews, average rating, and summary of the review.

    The mapping between the three datasets (Amazon, MusicBrainz and MSD), genre annotations, metadata, data splits, text reviews and links to images are available here. Images and audio files can not be released due to copyright issues.

    MuMu dataset (mapping, metadata, annotations and text reviews)

    Data splits and multimodal feature embeddings for ISMIR multi-label classification experiments

    These data can be used together with the Tartarus deep learning library https://github.com/sergiooramas/tartarus.

    NOTE: This version provides simplified files with metadata and splits.

    Scientific References

    Please cite the following papers if using MuMu dataset or Tartarus library.

    Oramas, S., Barbieri, F., Nieto, O., and Serra, X (2018). Multimodal Deep Learning for Music Genre Classification, Transactions of the International Society for Music Information Retrieval, V(1).

    Oramas S., Nieto O., Barbieri F., & Serra X. (2017). Multi-label Music Genre Classification from audio, text and images using Deep Features. In Proceedings of the 18th International Society for Music Information Retrieval Conference (ISMIR 2017). https://arxiv.org/abs/1707.04916

  6. Brain Tumor .npy

    • kaggle.com
    zip
    Updated Apr 18, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Awsaf (2020). Brain Tumor .npy [Dataset]. https://www.kaggle.com/awsaf49/brain-tumor
    Explore at:
    zip(880096105 bytes)Available download formats
    Dataset updated
    Apr 18, 2020
    Authors
    Awsaf
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    This brain tumor dataset containing 3064 T1-weighted contrast-inhanced images from 233 patients with three kinds of brain tumor: meningioma (708 slices), glioma (1426 slices), and pituitary tumor (930 slices). Due to the file size limit of repository, we split the whole dataset into 4 subsets, and achive them in 4 .zip files with each .zip file containing 766 slices.The 5-fold cross-validation indices are also provided.

    Content

    This data is organized in matlab data format (.mat file). Each file stores a struct containing the following fields for an image:

    • cjdata.label: 1 for meningioma, 2 for glioma, 3 for pituitary tumor
    • cjdata.PID: patient ID
    • cjdata.image: image data
    • cjdata.tumorBorder: a vector storing the coordinates of discrete points on tumor border. For example, [x1, y1, x2, y2,...] in which x1, y1 are planar coordinates on tumor border. It was generated by manually delineating the tumor border. So we can use it to generate binary image of tumor mask.
    • cjdata.tumorMask: a binary image with 1s indicating tumor region

    Acknowledgements

    This data was used in the following paper: 1. Cheng, Jun, et al. "Enhanced Performance of Brain Tumor Classification via Tumor Region Augmentation and Partition." PloS one 10.10 (2015). Enhanced performance of brain tumor classification via tumor region augmentation and partition

    1. Cheng, Jun, et al. "Retrieval of Brain Tumors by Adaptive Spatial Pooling and Fisher Vector Representation." PloS one 11.6 (2016). Retrieval of Brain Tumors by Adaptive Spatial Pooling and Fisher Vector Representation

    Matlab source codes are available on github https://github.com/chengjun583/brainTumorRetrieval

  7. Baseline Landmark Retrieval Model

    • kaggle.com
    zip
    Updated Jul 1, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cam Askew (2020). Baseline Landmark Retrieval Model [Dataset]. https://www.kaggle.com/camaskew/baseline-landmark-retrieval-model
    Explore at:
    zip(174041401 bytes)Available download formats
    Dataset updated
    Jul 1, 2020
    Authors
    Cam Askew
    Description

    About

    This is a reference global feature extraction model for the Google Landmark Retrieval 2020 Competition. You can use it as an initial submission to the competition or to better understand the model submission format and requirements.

    To create a submission to the competition, download the dataset, and zip its contents.

    Content

    This dataset contains a simplified version of DELG (ResNet-101 backbone with ArcFace). It outputs global features output only, and has been exported as a Tensorflow SavedModel, with the competition's required serving signature, serving_default (the default when creating a SavedModel), and the required output, global_descriptor.

    The model takes as input a single arbitrarily sized uint8 tensor of an RGB image, and outputs the embedding for the image as a float tensor with shape (2048,) to global_descriptor.

    Acknowledgements

    DELG (github): Paper

    "Unifying Deep Local and Global Features for Image Search",
    B. Cao*, A. Araujo* and J. Sim,
    arxiv:2001.05027
    

    GLDv2: Paper

    "Google Landmarks Dataset v2 - A Large-Scale Benchmark for Instance-Level Recognition and Retrieval",
    T. Weyand*, A. Araujo*, B. Cao and J. Sim,
    Proc. CVPR'20
    

    GLDv2 clean (Kaggle dataset here): Paper

    "Large-scale Landmark Retrieval/Recognition under a Noisy and Diverse Dataset",
    K. Ozaki, S. Yokoo
    
  8. Batik Nusantara (Batik Indonesia) Dataset

    • kaggle.com
    zip
    Updated Feb 17, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    HendryHB (2024). Batik Nusantara (Batik Indonesia) Dataset [Dataset]. https://www.kaggle.com/datasets/hendryhb/batik-nusantara-batik-indonesia-dataset
    Explore at:
    zip(105554919 bytes)Available download formats
    Dataset updated
    Feb 17, 2024
    Authors
    HendryHB
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Batik Background

    Indonesian textile craftsmanship has evolved over millennia, transitioning from basic utilitarian weaving techniques around 2500 BC to more intricate patterns and religious symbolism and social and culture during the time, with production hubs across regions like Sumatra, Borneo, Java, Celebes, Nusa Tenggara, and Bali. These textiles evolved from utilitarian items to carriers of sacred meanings, divided into secular and sacred cloths, both renowned for their aesthetic beauty. They played a pivotal role in individuals' cultural journeys, symbolizing life stages like maternity, matrimony, and mortality, with designs reflecting religious beliefs and the era's influence. The Batik technique, a hallmark of Indonesian textile artistry, involves creating intricate patterns using a resist wax method. Traditionally, artisans used a tool called a canting to draw patterns on fabric, a process known as batik tulis (drawn batik). Following the drawing phase, the cloth was dyed using natural dyes, and then subjected to the "lorot" process, involving boiling the wax out of the fabric. https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F19051508%2Fe543b4e91ad5dffe2b54e7f4300cc7b2%2F2024-02-16%2015.09.06%20copy%202.jpg?generation=1708074019154098&alt=media" alt=""> Batik making is revered for its complexity and demands high craftsmanship, requiring precise hand gestures and mastery of the canting tool. It stands as one of the most challenging pattern-making techniques in textile artistry. [1]

    Dataset Collection

    The primary objective of this dataset is to serve as a resource for research or academic or educational purposes rather than commercial endeavors. The dataset was meticulously compiled to include high-quality images representative of various types of Batik, encompassing the rich diversity of Batik Nusantara or Indonesian Batik from the Aceh to Papua regions.

    Andrew has mentioned that the cornerstone of effective machine learning lies in the quality of the data. Meticulously curated datasets hold the power to unlock valuable insights and drive meaningful results. In other words, data is more important than models. In contrast, datasets lacking in quality may hinder the learning process and lead to suboptimal outcomes. Therefore, prioritizing data quality is paramount, as it lays the foundation for successful machine learning initiatives [2]. Also Sebastian added that the effectiveness of a machine learning algorithm greatly depends on the quality of the data and the richness of the information it encapsulates [3].

    Acknowledgments

    This dataset was meticulously carefully collected with the assistance of Ultralytics. The ownership of all images within this dataset belongs to respective parties, to whom we extend our gratitude for their contribution of these visually captivating images.

    To cite from Kaggle:

    [Dataset creator's name]. ([Year & Month of dataset creation]). [Name of the dataset], [Version of the dataset]. Retrieved [Date Retrieved] from [URL of the dataset].

    Dataset

    Comprising 40 raw images per class with image dimension of 224 x 224, this dataset encompasses a wide array of Batik designs, each representing a distinct category. The classes include 'Aceh PintuAceh', 'Bali Barong', 'Bali Merak', 'DKI OndelOndel', 'JawaBarat Megamendung', 'JawaTimur Pring', 'Kalimantan Dayak', 'Lampung Gajah', 'Madura Mataketeran', 'Maluku Pala', 'NTB Lumbung', 'Papua Asmat', 'Papua Cendrawasih', 'Papua Tifa', 'Solo Parang', 'SulawesiSelatan Lontara', 'SumateraBarat Rumah Minang', 'SumateraUtara Boraspati', 'Yogyakarta Kawung', and 'Yogyakarta Parang' [2][3][4][5][6][7]. These classes collectively portray the rich heritage of Batik Nusantara or Batik Indonesia, spanning from the Aceh to Papua regions. Feel free to explore image augmentation techniques to further enhance the dataset.

    Simple Coding is available @ git with assumption using Colab. For reference, the following pre-trained architectures have been added: VGG16, ResNet50, Xception, MobileNetV2, along with Content-Based Image Retrieval (CBIR), Random Forest, a CNN architecture, and modeling, in addition to the MLP. It is also available on Kaggle Dataset Notebooks (Code).

    Instructions for Dataset Usage

    Below are steps to utilise the dataset using either Google Colab or Jupyter Notebook: 1. Begin by downloading the dataset. 2. Upon extraction, you'll find separate folders for training and testing data. Should you require validation data, either manually split a portion (approximately around 20%) from the training set and store it separately, or perform on-the-fly splitting during coding. 3. If splitting validation data manually, remember to re-zip the dataset after the separation process. 4....

  9. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Mathurin Aché (2022). GPR1200 Dataset [Dataset]. https://www.kaggle.com/datasets/mathurinache/gpr1200-dataset/data
Organization logo

GPR1200 Dataset

GPR1200: A Benchmark for General-Purpose Content-Based Image Retrieval

Explore at:
42 scholarly articles cite this dataset (View in Google Scholar)
zip(1248744484 bytes)Available download formats
Dataset updated
Jan 4, 2022
Authors
Mathurin Aché
License

Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically

Description

Context

GPR1200: A Benchmark for General-Purpose Content-Based Image Retrieval (ArXiv)

Content

Similar to most vision related tasks, deep learning models have taken over in the field of content-based image retrieval (CBIR) over the course of the last decade. However, most publications that aim to optimise neural networks for CBIR, train and test their models on domain specific datasets. It is therefore unclear, if those networks can be used as a general-purpose image feature extractor. After analyzing popular image retrieval test sets we decided to manually curate GPR1200, an easy to use and accessible but challenging benchmark dataset with 1200 categories and 10 class examples. Classes and images were manually selected from six publicly available datasets of different image areas, ensuring high class diversity and clean class boundaries.

Acknowledgements

https://github.com/Visual-Computing/GPR1200/raw/main/images/GPR_main_pic.jpg" alt="GPR1200">

Inspiration

Benchmark your Image Retrieval Models on It https://github.com/Visual-Computing/GPR1200/raw/main/images/result_table.JPG" alt="Image Retrieval">

Search
Clear search
Close search
Google apps
Main menu