46 datasets found
  1. Similix Image Dataset

    • kaggle.com
    zip
    Updated Feb 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ashish Patel (2025). Similix Image Dataset [Dataset]. https://www.kaggle.com/datasets/ashishpatel8736/similix-image-dataset
    Explore at:
    zip(64685706 bytes)Available download formats
    Dataset updated
    Feb 19, 2025
    Authors
    Ashish Patel
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    The Similix Image Dataset is a curated collection of 1,803 high-quality images featuring a diverse range of objects, including bikes, cars, horses, cats, and humans. This dataset is designed to support image similarity search and computer vision projects, particularly for applications involving deep learning, feature extraction, and visual search engines.

    The dataset is ideal for researchers, data scientists, and developers working on:

    Image similarity and search algorithms Deep learning feature extraction models Content-based image retrieval (CBIR) Object recognition and classification tasks The dataset is provided in PNG format, ensuring compatibility with popular machine learning frameworks like TensorFlow, PyTorch, and OpenCV. It complements projects like "Similix," an image similarity search system using deep learning and FAISS for fast similarity searches.

    Feel free to reach out with questions or share your project results using this dataset

  2. o

    Reverse Image Search

    • openwebninja.com
    json
    Updated Jun 30, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    OpenWeb Ninja (2021). Reverse Image Search [Dataset]. https://www.openwebninja.com/api/reverse-image-search
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Jun 30, 2021
    Dataset authored and provided by
    OpenWeb Ninja
    Area covered
    Global Visual Search
    Description

    This dataset provides comprehensive reverse image search capabilities across the web. It allows you to find visually similar images by providing an image URL, returning matching images along with their source URLs and metadata. The data includes information about where images appear online, helping track image usage and find original sources. Users can leverage this dataset for image tracking, duplicate detection, copyright monitoring, and building visual search features. Whether you're tracking brand assets, finding unauthorized image usage, or building visual search tools, this dataset provides current and reliable reverse image search results. The dataset is delivered in a JSON format via REST API.

  3. AI Image Similarity Challenge: Descriptor Track

    • kaggle.com
    zip
    Updated Jul 21, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Preeti Rana (2021). AI Image Similarity Challenge: Descriptor Track [Dataset]. https://www.kaggle.com/priyasi/ai-image-similarity-challenge-descriptor-track
    Explore at:
    zip(54755784 bytes)Available download formats
    Dataset updated
    Jul 21, 2021
    Authors
    Preeti Rana
    Description

    You will receive a reference set of 1 million images and a query set of 50,000 images. Some of the query images are derived from images in the reference set, and the rest are not.

    For this Descriptor Track, your task is to compute image descriptors (embeddings) for both the 50,000 query and 1 million reference images. Your descriptors will be floating-point vectors of up to 256 dimensions.

  4. D

    Distributed Vector Search System Report

    • datainsightsmarket.com
    doc, pdf, ppt
    Updated Jun 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Insights Market (2025). Distributed Vector Search System Report [Dataset]. https://www.datainsightsmarket.com/reports/distributed-vector-search-system-502348
    Explore at:
    pdf, ppt, docAvailable download formats
    Dataset updated
    Jun 9, 2025
    Dataset authored and provided by
    Data Insights Market
    License

    https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The Distributed Vector Search (DVS) system market is experiencing rapid growth, driven by the increasing adoption of artificial intelligence (AI) and machine learning (ML) applications across diverse sectors. The market's expansion is fueled by the need for efficient and scalable solutions to manage and query large-scale vector databases, crucial for applications like recommendation engines, image and video search, and natural language processing. While precise market sizing data is unavailable, considering the high CAGR (let's assume a conservative 30% based on industry trends for similar rapidly growing technologies) and a likely 2025 market size in the low billions (e.g., $2 billion), we can project substantial growth in the coming years. Key drivers include the rising volume of unstructured data, advancements in deep learning models generating high-dimensional vectors, and the need for real-time search capabilities. The market is segmented by deployment (cloud, on-premise), application (recommendation systems, similarity search), and organization size (SMEs, large enterprises). Companies like Pinecone, Vespa, Zilliz, Weaviate, Elastic, Meta, Microsoft, Qdrant, and Spotify are major players, fostering competition and innovation within the space. However, challenges such as the complexity of implementing DVS systems and the need for specialized expertise can act as restraints to broader adoption. The forecast period (2025-2033) promises even more significant market expansion, driven by continuous technological advancements and increased awareness of DVS solutions' potential. The increasing integration of DVS into various industry verticals – from e-commerce to healthcare – will further fuel growth. While challenges exist, the potential benefits, including improved search accuracy, faster query response times, and better scalability, are compelling enterprises to invest in DVS systems. The competitive landscape is dynamic, with both established tech giants and specialized startups vying for market share. This dynamic environment will likely lead to further innovation and improved accessibility of DVS technology, driving even faster market growth in the coming decade.

  5. numpy Weights for fashion product images dataset

    • kaggle.com
    zip
    Updated Nov 9, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    kalash jindal (2022). numpy Weights for fashion product images dataset [Dataset]. https://www.kaggle.com/datasets/kalashj16/numpy-weights-for-fashion-product-images-dataset
    Explore at:
    zip(323701898 bytes)Available download formats
    Dataset updated
    Nov 9, 2022
    Authors
    kalash jindal
    Description

    About Dataset Context Thr growing e-commerce industry presents us with a large dataset waiting to be scraped and researched upon. In addition to professionally shot high resolution product images, we also have multiple label attributes describing the product which was manually entered while cataloging. To add to this, we also have descriptive text that comments on the product characteristics.

    Content Each product is identified by an ID like 42431. You will find a map to all the products in styles.csv. From here, you can fetch the image for this product from images/42431.jpg and the complete metadata from styles/42431.json.

    To get started easily, we also have exposed some of the key product categories and it's display name in styles.csv.

    If this dataset is too large, you can start with a smaller (280MB) version here: https://www.kaggle.com/paramaggarwal/fashion-product-images-small

    Inspiration So what can you try building? Here are some suggestions:

    Start with an image classifier. Use the masterCategory column from styles.csv and train a convolutional neural network. The same can be achieved via NLP. Extract the product descriptions from styles/42431.json and then run a classifier to get the masterCategory. Try adding more sophisticated classification by predicting the other category labels in styles.csv Transfer Learning is your friend and use it wisely. You can even take things much further from here:

    Is it possible to build a GAN that takes a category as input and outputs an image? Auto-encode the image attributes to be able to make a visual search engine that converts the image into a small encoding which is sent to the server to perform visual search? Visual similarity search? Given an image, suggest other similar images.

    Done Preprocessing on Fashion Product Image Dataset https://www.kaggle.com/datasets/paramaggarwal/fashion-product-images-dataset Saved in numpy format so people can use it.

    from keras.applications.resnet import preprocess_input

    IMAGE_DIMS = (60, 60, 3)

    def load_image(imagePath): image = cv2.imread(imagePath) image = cv2.resize(image, (IMAGE_DIMS[1], IMAGE_DIMS[0])) image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) image = preprocess_input(image) return image

    image_data = [] for img_path in tqdm(image_ids): image_data.append(load_image(img_path))

    image_data = np.array(image_data, dtype="float")

  6. m

    Reproducible experiments on Learned Metric Index – proposition of learned...

    • data.mendeley.com
    Updated Jun 30, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Terézia Slanináková (2022). Reproducible experiments on Learned Metric Index – proposition of learned indexing for unstructured data [Dataset]. http://doi.org/10.17632/8wp73zxr47.1
    Explore at:
    Dataset updated
    Jun 30, 2022
    Authors
    Terézia Slanináková
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    With this collection of code and configuration files (contained in "LMIF" = 'Learned Metric Index Framework'), outputs ("output-files") and datasets ("datasets") we set out to explore whether a learned approach to building a metric index is a viable alternative to the traditional way of constructing metric indexes. Specifically, we build the index as a series of interconnected machine learning models. This collection serves as the basis for the reproducibility paper accompanying our parent paper -- "Learned metric index—proposition of learned indexing for unstructured data" [1].

    1. In "datasets/" we make publicly available a collection of 3 individual dataset descriptors -- CoPhIR (1 million objects, 282 columns), Profimedia (1 million objects, 4096 columns), and MoCap (~350k objects, 4096 columns), "labels" obtained from a template index -- M-tree or M-index, "queries" used to perform an experimental search with and "ground-truths" to evaluate the approximate k-NN performance of the index. Within "test" we include dummy data to ease the integration of any custom dataset (examples in "LMIF/*.ipynb") that a reader may want to integrate into our solution. In CoPhIR [2], each of the vectors is obtained by concatenating five MPEG-7 global visual descriptors extracted from an image downloaded from Flickr. The Profimedia image dataset [3], contains Caffe visual descriptors extracted from Photo-stock images by a convolutional neural network. MoCap (motion capture data) [4] descriptors contain sequences of 3D skeleton poses extracted from 3+ hrs of recordings capturing actors performing more than 70 different motion scenarios. The dataset's size is 43 GB upon decompression.

    [1] Antol, Matej, et al. "Learned metric index—proposition of learned indexing for unstructured data." Information Systems 100 (2021): 101774. [2] Batko, Michal, et al. "Building a web-scale image similarity search system." Multimedia Tools and Applications 47.3 (2010): 599-629. [3] Budikova, Petra et al. "Evaluation platform for content-based image retrieval systems." International Conference on Theory and Practice of Digital Libraries. Springer, Berlin, Heidelberg, 2011. [4] Müller, Meinard, et al. "Documentation mocap database hdm05." (2007).

    1. "LMIF" contains a user-friendly environment to reproduce the experiments in [1]. LMIF consists of three components:
    2. an implementation of the Learned Metric Index (distributed under the MIT license),
    3. a collection of scripts and configuration setups necessary for re-running the experiments in [1] and
    4. instructions for creating the reproducibility environment (Docker). For a thorough description of "LMIF", please refer to our reproducibility paper -- "Reproducible experiments on Learned Metric Index – proposition of learned indexing for unstructured data".

    5. "output-files" contain the reproduced outputs for each experiment, with generated figures and a concise ".html" report (as presented in [1])

  7. g

    Large Shoe Dataset (UT Zappos50K)

    • gts.ai
    json/zip
    Updated Apr 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    GLOBOSE TECHNOLOGY SOLUTIONS PRIVATE LIMITED (2024). Large Shoe Dataset (UT Zappos50K) [Dataset]. https://gts.ai/dataset-download/large-shoe-dataset-ut-zappos50k/
    Explore at:
    json/zipAvailable download formats
    Dataset updated
    Apr 19, 2024
    Dataset authored and provided by
    GLOBOSE TECHNOLOGY SOLUTIONS PRIVATE LIMITED
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Variables measured
    Brand, Gender, Material, GIST descriptors, Visual orientation, Color features (LAB), Background consistency, Category (shoe, sandal, slipper, boot)
    Measurement technique
    Fine-grained image classification and metadata labeling for retail visual search and similarity analysis.
    Description

    UT Zappos50K (UT-Zap50K) is a large shoe dataset with 50,025 product images categorized into shoes, sandals, slippers, and boots. Each image includes metadata such as gender, material, and brand, ideal for training AI models in e-commerce and visual similarity search.

  8. User interaction and search depth for recurrence period, in days.

    • figshare.com
    xls
    Updated Jul 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    João António; Jorge Valente; Carlos Mora; Artur Almeida; Sandra Jardim (2024). User interaction and search depth for recurrence period, in days. [Dataset]. http://doi.org/10.1371/journal.pone.0304915.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jul 1, 2024
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    João António; Jorge Valente; Carlos Mora; Artur Almeida; Sandra Jardim
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    User interaction and search depth for recurrence period, in days.

  9. D

    Data for "Prediction of Search Targets From Fixations in Open-World...

    • darus.uni-stuttgart.de
    Updated Oct 28, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andreas Bulling (2022). Data for "Prediction of Search Targets From Fixations in Open-World Settings" [Dataset]. http://doi.org/10.18419/DARUS-3226
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 28, 2022
    Dataset provided by
    DaRUS
    Authors
    Andreas Bulling
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Area covered
    World
    Dataset funded by
    DFG
    Cluster of Excellence on Multimodal Computing and Interaction (MMCI) at Saarland University
    Description

    We designed a human study to collect fixation data during visual search. We opted for a task that involved searching for a single image (the target) within a synthesised collage of images (the search set). Each of the collages are the random permutation of a finite set of images. To explore the impact of the similarity in appearance between target and search set on both fixation behaviour and automatic inference, we have created three different search tasks covering a range of similarities. In prior work, colour was found to be a particularly important cue for guiding search to targets and target-similar objects. Therefore we have selected for the first task 78 coloured O'Reilly book covers to compose the collages. These covers show a woodcut of an animal at the top and the title of the book in a characteristic font underneath. Given that overall cover appearance was very similar, this task allows us to analyse fixation behaviour when colour is the most discriminative feature. For the second task we use a set of 84 book covers from Amazon. In contrast to the first task, appearance of these covers is more diverse. This makes it possible to analyse fixation behaviour when both structure and colour information could be used by participants to find the target. Finally, for the third task, we use a set of 78 mugshots from a public database of suspects. In contrast to the other tasks, we transformed the mugshots to grey-scale so that they did not contain any colour information. In this case, allows abalysis of fixation behaviour when colour information was not available at all. We found faces to be particularly interesting given the relevance of searching for faces in many practical applications. 18 participants (9 males), age 18-30 Gaze data recorded with a stationary Tobii TX300 eye tracker More information about the dataset can be found in the README file.

  10. COCO, LVIS, Open Images V4 classes mapping

    • data.europa.eu
    • data.niaid.nih.gov
    • +1more
    unknown
    Updated Nov 23, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zenodo (2023). COCO, LVIS, Open Images V4 classes mapping [Dataset]. https://data.europa.eu/data/datasets/oai-zenodo-org-7194300?locale=en
    Explore at:
    unknown(1372)Available download formats
    Dataset updated
    Nov 23, 2023
    Dataset authored and provided by
    Zenodohttp://zenodo.org/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This repository contains a mapping between the classes of COCO, LVIS, and Open Images V4 datasets into a unique set of 1460 classes. COCO [Lin et al 2014] contains 80 classes, LVIS [gupta2019lvis] contains 1460 classes, Open Images V4 [Kuznetsova et al. 2020] contains 601 classes. We built a mapping of these classes using a semi-automatic procedure in order to have a unique final list of 1460 classes. We also generated a hierarchy for each class, using wordnet This repository contains the following files: coco_classes_map.txt, contains the mapping for the 80 coco classes lvis_classes_map.txt, contains the mapping for the 1460 coco classes openimages_classes_map.txt, contains the mapping for the 601 coco classes classname_hyperset_definition.csv, contains the final set of 1460 classes, their definition and hierarchy all-classnames.xlsx, contains a side-by-side view of all classes considered This mapping was used in VISIONE [Amato et al. 2021, Amato et al. 2022] that is a content-based retrieval system that supports various search functionalities (text search, object/color-based search, semantic and visual similarity search, temporal search). For the object detection VISIONE uses three pre-trained models: VfNet Zhang et al. 2021, Mask R-CNN He et al. 2017, and a Faster R-CNN+Inception ResNet (trained on the Open Images V4). This is repository is released under a Creative Commons Attribution license, please cite the following paper if you use it in your work in any form: @inproceedings{amato2021visione, title={The visione video search system: exploiting off-the-shelf text search engines for large-scale video retrieval}, author={Amato, Giuseppe and Bolettieri, Paolo and Carrara, Fabio and Debole, Franca and Falchi, Fabrizio and Gennaro, Claudio and Vadicamo, Lucia and Vairo, Claudio}, journal={Journal of Imaging}, volume={7}, number={5}, pages={76}, year={2021}, publisher={Multidisciplinary Digital Publishing Institute} } References: [Amato et al. 2022] Amato, G. et al. (2022). VISIONE at Video Browser Showdown 2022. In: , et al. MultiMedia Modeling. MMM 2022. Lecture Notes in Computer Science, vol 13142. Springer, Cham. https://doi.org/10.1007/978-3-030-98355-0_52 [Amato et al. 2021] Amato, G., Bolettieri, P., Carrara, F., Debole, F., Falchi, F., Gennaro, C., Vadicamo, L. and Vairo, C., 2021. The visione video search system: exploiting off-the-shelf text search engines for large-scale video retrieval. Journal of Imaging, 7(5), p.76. [Gupta et al.2019] Gupta, A., Dollar, P. and Girshick, R., 2019. Lvis: A dataset for large vocabulary instance segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 5356-5364). [He et al. 2017] He, K., Gkioxari, G., Dollár, P. and Girshick, R., 2017. Mask r-cnn. In Proceedings of the IEEE international conference on computer vision (pp. 2961-2969). [Kuznetsova et al. 2020] Kuznetsova, A., Rom, H., Alldrin, N., Uijlings, J., Krasin, I., Pont-Tuset, J., Kamali, S., Popov, S., Malloci, M., Kolesnikov, A. and Duerig, T., 2020. The open images dataset v4. International Journal of Computer Vision, 128(7), pp.1956-1981. [Lin et al. 2014] Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P. and Zitnick, C.L., 2014, September. Microsoft coco: Common objects in context. In European conference on computer vision (pp. 740-755). Springer, Cham. [Zhang et al. 2021] Zhang, H., Wang, Y., Dayoub, F. and Sunderhauf, N., 2021. Varifocalnet: An iou-aware dense object detector. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 8514-8523).

  11. s

    Datasets for "From authority to similarity: how Google transformed its...

    • orda.shef.ac.uk
    csv
    Updated Jul 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Warren Pearce (2025). Datasets for "From authority to similarity: how Google transformed its knowledge infrastructure using computer vision" [Dataset]. http://doi.org/10.15131/shef.data.29481173.v2
    Explore at:
    csvAvailable download formats
    Dataset updated
    Jul 6, 2025
    Dataset provided by
    The University of Sheffield
    Authors
    Warren Pearce
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description

    We investigate the impact of computer vision models, a prominent artificial intelligence tool, on critical knowledge infrastructure, using the case of Google search engines. We answer the following research question: How do search results for Google Images compare internationally with those for Google Search, and how can these results be explained by changes in Google’s knowledge infrastructure? To answer this question, we carry out four steps: 1) theorise the relationship between web epistemology, calculative technology, platform vernacular and issue configuration, illustrating the dynamics of critical knowledge infrastructures on the web; 2) provide a potted history of Google’s use of computer vision in search; 3) undertake the first international comparison of search results from Google Search with Google Images; 4) analyse the visual content of search results from Google Images. Using quanti-quali digital methods including visual content analysis, social semiotics and computer vision network analysis, we analyse search results related to environmental change across six countries, with two key findings. First, Google Images search results contain fewer authoritative sources than Google Search across all countries. Second, Google Images results constitute a narrow, homogenised visual repertoire across all countries. This constitutes a transformation in web epistemology from ranking-by-authority to ranking-by-similarity, driven by a shift in calculative technology from web links (Google Search) to computer vision (Google Images). Our findings and theoretical model open up new questions regarding the impact of computer vision on the public availability of knowledge in our increasingly image-saturated digital societies.

  12. o

    FashionLocalTriplets

    • registry.opendata.aws
    Updated Jun 3, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amazon (2021). FashionLocalTriplets [Dataset]. https://registry.opendata.aws/fashionlocaltriplets/
    Explore at:
    Dataset updated
    Jun 3, 2021
    Dataset provided by
    Amazon.comhttp://amazon.com/
    Description

    Fine-grained localized visual similarity and search for fashion.

  13. Amazon Product Images Dataset 2025

    • kaggle.com
    zip
    Updated Sep 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Umer Haddii (2025). Amazon Product Images Dataset 2025 [Dataset]. https://www.kaggle.com/datasets/umerhaddii/amazon-product-images-dataset-2025
    Explore at:
    zip(2843492692 bytes)Available download formats
    Dataset updated
    Sep 20, 2025
    Authors
    Umer Haddii
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Context

    Product images are a core part of e-commerce. They contain rich information about the product’s appearance, packaging, labeling, and category. These images are widely used for building computer vision systems in online retail, such as automated product classification, recommendation engines, and visual search.

    This dataset provides a large collection of 15,000+ product images across multiple categories (healthcare, beauty, supplements, electronics, household items, etc.). It is intended as a benchmark resource for computer vision research and e-commerce applications.

    Content

    Images: 15,000+ Amazon product images.

    Format: JPEG

    Categories: Mixed — includes consumer goods, healthcare, beauty, electronics, etc.

    Resolution: Varies, mostly medium to high quality.

    (Dataset may contains the duplicate images.)

    Potential Uses

    This dataset can be applied in:

    Computer Vision Research

    Product classification - Object detection / segmentation - Image retrieval & similarity search

    E-commerce Applications

    Automated catalog management - Visual product recommendation - Duplicate/near-duplicate product detection

    Deep Learning

    Transfer learning with CNNs or Vision Transformers - Generative tasks (synthetic product image generation, augmentation) - Multimodal research (images + potential text labels)

    Notes

    Dataset contains images only (no metadata such as title, description, or price).

    Provided for educational and research purposes only.

    Original product rights remain with their respective owners.

    Acknowledgements

    The Dataset Images belong to Amazon sellers.

  14. f

    UoS Buildings Image Dataset for Computer Vision Algorithms

    • salford.figshare.com
    application/x-gzip
    Updated Jan 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ali Alameer; Mazin Al-Mosawy (2025). UoS Buildings Image Dataset for Computer Vision Algorithms [Dataset]. http://doi.org/10.17866/rd.salford.20383155.v2
    Explore at:
    application/x-gzipAvailable download formats
    Dataset updated
    Jan 23, 2025
    Dataset provided by
    University of Salford
    Authors
    Ali Alameer; Mazin Al-Mosawy
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The dataset for this project is represented by photos, photos for the buildings of the University of Salford, these photos are taken by a mobile phone camera from different angels and different distances , even though this task sounds so easy but it encountered some challenges, these challenges are summarized below: 1. Obstacles. a. Fixed or unremovable objects. When taking several photos for a building or a landscape from different angels and directions ,there are some of these angels blocked by a form of a fixed object such as trees and plants, light poles, signs, statues, cabins, bicycle shades, scooter stands, generators/transformers, construction barriers, construction equipment and any other service equipment so it is unavoidable to represent some photos without these objects included, this will raise 3 questions. - will these objects confuse the model/application we intend to create meaning will that obstacle prevent the model/application from identifying the designated building? - Or will the photos be more precise with these objects and provide the capability for the model/application to identify these building with these obstacles included? - How far is the maximum length for detection? In other words, how far will the mobile device with the application be from the building so it could or could not detect the designated building? b. Removable and moving objects. - Any University is crowded with staff and students especially in the rush hours of the day so it is hard for some photos to be taken without a personnel appearing in that photo in a certain time period of the day. But, due to privacy issues and showing respect to that person, these photos are better excluded. - Parked vehicles, trollies and service equipment can be an obstacle and might appear in these images as well as it can block access to some areas which an image from a certain angel cannot be obtained. - Animals, like dogs, cats, birds or even squirrels cannot be avoided in some photos which are entitled to the same questions above.
    2. Weather. In a deep learning project, more data means more accuracy and less error, at this stage of our project it was agreed to have 50 photos per building but we can increase the number of photos for more accurate results but due to the limitation of time for this project it was agreed for 50 per building only. these photos were taken on cloudy days and to expand our work on this project (as future works and recommendations). Photos on sunny, rainy, foggy, snowy and any other weather condition days can be included. Even photos in different times of the day can be included such as night, dawn, and sunset times. To provide our designated model with all the possibilities to identify these buildings in all available circumstances.

    1. The selected buildings. It was agreed to select 10 buildings only from the University of Salford buildings for this project with at least 50 images per building, these selected building for this project with the number of images taken are:
    2. Chapman: 74 images
    3. Clifford Whitworth Library: 60 images
    4. Cockcroft: 67 images
    5. Maxwell: 80 images
    6. Media City Campus: 92 images
    7. New Adelphi: 93 images
    8. New Science, Engineering & Environment: 78 images
    9. Newton: 92 images
    10. Sports Centre: 55 images
    11. University House: 60 images Peel building is an important figure of the University of Salford due to its distinct and amazing exterior design but unfortunately it was excluded from the selection due to some maintenance activities at the time of collecting the photos for this project as it is partially covered with scaffolding and a lot of movement by personnel and equipment. If the supervisor suggests that this will be another challenge to include in the project then, it is mandatory to collect its photos. There are many other buildings in the University of Salford and again to expand our project in the future, we can include all the buildings of the University of Salford. The full list of buildings of the university can be reviewed by accessing an interactive map on: www.salford.ac.uk/find-us

    12. Expand Further. This project can be improved furthermore with so many capabilities, again due to the limitation of time given to this project , these improvements can be implemented later as future works. In simple words, this project is to create an application that can display the building’s name when pointing a mobile device with a camera to that building. Future featured to be added: a. Address/ location: this will require collection of additional data which is the longitude and latitude of each building included or the post code which will be the same taking under consideration how close these buildings appear on the interactive map application such as Google maps, Google earth or iMaps. b. Description of the building: what is the building for, by which school is this building occupied? and what facilities are included in this building? c. Interior Images: all the photos at this stage were taken for the exterior of the buildings, will interior photos make an impact on the model/application for example, if the user is inside newton or chapman and opens the application, will the building be identified especially the interior of these buildings have a high level of similarity for the corridors, rooms, halls, and labs? Will the furniture and assets will be as obstacles or identification marks? d. Directions to a specific area/floor inside the building: if the interior images succeed with the model/application, it would be a good idea adding a search option to the model/application so it can guide the user to a specific area showing directions to that area, for example if the user is inside newton building and searches for lab 141 it will direct him to the first floor of the building with an interactive arrow that changes while the user is approaching his destination. Or, if the application can identify the building from its interior, a drop down list will be activated with each floor of this building, for example, if the model/application identifies Newton building, the drop down list will be activated and when pressing on that drop down list it will represent interactive tabs for each floor of the building, selecting one of the floors by clicking on its tab will display the facilities on that floor for example if the user presses on floor 1 tab, another screen will appear displaying which facilities are on that floor. Furthermore, if the model/application identifies another building, it should activate a different number of floors as buildings differ in the number of floors from each other. this feature can be improved with a voice assistant that can direct the user after he applies his search (something similar to the voice assistant in Google maps but applied to the interior of the university’s buildings. e. Top View: if a drone with a camera can be afforded, it can provide arial images and top views for the buildings that can be added to the model/application but these images can be similar to the interior images situation , the buildings can be similar to each other from the top with other obstacles included like water tanks and AC units.

    13. Other Questions:

    14. Will the model/application be reproducible? the presumed answer for this question should be YES, IF, the model/application will be fed with the proper data (images) such as images of restaurants, schools, supermarkets, hospitals, government facilities...etc.

  15. Theoretical comparison of online image search tools; state-of-the-art search...

    • plos.figshare.com
    xls
    Updated Jun 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Umer Rashid; Maha Saddal; Ghazanfar Farooq; Muazzam Ali Khan; Naveed Ahmad (2023). Theoretical comparison of online image search tools; state-of-the-art search and exploration tools discussed in the literature and our proposed SUI approach. [Dataset]. http://doi.org/10.1371/journal.pone.0280400.t004
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 21, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Umer Rashid; Maha Saddal; Ghazanfar Farooq; Muazzam Ali Khan; Naveed Ahmad
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Theoretical comparison of online image search tools; state-of-the-art search and exploration tools discussed in the literature and our proposed SUI approach.

  16. Fashion Product Images Dataset

    • kaggle.com
    zip
    Updated Mar 14, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Param Aggarwal (2019). Fashion Product Images Dataset [Dataset]. https://www.kaggle.com/datasets/paramaggarwal/fashion-product-images-dataset/discussion
    Explore at:
    zip(24771215740 bytes)Available download formats
    Dataset updated
    Mar 14, 2019
    Authors
    Param Aggarwal
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Context

    Thr growing e-commerce industry presents us with a large dataset waiting to be scraped and researched upon. In addition to professionally shot high resolution product images, we also have multiple label attributes describing the product which was manually entered while cataloging. To add to this, we also have descriptive text that comments on the product characteristics.

    Content

    Each product is identified by an ID like 42431. You will find a map to all the products in styles.csv. From here, you can fetch the image for this product from images/42431.jpg and the complete metadata from styles/42431.json.

    To get started easily, we also have exposed some of the key product categories and it's display name in styles.csv.

    If this dataset is too large, you can start with a smaller (280MB) version here: https://www.kaggle.com/paramaggarwal/fashion-product-images-small

    Inspiration

    So what can you try building? Here are some suggestions:

    • Start with an image classifier. Use the masterCategory column from styles.csv and train a convolutional neural network.
    • The same can be achieved via NLP. Extract the product descriptions from styles/42431.json and then run a classifier to get the masterCategory.
    • Try adding more sophisticated classification by predicting the other category labels in styles.csv

    Transfer Learning is your friend and use it wisely. You can even take things much further from here:

    • Is it possible to build a GAN that takes a category as input and outputs an image?
    • Auto-encode the image attributes to be able to make a visual search engine that converts the image into a small encoding which is sent to the server to perform visual search?
    • Visual similarity search? Given an image, suggest other similar images.
  17. c

    Shoe Dataset

    • cubig.ai
    zip
    Updated Jun 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CUBIG (2025). Shoe Dataset [Dataset]. https://cubig.ai/store/products/440/shoe-dataset
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 12, 2025
    Dataset authored and provided by
    CUBIG
    License

    https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service

    Measurement technique
    Synthetic data generation using AI techniques for model training, Privacy-preserving data transformation via differential privacy
    Description

    1) Data Introduction • The Shoe Dataset is a multi-class image classification dataset for computer vision tasks, designed to classify various types of shoes. It consists of 6 distinct classes: boots, sneakers, flip-flops, loafers, sandals, and soccer shoes.

    2) Data Utilization (1) Characteristics of the Shoe Dataset: • The dataset contains images captured under various angles, backgrounds, and lighting conditions, making it suitable for evaluating model generalization performance in real-world environments. • Each class has clearly distinguishable visual features, which makes the dataset effective for both model training and testing.

    (2) Applications of the Shoe Dataset: • Shoe classification model development: The dataset can be used to train AI models that automatically classify shoe types using a variety of deep learning architectures. • Recommendation and search system research: It can be applied to develop product recommendation algorithms, image-based similarity search systems, and targeted advertising solutions on social media platforms.

  18. SHREC'14 Track: Large Scale Comprehensive 3D Shape Retrieval

    • data.nist.gov
    • gimi9.com
    • +1more
    Updated Apr 14, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Afzal A. Godil (2020). SHREC'14 Track: Large Scale Comprehensive 3D Shape Retrieval [Dataset]. http://identifiers.org/ark:/88434/mds2-2219
    Explore at:
    Dataset updated
    Apr 14, 2020
    Dataset provided by
    National Institute of Standards and Technologyhttp://www.nist.gov/
    Authors
    Afzal A. Godil
    License

    https://www.nist.gov/open/licensehttps://www.nist.gov/open/license

    Description

    Objective: The objective of this track is to evaluate the performance of 3D shape retrieval approaches on a large-sale comprehensive 3D shape database which contains different types of models, such as generic, articulated, CAD and architecture models. Introduction: With the increasing number of 3D models created every day and stored in databases, the development of effective and scalable 3D search algorithms has become an important research area. In this contest, the task will be retrieving 3D models similar to a complete 3D model query from a new integrated large-scale comprehensive 3D shape benchmark including various types of models. Owing to the integration of the most important existing benchmarks to date, the newly created benchmark is the most exhaustive to date in terms of the number of semantic query categories covered, as well as the variations of model types. The shape retrieval contest will allow researchers to evaluate results of different 3D shape retrieval approaches when applied on a large scale comprehensive 3D database. The benchmark is motivated by a latest large collection of human sketches built by Eitz et al. [1]. To explore how human draw sketches and human sketch recognition, they collected 20,000 human-drawn sketches, categorized into 250 classes, each with 80 sketches. This sketch dataset is exhaustive in terms of the number of object categories. Thus, we believe that a 3D model retrieval benchmark based on their object categorizations will be more comprehensive and appropriate than currently available 3D retrieval benchmarks to more objectively and accurately evaluate the real practical performance of a comprehensive 3D model retrieval algorithm if implemented and used in the real world. Considering this, we build a SHREC'14 Large Scale Comprehensive Track Benchmark (SHREC14LSGTB) by collecting relevant models in the major previously proposed 3D object retrieval benchmarks. Our target is to find models for as many as classes of the 250 classes and find as many as models for each class. These previous benchmarks have been compiled with different goals in mind and to date, not been considered in their sum. Our work is the first to integrate them to form a new, larger benchmark corpus for comprehensive 3D shape retrieval. Dataset: SHREC'14 Large Scale Comprehensive Retrieval Track Benchmark has 8,987 models, categorized into 171 classes. We adopt a voting scheme to classify models. For each classification, we have at least two votes. If these two votes agree each other, we confirm that the classification is correct, otherwise, we perform a third vote to finalize the classification. All the models are categorized according to the classifications in Eitz et al. [1], based on visual similarity. Evaluation Method: To have a comprehensive evaluation of the retrieval algorithm, we employ seven commonly adopted performance metrics in 3D model retrieval technique. Please cite the papers: [1] Bo Li, Yijuan Lu, Chunyuan Li, Afzal Godil, Tobias Schreck, Masaki Aono, Martin Burtscher, Qiang Chen, Nihad Karim Chowdhury, Bin Fang, Hongbo Fu, Takahiko Furuya, Haisheng Li, Jianzhuang Liu, Henry Johan, Ryuichi Kosaka, Hitoshi Koyanagi, Ryutarou Ohbuchi, Atsushi Tatsuma, Yajuan Wan, Chaoli Zhang, Changqing Zou. A Comparison of 3D Shape Retrieval Methods Based on a Large-scale Benchmark Supporting Multimodal Queries. Computer Vision and Image Understanding, November 4, 2014. [2] Bo Li, Yijuan Lu, Chunyuan Li, Afzal Godil, Tobias Schreck, Masaki Aono, Qiang Chen, Nihad Karim Chowdhury, Bin Fang, Takahiko Furuya, Henry Johan, Ryuichi Kosaka, Hitoshi Koyanagi, Ryutarou Ohbuchi, Atsushi Tatsuma. SHREC' 14 Track: Large Scale Comprehensive 3D Shape Retrieval. Eurographics Workshop on 3D Object Retrieval 2014 (3DOR 2014): 131-140, 2014.

  19. D

    Vector Embeddings Management Platform Market Research Report 2033

    • dataintelo.com
    csv, pdf, pptx
    Updated Sep 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2025). Vector Embeddings Management Platform Market Research Report 2033 [Dataset]. https://dataintelo.com/report/vector-embeddings-management-platform-market
    Explore at:
    pdf, pptx, csvAvailable download formats
    Dataset updated
    Sep 30, 2025
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Vector Embeddings Management Platform Market Outlook




    According to our latest research, the global vector embeddings management platform market size reached USD 1.42 billion in 2024, with a robust year-on-year growth fueled by the rapid adoption of artificial intelligence and machine learning solutions across multiple industries. The market is expected to expand at a CAGR of 27.8% from 2025 to 2033, reaching a projected value of USD 13.24 billion by 2033. This impressive growth trajectory is primarily driven by the increasing need for efficient management and utilization of high-dimensional data representations, particularly as organizations accelerate their digital transformation initiatives and AI-powered applications become mainstream.




    One of the most significant growth factors propelling the vector embeddings management platform market is the exponential rise in unstructured data generation. As enterprises deploy advanced AI and machine learning models for natural language processing, recommendation engines, and multimedia analysis, the demand for platforms that can efficiently store, search, and manage vector embeddings has surged. These platforms enable organizations to handle complex data representations, making it possible to derive actionable insights from vast data lakes. The growing sophistication of AI models, particularly in fields like generative AI, semantic search, and personalized content delivery, has made vector embedding management platforms indispensable for maintaining scalability, speed, and accuracy in data-driven operations.




    Another critical driver for market growth is the increasing integration of vector embeddings management platforms with cloud-native architectures and microservices. As organizations migrate workloads to the cloud and embrace hybrid or multi-cloud strategies, the need for scalable, distributed, and highly available vector database solutions has become paramount. Cloud deployment not only offers flexibility and cost-efficiency but also simplifies the integration of AI/ML pipelines, thereby accelerating time-to-value for enterprises. Furthermore, the proliferation of open-source vector database technologies and APIs has democratized access, empowering both large enterprises and SMEs to leverage advanced vector search and similarity capabilities without prohibitive upfront investments.




    The expansion of use cases across verticals such as BFSI, healthcare, retail, and media has also contributed to the rapid growth of the vector embeddings management platform market. In the BFSI sector, for instance, these platforms are instrumental in fraud detection, customer segmentation, and risk assessment, while in healthcare, they facilitate medical image analysis and patient data retrieval. Retail and e-commerce players are leveraging vector embeddings for personalized recommendations and visual search, improving customer engagement and conversion rates. As AI adoption deepens across sectors, the demand for robust, secure, and high-performance vector embeddings management solutions is expected to intensify, further fueling market expansion.




    Regionally, North America continues to dominate the global vector embeddings management platform market, accounting for the largest revenue share in 2024. This leadership is attributed to the presence of leading AI technology vendors, high IT spending, and a mature ecosystem for AI/ML innovation. However, Asia Pacific is emerging as the fastest-growing region, driven by rapid digitalization, rising investments in AI infrastructure, and a burgeoning startup landscape. Europe also demonstrates significant potential, particularly in regulated industries where data privacy and compliance are pivotal. The Middle East & Africa and Latin America are gradually catching up, propelled by government initiatives and increasing awareness of AI's transformative potential. Overall, the global market is poised for sustained growth, with regional dynamics playing a crucial role in shaping competitive strategies and investment priorities.



    Component Analysis




    The vector embeddings management platform market is segmented by component into software and services, each playing a pivotal role in enabling organizations to harness the full potential of vector-based data representations. The software segment constitutes the core of the market, encompassing vector databases, indexing engines, similarity search algorithms, and integration APIs. These softwa

  20. D

    AI Footage Search Platforms Market Research Report 2033

    • dataintelo.com
    csv, pdf, pptx
    Updated Sep 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2025). AI Footage Search Platforms Market Research Report 2033 [Dataset]. https://dataintelo.com/report/ai-footage-search-platforms-market
    Explore at:
    pptx, pdf, csvAvailable download formats
    Dataset updated
    Sep 30, 2025
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    AI Footage Search Platforms Market Outlook



    According to our latest research, the AI Footage Search Platforms market size reached USD 1.42 billion globally in 2024, driven by the rapid adoption of artificial intelligence in media management and content retrieval. The market is forecasted to grow at a robust CAGR of 18.7% from 2025 to 2033, reaching an estimated USD 7.36 billion by the end of the forecast period. This remarkable growth is primarily fueled by the increasing volumes of digital video content, the demand for efficient search tools in broadcasting and media production, and the integration of advanced AI technologies for metadata tagging and content discovery.



    One of the primary growth factors for the AI Footage Search Platforms market is the exponential increase in video content across multiple industries, particularly in media & entertainment, advertising, and corporate communications. With the proliferation of smart devices and high-speed internet, organizations and individuals are generating and consuming more video content than ever before. This surge has created a pressing need for advanced search capabilities to manage, retrieve, and utilize vast libraries of footage efficiently. AI-powered search platforms leverage machine learning, computer vision, and natural language processing to automate metadata extraction, facial recognition, and scene detection, significantly reducing manual effort and time spent on content curation. As a result, businesses are able to streamline workflows, improve productivity, and enhance user experiences, further accelerating market growth.



    Another significant driver is the increasing adoption of AI technologies by broadcasting companies, production houses, and enterprises seeking to monetize their video assets. AI Footage Search Platforms enable these organizations to unlock the value of their archives by making content easily discoverable and reusable. This is particularly relevant for broadcasters and production companies with extensive historical footage, as AI-driven search tools can quickly surface relevant clips based on complex queries, visual similarity, or contextual metadata. Additionally, the integration of AI search platforms with cloud storage solutions facilitates remote access, collaboration, and scalability, making them indispensable for global teams and distributed production environments. The ongoing digital transformation across sectors is expected to further boost the demand for intelligent footage search solutions.



    The market is also benefiting from the growing emphasis on compliance, content moderation, and brand safety, especially in regulated industries and public sector organizations. AI Footage Search Platforms play a crucial role in enabling organizations to track, filter, and manage sensitive or inappropriate content in real time. This capability is essential for educational institutions, government agencies, and corporate enterprises that must adhere to strict regulatory standards or protect their reputations. As data privacy regulations and content standards become more stringent worldwide, the adoption of AI-powered search and moderation tools is expected to rise, fueling further market expansion.



    Regionally, North America dominates the AI Footage Search Platforms market, accounting for the largest revenue share in 2024, followed by Europe and Asia Pacific. The strong presence of leading technology providers, early adoption of AI solutions, and the concentration of major broadcasting and media companies contribute to North America's leadership. Europe is witnessing steady growth, driven by increasing investments in digital media infrastructure and regulatory support for AI innovation. Meanwhile, Asia Pacific is emerging as a high-growth region, supported by the rapid digitization of media, expanding internet penetration, and the rise of local content creators. Latin America and the Middle East & Africa are gradually embracing AI search platforms, with opportunities for future growth as digital ecosystems mature.



    Component Analysis



    The AI Footage Search Platforms market is segmented by component into Software and Services, with software solutions currently holding the lion’s share of the market revenue. AI-powered software platforms form the backbone of this market, providing the core functionality for video indexing, automated tagging, and semantic search. These platforms are continually evolving, incorporating advanced AI al

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Ashish Patel (2025). Similix Image Dataset [Dataset]. https://www.kaggle.com/datasets/ashishpatel8736/similix-image-dataset
Organization logo

Similix Image Dataset

A Diverse Collection of Images Across Seven Categories for Image Recognition and

Explore at:
zip(64685706 bytes)Available download formats
Dataset updated
Feb 19, 2025
Authors
Ashish Patel
License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

The Similix Image Dataset is a curated collection of 1,803 high-quality images featuring a diverse range of objects, including bikes, cars, horses, cats, and humans. This dataset is designed to support image similarity search and computer vision projects, particularly for applications involving deep learning, feature extraction, and visual search engines.

The dataset is ideal for researchers, data scientists, and developers working on:

Image similarity and search algorithms Deep learning feature extraction models Content-based image retrieval (CBIR) Object recognition and classification tasks The dataset is provided in PNG format, ensuring compatibility with popular machine learning frameworks like TensorFlow, PyTorch, and OpenCV. It complements projects like "Similix," an image similarity search system using deep learning and FAISS for fast similarity searches.

Feel free to reach out with questions or share your project results using this dataset

Search
Clear search
Close search
Google apps
Main menu