Facebook
TwitterThis dataset contains 80 million high-quality vector images (SVG, EPS, AI formats), offering a vast collection for use in computer vision, machine learning, and creative applications. Each image is copyright-cleared and legally sourced through authorized channels, with transparent usage rights for both commercial and academic purposes. The dataset features a wide variety of vector content—icons, illustrations, infographics, and more—with excellent color fidelity and scalable resolution. Ideal for AI model training (e.g., image classification, object recognition), generative design models, and creative design inspiration, this resource ensures traceable IP rights and enables safe, large-scale usage in real-world environments.
Facebook
TwitterVectorEdits: A Dataset and Benchmark for Instruction-Based Editing of Vector Graphics
Paper (Soon) We introduce a large-scale dataset for instruction-guided vector image editing, consisting of over 270,000 pairs of SVG images paired with natural language edit instructions. Our dataset enables training and evaluation of models that modify vector graphics based on textual commands. We describe the data collection process, including image pairing via CLIP similarity and instruction… See the full description on the dataset page: https://huggingface.co/datasets/authoranonymous321/VectorEdits.
Facebook
TwitterThis dataset comprises 80 million vector images. The resources are diverse in type, excellent color accuracy, and rich detail. All materials have been legally obtained through authorized channels, with clear indications of copyright ownership and usage authorization scope. The entire collection provides commercial-grade usage rights and has been granted permission for scientific research use, ensuring clear and traceable intellectual property attribution. The vast and high-quality image resources offer robust support for a wide range of applications, including research in the field of computer vision, training of image recognition algorithms, and sourcing materials for creative design, thereby facilitating efficient progress in related areas.
Data size
80 million images
Image type
posters, patterns, cartoons, backgrounds and other categories
Data format
image formats is .eps
Data content
genuine image works released by the author
Facebook
Twitter
Facebook
TwitterWe created a novel database of mosquito images by sampling live mosquitoes from established colonies maintained by the Malaria Research and Reference Reagent Resource (MR4)/ Biodefense and Emerging Infections (BEI) Resources at the Centers for Disease Control and Prevention (CDC) in Atlanta, GA. Adults of both sexes were imaged from 15 species of mosquitoes from there genera, 13 Anopheles, 2 Culex and 1 Aedes. There are a total of 1,709 images. We included an additional strain of An. gambiae s.s. resulting in two categories of this species: G3 and KISUMU1. Finally, for An. stephensi we captured images of mosquitoes using the two methods of storing mosquitoes, freezing versus dried samples. Images are folders labeled by genus, species, strain, sex and storage method.
Facebook
Twitterhttps://www.shibatadb.com/license/data/proprietary/v1.0/license.txthttps://www.shibatadb.com/license/data/proprietary/v1.0/license.txt
Yearly citation counts for the publication titled "Feature-specific vector quantization of images".
Facebook
TwitterThis dataset provides browse images of the NASA Scatterometer (NSCAT) Level 3 daily gridded ocean wind vectors, which are provided at 0.5 degree spatial resolution for ascending and descending passes; wind vectors are averaged at points where adjacent passes overlap. This is the most up-to-date version, which designates the final phase of calibration, validation and science data processing, which was completed in November of 1998, on behalf of the JPL NSCAT Project; wind vectors are processed using the NSCAT-2 geophysical model function. Information and access to the Level 3 source data used to generate these browse images may be accessed at: http://podaac.jpl.nasa.gov/dataset/NSCAT%20LEVEL%203.
Facebook
Twitterhttps://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/
Dataset Card for TreeOfLife-10M Vector database
Persistent files for vector Database created with chromadb containing the embeddings for all images in the imageomics/TreeOfLife-10M dataset.
Dataset Details
This dataset contains the generated vector database built using ChromaDb as the backend vector database solution for the entire TreeOfLife-10M dataset. The rationale behind creating a vector database was to enable blazingly fast nearest neighbor search. The vector… See the full description on the dataset page: https://huggingface.co/datasets/imageomics/tree-of-life-vector-db.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset is a repositpry of sign language images taken by the Anki Vector robot. To understand the American sign language for the English alphabet, please take a look at the following video: https://www.youtube.com/watch?v=a5BD8SjhPSg
The dataset contains roughly 8500 images. Images are labelled according to the sign language, for e.g. all images with a_*.png are labels for pictures with sign for the alphabet 'a' taken by vector. All images for the background (with no sign) are labelled as background_a.
Thanks to the entire ex-Anki team for working on a fantastic robot and making the SDK available free,
Lets train a model to enable robots to accurately understand the human sign language.
More material wrt this dataset is available in my online course: 'Learn AI with a robot', available at http://robotics.thinkific.com
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Specialized collection of 0 free data visualization SVG illustrations from the technology & electronics category. Data visualization illustrations including bar charts, network graphs, and information graphics Examples include: bar chart, network graph.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
OPEN-PMC
Arxiv: Arxiv
|
Code: Open-PMC Github
|
Model Checkpoint: Hugging Face
Dataset Summary
This dataset consists of image-text pairs extracted from medical papers available on PubMed Central. It has been curated to support research in medical image understanding, particularly in natural language processing (NLP) and computer vision tasks related to medical imagery. The dataset includes:
Extracted images from research articles.… See the full description on the dataset page: https://huggingface.co/datasets/vector-institute/open-pmc.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The 14 datasets used to build SVM and LS-SVM classification models of FLS.
Facebook
TwitterThe statistic shows the computer graphics software market value in the vector graphics segment from 2009 to 2013. In 2010, there was a market value of *** million U.S. dollars.
Facebook
TwitterThis child item contains the Mathworks Matlab mat-file outputs from the scripts described in the Ancillary Scripts child item. Each file contains the results for a particular field site. See the FGDC metadata Process Steps section for more information about opening these files. The mat-files included here have a standard set of output variables and include a variable named "zzVariableDescriptions" in each mat-file which describes the contents of the file. The following variables and descriptions are included in each mat-file (extracted from the "zzVariableDescriptions" variable):
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
CQ100 is a diverse and high-quality dataset of color images that can be used to develop, test, and compare color quantization algorithms. The dataset can also be used in other color image processing tasks, including filtering and segmentation.
If you find CQ100 useful, please cite the following publication: M. E. Celebi and M. L. Perez-Delgado, “CQ100: A High-Quality Image Dataset for Color Quantization Research,” Journal of Electronic Imaging, vol. 32, no. 3, 033019, 2023.
You may download the above publication free of charge from: https://www.spiedigitallibrary.org/journals/journal-of-electronic-imaging/volume-32/issue-3/033019/cq100--a-high-quality-image-dataset-for-color-quantization/10.1117/1.JEI.32.3.033019.full?SSO=1
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset was created by Глеб Мехряков
Released under MIT
Facebook
Twitter
According to our latest research, the global vector database market size reached USD 1.12 billion in 2024, demonstrating robust momentum driven by the surging adoption of artificial intelligence and machine learning applications. The market is experiencing a remarkable expansion, registering a CAGR of 22.4% from 2025 to 2033. By 2033, the market is forecasted to reach USD 8.43 billion, underscoring the transformative role of vector databases in powering next-generation data-driven solutions. This extraordinary growth trajectory is fueled by the increasing need for high-performance search and analytics capabilities across industries, as organizations pivot towards leveraging unstructured and semi-structured data for strategic advantage.
A primary growth factor for the vector database market is the exponential increase in the volume and complexity of unstructured data generated by enterprises. As organizations accumulate vast amounts of images, videos, text, and other rich media, traditional relational databases struggle to provide the speed and scalability required for real-time analysis and retrieval. Vector databases, designed specifically to handle high-dimensional vector representations, have become essential for enabling advanced search and recommendation systems. The proliferation of AI-powered applications, such as semantic search, natural language processing, and image recognition, is amplifying the demand for vector databases, as these systems rely on vector embeddings to deliver accurate and contextually relevant results. Furthermore, the integration of vector databases with popular machine learning frameworks is streamlining the development and deployment of intelligent solutions, accelerating market adoption.
Another significant driver is the rapid digital transformation across key verticals, including BFSI, healthcare, retail and e-commerce, and IT and telecommunications. Enterprises in these sectors are increasingly leveraging vector databases to enhance customer experiences, improve operational efficiency, and unlock new revenue streams. For instance, in retail and e-commerce, vector databases power personalized recommendation engines and visual search capabilities, driving higher conversion rates and customer satisfaction. In healthcare, they enable advanced medical image analysis and patient data retrieval, supporting better diagnostics and treatment outcomes. The growing emphasis on data-driven decision-making and the need to derive actionable insights from complex datasets are compelling organizations to invest in vector database technologies, further propelling market growth.
The evolution of deployment models and the rise of cloud-native architectures have also contributed to the expansion of the vector database market. Organizations are increasingly opting for cloud-based vector database solutions to benefit from scalability, flexibility, and cost efficiency. Cloud deployment enables seamless integration with existing IT infrastructure and allows enterprises to scale resources dynamically based on workload demands. This shift is particularly pronounced among small and medium enterprises (SMEs), which often lack the capital and expertise to maintain on-premises infrastructure. The availability of managed vector database services from major cloud providers is lowering the barrier to entry, democratizing access to advanced data management capabilities, and fueling widespread adoption across diverse industry segments.
The financial services sector is increasingly recognizing the transformative potential of vector search technology. Vector Search for Financial Services is revolutionizing how institutions manage and analyze vast datasets, enabling more accurate risk assessments and personalized customer interactions. By leveraging high-dimensional vector representations, financial organizations can enhance fraud detection, streamline compliance processes, and deliver tailored financial products. This technology is particularly beneficial in real-time trading environments, where rapid data retrieval and analysis are crucial. As the financial industry continues to evolve, the adoption of vector search solutions is set to accelerate, driving innovation and competitive advantage in a data-driven landscape.
From a regional perspective, North America continues to dominate the vector database market, driven by the p
Facebook
TwitterAttribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
The outbreak of dengue fever in recent years has become a grave public health concern as it has spread to 20 countries in South America and South Asia. As vectors of the flavivirus, several mosquito species belonging to the Aedes genus are responsible for transmitting dengue fever. Effective vector surveillance and control are essential in reducing dengue outbreaks. However, due to their minute variations in anatomical structure, it is challenging to identify Aedes mosquitoes without expert entomologists using a microscope. In this regard, deep learning algorithms can play a vital role in identifying mosquitoes using smartphone-captured images and pave the way for deskilling automated vector surveillance, provided that sufficient training examples are available.
A graphical representation of our working pipeline:
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F15444916%2F19a5be5f62e8c13d80e4eeb60f83b5c9%2FA%20flow-diagram%20of%20the%20proposed%20mosquito%20detection%20system.png?generation=1718534243741585&alt=media" alt="">
In this study, we developed the “Aedes Mosquito Image Dataset,” consisting of smartphone-captured mosquito images consisting of 8 class labels: Aedes aegypti, Aedes koreicus, Aedes albopictus, Culex pipiens, Armigeres subalbatus, Culex quinquifasciatus, Aedes japonicus, and others (non-mosquito). The images are collected by trapping mosquitoes in several locations in Dhaka, followed by image capture and expert annotations in collaboration with ICDDR,B. Additional image data is collected from open-access online repositories.
Some sample images from dataset:
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F15444916%2F95a785845e1d50ee6fc90683c1dbe58b%2FDifferent%20types%20of%20Mosquito%20Species.png?generation=1718537439037243&alt=media" alt="">
The dataset contains a total of 31999 images from 3 sources. The class distribution is presented as follows: | Class label | No. of Images from Mosquito Alert | No. of Images from ICDDR,B | No. of Images from WHO | Total No. of Images (Class label Wise) | |---------------------------------|-----------------------------------|----------------------------|-------------------------|----------------------------------------| | Aedes aegypti | 73 | 247 | 499 | 819 | | Aedes albopictus | 15,268 | 7 | 500 | 15,775 | | Aedes japonicus | 153 | 0 | 0 | 153 | | Aedes koreicus | 38 | 0 | 0 | 38 | | Armigeres subalbatus | 0 | 42 | 0 | 42 | | Culex pipiens | 6,180 | 231 | 0 | 6,411 | | Culex quinquefasciatus | 0 | 0 | 500 | 500 | | Others (non-mosquito) | 8,261 | 0 | 0 | 8,261 | | Total No. of Images (Source Wise) | 29,973 | 527 | 1,499 | 31,999 |
This dataset has 8 subfolders, which contain 7 kinds of mosquito images and 1 folder 1 folder (Others) for non-mosquito images.
The dataset was constructed by sourcing images from Mosquito Alert, WHO accredited breeding laboratory & Trap set up by ICDDR,B. Each image was meticulously recorded along with its respective source information to facilitate further verification and ensure proper attribution. Copyright considerations were duly addressed, adhering to appropriate protocols to safeguard intellectual property rights.
Each image is assigned a name following the format of SourceCode_ClassLabel_Cropped_CroppingNumber_Resized. The corresponding source codes assigned to each source are: Mosquito Alert -> MSA; ICDDR,B -> ICD; WHO accredited bre...
Facebook
Twitter300 million images, each corresponding to a description. All are genuine image works published by photographers. The vast majority of descriptions are in English, with very few in Chinese.
Data size
300 million images, each paired with a textual description. Complete image library (including photographic + vector images) totals nearly 300 million, Full dataset available for generative AI training (curated photographic + vector images excluding editorial/news images) comprises approximately 100 million.
Data formats
Image formats: .jpg, .png, .svg; Description format: .txt
Data content
Original copyrighted image works officially released by creators, accompanying descriptions authored by content creators.
Data types
Photographic images and vector illustrations, covers diverse scene categories.
Data resolution
4K and above
Description languages
Predominantly English (majority), Minimal Chinese portion.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Specialized collection of 0 free presentations SVG illustrations from the office & workplace category. Presentation scene illustrations with speakers at podiums, slide deck presentations, and product demonstrations Examples include: speaker at podium, slide deck on screen, product demo.
Facebook
TwitterThis dataset contains 80 million high-quality vector images (SVG, EPS, AI formats), offering a vast collection for use in computer vision, machine learning, and creative applications. Each image is copyright-cleared and legally sourced through authorized channels, with transparent usage rights for both commercial and academic purposes. The dataset features a wide variety of vector content—icons, illustrations, infographics, and more—with excellent color fidelity and scalable resolution. Ideal for AI model training (e.g., image classification, object recognition), generative design models, and creative design inspiration, this resource ensures traceable IP rights and enables safe, large-scale usage in real-world environments.