Dataset Card for "new-image-dataset"
More Information needed
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Introducing the Bahasa Product Image Dataset - a diverse and comprehensive collection of images meticulously curated to propel the advancement of text recognition and optical character recognition (OCR) models designed specifically for the Bahasa language.
Dataset Contain & Diversity:Containing a total of 2000 images, this Bahasa OCR dataset offers diverse distribution across different types of front images of Products. In this dataset, you'll find a variety of text that includes product names, taglines, logos, company names, addresses, product content, etc. Images in this dataset showcase distinct fonts, writing formats, colors, designs, and layouts.
To ensure the diversity of the dataset and to build a robust text recognition model we allow limited (less than five) unique images from a single resource. Stringent measures have been taken to exclude any personally identifiable information (PII) and to ensure that in each image a minimum of 80% of space contains visible Bahasa text.
Images have been captured under varying lighting conditions – both day and night – along with different capture angles and backgrounds, to build a balanced OCR dataset. The collection features images in portrait and landscape modes.
All these images were captured by native Bahasa people to ensure the text quality, avoid toxic content and PII text. We used the latest iOS and Android mobile devices above 5MP cameras to click all these images to maintain the image quality. In this training dataset images are available in both JPEG and HEIC formats.
Metadata:Along with the image data, you will also receive detailed structured metadata in CSV format. For each image, it includes metadata like image orientation, county, language, and device information. Each image is properly renamed corresponding to the metadata.
The metadata serves as a valuable tool for understanding and characterizing the data, facilitating informed decision-making in the development of Bahasa text recognition models.
Update & Custom Collection:We're committed to expanding this dataset by continuously adding more images with the assistance of our native Bahasa crowd community.
If you require a custom product image OCR dataset tailored to your guidelines or specific device distribution, feel free to contact us. We're equipped to curate specialized data to meet your unique needs.
Furthermore, we can annotate or label the images with bounding box or transcribe the text in the image to align with your specific project requirements using our crowd community.
License:This Image dataset, created by FutureBeeAI, is now available for commercial use.
Conclusion:Leverage the power of this product image OCR dataset to elevate the training and performance of text recognition, text detection, and optical character recognition models within the realm of the Bahasa language. Your journey to enhanced language understanding and processing starts here.
https://www.sapien.io/termshttps://www.sapien.io/terms
High-quality image and video datasets for AI training in computer vision applications, including object recognition, scene understanding, and more.
Open Images is a dataset of ~9M images that have been annotated with image-level labels and object bounding boxes.
The training set of V4 contains 14.6M bounding boxes for 600 object classes on 1.74M images, making it the largest existing dataset with object location annotations. The boxes have been largely manually drawn by professional annotators to ensure accuracy and consistency. The images are very diverse and often contain complex scenes with several objects (8.4 per image on average). Moreover, the dataset is annotated with image-level labels spanning thousands of classes.
To use this dataset:
import tensorflow_datasets as tfds
ds = tfds.load('open_images_v4', split='train')
for ex in ds.take(4):
print(ex)
See the guide for more informations on tensorflow_datasets.
https://storage.googleapis.com/tfds-data/visualization/fig/open_images_v4-original-2.0.0.png" alt="Visualization" width="500px">
The image aesthetic benchmark [18] consists of 10800 Flickr photos of four categories, i.e., “animals”, “urban”, “people” and “nature”, and is constructed originally to retrieve beautiful yet unpopular images in social networks. The ground truths of the photos in the benchmark are five aesthetic grades: “Unacceptable” - images with extremely low quality, out of focus or underexposed, “Flawed” - images with some technical flaws and without any artistic value, “Ordinary” - standard quality images without technical flaws, “Professional” - professional-quality images with some artistic value, and “Exceptional” - very appealing images showing both outstanding professional quality and high artistic value.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The BIQ2021 dataset is a large-scale blind image quality assessment database, consisting of 12,000 authentically distorted images. Each image in the dataset has been quality rated by 30 observers, resulting in a total of 360,000 quality ratings. This dataset was created in a controlled laboratory environment, ensuring consistent and reliable subjective scoring. Moreover, the dataset provide a train/test split by which the researchers can report their results for benchmarking. The dataset is openly available and serves as a valuable resource for evaluating and benchmarking image quality assessment algorithms. The paper providing a detailed description of the dataset and its creation process is openly accessible at the following link: BIQ2021: A large-scale blind image quality assessment database.
The paper can be sited as:
Ahmed, N., & Asif, S. (2022). BIQ2021: a large-scale blind image quality assessment database. Journal of Electronic Imaging, 31(5), 053010.
Images: The dataset contain a folder named images containing 12,000 images to be used for training and testing. Train (Images and MOS): It is a CSV file containing randomly partitioned train set of the dataset containing 10,000 images with their corresponding MOS. Test (Images and MOS): It is a CSV file containing randomly partitioned test set of the dataset containing 2,000 images with their corresponding MOS.
Benchmarking: In order to compare the performance of a predictive model trained on the dataset, Pearson and Spearman's correlation can be computed and compared with the existing approaches and the CNN models listed at the following gitHub repository: https://github.com/nisarahmedrana/BIQ2021
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Welcome to the Native American Facial Images from Past Dataset, meticulously curated to enhance face recognition models and support the development of advanced biometric identification systems, KYC models, and other facial recognition technologies.
This dataset comprises over 5,000+ images, divided into participant-wise sets with each set including:
The dataset includes contributions from a diverse network of individuals across Native American countries:
To ensure high utility and robustness, all images are captured under varying conditions:
Each image set is accompanied by detailed metadata for each participant, including:
This metadata is essential for training models that can accurately recognize and identify Native American faces across different demographics and conditions.
This facial image dataset is ideal for various applications in the field of computer vision, including but not limited to:
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
## Overview
DATASET IMAGE is a dataset for object detection tasks - it contains FISH WcSP annotations for 4,230 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [Public Domain license](https://creativecommons.org/licenses/Public Domain).
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Introducing the Finnish Product Image Dataset - a diverse and comprehensive collection of images meticulously curated to propel the advancement of text recognition and optical character recognition (OCR) models designed specifically for the Finnish language.
Dataset Contain & Diversity:Containing a total of 2000 images, this Finnish OCR dataset offers diverse distribution across different types of front images of Products. In this dataset, you'll find a variety of text that includes product names, taglines, logos, company names, addresses, product content, etc. Images in this dataset showcase distinct fonts, writing formats, colors, designs, and layouts.
To ensure the diversity of the dataset and to build a robust text recognition model we allow limited (less than five) unique images from a single resource. Stringent measures have been taken to exclude any personally identifiable information (PII) and to ensure that in each image a minimum of 80% of space contains visible Finnish text.
Images have been captured under varying lighting conditions – both day and night – along with different capture angles and backgrounds, to build a balanced OCR dataset. The collection features images in portrait and landscape modes.
All these images were captured by native Finnish people to ensure the text quality, avoid toxic content and PII text. We used the latest iOS and Android mobile devices above 5MP cameras to click all these images to maintain the image quality. In this training dataset images are available in both JPEG and HEIC formats.
Metadata:Along with the image data, you will also receive detailed structured metadata in CSV format. For each image, it includes metadata like image orientation, county, language, and device information. Each image is properly renamed corresponding to the metadata.
The metadata serves as a valuable tool for understanding and characterizing the data, facilitating informed decision-making in the development of Finnish text recognition models.
Update & Custom Collection:We're committed to expanding this dataset by continuously adding more images with the assistance of our native Finnish crowd community.
If you require a custom product image OCR dataset tailored to your guidelines or specific device distribution, feel free to contact us. We're equipped to curate specialized data to meet your unique needs.
Furthermore, we can annotate or label the images with bounding box or transcribe the text in the image to align with your specific project requirements using our crowd community.
License:This Image dataset, created by FutureBeeAI, is now available for commercial use.
Conclusion:Leverage the power of this product image OCR dataset to elevate the training and performance of text recognition, text detection, and optical character recognition models within the realm of the Finnish language. Your journey to enhanced language understanding and processing starts here.
Dataset Card for "pagoda-text-and-image-dataset-small"
More Information needed
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Resume Images is a dataset for object detection tasks - it contains Resume annotations for 2,694 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
1) Data Introduction • The Egg Image Dataset is constructed by collecting images of eggs captured in real-world environments, classified based on whether the eggs are damaged or not damaged.
2) Data Utilization (1) Characteristics of the Egg Image Dataset: • It includes images collected from various real-world settings such as kitchens, farms, and markets, making it highly effective for model training and improving data generalization. • The dataset provides a clear distinction between damaged and undamaged eggs, making it suitable for solving problems related to object recognition and quality inspection.
(2) Applications of the Egg Image Dataset: • Development of Object Recognition and Quality Classification Models: It can be used to train AI models to automatically detect and classify eggs based on their damage status. • Utilization in Research and Development (R&D): The dataset can be applied to various R&D projects, including product quality management and the development of automated inspection systems.
Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
Humans have long struggled to discern between these majestic big cats species, and now we invite data scientists, researchers, and enthusiasts to unleash the power of artificial intelligence. Your mission is to craft and fine-tune models that transcend human perception, differentiating between diverse big cat species with unrivaled accuracy.
Our meticulously curated dataset lays the foundation for this remarkable undertaking. With images meticulously sourced from various habitats, the dataset forms a comprehensive compendium of big cat diversity. As a participant, you'll harness this trove of data to create models that decipher the intricate features distinguishing lions, tigers, cheetahs, and more. Through your innovative approach and algorithmic prowess, the challenge aims to crown the model that can elegantly navigate the spectrum of big cat species.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset includes over 2000 images of shipwrecks and shipwreck relics which provides a unique insight into the State's maritime history. In addition to the images the dataset includes an extract from the South Australian Register of Historic Shipwrecks. The database includes all known shipwrecks located in South Australian and Australian waters adjacent to South Australia. It includes information pertaining to Historic Shipwreck and Historic Relics as described under the (Commonwealth) Historic Shipwrecks Act 1976 and the (South Australian) Historic Shipwrecks Act 1981. The dataset includes shipwrecks that have not yet been declared under either of these Acts. Filtering may take place to restrict the location of sensitive shipwrecks where condition assessments are pending. The Maritime Register (XML) contains the image URL which can be matched to the image name. The register also includes the shipwreck name, historical background, description of the relic, location of the shipwreck and other details. See also: https://data.sa.gov.au/data/dataset/shipwrecks
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Dataset Description: This dataset contains images of 192 different scene categories, with both AI-generated and real-world images for each class. It is designed for research and benchmarking in computer vision, deep learning, and AI-generated image detection.
Key Features: 📸 192 Scene Classes: Includes diverse environments like forests, cities, beaches, deserts, and more. 🤖 AI-Generated vs. Real Images: Each class contains images generated by AI models as well as real-world photographs. 🖼️ High-Quality Images: The dataset ensures a variety of resolutions and sources to improve model generalization. 🏆 Perfect for Research: Ideal for training models in AI-generated image detection, scene classification, and image authenticity verification. Potential Use Cases: 🔍 AI-generated vs. real image classification 🏙️ Scene recognition and segmentation 🖥️ Training deep learning models for synthetic image detection 📊 Analyzing AI image generation trends
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Explore the TIME Image Dataset, featuring 144 classes of synthetically generated clock images designed for time-based image recognition tasks.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Real Image is a dataset for object detection tasks - it contains Decaycavity Earlycavity annotations for 951 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
GUI-based software coded in PYTHON to automate image stitching and alignment processes from a set of tile images for the high throughput image analytics by implementing a series of algorithms: 1) deskewing the image acquired in an oblique view angle, 2) row alignment of the geometrically drifted image due to acquisition errors by detecting the crop row using Hough Transformation, and 3) options for omnidirectional overlap trimming and resizing. Resources in this dataset:Resource Title: iStitch: GUI-based Image Stitching Software. File Name: iStitch.zip
About Dataset The file contains 24K unique figure obtained from various Google resources Meticulously curated figure ensuring diversity and representativeness Provides a solid foundation for developing robust and precise figure allocation algorithms Encourages exploration in the fascinating field of feed figure allocation
Unparalleled Diversity Dive into a vast collection spanning culinary landscapes worldwide. Immerse yourself in a diverse array of cuisines, from Italian pasta to Japanese sushi. Explore a rich tapestry of food imagery, meticulously curated for accuracy and breadth. Precision Labeling Benefit from meticulous labeling, ensuring each image is tagged with precision. Access detailed metadata for seamless integration into your machine learning projects. Empower your algorithms with the clarity they need to excel in food recognition tasks. Endless Applications Fuel advancements in machine learning and computer vision with this comprehensive dataset. Revolutionize food industry automation, from inventory management to quality control. Enable innovative applications in health monitoring and dietary analysis for a healthier tomorrow. Seamless Integration Seamlessly integrate our dataset into your projects with user-friendly access and documentation. Enjoy high-resolution images optimized for compatibility with a range of AI frameworks. Access support and resources to maximize the potential of our dataset for your specific needs.
Conclusion Embark on a culinary journey through the lens of artificial intelligence and unlock the potential of feed figure allocation with our SEO-optimized file. Elevate your research, elevate your projects, and elevate the way we perceive and interact with food in the digital age. Dive in today and savor the possibilities!
This dataset is sourced from Kaggle.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Curated RGB image dataset for our analysis, splited into training and evalutaion set. Based on ImageNet ILSVRC dataset (Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg AC, 2015).
Dataset Card for "new-image-dataset"
More Information needed