Leaves from genetically unique Juglans regia plants were scanned using X-ray micro-computed tomography (microCT) on the X-ray μCT beamline (8.3.2) at the Advanced Light Source (ALS) in Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA USA). Soil samples were collected in Fall of 2017 from the riparian oak forest located at the Russell Ranch Sustainable Agricultural Institute at the University of California Davis. The soil was sieved through a 2 mm mesh and was air dried before imaging. A single soil aggregate was scanned at 23 keV using the 10x objective lens with a pixel resolution of 650 nanometers on beamline 8.3.2 at the ALS. Additionally, a drought stressed almond flower bud (Prunus dulcis) from a plant housed at the University of California, Davis, was scanned using a 4x lens with a pixel resolution of 1.72 µm on beamline 8.3.2 at the ALS Raw tomographic image data was reconstructed using TomoPy. Reconstructions were converted to 8-bit tif or png format using ImageJ or the PIL package in Python before further processing. Images were annotated using Intel’s Computer Vision Annotation Tool (CVAT) and ImageJ. Both CVAT and ImageJ are free to use and open source. Leaf images were annotated in following Théroux-Rancourt et al. (2020). Specifically, Hand labeling was done directly in ImageJ by drawing around each tissue; with 5 images annotated per leaf. Care was taken to cover a range of anatomical variation to help improve the generalizability of the models to other leaves. All slices were labeled by Dr. Mina Momayyezi and Fiona Duong.To annotate the flower bud and soil aggregate, images were imported into CVAT. The exterior border of the bud (i.e. bud scales) and flower were annotated in CVAT and exported as masks. Similarly, the exterior of the soil aggregate and particulate organic matter identified by eye were annotated in CVAT and exported as masks. To annotate air spaces in both the bud and soil aggregate, images were imported into ImageJ. A gaussian blur was applied to the image to decrease noise and then the air space was segmented using thresholding. After applying the threshold, the selected air space region was converted to a binary image with white representing the air space and black representing everything else. This binary image was overlaid upon the original image and the air space within the flower bud and aggregate was selected using the “free hand” tool. Air space outside of the region of interest for both image sets was eliminated. The quality of the air space annotation was then visually inspected for accuracy against the underlying original image; incomplete annotations were corrected using the brush or pencil tool to paint missing air space white and incorrectly identified air space black. Once the annotation was satisfactorily corrected, the binary image of the air space was saved. Finally, the annotations of the bud and flower or aggregate and organic matter were opened in ImageJ and the associated air space mask was overlaid on top of them forming a three-layer mask suitable for training the fully convolutional network. All labeling of the soil aggregate and soil aggregate images was done by Dr. Devin Rippner. These images and annotations are for training deep learning models to identify different constituents in leaves, almond buds, and soil aggregates Limitations: For the walnut leaves, some tissues (stomata, etc.) are not labeled and only represent a small portion of a full leaf. Similarly, both the almond bud and the aggregate represent just one single sample of each. The bud tissues are only divided up into buds scales, flower, and air space. Many other tissues remain unlabeled. For the soil aggregate annotated labels are done by eye with no actual chemical information. Therefore particulate organic matter identification may be incorrect. Resources in this dataset:Resource Title: Annotated X-ray CT images and masks of a Forest Soil Aggregate. File Name: forest_soil_images_masks_for_testing_training.zipResource Description: This aggregate was collected from the riparian oak forest at the Russell Ranch Sustainable Agricultural Facility. The aggreagate was scanned using X-ray micro-computed tomography (microCT) on the X-ray μCT beamline (8.3.2) at the Advanced Light Source (ALS) in Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA USA) using the 10x objective lens with a pixel resolution of 650 nanometers. For masks, the background has a value of 0,0,0; pores spaces have a value of 250,250, 250; mineral solids have a value= 128,0,0; and particulate organic matter has a value of = 000,128,000. These files were used for training a model to segment the forest soil aggregate and for testing the accuracy, precision, recall, and f1 score of the model.Resource Title: Annotated X-ray CT images and masks of an Almond bud (P. Dulcis). File Name: Almond_bud_tube_D_P6_training_testing_images_and_masks.zipResource Description: Drought stressed almond flower bud (Prunis dulcis) from a plant housed at the University of California, Davis, was scanned by X-ray micro-computed tomography (microCT) on the X-ray μCT beamline (8.3.2) at the Advanced Light Source (ALS) in Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA USA) using the 4x lens with a pixel resolution of 1.72 µm using. For masks, the background has a value of 0,0,0; air spaces have a value of 255,255, 255; bud scales have a value= 128,0,0; and flower tissues have a value of = 000,128,000. These files were used for training a model to segment the almond bud and for testing the accuracy, precision, recall, and f1 score of the model.Resource Software Recommended: Fiji (ImageJ),url: https://imagej.net/software/fiji/downloads Resource Title: Annotated X-ray CT images and masks of Walnut leaves (J. Regia) . File Name: 6_leaf_training_testing_images_and_masks_for_paper.zipResource Description: Stems were collected from genetically unique J. regia accessions at the 117 USDA-ARS-NCGR in Wolfskill Experimental Orchard, Winters, California USA to use as scion, and were grafted by Sierra Gold Nursery onto a commonly used commercial rootstock, RX1 (J. microcarpa × J. regia). We used a common rootstock to eliminate any own-root effects and to simulate conditions for a commercial walnut orchard setting, where rootstocks are commonly used. The grafted saplings were repotted and transferred to the Armstrong lathe house facility at the University of California, Davis in June 2019, and kept under natural light and temperature. Leaves from each accession and treatment were scanned using X-ray micro-computed tomography (microCT) on the X-ray μCT beamline (8.3.2) at the Advanced Light Source (ALS) in Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA USA) using the 10x objective lens with a pixel resolution of 650 nanometers. For masks, the background has a value of 170,170,170; Epidermis value= 85,85,85; Mesophyll value= 0,0,0; Bundle Sheath Extension value= 152,152,152; Vein value= 220,220,220; Air value = 255,255,255.Resource Software Recommended: Fiji (ImageJ),url: https://imagej.net/software/fiji/downloads
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Cvat is a dataset for computer vision tasks - it contains 1 annotations for 386 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
https://www.marketreportanalytics.com/privacy-policyhttps://www.marketreportanalytics.com/privacy-policy
The global data annotation and labeling tool market is experiencing robust growth, driven by the increasing demand for high-quality training data in artificial intelligence (AI) and machine learning (ML) applications. The market, estimated at $2 billion in 2025, is projected to achieve a Compound Annual Growth Rate (CAGR) of 25% from 2025 to 2033, reaching approximately $10 billion by 2033. This expansion is fueled by several key factors. Firstly, the proliferation of AI applications across diverse sectors such as automotive (autonomous driving), healthcare (medical image analysis), and finance (fraud detection) is creating an insatiable need for accurate and efficiently labeled data. Secondly, the advancement of deep learning techniques requires massive datasets, further boosting demand for annotation and labeling tools. Finally, the emergence of sophisticated tools offering automated and semi-supervised annotation capabilities is streamlining the process and reducing costs, making the technology accessible to a broader range of organizations. However, market growth is not without its challenges. Data privacy concerns and the need for robust data security protocols pose significant restraints. The high cost associated with specialized expertise in data annotation can also limit adoption, particularly for smaller companies. Despite these challenges, the market segmentation reveals opportunities. The automatic annotation segment is anticipated to grow rapidly due to its efficiency gains, while applications within the healthcare and automotive sectors are expected to dominate the market share, reflecting the considerable investment in AI across these industries. Leading players like Labelbox, Scale AI, and SuperAnnotate are strategically positioning themselves to capitalize on this growth by focusing on developing advanced tools, expanding their partnerships, and entering new geographic markets. The North American market currently holds the largest share, but the Asia-Pacific region is projected to experience the fastest growth due to increased investment in AI research and development across countries such as China and India.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
CVAT Coco is a dataset for object detection tasks - it contains Defect Distance Event annotations for 9,899 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
CVAT Upload is a dataset for object detection tasks - it contains 123 annotations for 1,370 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
General information
The dataset consists of 4403 labelled subscenes from 155 Sentinel-2 (S2) Level-1C (L1C) products distributed over the Northern European terrestrial area. Each S2 product was oversampled at 10 m resolution for 512 x 512 pixels subscenes. 6 L1C S2 products were labelled fully. Among other 149 S2 products the most challenging ~10 subscenes per product were selected for labelling. In total the dataset represents 4403 labelled Sentinel-2 subscenes, where each sub-tile is 512 x 512 pixels at 10 m resolution. The dataset consists of around 30 S2 products per month from April to August and 3 S2 products per month for September and October. Each selected L1C S2 product represents different clouds, such as cumulus, stratus, or cirrus, which are spread over various geographical locations in Northern Europe.
The classification pixel-wise map consists of the following categories:
The dataset was labelled using Computer Vision Annotation Tool (CVAT) and Segments.ai. With the possibility of integrating active learning process in Segments.ai, the labelling was performed semi-automatically.
The dataset limitations must be considered: the data is covering only terrestrial region and does not include water areas; the dataset is not presented in winter conditions; the dataset represent summer conditions, therefore September and October contain only test products used for validation. Current subscenes do not have georeferencing, however, we are working towards including them in next version.
More details about the dataset structure can be found in README.
Contributions and Acknowledgements
The data were annotated by Fariha Harun and Olga Wold. The data verification and Software Development was performed by Indrek Sünter, Heido Trofimov, Anton Kostiukhin, Marharyta Domnich, Mihkel Järveoja, Olga Wold. Methodology was developed by Kaupo Voormansik, Indrek Sünter, Marharyta Domnich.
We would like to thank Segments.ai annotation tool for instant and an individual customer support. We are grateful to European Space Agency for reviews and suggestions. We would like to extend our thanks to Prof. Gholamreza Anbarjafari for the feedback and directions.
The project was funded by European Space Agency, Contract No. 4000132124/20/I-DT.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Overview:
The Manually Annotated Drone Imagery Dataset (MADRID) consists of hand annotated high resolution RGB images taken in two different types of coasts in Poland, Miedzyzdroje - cliff coast and in Mrzezyno - dune coast in 2022-2023. All images were converted into a uniform format of 1440x2560 pixels, polyline annotated and set into file structure format suited for semantic segmentation tasks (See "Usage" notes below for more details).
The raw images of our dataset were captured Zenmuse L1 Sensor (RGB) mounted on a DJI Matrice 300 RTK Drone. Total of 4895 images were captured, however the dataset contains 3876 images with each image annotated with coastline. The dataset only include images with coastlines that are visually identifiable with the human eye. For the annotations of the images, CVAT v2.13 open-source software was utilized.
Usage:
The compressed RAR file contains two folders train and test. Each folder contains the file that represents the date at which the image was captured in the format of (year, month, day), number of the image and the name of the drone utilized to capture the image. For example, DJI_20220111140051_0051_Zenmuse-L1-mission and DJI_20220111140105_0053_Zenmuse-L1-mission. Additionally, the test folder contains annotations (one per image) which are extracted from the original XML annotation file provided in the CVAT 1.1 image format.
Archives were compressed using RAR compression. They can be decompressed in a terminal by opening and extracting Madrid_v0.1_Data.zip.
The subset of the data with the name Madrid_subset_data.zip has been added which contains a small portion of train and test images for purpose of inspecting the dataset without downloading the entire dataset.
The training images for both training data and testing data are structured as follows.
Train/ └── images/ └── DJI_20220111140051_0051_Zenmuse-L1-mission.JPG └── DJI_20220111140105_0053_Zenmuse-L1-mission.JPG └── ...└── masks/ └── DJI_20220111140051_0051_Zenmuse-L1-mission.PNG └── DJI_20220111140105_0053_Zenmuse-L1-mission.PNG └── ...Test/ └── images/ └── DJI_20220111140051_0051_Zenmuse-L1-mission.JPG └── DJI_20220111140105_0053_Zenmuse-L1-mission.JPG └── ...└── masks/ └── DJI_20220111140051_0051_Zenmuse-L1-mission.PNG └── DJI_20220111140105_0053_Zenmuse-L1-mission.PNG └── ...
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Data
The dataset consist of 5538 images of public spaces, annotated with steps, stairs, ramps and grab bars for stairs and ramps. The dataset has annotations 3564 of steps, 1492 of stairs, 143 of ramps and 922 of grab bars.
Each step annotation is attributed with an estimate of the height of the step, as falling into one of three categories: less than 3cm, 3cm to 7cm or more than 7cm. Additionally it is attributed with a 'type', with the possibilities 'doorstep', 'curb' or 'other'.
Stair annotations are attributed with the number of steps in the stair.
Ramps are attributed with an estimate of their width, also falling into three categories: less than 50cm, 50cm to 100cm and more than 100cm.
In order to preserve all additional attributes of the labels, the data is published in the CVAT XML format for images.
Annotating Process
The labelling has been done using bounding boxes around the objects. This format is compatible with many popular object detection models, e.g. the YOLO object model. A bounding box is placed so it contains exactly the visible part of the respective objects. This implies that only objects that are visible in the photo are annotated. This means in particular a photo of a stair or step from above, where the object cannot be seen, have not been annotated, even when a human viewer can possibly infer that there is a stair or a step from other features in the photo.
Steps
A step is annotated, when there is an vertical increment that functions as a passage between two surface areas intended human or vehicle traffic. This means that we have not included:
In particular, the bounding box of a step object contains exactly the incremental part of the step, but does not extend into the top or bottom horizontal surface any more than necessary to enclose entirely the incremental part. This has been chosen for consistency reasons, as including parts of the horizontal surfaces would imply a non-trivial choice of how much to include, which we deemed would most likely lead to more inconstistent annotations.
The height of the steps are estimated by the annotators, and are therefore not guarranteed to be accurate.
The type of the steps typically fall into the category 'doorstep' or 'curb'. Steps that are in a doorway, entrance or likewise are attributed as doorsteps. We also include in this category steps that are immediately leading to a doorway within a proximity of 1-2m. Steps between different types of pathways, e.g. between streets and sidewalks, are annotated as curbs. Any other type of step are annotated with 'other'. Many of the 'other' steps are for example steps to terraces.
Stairs
The stair label is used whenever two or more steps directly follow each other in a consistent pattern. All vertical increments are enclosed in the bounding box, as well as intermediate surfaces of the steps. However the top and bottom surface is not included more than necessary for the same reason as for steps, as described in the previous section.
The annotator counts the number of steps, and attribute this to the stair object label.
Ramps
Ramps have been annotated when a sloped passage way has been placed or built to connect two surface areas intended for human or vehicle traffic. This implies the same considerations as with steps. Alike also only the sloped part of a ramp is annotated, not including the bottom or top surface area.
For each ramp, the annotator makes an assessment of the width of the ramp in three categories: less than 50cm, 50cm to 100cm and more than 100cm. This parameter is visually hard to assess, and sometimes impossible due to the view of the ramp.
Grab Bars
Grab bars are annotated for hand rails and similar that are in direct connection to a stair or a ramp. While horizontal grab bars could also have been included, this was omitted due to the implied ambiguities of fences and similar objects. As the grab bar was originally intended as an attributal information to stairs and ramps, we chose to keep this focus. The bounding box encloses the part of the grab bar that functions as a hand rail for the stair or ramp.
Usage
As is often the case when annotating data, much information depends on the subjective assessment of the annotator. As each data point in this dataset has been annotated only by one person, caution should be taken if the data is applied.
Generally speaking, the mindset and usage guiding the annotations have been wheelchair accessibility. While we have strived to annotate at an object level, hopefully making the data more widely applicable than this, we state this explicitly as it may have swayed untrivial annotation choices.
The attributal data, such as step height or ramp width are highly subjective estimations. We still provide this data to give a post-hoc method to adjust which annotations to use. E.g. for some purposes, one may be interested in detecting only steps that are indeed more than 3cm. The attributal data makes it possible to sort away the steps less than 3cm, so a machine learning algorithm can be trained on this more appropriate dataset for that use case. We stress however, that one cannot expect to train accurate machine learning algorithms inferring the attributal data, as this is not accurate data in the first place.
We hope this dataset will be a useful building block in the endeavours for automating barrier detection and documentation.
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Existing image/video datasets for cattle behavior recognition are mostly small, lack well-defined labels, or are collected in unrealistic controlled environments. This limits the utility of machine learning (ML) models learned from them. Therefore, we introduce a new dataset, called Cattle Visual Behaviors (CVB), that consists of 502 video clips, each fifteen seconds long, captured in natural lighting conditions, and annotated with eleven visually perceptible behaviors of grazing cattle. By creating and sharing CVB, our aim is to develop improved models capable of recognizing all important behaviors accurately and to assist other researchers and practitioners in developing and evaluating new ML models for cattle behavior classification using video data. The dataset is presented in the form of following three sub-directories. 1. raw_frames: contains 450 frames in each sub folder, representing 15 sec video, taking at a frames rate of 30 FPS, 2. annotations: contains the json files corresponding to the raw_frames folder. We have one json file for one video, containing the bounding box annotations for each cattle and their associated behaviors, and 3. CVB_in_AVA_format: contains the CVB data in the standard AVA dataset format which we have used to apply SlowFast model. Lineage: We use the Computer Vision Annotation Tool (CVAT) to collect our annotations. To make the procedure more efficient, we perform an initial detection and tracking of cattle in the videos using appropriate pre-trained models. The results are corrected by domain experts along with cattle behavior labeling in CVAT. The pre-hoc detection and tracking step significantly reduces the manual annotation time and effort.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset contains 8,992 images of Uno cards and 26,976 labeled examples on various textured backgrounds.
This dataset was collected, processed, and released by Roboflow user Adam Crawshaw, released with a modified MIT license: https://firstdonoharm.dev/
https://i.imgur.com/P8jIKjb.jpg" alt="Image example">
Adam used this dataset to create an auto-scoring Uno application:
Fork or download this dataset and follow our How to train state of the art object detector YOLOv4 for more.
See here for how to use the CVAT annotation tool.
Roboflow makes managing, preprocessing, augmenting, and versioning datasets for computer vision seamless. :fa-spacer: Developers reduce 50% of their boilerplate code when using Roboflow's workflow, save training time, and increase model reproducibility. :fa-spacer:
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Circular TwAIn WEEE Pilot Dataset v1.0 The following is a description of the dataset and testbed. This repository contains dataset sampled in WEEE Pilot research and development. There is a dataset testbed designed to sample data on PC. The testbed consists of two RGB cameras for capturing the front and top views of the PCs. The front camera captures the chassis, the top camera captures the motherboard. We used open-audit tool to identify the different hardware components of the PC (when the PC is functioning). We annotated the top-view image of the PC using CVAT image annotation tool. The dataset has four different data modalities: RGB image of the front side view of a PC RGB image of the fop view of the PC XML metadata about the different component of a PC YOLO format annotations of different components of a PC for the top view RGB image. Following are the PC component classes. Fan (CPU cooler/fan) Motherboard Screw (Star) Screw (Torch) HDD CD/DVD BIOS Cell PSU (Power Supply Unit) CPU Chip Description of Data Files: The CT_WEEE_PILOT_DATASET_v1.0.rar contains four directories: front: RGB images of front view of PC top: RGB images of top view of PC labels: YOLO format annotations for PC hardware openaudit: PC hardware metadata stored in XML The circular_twain_wee_dataset_v1.0.csv file provides information on: sample_name: Unique name of each sample. case_type: Type of PC chassis manufacturer: PC manufactuerer name serial_number: Serial number of the sampled PC (not always correct) date: Date of sampling model_name: Model name of the sampled PC
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
For a detailed description of this dataset, based on the Datasheets for Datasets (Gebru, Timnit, et al. "Datasheets for datasets." Communications of the ACM 64.12 (2021): 86-92.), check the VINEPICs_datasheet.md file.
For what purpose was the dataset created?
VINEPICs was developed specifically for the purpose of detecting grape bunches in RGB images and facilitating tasks such as object detection, semantic segmentation, and instance segmentation. The detection of grape bunches serves as the initial phase in an analysis pipeline designed for vine plant phenotyping. The dataset encompasses a wide range of lighting conditions, camera orientations, plant defoliation levels, species variations, and cultivation methods. Consequently, this dataset presents an opportunity to explore the influence of each source of variability on grape bunch detection.
What do the instances that comprise the dataset represent?
The dataset consists of RGB images showcasing various species of vine plants. Specifically, the images represent three different Vitis vinifera varieties:
- Red Globe, a type of table grape
- Cabernet Sauvignon, a red wine grape
- Ortrugo, a white wine grape
These images have been collected over different years and dates at the vineyard facility of Università Cattolica del Sacro Cuore in Piacenza, Italy. You can find the images stored in the "data/images" directory, organized into subdirectories based on the starting time of data collection, indicating the day (and, if available, the approximate time in minutes). Images collected in 2022 are named using timestamps with nanosecond precision.
Is there a label or target associated with each instance?
Each image has undergone manual annotation using the Computer Vision Annotation Tool (CVAT) (https://github.com/opencv/cvat). Grape bunches have been meticulously outlined with polygon annotations. These annotations belong to a single class, "bunch," and have been saved in a JSON file using the COCO Object Detection format, including segmentation masks (https://cocodataset.org/#format-data).
What mechanisms or procedures were used to collect the data?
The data was collected using a D435 Intel Realsense camera, which was mounted on a four-wheeled skid-steering robot. The robot was teleoperated during the data collection process. The data was recorded by streaming the camera's feed into rosbag format. Specifically, the camera was connected via a USB 3.0 interface to a PC running Ubuntu 18.04 and ROS Melodic.
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
The Tasmanian Orange Roughy Stereo Image Machine Learning Dataset is a collection of annotated stereo image pairs collected by a net-attached Acoustic and Optical System (AOS) during orange roughy (Hoplostethus atlanticus) biomass surveys off the northeast coast of Tasmania, Australia in July 2019. The dataset consists of expertly annotated imagery from six AOS deployments (OP12, OP16, OP20, OP23, OP24, and OP32), representing a variety of conditions including different fish densities, benthic substrates, and altitudes above the seafloor. Each image was manually annotated with bounding boxes identifying orange roughy and other marine species. For all annotated images, paired stereo images from the opposite camera have been included where available to enable stereo vision analysis. This dataset was specifically developed to investigate the effectiveness of machine learning-based object detection techniques for automating fish detection under variable real-world conditions, providing valuable resources for advancing automated image processing in fisheries science. Lineage: Data were obtained onboard the 32 m Fishing Vessel Saxon Onward during an orange roughy acoustic biomass survey off the northeast coast of Tasmania in July 2019. Stereo image pairs were collected using a net-attached Acoustic and Optical System (AOS), which is a self-contained autonomous system with multi-frequency and optical capabilities mounted on the headline of a standard commercial orange roughy demersal trawl. Images were acquired by a pair of Prosilica GX3300 Gigabyte Ethernet cameras with Zeiss F2.8 lenses (25 mm focal length), separated by 90 cm and angled inward at 7° to provide 100% overlap at a 5 m range. Illumination was provided by two synchronised quantum trio strobes. Stereo pairs were recorded at 1 Hz in JPG format with a resolution of 3296 x 2472 pixels and a 24-bit depth.
Human experts manually annotated images from the six deployments using both the CVAT annotation tool (producing COCO format annotations) and LabelImg tool (producing XML format annotations). Only port camera views were annotated for all deployments. Annotations included bounding boxes for "orange roughy" and "orange roughy edge" (for partially visible fish), as well as other marine species (brittle star, coral, eel, miscellaneous fish, etc.). Prior to annotation, under-exposed images were enhanced based on altitude above the seafloor using a Dark Channel Prior (DCP) approach, and images taken above 10 m altitude were discarded due to poor visibility.
For all annotated images, the paired stereo images (from the opposite camera) have been included where available to enable stereo vision applications. The dataset represents varying conditions of fish density (1-59 fish per image), substrate types (light vs. dark), and altitudes (2.0-10.0 m above seafloor), making it particularly valuable for training and evaluating object detection models under variable real-world conditions.
The final standardised COCO dataset contains 1051 annotated port-side images, 849 paired images (without annotations), and 14414 total annotations across 17 categories. The dataset's category distribution includes orange roughy (9887), orange roughy edge (2928), mollusc (453), cnidaria (359), misc fish (337), sea anemone (136), sea star (105), sea feather (100), sea urchin (45), coral (22), eel (15), oreo (10), brittle star (8), whiptail (4), chimera (2), siphonophore (2), and shark (1).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
BWILD is a dataset tailored to train Artificial Intelligence applications to automate beach seagrass wrack detection in RGB images. It includes oblique RGB images captured by SIRENA beach video-monitoring systems, along with corresponding annotations, auxiliary data and a README file. BWILD encompasses data from two microtidal sandy beaches in the Balearic Islands, Spain. The dataset consists of images with varying fields of view (9 cameras), beach wrack abundance, degrees of occupation, and diverse meteoceanic and lighting conditions. The annotations categorise image pixels into five classes: i) Landwards, ii) Seawards, iii) Diffuse wrack, iv) Intermediate wrack, and v) Dense wrack.
The BWILD version 1.1.0 is packaged in a compressed file (BWILD_v1.1.0.zip). A total of 3286 RGB images are shared in PNG format, corresponding annotations and masks in various formats (PNG, XML, JSON,TXT), and the README file in PDF format.
The BWILD dataset utilizes snapshot images from two SIRENA beach video-monitoring systems. To facilitate annotation while maintaining a diverse range of scenarios, the original 1280x960 pixel images were cropped to smaller regions, with a uniform resolution of 640x480 pixels. A subset of images was carefully curated to minimize annotation workload while ensuring representation of various time periods, distances to camera, and environmental conditions. Image selection involved filtering for quality, clustering for diversity, and prioritizing scenes containing beach seagrass wracks. Further details are available in the README file.
Data splitting requirements may vary depending on the chosen Artificial Intelligence approach (e.g., splitting by entire images or by image patches). Researchers should use a consistent method and document the approach and splits used in publications, enabling reproducible results and facilitating comparisons between studies.
The BWILD dataset has been labelled manually using the 'Computer Vision Annotation Tool' (CVAT), categorising pixels into five labels of interest using polygon annotations.
Label | Description |
landwards | Pixels that are towards the landside with respect to the shoreline |
seawards | Pixels that are towards the seaside with respect to the shoreline |
diffuse wrack | Pixels that potentially resembled beach wracks based on colour and shape, yet the annotator could not confirm this with certainty, were denoted as ‘diffuse wrack’ |
Intermediate wrack | Pixels with low-density beach wracks or mixed beach wracks and sand surfaces |
Dense wrack | Pixels with high-density beach wracks |
Annotations were exported from CVAT in four different formats: (i) CVAT for images (XML); (ii) Segmentation Mask 1.0 (PNG); (iii) COCO (JSON); (iv) Ultralytics YOLO Segmentation 1.0 (TXT). These diverse annotation formats can be used for various applications including object detection and segmentation, and simplify the interaction with the dataset, making it more user-friendly. Further details are available in the README file.
RGB values or any transformation in the colour space can be used as parameters.
A SIRENA system consists of a set of RGB cameras mounted at the top of buildings on the beachfront. These cameras take oblique pictures of the beach, with overlapping sights, at 7.5 FPS during the first 10 minutes of each hour in daylight hours. From these pictures, different products are generated, including snapshots, which correspond to the frame of the video at the 5th minute. In the Balearic Islands, SIRENA stations are managed by the Balearic Islands Coastal Observing and Forecasting System (SOCIB), and are mounted at the top of hotels located in front of the coastline. The present dataset includes snapshots from the SIRENA systems operating since 2011 at Cala Millor (5 cameras) and Son Bou (4 cameras) beaches, located in Mallorca and Menorca islands (Balearic Islands, Spain), respectively. All latest and historical SIRENA images are available at the Beamon app viewer (https://apps.socib.es/beamon).
All images included in BWILD have been supervised by the authors of the dataset. However, variable presence of beach segrass wracks across different beach segments and seasons impose a variable distribution of images across different SIRENA stations and cameras. Users of BWILD dataset must be aware of this variance. Further details are available in the README file.
The resolution of the images in BWILD is of 640x480 pixels.
The BWILD version 1.1.0 contains data from two SIRENA beach video-monitoring stations, encompassing two microtidal sandy beaches in the Balearic Islands, Spain. These are: Cala Millor (clm) and Son Bou (snb).
SIRENA station | Longitude | Latitude |
clm | 3.383 | 39.596 |
snb | 4.077 | 39.898 |
For further technical inquiries or additional information about the annotated dataset, please contact jsoriano@socib.es.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is the dataset and model used for Tiny Towns Scorer, a computer vision project completed as part of CS 4664: Data-Centric Computing Capstone at Virginia Tech. The goal of the project was to calculate player scores in the board game Tiny Towns.
The dataset consists of 226 images and associated annotations, intended for object detection. The images are photographs of players' game boards over the course of a game of Tiny Towns, as well as photos of individual game pieces taken after the game. Photos were taken using hand-held smartphones. Images are in JPG and PNG formats. The annotations are provided in TFRecord 1.0 and CVAT for Images 1.1 formats.
The weights for the trained RetinaNet-portion of the model are also provided.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Butterfly Wing VIS CVAT is a dataset for instance segmentation tasks - it contains Wing 9Y6m annotations for 582 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
One of the primary challenges in Crown-of-Thorns Starfish (COTS) management is the detection of COTS. Additionally, identifying COTS scars is crucial for locating areas that may be affected by COTS and estimating the probability of their presence. This dataset comprises annotations for COTS and COTS scars derived from images collected by the Australian Institute of Marine Science (AIMS) using their innovative ReefScan platform. The images were thoroughly reviewed by domain experts to identify COTS and COTS scars. For efficiency in annotation, not all images were annotated for COTS scars; however, when an image was annotated for scars, all visible scars within that image were included. For COTS annotations, all visible COTS were annotated. The dataset contains a total of 11,243 images, including annotations for 9,208 COTS and 18,142 scar polygons that can be used to develop an AI model for COTS and COTS scar detection. Lineage: The data was collected by the Australian Institute of Marine Science (AIMS) using a novel COTS Surveillance System, a towed platform developed by AIMS. This system was designed to meet the needs of the Crown-of-Thorns Starfish (COTS) Control Teams, who are responsible for the majority of starfish control efforts along the Great Barrier Reef (GBR). The platform captures stereo still images at a resolution of 5312x3040 at 4 frames per second (fps). Since 2021, AIMS conducted more than six field trips across the Great Barrier Reef to collect data, which was provided to CSIRO for annotation. Domain experts meticulously reviewed the video frames and annotated them using the Computer Vision Annotation Tool (CVAT). CVAT incorporates the Segment Anything Model (SAM), enabling accurate polygon annotations for COTS scars and bounding box annotations for COTS. The annotated dataset is divided into training and test sets. Care was taken to ensure a reasonable number of COTS instances (~61) were included in the test set to facilitate robust evaluation of object detection and tracking algorithms. The training set comprises 8,204 bounding boxes for COTS and 16,321 scar polygons, while the test set contains 1,004 bounding boxes for COTS and 1,821 scar polygons. This annotated dataset will support the development and validation of machine learning models for automated COTS monitoring, contributing to reef conservation efforts.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Authors marked with an asterisk (*) have contributed equally to this publication.
We annotated a dataset for the detection of drainage outlets and ventilations on flat rooftops. The underlying high-resolution aerial images are orthophotos with a ground sampling distance of 7.5 cm, provided by the Office for Land Management and Geoinformation of the City of Bonn, Germany. The dataset was created through manual annotation using the Computer Vision Annotation Tool (CVAT) and comprises 740 image pairs. Each pair consists of a rooftop image and a corresponding annotated mask indicating the drainage outlets and ventilations. Since rooftops vary in size, we aimed to create image pairs that capture a single rooftop per image without overlaps or cutoffs. Consequently, the dimensions of each image pair differ. The dataset is split randomly into 80% for training, 10% for validation, and 10% for testing.
We provide the dataset in the Common Objects in Context (COCO) format for object detection tasks. In addition to the COCO-formatted dataset, we provide the dataset in its original, pairwise, format to support various machine learning tasks, such as semantic segmentation and panoptic segmentation, as well as to accommodate different data-loading requirements for diverse deep learning models.
If your object detection approach requires the 'category_id' to start from 0 instead of 1, please refer to the following guide: https://github.com/obss/sahi/discussions/336
For conversion to a completely different dataset format, such as YOLO, please see the repository: https://github.com/ultralytics/JSON2YOLO
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset consists of birds (Budgerigar) flying motion in a controlled environment. Trajectories of birds flying from one perch to another perch were recorded using video cameras. The dataset contains 355 clips of individual events of a bird's flying motion, along with annotated images and 3D data generated from the events.
Only the bird in motion was annotated and 3D trajectories were calculated for all five parts of the bird that were annotated. The annotations were done manually using Computer Vision Annotation Tool (CVAT). For generating the 3D trajectories of the bird's flight motion, Matlab (MathWorks®) was used. The files for all five 3D trajectories are labelled as such; point3D_bird for the bird’s body, point3D_head for the head of the bird, point3D_tail for the tail of the bird, point_3D_left_wing and point_3D_right_wing for the left wing and right wing of the bird respectively. The annotation's file format is JSON, and the format of the annotation is Microsoft COCO . The 3D coordinates for the bird's trajectories are in .mat file format.
Dataset folder structure: clip_1 -->point3D_bird -->point3D_head -->point3D_tail -->point3D_left_wing -->point3D_right_wing -->left.mp4 -->right.mp4 -->left -->-->images (includes all the frames of clip 1 left) -->-->annotations (includes .json file for annotation of clip 1 left) -->right -->-->images (includes all the frames of clip 1 right) -->-->annotations (includes .json file for annotation of clip 1 right) . . . clip_355
There are three zip files: 1. Budges355 (Random 10 clips) where you can randomly find 10 clips for previewing the dataset. 2. Budges355 (clip 1-150).zip - clip 1 to clip 150 3. Budges355 (clip 151-355).zip - clip 151 to clip 355
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
As mentioned in the reference paper:
Dust storms are considered a severe meteorological disaster, especially in arid and semi-arid regions, which is characterized by dust aerosol-filled air and strong winds across an extensive area. Every year, a large number of aerosols are released from dust storms into the atmosphere, manipulating a deleterious impact both on the environment and human lives. Even if an increasing emphasis is being placed on dust storms due to the rapid change in global climate in the last fifty years by utilizing the measurements from the moderate-resolution imaging spectroradiometer (MODIS), the possibility of utilizing MODIS true-color composite images for the task has not been sufficiently discussed yet.
This data publication contains MODIS true-color dust images which are collected through an extensive visual inspection procedure to test the above hypothesis. This dataset includes a subset of the full dataset of RGB images each with visually-recognizable dust storm incidents in high latitude, temporally ranging from 2003 to 2019 over land as well as ocean throughout the world. All RGB images are manually annotated for dust storm detection using CVAT tool such that the dust-susceptible pixel area in the image is masked with (255, 255, 255) in RGB space (white) and the nonsusceptible pixel area is masked with (0, 0, 0) in RGB space (black).
This dataset contains 160 satellite true-colour images and their corresponding ground-truth label bitmaps, organized in two folders: images, and annotations. The associated notebook simply presents the image data visualization, statistical data augmentation and a U-Net-based model to detect dust storms in a semantic segment fashion.
The dataset of true-colour dust images, consisting of airborne dust and weaker dust traces, was collected using MODIS database from an extensive visual inspection procedure. The dataset can be used without additional permissions or fees.
If you use these data in a publication, presentation, or other research product please use the following citation:
N. Bandara, “Ensemble deep learning for automated dust storm detection using satellite images,” in 2022 International Research Conference on Smart Computing and Systems Engineering (SCSE), vol. 5. IEEE, 2022, pp. 178–183.
For interested researchers, please note that the paper is openly accessible at conference proceedings and/or here.
As described here, ``` You are free to: Share — copy and redistribute the material in any medium or format Adapt — remix, transform, and build upon the material for any purpose, even commercially. This license is acceptable for Free Cultural Works. The licensor cannot revoke these freedoms as long as you follow the license terms.
Under the following terms: Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
ShareAlike — If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.
No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits. ```
Leaves from genetically unique Juglans regia plants were scanned using X-ray micro-computed tomography (microCT) on the X-ray μCT beamline (8.3.2) at the Advanced Light Source (ALS) in Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA USA). Soil samples were collected in Fall of 2017 from the riparian oak forest located at the Russell Ranch Sustainable Agricultural Institute at the University of California Davis. The soil was sieved through a 2 mm mesh and was air dried before imaging. A single soil aggregate was scanned at 23 keV using the 10x objective lens with a pixel resolution of 650 nanometers on beamline 8.3.2 at the ALS. Additionally, a drought stressed almond flower bud (Prunus dulcis) from a plant housed at the University of California, Davis, was scanned using a 4x lens with a pixel resolution of 1.72 µm on beamline 8.3.2 at the ALS Raw tomographic image data was reconstructed using TomoPy. Reconstructions were converted to 8-bit tif or png format using ImageJ or the PIL package in Python before further processing. Images were annotated using Intel’s Computer Vision Annotation Tool (CVAT) and ImageJ. Both CVAT and ImageJ are free to use and open source. Leaf images were annotated in following Théroux-Rancourt et al. (2020). Specifically, Hand labeling was done directly in ImageJ by drawing around each tissue; with 5 images annotated per leaf. Care was taken to cover a range of anatomical variation to help improve the generalizability of the models to other leaves. All slices were labeled by Dr. Mina Momayyezi and Fiona Duong.To annotate the flower bud and soil aggregate, images were imported into CVAT. The exterior border of the bud (i.e. bud scales) and flower were annotated in CVAT and exported as masks. Similarly, the exterior of the soil aggregate and particulate organic matter identified by eye were annotated in CVAT and exported as masks. To annotate air spaces in both the bud and soil aggregate, images were imported into ImageJ. A gaussian blur was applied to the image to decrease noise and then the air space was segmented using thresholding. After applying the threshold, the selected air space region was converted to a binary image with white representing the air space and black representing everything else. This binary image was overlaid upon the original image and the air space within the flower bud and aggregate was selected using the “free hand” tool. Air space outside of the region of interest for both image sets was eliminated. The quality of the air space annotation was then visually inspected for accuracy against the underlying original image; incomplete annotations were corrected using the brush or pencil tool to paint missing air space white and incorrectly identified air space black. Once the annotation was satisfactorily corrected, the binary image of the air space was saved. Finally, the annotations of the bud and flower or aggregate and organic matter were opened in ImageJ and the associated air space mask was overlaid on top of them forming a three-layer mask suitable for training the fully convolutional network. All labeling of the soil aggregate and soil aggregate images was done by Dr. Devin Rippner. These images and annotations are for training deep learning models to identify different constituents in leaves, almond buds, and soil aggregates Limitations: For the walnut leaves, some tissues (stomata, etc.) are not labeled and only represent a small portion of a full leaf. Similarly, both the almond bud and the aggregate represent just one single sample of each. The bud tissues are only divided up into buds scales, flower, and air space. Many other tissues remain unlabeled. For the soil aggregate annotated labels are done by eye with no actual chemical information. Therefore particulate organic matter identification may be incorrect. Resources in this dataset:Resource Title: Annotated X-ray CT images and masks of a Forest Soil Aggregate. File Name: forest_soil_images_masks_for_testing_training.zipResource Description: This aggregate was collected from the riparian oak forest at the Russell Ranch Sustainable Agricultural Facility. The aggreagate was scanned using X-ray micro-computed tomography (microCT) on the X-ray μCT beamline (8.3.2) at the Advanced Light Source (ALS) in Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA USA) using the 10x objective lens with a pixel resolution of 650 nanometers. For masks, the background has a value of 0,0,0; pores spaces have a value of 250,250, 250; mineral solids have a value= 128,0,0; and particulate organic matter has a value of = 000,128,000. These files were used for training a model to segment the forest soil aggregate and for testing the accuracy, precision, recall, and f1 score of the model.Resource Title: Annotated X-ray CT images and masks of an Almond bud (P. Dulcis). File Name: Almond_bud_tube_D_P6_training_testing_images_and_masks.zipResource Description: Drought stressed almond flower bud (Prunis dulcis) from a plant housed at the University of California, Davis, was scanned by X-ray micro-computed tomography (microCT) on the X-ray μCT beamline (8.3.2) at the Advanced Light Source (ALS) in Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA USA) using the 4x lens with a pixel resolution of 1.72 µm using. For masks, the background has a value of 0,0,0; air spaces have a value of 255,255, 255; bud scales have a value= 128,0,0; and flower tissues have a value of = 000,128,000. These files were used for training a model to segment the almond bud and for testing the accuracy, precision, recall, and f1 score of the model.Resource Software Recommended: Fiji (ImageJ),url: https://imagej.net/software/fiji/downloads Resource Title: Annotated X-ray CT images and masks of Walnut leaves (J. Regia) . File Name: 6_leaf_training_testing_images_and_masks_for_paper.zipResource Description: Stems were collected from genetically unique J. regia accessions at the 117 USDA-ARS-NCGR in Wolfskill Experimental Orchard, Winters, California USA to use as scion, and were grafted by Sierra Gold Nursery onto a commonly used commercial rootstock, RX1 (J. microcarpa × J. regia). We used a common rootstock to eliminate any own-root effects and to simulate conditions for a commercial walnut orchard setting, where rootstocks are commonly used. The grafted saplings were repotted and transferred to the Armstrong lathe house facility at the University of California, Davis in June 2019, and kept under natural light and temperature. Leaves from each accession and treatment were scanned using X-ray micro-computed tomography (microCT) on the X-ray μCT beamline (8.3.2) at the Advanced Light Source (ALS) in Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA USA) using the 10x objective lens with a pixel resolution of 650 nanometers. For masks, the background has a value of 170,170,170; Epidermis value= 85,85,85; Mesophyll value= 0,0,0; Bundle Sheath Extension value= 152,152,152; Vein value= 220,220,220; Air value = 255,255,255.Resource Software Recommended: Fiji (ImageJ),url: https://imagej.net/software/fiji/downloads