Voila-COCO Dataset Instruction
File structure
. |-- README.md |-- voila_anno.json # annotation file contains traces and corresponding QA pairs |-- voila_image.json # compressed image data |-- voila_meta.json # meta index of conversation `-- voilagaze_dataset.py # Our torch dataset implementation you can use this to quickly leverage the data
Get dataset sample
you can direct run voilagaze_dataset.py to get a sample in the following structure: example = {… See the full description on the dataset page: https://huggingface.co/datasets/skywang/VOILA-COCO.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Vehicles Coco is a dataset for object detection tasks - it contains Vehicles annotations for 18,998 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data used for our paper "WormSwin: Instance Segmentation of C. elegans using Vision Transformer".This publication is divided into three parts:
CSB-1 Dataset
Synthetic Images Dataset
MD Dataset
The CSB-1 Dataset consists of frames extracted from videos of Caenorhabditis elegans (C. elegans) annotated with binary masks. Each C. elegans is separately annotated, providing accurate annotations even for overlapping instances. All annotations are provided in binary mask format and as COCO Annotation JSON files (see COCO website).
The videos are named after the following pattern:
<"worm age in hours"_"mutation"_"irradiated (binary)"_"video index (zero based)">
For mutation the following values are possible:
wild type
csb-1 mutant
csb-1 with rescue mutation
An example video name would be 24_1_1_2 meaning it shows C. elegans with csb-1 mutation, being 24h old which got irradiated.
Video data was provided by M. Rieckher; Instance Segmentation Annotations were created under supervision of K. Bozek and M. Deserno.The Synthetic Images Dataset was created by cutting out C. elegans (foreground objects) from the CSB-1 Dataset and placing them randomly on background images also taken from the CSB-1 Dataset. Foreground objects were flipped, rotated and slightly blurred before placed on the background images.The same was done with the binary mask annotations taken from CSB-1 Dataset so that they match the foreground objects in the synthetic images. Additionally, we added rings of random color, size, thickness and position to the background images to simulate petri-dish edges.
This synthetic dataset was generated by M. Deserno.The Mating Dataset (MD) consists of 450 grayscale image patches of 1,012 x 1,012 px showing C. elegans with high overlap, crawling on a petri-dish.We took the patches from a 10 min. long video of size 3,036 x 3,036 px. The video was downsampled from 25 fps to 5 fps before selecting 50 random frames for annotating and patching.Like the other datasets, worms were annotated with binary masks and annotations are provided as COCO Annotation JSON files.
The video data was provided by X.-L. Chu; Instance Segmentation Annotations were created under supervision of K. Bozek and M. Deserno.
Further details about the datasets can be found in our paper.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
The 3D surgical tool dataset (3dStool) has been constructed with the aim of assisting the development of computer vision techniques that address the operating room. Note, functions for visualisation, processing, and splitting the dataset can be found in the relevant github repository.
Specifically, even though laparoscopic scenes have received a lot of attention in terms of labelled images, surgical tools that are used at initial stages of an operation, such as scalpels and scissors, have not had any such datasets developed.
3dStool includes 5370 images, accompanied by manually drawn polygon labels, as well as information on the 3D pose of these tools in operation. The tools were recorded while operating on a cadaveric knee. A RealSense D415 was used for image collection, while an optical tracker was employed for the purpose of 3D pose recording. Four surgical tools have been collected for now:
Scalpel
Scissors
Forceps
Electric Burr
An annotation json file (in the format of COCO) exists for the images, containing the masks, boxes, and other relevant information. Furthermore, pose information is provided in two different manners.
Firstly, a csv in the following format:
CSV Structure
Column
1
2
3
4
5
6
7
8
9
Value
X (m)
Y (m)
Z (m)
qi
qj
qk
ql
Class
Image Name
Position and orientation are both provided in the coordinate axes of the camera used to obtain the data (Realsense D415, Intel, USA). Pose is provided in the form of quaternions, however it is possible to convert this format into other available notations.
The pose data can also be combined with the masks in the form of a final json file, in order to obtain a final COCO-format json with object poses as well. In the data provided, each of the test, train and validation subsets have their own COCO-like json files with the poses fused within, although the "orignal_jsons" only provide the image masks.
The files and directories are structured as follows. Note that this example is based on the "train" directory, but a similar structure has been created for the test and val sets:
Train
manual_json - Contains the json created when manually annotating, the images, therefore no pose data included
pose - Contains the CSV file with the poses of the relevant images, explained in the table above
pose_json - Contains the fused json that includes both the annotations and the pose data for each image
surgical2020 - Contains the images in jpg format
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This page only provides the drone-view image dataset.
The dataset contains drone-view RGB images, depth maps and instance segmentation labels collected from different scenes. Data from each scene is stored in a separate .7z file, along with a color_palette.xlsx
file, which contains the RGB_id and corresponding RGB values.
All files follow the naming convention: {central_tree_id}_{timestamp}
, where {central_tree_id}
represents the ID of the tree centered in the image, which is typically in a prominent position, and timestamp
indicates the time when the data was collected.
Specifically, each 7z file includes the following folders:
rgb: This folder contains the RGB images (PNG) of the scenes and their metadata (TXT). The metadata describes the weather conditions and the world time when the image was captured. An example metadata entry is: Weather:Snow_Blizzard,Hour:10,Minute:56,Second:36
.
depth_pfm: This folder contains absolute depth information of the scenes, which can be used to reconstruct the point cloud of the scene through reprojection.
instance_segmentation: This folder stores instance segmentation labels (PNG) for each tree in the scene, along with metadata (TXT) that maps tree_id
to RGB_id
. The tree_id
can be used to look up detailed information about each tree in obj_info_final.xlsx
, while the RGB_id
can be matched to the corresponding RGB values in color_palette.xlsx
. This mapping allows for identifying which tree corresponds to a specific color in the segmentation image.
obj_info_final.xlsx: This file contains detailed information about each tree in the scene, such as position, scale, species, and various parameters, including trunk diameter (in cm), tree height (in cm), and canopy diameter (in cm).
landscape_info.txt: This file contains the ground location information within the scene, sampled every 0.5 meters.
For birch_forest, broadleaf_forest, redwood_forest and rainforest, we also provided COCO-format annotation files (.json). Two such files can be found in these datasets:
⚠️: 7z files that begin with "!" indicate that the RGB values in the images within the instance_segmentation
folder cannot be found in color_palette.xlsx
. Consequently, this prevents matching the trees in the segmentation images to their corresponding tree information, which may hinder the application of the dataset to certain tasks. This issue is related to a bug in Colossium/AirSim, which has been reported in link1 and link2.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description from the SaRNet: A Dataset for Deep Learning Assisted Search and Rescue with Satellite Imagery GitHub Repository * The "Note" was added by the Roboflow team.
This is a single class dataset consisting of tiles of satellite imagery labeled with potential 'targets'. Labelers were instructed to draw boxes around anything they suspect may a paraglider wing, missing in a remote area of Nevada. Volunteers were shown examples of similar objects already in the environment for comparison. The missing wing, as it was found after 3 weeks, is shown below.
https://michaeltpublic.s3.amazonaws.com/images/anomaly_small.jpg" alt="anomaly">
The dataset contains the following:
Set | Images | Annotations |
---|---|---|
Train | 1808 | 3048 |
Validate | 490 | 747 |
Test | 254 | 411 |
Total | 2552 | 4206 |
The data is in the COCO format, and is directly compatible with faster r-cnn as implemented in Facebook's Detectron2.
Download the data here: sarnet.zip
Or follow these steps
# download the dataset
wget https://michaeltpublic.s3.amazonaws.com/sarnet.zip
# extract the files
unzip sarnet.zip
***Note* with Roboflow, you can download the data here** (original, raw images, with annotations): https://universe.roboflow.com/roboflow-public/sarnet-search-and-rescue/ (download v1, original_raw-images) * Download the dataset in COCO JSON format, or another format of choice, and import them to Roboflow after unzipping the folder to get started on your project.
Get started with a Faster R-CNN model pretrained on SaRNet: SaRNet_Demo.ipynb
Source code for the paper is located here: SaRNet_train_test.ipynb
@misc{thoreau2021sarnet,
title={SaRNet: A Dataset for Deep Learning Assisted Search and Rescue with Satellite Imagery},
author={Michael Thoreau and Frazer Wilson},
year={2021},
eprint={2107.12469},
archivePrefix={arXiv},
primaryClass={eess.IV}
}
The source data was generously provided by Planet Labs, Airbus Defence and Space, and Maxar Technologies.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Mechanical Parts Dataset
The dataset consists of a total of 2250 images obtained by downloading from various internet platforms. Among the images in the dataset, there are 714 images with bearings, 632 images with bolts, 616 images with gears and 586 images with nuts. A total of 10597 manual labeling processes were carried out in the dataset, including 2099 labels belonging to the bearing class, 2734 labels belonging to the bolt class, 2662 labels belonging to the gear class and 3102 labels belonging to the nut class.
Folder Content
The created dataset is divided into 3 as 80% train, 10% validation and 10% test. In the "Mechanical Parts Dataset" folder, there are three separate folders as "train", "test" and "val". In each of these three folders there are folders named "images" and "labels". Images are kept in the "images" folder and tag information is kept in the "labels" folder.
Finally, inside the folder there is a yaml file named "mech_parts_data" for the Yolo algorithm. This file contains the number of classes and class names.
Images and Labels
The dataset was prepared in accordance with the Yolov5 algorithm.
For example, the tag information of the image named "2a0xhkr_jpg.rf.45a11bf63c40ad6e47da384fdf6bb7a1.jpg" is stored in the txt file with the same name. The tag information (coordinates) in the txt file are as follows: "class x_center y_center width height".
Update 05.01.2023
***Pascal voc and coco json formats have been added.***
Related paper: doi.org/10.5281/zenodo.7496767
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains figure bounding boxes corresponding to the bioRxiv 10k dataset.
It provides annotations in two formats:
COCO format (JSON)
JATS XML with GROBID's "coords" attribute
The COCO format contains bounding boxes in rendered pixel units, as well as PDF user units. The latter uses field names with the "pt_" prefix.
The "coords" attribute uses the PDF user units.
The dataset was generated by using an algorithm to find the figure images within the rendered PDF pages. The main algorithm used for that purpose is SIFT. As a fallback, OpenCV's Template Matching (with multi scaling) was used. There may be some error cases in the document. Very few documents were excluded, were neither algorithm was able to find any match for one of the figure images (six documents in the train subset, two documents in the test subset).
Figure images may appear next to a figure description, but they may also appear as "attachments". The latter usually appears at the end of the document (but not always) and often on pages with dimensions different to the regular page size (but not always).
This dataset itself doesn't contain any images. The PDF to render pages can be found in the bioRxiv 10k dataset.
The dataset is intended for training or evaluation purposes of the semantic Figure extraction. The evaluation score would be calculated by comparing the extracted bounding boxes with the one from this purpose. (example implementation ScienceBeam Judge)
The dataset was created as part of eLife's ScienceBeam project.
Methods
Cotton plants were grown in a well-controlled greenhouse in the NC State Phytotron as described previously (Pierce et al, 2019). Flowers were tagged on the day of anthesis and harvested three days post anthesis (3 DPA). The distinct fiber shapes had already formed by 2 DPA (Stiff and Haigler, 2016; Graham and Haigler, 2021), and fibers were still relatively short at 3 DPA, which facilitated the visualization of multiple fiber tips in one image.
Cotton fiber sample preparation, digital image collection, and image analysis:
Ovules with attached fiber were fixed in the greenhouse. The fixative previously used (Histochoice) (Stiff and Haigler, 2016; Pierce et al., 2019; Graham and Haigler, 2021) is obsolete, which led to testing and validation of another low-toxicity, formalin-free fixative (#A5472; Sigma-Aldrich, St. Louis, MO; Fig. S1). The boll wall was removed without damaging the ovules. (Using a razor blade, cut away the top 3 mm of the boll. Make about 1 mm deep longitudinal incisions between the locule walls, and finally cut around the base of the boll.) All of the ovules with attached fiber were lifted out of the locules and fixed (1 h, RT, 1:10 tissue:fixative ratio) prior to optional storage at 4°C. Immediately before imaging, ovules were examined under a stereo microscope (incident light, black background, 31X) to select three vigorous ovules from each boll while avoiding drying. Ovules were rinsed (3 x 5 min) in buffer [0.05 M PIPES, 12 mM EGTA. 5 mM EDTA and 0.1% (w/v) Tween 80, pH 6.8], which had lower osmolarity than a microtubule-stabilizing buffer used previously for aldehyde-fixed fibers (Seagull, 1990; Graham and Haigler, 2021). While steadying an ovule with forceps, one to three small pieces of its chalazal end with attached fibers were dissected away using a small knife (#10055-12; Fine Science Tools, Foster City, CA). Each ovule piece was placed in a single well of a 24-well slide (#63430-04; Electron Microscopy Sciences, Hatfield, PA) containing a single drop of buffer prior to applying and sealing a 24 x 60 mm coverslip with vaseline.
Samples were imaged with brightfield optics and default settings for the 2.83 mega-pixel, color, CCD camera of the Keyence BZ-X810 imaging system (www.keyence.com; housed in the Cellular and Molecular Imaging Facility of NC State). The location of each sample in the 24-well slides was identified visually using a 2X objective and mapped using the navigation function of the integrated Keyence software. Using the 10X objective lens (plan-apochromatic; NA 0.45) and 60% closed condenser aperture setting, a region with many fiber apices was selected for imaging using the multi-point and z-stack capture functions. The precise location was recorded by the software prior to visual setting of the limits of the z-plane range (1.2 µm step size). Typically, three 24-sample slides (representing three accessions) were set up in parallel prior to automatic image capture. The captured z-stacks for each sample were processed into one two-dimensional image using the full-focus function of the software. (Occasional samples contained too much debris for computer vision to be effective, and these were reimaged.)
Resource Title: Deltapine 90 - Manually Annotated Training Set.
File Name: GH3 DP90 Keyence 1_45 JPEG.zip
Resource Description: These images were manually annotated in Labelbox.
Resource Title: Deltapine 90 - AI-Assisted Annotated Training Set.
File Name: GH3 DP90 Keyence 46_101 JPEG.zip
Resource Description: These images were AI-labeled in RoboFlow and then manually reviewed in RoboFlow.
Resource Title: Deltapine 90 - Manually Annotated Training-Validation Set.
File Name: GH3 DP90 Keyence 102_125 JPEG.zip
Resource Description: These images were manually labeled in LabelBox, and then used for training-validation for the machine learning model.
Resource Title: Phytogen 800 - Evaluation Test Images.
File Name: Gb cv Phytogen 800.zip
Resource Description: These images were used to validate the machine learning model. They were manually annotated in ImageJ.
Resource Title: Pima 3-79 - Evaluation Test Images.
File Name: Gb cv Pima 379.zip
Resource Description: These images were used to validate the machine learning model. They were manually annotated in ImageJ.
Resource Title: Pima S-7 - Evaluation Test Images.
File Name: Gb cv Pima S7.zip
Resource Description: These images were used to validate the machine learning model. They were manually annotated in ImageJ.
Resource Title: Coker 312 - Evaluation Test Images.
File Name: Gh cv Coker 312.zip
Resource Description: These images were used to validate the machine learning model. They were manually annotated in ImageJ.
Resource Title: Deltapine 90 - Evaluation Test Images.
File Name: Gh cv Deltapine 90.zip
Resource Description: These images were used to validate the machine learning model. They were manually annotated in ImageJ.
Resource Title: Half and Half - Evaluation Test Images.
File Name: Gh cv Half and Half.zip
Resource Description: These images were used to validate the machine learning model. They were manually annotated in ImageJ.
Resource Title: Fiber Tip Annotations - Manual.
File Name: manual_annotations.coco_.json
Resource Description: Annotations in COCO.json format for fibers. Manually annotated in Labelbox.
Resource Title: Fiber Tip Annotations - AI-Assisted.
File Name: ai_assisted_annotations.coco_.json
Resource Description: Annotations in COCO.json format for fibers. AI annotated with human review in Roboflow.
Resource Title: Model Weights (iteration 600).
File Name: model_weights.zip
Resource Description: The final model, provided as a zipped Pytorch .pth
file. It was chosen at training iteration 600.
The model weights can be imported for use of the fiber tip type detection neural network in Python.
Resource Software Recommended: Google Colab,url: https://research.google.com/colaboratory/
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
The Growing Strawberries Dataset (GSD) is a curated multiple-object tracking dataset inspired by the growth monitoring of strawberries. The frames were taken at hourly intervals by six cameras for in total of 16 months in 2021 and 2022, covering 12 plants in two greenhouses respectively. The dataset consists of hourly images collected during the cultivation period, bounding box (bbox) annotations of strawberry fruits, and precise identification and tracking of strawberries over time. GSD contains two types of images - RGB (color) and OCN (orange, cyan, near-infrared). These images were captured throughout the cultivation period. Each image sequence represents all the images captured by one camera during the year of cultivation. These sequences are named using the format "
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Voila-COCO Dataset Instruction
File structure
. |-- README.md |-- voila_anno.json # annotation file contains traces and corresponding QA pairs |-- voila_image.json # compressed image data |-- voila_meta.json # meta index of conversation `-- voilagaze_dataset.py # Our torch dataset implementation you can use this to quickly leverage the data
Get dataset sample
you can direct run voilagaze_dataset.py to get a sample in the following structure: example = {… See the full description on the dataset page: https://huggingface.co/datasets/skywang/VOILA-COCO.