Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This Is Keypoint-Only subset from COCO 2017 Dataset. You can access the original COCO Dataset from here
This Dataset contains three folders: annotations, val2017, and train2017. - Contents in annotation folder is two jsons, for val dan train. Each jsons contains various informations, like the image id, bounding box, and keypoints locations. - Contents of val2017 and train2017 is various images that have been filtered. They are the images that have num_keypoints > 0 according to the annotation file.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This remarkable dataset of lunar images captured by the LRO Camera has been meticulously labeled in COCO format for object detection tasks in computer vision. The COCO annotation format provides a standardized way of describing objects in the images, including their locations and class labels, enabling machine learning algorithms to learn to recognize and detect objects in the images more accurately.
This dataset captures a wide variety of lunar features, including craters, mountains, and other geological formations, all labeled with precise and consistent COCO annotation. The dataset's comprehensive coverage of craters and other geological features on the Moon provides a treasure trove of data and insights into the evolution of our closest celestial neighbor.
The COCO annotation format is particularly well-suited for handling complex scenes with multiple objects, occlusions, and overlapping objects. With the precise labeling of objects provided by COCO annotation, this dataset enables researchers and scientists to train machine learning algorithms to automatically detect and analyze these features in large datasets.
In conclusion, this valuable dataset of lunar images labeled in COCO annotation format provides a powerful tool for research and discovery in the field of planetary science. With its comprehensive coverage and precise labeling of lunar features, it offers a wealth of data and insights into the evolution of the Moon's landscape, facilitating research and understanding of this enigmatic celestial body.
This dataset was created by Prateek_ag
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Furniture_detection
Introduction
This is a furniture detction dataset with synthetic dataset generated in Unreal engine and annoatated with coco-annotator.
Dataset Structure
/UnrealEngine_Furniture/ Images/ HighresScreenshot00000.png HighresScreenshot00001.png ... HighresScreenshot00219.png coco_annotator.json
Annotation examples
:--
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The MS COCO (Microsoft Common Objects in Context) dataset is a large-scale object detection, segmentation, key-point detection, and captioning dataset. The dataset consists of 164K images.
This is the original version from 2014 made available here for easy access in Kaggle and because it does not seem to be still available on the COCO Dataset website. This has been retrieved from the mirror that Joseph Redmon has setup on this own website.
The 2014 version of the COCO dataset is an excellent object detection dataset with 80 classes, 82,783 training images and 40,504 validation images. This dataset contains all this imagery on two folders as well as the annotation with the class and location (bounding box) of the objects contained in each image.
The initial split provides training (83K), validation (41K) and test (41K) sets. Since the split between training and validation was not optimal in the original dataset, there is also two text (.part) files with a new split with only 5,000 images for validation and the rest for training. The test set has no labels and can be used for visual validation or pseudo-labelling.
This is mostly inspired by Erik Linder-Norén and [Joseph Redmon](https://pjreddie.com/darknet/yolo
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset provides annotated very-high-resolution satellite RGB images extracted from Google Earth to train deep learning models to perform instance segmentation of Juniperus communis L. and Juniperus sabina L. shrubs. All images are from the high mountain of Sierra Nevada in Spain. The dataset contains 810 images (.jpg) of size 224x224 pixels. We also provide partitioning of the data into Train (567 images), Test (162 images), and Validation (81 images) subsets. Their annotations are provided in three different .json files following the COCO annotation format.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset is designed for object detection tasks and follows the COCO format. It contains 300 images and corresponding annotation files in JSON format. The dataset is split into training, validation, and test sets, ensuring a balanced distribution for model evaluation.
train/ (70% - 210 images)
valid/ (15% - 45 images)
test/ (15% - 45 images)
Images in JPEG/PNG format.
A corresponding _annotations.coco.json file that includes bounding box annotations.
The dataset has undergone several preprocessing and augmentation steps to enhance model generalization:
Auto-orientation applied
Resized to 640x640 pixels (stretched)
Flip: Horizontal flipping
Crop: 0% minimum zoom, 5% maximum zoom
Rotation: Between -5° and +5°
Saturation: Adjusted between -4% and +4%
Brightness: Adjusted between -10% and +10%
Blur: Up to 0px
Noise: Up to 0.1% of pixels
Bounding Box Augmentations:
Flipping, cropping, rotation, brightness adjustments, blur, and noise applied accordingly to maintain annotation consistency.
The dataset follows the COCO (Common Objects in Context) format, which includes:
images section: Contains image metadata such as filename, width, and height.
annotations section: Includes bounding boxes, category IDs, and segmentation masks (if applicable).
categories section: Defines class labels.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This paper introduces a dataset for object detection training, which includes 12 types of construction machinery commonly operated at civil engineering sites. To collect this dataset, a housing development construction site in South Korea was selected, and a video collection system was operated over a six-month period. Frames were extracted from the collected video footage on a daily basis, resulting in a total of 87,766 images in the full training dataset. Using the COCO Annotator tool, labels and bounding box annotations were processed, generating a total of 856,485 objects.
https://www.marketreportanalytics.com/privacy-policyhttps://www.marketreportanalytics.com/privacy-policy
The global data annotation and labeling tool market is experiencing robust growth, driven by the increasing demand for high-quality training data in artificial intelligence (AI) and machine learning (ML) applications. The market, estimated at $2 billion in 2025, is projected to achieve a Compound Annual Growth Rate (CAGR) of 25% from 2025 to 2033, reaching approximately $10 billion by 2033. This expansion is fueled by several key factors. Firstly, the proliferation of AI applications across diverse sectors such as automotive (autonomous driving), healthcare (medical image analysis), and finance (fraud detection) is creating an insatiable need for accurate and efficiently labeled data. Secondly, the advancement of deep learning techniques requires massive datasets, further boosting demand for annotation and labeling tools. Finally, the emergence of sophisticated tools offering automated and semi-supervised annotation capabilities is streamlining the process and reducing costs, making the technology accessible to a broader range of organizations. However, market growth is not without its challenges. Data privacy concerns and the need for robust data security protocols pose significant restraints. The high cost associated with specialized expertise in data annotation can also limit adoption, particularly for smaller companies. Despite these challenges, the market segmentation reveals opportunities. The automatic annotation segment is anticipated to grow rapidly due to its efficiency gains, while applications within the healthcare and automotive sectors are expected to dominate the market share, reflecting the considerable investment in AI across these industries. Leading players like Labelbox, Scale AI, and SuperAnnotate are strategically positioning themselves to capitalize on this growth by focusing on developing advanced tools, expanding their partnerships, and entering new geographic markets. The North American market currently holds the largest share, but the Asia-Pacific region is projected to experience the fastest growth due to increased investment in AI research and development across countries such as China and India.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data used for our paper "WormSwin: Instance Segmentation of C. elegans using Vision Transformer".This publication is divided into three parts:
CSB-1 Dataset
Synthetic Images Dataset
MD Dataset
The CSB-1 Dataset consists of frames extracted from videos of Caenorhabditis elegans (C. elegans) annotated with binary masks. Each C. elegans is separately annotated, providing accurate annotations even for overlapping instances. All annotations are provided in binary mask format and as COCO Annotation JSON files (see COCO website).
The videos are named after the following pattern:
<"worm age in hours"_"mutation"_"irradiated (binary)"_"video index (zero based)">
For mutation the following values are possible:
wild type
csb-1 mutant
csb-1 with rescue mutation
An example video name would be 24_1_1_2 meaning it shows C. elegans with csb-1 mutation, being 24h old which got irradiated.
Video data was provided by M. Rieckher; Instance Segmentation Annotations were created under supervision of K. Bozek and M. Deserno.The Synthetic Images Dataset was created by cutting out C. elegans (foreground objects) from the CSB-1 Dataset and placing them randomly on background images also taken from the CSB-1 Dataset. Foreground objects were flipped, rotated and slightly blurred before placed on the background images.The same was done with the binary mask annotations taken from CSB-1 Dataset so that they match the foreground objects in the synthetic images. Additionally, we added rings of random color, size, thickness and position to the background images to simulate petri-dish edges.
This synthetic dataset was generated by M. Deserno.The Mating Dataset (MD) consists of 450 grayscale image patches of 1,012 x 1,012 px showing C. elegans with high overlap, crawling on a petri-dish.We took the patches from a 10 min. long video of size 3,036 x 3,036 px. The video was downsampled from 25 fps to 5 fps before selecting 50 random frames for annotating and patching.Like the other datasets, worms were annotated with binary masks and annotations are provided as COCO Annotation JSON files.
The video data was provided by X.-L. Chu; Instance Segmentation Annotations were created under supervision of K. Bozek and M. Deserno.
Further details about the datasets can be found in our paper.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
DoPose (Dortmund Pose)is a dataset of highly cluttered and closely stacked objects. The dataset is saved in the BOP format. The dataset includes RGB images, Depth images, 6D Pose of objects, segmentation mask (all and visible), COCO Json annotation, camera transformations, and 3D model of all objects. The dataset contains 2 different types of scenes (table and bin). Each scene contains different view angles. For the bin scenes, the data contains 183 scenes with 2150 image views. In those 183 scenes 35 scenes contain 2 views, 20 contains 3 views and 128 contains 16 views. And for table scenes, the data contains 118 scenes with 1175 image views. in Those 118 scenes, 20 scenes contain 3 views, 50 scenes with 6 images, and 48 scenes with 17 images. So in total, our data contains 301 scenes and 3325 view images. Most of the scenes contain mixed objects. The dataset contains 19 objects in total.
For more info about the dataset content and collection process please refer to our Arxiv preprint
If you have any questions about the dataset, please contact anas.gouda@tu-dortmund.de
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This synthetic Siberian Larch tree crown dataset was created for upscaling and machine learning purposes as a part of the SiDroForest (Siberia Drone Forest Inventory) project. The SiDroForest data collection (https://www.pangaea.de/?q=keyword%3A%22SiDroForest%22) consists of vegetation plots covered in Siberia during a 2-month fieldwork expedition in 2018 by the Alfred Wegener Institute Helmholtz Centre for Polar and Marine Research in Germany. During fieldwork fifty-six, 50*50-meter vegetation plots were covered by Unmanned Aerial Vehicle (UAV) flights and Red Green Blue (RGB) and Red Green Near Infrared (RGNIR) photographs were taken with a consumer grade DJI Phantom 4 quadcopter. The synthetic dataset provided here contains Larch (Larix gmelinii (Rupr.) Rupr. and Larix cajanderi Mayr.) tree crowns extracted from the onboard camera RGB UAV images of five selected vegetation plots from this expedition, placed on top of full-resized images from the same RGB flights. The extracted tree crowns have been rotated, rescaled and repositioned across the images with the result of a diverse synthetic dataset that contains 10.000 images for training purposes and 2000 images for validation purposes for complex machine learning neural networks. In addition, the data is saved in the Microsoft's Common Objects in Context dataset (COCO) format (Lin et al.,2013) and can be easily loaded as a dataset for networks such as the Mask R-CNN, U-Nets or the Faster R-NN. These are neural networks for instance segmentation tasks that have become more frequently used over the years for forest monitoring purposes. The images included in this dataset are from the field plots: EN18062 (62.17° N 127.81° E), EN18068 (63.07° N 117.98° E), EN18074 (62.22° N 117.02° E), EN18078 (61.57° N 114.29° E), EN18083 (59.97° N 113° E), located in Central Yakutia, Siberia. These sites were selected based on their vegetation content, their spectral differences in color as well as UAV flight angles and the clarity of the UAV images that were taken with automatic shutter and white balancing (Brieger et al. 2019). From each site 35 images were selected in order of acquisition, starting at the fifteenth image in the flight to make up the backgrounds for the dataset. The first fifteen images were excluded because they often contain a visual representation of the research team. The 117 tree crowns were manually cut out in Gimp software to ensure that they were all Larix trees.Of the tree crowns,15% were included that are at the margin of the image to make sure that the algorithm does not rely on a full tree crown in order to detect a tree. As a background image for the extracted tree crowns, 35 raw UAV images for each of the five sites were selected were included. The images were selected based on their content. In some of the UAV images, the research teams are visible and those have been excluded from this dataset. The five sites were selected based on their spectral diversity, and their vegetation content. The raw UAV images were cropped to 640 by 480 pixels at a resolution of 72 dpi. These are later rescaled to 448 by 448 pixels in the process of the dataset creation. In total there were 175 cropped backgrounds. The synthetic images and their corresponding annotations and masks were created using the cocosynth python software provided by Adam Kelly (2019). The software is open source and available on GitHub: https://github.com/akTwelve/cocosynth. The software takes the tree crowns and rescales and transform them before placing up to three tree crowns on the backgrounds that were provided. The software also creates matching masks that are used by instance segmentation and object detection algorithms to learn the shapes and location of the synthetic crown. COCO annotation files with information about the crowns name and label are also generated. This format can be loaded into a variety of neural networks for training purposes.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
MJ-COCO-2025 is a modified version of the MS-COCO-2017 dataset, in which the annotation errors have been automatically corrected using model-driven methods. The name "MJ" originates from the initials of Min Je Kim, the individual who updated the dataset. "MJ" also stands for "Modification & Justification," emphasizing that the modifications were not manually edited but were systematically validated through machine learning models to increase reliability and quality. Thus, MJ-COCO-2025 reflects both a personal identity and a commitment to improving the dataset through thoughtful modification, ensuring improved accuracy, reliability and consistency. The comparative results of MS-COCO and MJ-COCO datasets are presented in Table 1 and Figure 1. The MJ-COCO-2025 dataset features the improvements, including fixes for group annotations, addition of missing annotations, removal of redundant or overlapping labels, etc. These refinements aim to improve training and evaluation performance in object detection tasks.
The re-labeled MJ-COCO-2025 dataset exhibits notable improvements in annotation quality compared to the original MS-COCO-2017 dataset. As shown in Table 1, it includes substantial increases in categories such as previously missing annotations and group annotations. At the same time, the dataset has been refined by reducing annotation noise through the removal of duplicates, resolution of challenging or debatable cases, and elimination of non-existent object annotations.
Table 1: Comparison of Class-wise Annotations: MS-COCO-2017 and MJ-COCO-2025. Class Names | MS-COCO | MJ-COCO | Difference | Class Names | MS-COCO | MJ-COCO | Difference ---------------------|---------|---------|------------|----------------------|---------|---------|------------ Airplane | 5,135 | 5,810 | 675 | Kite | 9,076 | 15,092 | 6,016 Apple | 5,851 | 19,527 | 13,676 | Knife | 7,770 | 6,697 | -1,073 Backpack | 8,720 | 10,029 | 1,309 | Laptop | 4,970 | 5,280 | 310 Banana | 9,458 | 49,705 | 40,247 | Microwave | 1,673 | 1,755 | 82 Baseball Bat | 3,276 | 3,517 | 241 | Motorcycle | 8,725 | 10,045 | 1,320 Baseball Glove | 3,747 | 3,440 | -307 | Mouse | 2,262 | 2,377 | 115 Bear | 1,294 | 1,311 | 17 | Orange | 6,399 | 18,416 | 12,017 Bed | 4,192 | 4,177 | -15 | Oven | 3,334 | 4,310 | 976 Bench | 9,838 | 9,784 | -54 | Parking Meter | 1,285 | 1,355 | 70 Bicycle | 7,113 | 7,853 | 740 | Person | 262,465 | 435,252 | 172,787 Bird | 10,806 | 13,346 | 2,540 | Pizza | 5,821 | 6,049 | 228 Boat | 10,759 | 13,386 | 2,627 | Potted Plant | 8,652 | 11,252 | 2,600 Book | 24,715 | 35,712 | 10,997 | Refrigerator | 2,637 | 2,728 | 91 Bottle | 24,342 | 32,455 | 8,113 | Remote | 5,703 | 5,428 | -275 Bowl | 14,358 | 13,591 | -767 | Sandwich | 4,373 | 3,925 | -448 Broccoli | 7,308 | 14,275 | 6,967 | Scissors | 1,481 | 1,558 | 77 Bus | 6,069 | 7,132 | 1,063 | Sheep | 9,509 | 12,813 | 3,304 Cake | 6,353 | 8,968 | 2,615 | Sink | 5,610 | 5,969 | 359 Car | 43,867 | 51,662 | 7,795 | Skateboard | 5,543 | 5,761 | 218 Carrot | 7,852 | 15,411 | 7,559 | Skis | 6,646 | 8,945 | 2,299 Cat | 4,768 | 4,895 | 127 | Snowboard | 2,685 | 2,565 | -120 Cell Phone | 6,434 | 6,642 | 208 | Spoon | 6,165 | 6,156 | -9 Chair | 38,491 | 56,750 | 18,259 | Sports Ball | 6,347 | 6,060 | -287 Clock | 6,334 | 7,618 | 1,284 | Stop Sign | 1,983 | 2,684 | 701 Couch | 5,779 | 5,598 | -181 | Suitcase | 6,192 | 7,447 | 1,255 Cow | 8,147 | 8,990 | 843 | Surfboard | 6,126 | 6,175 | 49 Cup | 20,650 | 22,545 | 1,895 | Teddy Bear | 4,793 | 6,432 | 1,639 Dining Table | 15,714 | 16,569 | 855 | Tennis Racket | 4,812 | 4,932 | 120 Dog | 5,508 | 5,870 | 362 | Tie | 6,496 | 6,048 | -448 Donut | 7,179 | 11,622 | 4,443 ...
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This Zenodo dataset contain the Common Objects in Context (COCO) files linked to the following publication:
Verhaegen, G, Cimoli, E, & Lindsay, D (2021). Life beneath the ice: jellyfish and ctenophores from the Ross Sea, Antarctica, with an image-based training set for machine learning. Biodiversity Data Journal.
Each COCO zip folder contains an "annotations" folder including a json file and an "images" folder containing the annotated images.
Details on each COCO zip folders:
Beroe_sp_A_images-coco 1.0.zip
COCO annotations of Beroe sp. A for the following 114 images:
MCMEC2018_20181116_NIKON_Beroe_sp_A_c_1 to MCMEC2018_20181116_NIKON_Beroe_sp_A_c_16, MCMEC2018_20181125_NIKON_Beroe_sp_A_d_1 to MCMEC2018_20181125_NIKON_Beroe_sp_A_d_57, MCMEC2018_20181127_NIKON_Beroe_sp_A_e_1 to MCMEC2018_20181127_NIKON_Beroe_sp_A_e_2, MCMEC2019_20191116_SONY_Beroe_sp_A_a_1 to MCMEC2019_20191116_SONY_Beroe_sp_A_a_28, and MCMEC2019_20191127_SONY_Beroe_sp_A_f_1 to MCMEC2019_20191127_SONY_Beroe_sp_A_f_12
Beroe_sp_B_images-coco 1.0.zip
COCO annotations of Beroe sp. B for the following 2 images:
MCMEC2019_20191115_SONY_Beroe_sp_B_a_1 and MCMEC2019_20191115_SONY_Beroe_sp_B_a_2
Callianira_cristata_images-coco 1.0.zip
COCO annotations of Callianira cristata for the following 21 images:
MCMEC2019_20191120_SONY_Callianira_cristata_b_1 to MCMEC2019_20191120_SONY_Callianira_cristata_b_21
Diplulmaris_antarctica_images-coco 1.0.zip
COCO annotations of Diplulmaris antarctica for the following 83 images:
MCMEC2019_20191116_SONY_Diplulmaris_antarctica_a_1 to MCMEC2019_20191116_SONY_Diplulmaris_antarctica_a_9, and MCMEC2019_20191201_SONY_Diplulmaris_antarctica_c_1 to MCMEC2019_20191201_SONY_Diplulmaris_antarctica_c_74
Koellikerina_maasi_images-coco 1.0.zip
COCO annotations of Koellikerina maasi for the following 49 images:
MCMEC2018_20181127_NIKON_Koellikerina_maasi_b_1 to MCMEC2018_20181127_NIKON_Koellikerina_maasi_b_4, MCMEC2018_20181129_NIKON_Koellikerina_maasi_c_1 to MCMEC2018_20181129_NIKON_Koellikerina_maasi_c_29, and MCMEC2019_20191126_SONY_Koellikerina_maasi_a_1 to MCMEC2019_20191126_SONY_Koellikerina_maasi_a_16
Leptomedusa_sp_A-coco 1.0.zip
COCO annotations of Leptomedusa sp. A for Figure 5 (see paper).
Leuckartiara_brownei_images-coco 1.0.zip
COCO annotations of Leuckartiara brownei for the following 48 images:
MCMEC2018_20181129_NIKON_Leuckartiara_brownei_b_1 to MCMEC2018_20181129_NIKON_Leuckartiara_brownei_b_27, MCMEC2018_20181129_NIKON_Leuckartiara_brownei_c_1 to MCMEC2018_20181129_NIKON_Leuckartiara_brownei_c_6, and MCMEC2019_20191116_SONY_Leuckartiara_brownei_a_1 to MCMEC2019_20191116_SONY_Leuckartiara_brownei_a_15
MCMEC2019_20191115_SONY_Mertensiidae_sp_A_a_3-coco 1.0.zip
COCO annotations of Mertensiidae sp. A for the following video (total of 1847 frames): MCMEC2019_20191115_SONY_Mertensiidae_sp_A_a_3 (https://youtu.be/0W2HHLW71Pw)
MCMEC2019_20191116_SONY_Leuckartiara_brownei_a_3-coco 1.0.zip
COCO annotations of Leuckartiara brownei for the following video (total of 1367 frames): MCMEC2019_20191116_SONY_Leuckartiara_brownei_a_3 (https://youtu.be/dEIbVYlF_TQ)
MCMEC2019_20191122_SONY_Callianira_cristata_a_1-coco 1.0.zip
COCO annotations of Callianira cristata for the following video (total of 2423 frames): MCMEC2019_20191122_SONY_Callianira_cristata_a_1 (https://youtu.be/30g9CvYh5JE)
MCMEC2019_20191122_SONY_Leptomedusa_sp_B_a_1-coco 1.0.zip
COCO annotations of Leptomedusa sp. B for the following video (total of 1164 frames): MCMEC2019_20191122_SONY_Leptomedusa_sp_B_a_1 (https://youtu.be/hrufuPQ7F8U)
MCMEC2019_20191126_SONY_Koellikerina_maasi_a_1-coco 1.0.zip
COCO annotations of Koellikerina maasi for the following video (total of 1643 frames): MCMEC2019_20191126_SONY_Koellikerina_maasi_a_1 (https://youtu.be/QiBPf_HYrQ8)
MCMEC2019_20191129_SONY_Mertensiidae_sp_A_b_1-coco 1.0.zip
COCO annotations of Mertensiidae sp. A for the following video (total of 239 frames): MCMEC2019_20191129_SONY_Mertensiidae_sp_A_b_1 (https://youtu.be/pvXYlQGZIVg)
MCMEC2019_20191129_SONY_Pyrostephos_vanhoeffeni_b_2-coco 1.0.zip
COCO annotations of Pyrostephos vanhoeffeni for the following video (total of 444 frames): MCMEC2019_20191129_SONY_Pyrostephos_vanhoeffeni_b_2 (https://youtu.be/2rrQCybEg0Q)
MCMEC2019_20191129_SONY_Pyrostephos_vanhoeffeni_b_3-coco 1.0.zip
COCO annotations of Pyrostephos vanhoeffeni for the following video (total of 683 frames): MCMEC2019_20191129_SONY_Pyrostephos_vanhoeffeni_b_3 (https://youtu.be/G9tev_gdUvQ)
MCMEC2019_20191129_SONY_Pyrostephos_vanhoeffeni_b_4-coco 1.0.zip
COCO annotations of Pyrostephos vanhoeffeni for the following video (total of 1127 frames): MCMEC2019_20191129_SONY_Pyrostephos_vanhoeffeni_b_4 (https://youtu.be/NfJjKBRh5Hs)
MCMEC2019_20191130_SONY_Beroe_sp_A_b_1-coco 1.0.zip
COCO annotations of Beroe sp. A for the following video (total of 2171 frames): MCMEC2019_20191130_SONY_Beroe_sp_A_b_1 (https://youtu.be/kGBUQ7ZtH9U)
MCMEC2019_20191130_SONY_Beroe_sp_A_b_2-coco 1.0.zip
COCO annotations of Beroe sp. A for the following video (total of 359 frames): MCMEC2019_20191130_SONY_Beroe_sp_A_b_2 (https://youtu.be/Vbl_KEmPNmU)
Mertensiidae_sp_A_images-coco 1.0.zip
COCO annotations of Mertensiidae sp. A for the following 49 images:
MCMEC2018_20181127_NIKON_Mertensiidae_sp_A_c_1 to MCMEC2018_20181127_NIKON_Mertensiidae_sp_A_c_2, MCMEC2018_20181127_NIKON_Mertensiidae_sp_A_f_1 to MCMEC2018_20181127_NIKON_Mertensiidae_sp_A_f_8, MCMEC2018_20181129_NIKON_Mertensiidae_sp_A_d_1 to MCMEC2018_20181129_NIKON_Mertensiidae_sp_A_d_13, MCMEC2018_20181201_ROV_Mertensiidae_sp_A_e_1 to MCMEC2018_20181201_ROV_Mertensiidae_sp_A_e_15, and MCMEC2019_20191115_SONY_Mertensiidae_sp_A_a_1 to MCMEC2019_20191115_SONY_Mertensiidae_sp_A_a_11
Pyrostephos_vanhoeffeni_images-coco 1.0.zip
COCO annotations of Pyrostephos vanhoeffeni for the following 14 images: MCMEC2019_20191125_SONY_Pyrostephos_vanhoeffeni_a_1 to MCMEC2019_20191125_SONY_Pyrostephos_vanhoeffeni_a_8, MCMEC2019_20191129_SONY_Pyrostephos_vanhoeffeni_b_1 to MCMEC2019_20191129_SONY_Pyrostephos_vanhoeffeni_b_6
Solmundella_bitentaculata_images-coco 1.0.zip
COCO annotations of Solmundella bitentaculata for the following 13 images: MCMEC2018_20181127_NIKON_Solmundella_bitentaculata_a_1 to MCMEC2018_20181127_NIKON_Solmundella_bitentaculata_a_13
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset comprises 4 different actions in tennis, each action has 500 images and a COCO-format JSON files. The images in the dataset were extracted frame by frame from videos that were self-recorded, and manually classified according to different tennis actions.
The actions in this dataset, the action categories name in COCO-format is in brackets: 1. backhand shot (backhand) 2. forehand shot (forehand) 3. ready position (ready_position) 4. serve (serve)
We organize two main directories: annotations and images. - annotations: the JSON files of the actions (COCO-format) - images: the images of the actions (according four actions classify to four folders)
We use COCO-Annotator to annotating and categorizing human actions. And we annotate the key points are in following (refer to OpenPose's annotation): ["nose", "left_eye", "right_eye", "left_ear", "right_ear", "left_shoulder", "right_shoulder", "left_elbow", "right_elbow", "left_wrist", "right_wrist", "left_hip", "right_hip", "left_knee", "right_knee", "left_ankle", "right_ankle", "neck"]
The dataset comprises 4 different actions in tennis, each action have 500 images and a COCO-format JSON files. Size on disk is 508 MB (533,372,928 bytes).
National Taichung University of Science and Technology, National Kaohsiung University of Science and Technology
Computer Vision, Image Processing, Tennis, Action Recognition
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Abstract
This paper presents SubPipe, an underwater dataset for SLAM, object detection, and image segmentation. SubPipe has been recorded using a lightweight autonomous underwater vehicle (LAUV), operated by OceanScan MST, and carrying a sensor suite including two cameras, a side-scan sonar, and an inertial navigation system, among other sensors. The AUV has been deployed in a pipeline inspection environment with a submarine pipe partially covered by sand. The AUV's pose ground truth is estimated from the navigation sensors. The side-scan sonar and RGB images include object detection and segmentation annotations, respectively. State-of-the-art segmentation, object detection, and SLAM methods are benchmarked on SubPipe to demonstrate the dataset's challenges and opportunities for leveraging computer vision algorithms.To the authors' knowledge, this is the first annotated underwater dataset providing a real pipeline inspection scenario. The dataset and experiments are publicly available online.
On Zenodo we provide three versions for SubPipe. One is the full version (SubPipe.zip, ~80GB unzipped) and two subsamples: SubPipeMini.zip, ~12GB unzipped and SubPipeMini2.zip, ~16GB unzipped. Both subsamples are only parts of the entire dataset (SubPipe.zip). SubPipeMini is a subset, containing semantic segmentation data, and it has interesting camera data of the underwater pipeline. On the other hand, SubPipeMini2 is mainly focused on underwater side-scan sonar images of the seabed including ground truth object detection bounding boxes of the pipeline.
For (re-)using/publishing SubPipe, please include the following copyright text:
SubPipe is a public dataset of a submarine outfall pipeline, property of Oceanscan-MST. This dataset was acquired with a Light Autonomous Underwater Vehicle by Oceanscan-MST, within the scope of Challenge Camp 1 of the H2020 REMARO project.
More information about OceanScan-MST can be found at this link.
Cam0 — GoPro Hero 10
Camera parameters:
Resolution: 1520×2704
fx = 1612.36
fy = 1622.56
cx = 1365.43
cy = 741.27
k1,k2, p1, p2 = [−0.247, 0.0869, −0.006, 0.001]
Side-scan Sonars
Each sonar image was created after 20 “ping” (after every 20 new lines) which corresponds to approx. ~1 image / second.
Regarding the object detection annotations, we provide both COCO and YOLO formats for each annotation. A single COCO annotation file is provided per each chunk and per each frequency (low frequency vs. high frequency), whereas the YOLO annotations are provided for each SSS image file.
Metadata about the side-scan sonar images contained in this dataset:
Images for object detection
5000
LF image size: 2500 × 500
5030
HF Image size 5000 × 500
Total number of images: 10030
Annotations
3163
3172
Total number of annotations: 6335
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset comprises the following components:
1. SHdataset: It contains 12,051 microscopic images taken from 103 urine samples, along with their corresponding segmentation masks manually annotated for Schistosoma haematobium eggs. The dataset is randomly partitioned into 80-20 train-test splits.
2. diagnosis_test_dataset: This dataset includes 65 clinical urine samples. Each sample consists of 117 Field-of-View (FoV) images required to capture the entire filter membrane. Additionally, the dataset includes the diagnosis results provided by an expert microscopist.
Samples were obtained from school-age children who had observed the presence of blood in their urine. These clinical urine samples were collected in 20 mL sterile universal containers as part of a field study conducted in the Federal Capital Territory (FCT), Abuja, Nigeria, in collaboration with the University of Lagos, Nigeria. The study received ethical approval from the Federal Capital Territory Health Research Ethics Committee (FCT-HREC) Nigeria (Reference No. FHREC/2019/01/73/18-07-19).
The standard urine filtration procedure was used to process the clinical urine samples. Specifically, 10 mL of urine was passed through a 13 mm diameter filter membrane with a pore size of 0.2 μm. After filtration, the membrane was placed on a microscopy glass slide and covered with a coverslip to enhance the flatness of the membrane for image capture. The images were acquired using a digital microscope called the Schistoscope and were saved in PNG format with a resolution of 2028 X 1520 pixels and a size of approximately 2 MB.
The annotation and microscopy analysis were performed by a team of two experts from the ANDI Centre of Excellence for Malaria Diagnosis, College of Medicine, University of Lagos, and Centre de Recherches Medicales des Lambaréné, CERMEL, Lambarene. The experts used the coco annotation tool to annotate the 12,051 images, creating polygons around the Schistosoma haematobium eggs. The output of the annotation process was a JSON file containing specific details about the image storage location, size, filename, and coordinates of all annotated regions.
The segmentation mask images were generated from the JSON file using a Python program. The SHdataset was used to develop an automated diagnosis framework for urogenital schistosomiasis, while the diagnosis_test_dataset was used to compare the performance of the developed framework with the results from the expert microscopist.
For further details about the dataset, more information can be found in the following articles:
1. Oyibo, P., Jujjavarapu, S., Meulah, B., Agbana, T., Braakman, I., van Diepen, A., Bengtson, M., van Lieshout, L., Oyibo, W., Vdovine, G., and Diehl, J.C. (2022). "Schistoscope: an automated microscope with artificial intelligence for detection of Schistosoma haematobium eggs in resource-limited settings." Micromachines, 13(5), p.643.
2. Oyibo, P., Meulah, B., Bengtson, M., van Lieshout, L., Oyibo, W., Diehl, J.C., Vdovine, G., and Agbana, T. (2023). "Two-stage automated diagnosis framework for urogenital schistosomiasis in microscopy images from low-resource settings." Journal of Medical Imaging. [Accepted Manuscript]
https://data.4tu.nl/info/fileadmin/user_upload/Documenten/4TU.ResearchData_Restricted_Data_2022.pdfhttps://data.4tu.nl/info/fileadmin/user_upload/Documenten/4TU.ResearchData_Restricted_Data_2022.pdf
This file contains the annotations for the ConfLab dataset, including actions (speaking status), pose, and F-formations.
------------------
./actions/speaking_status:
./processed: the processed speaking status files, aggregated into a single data frame per segment. Skipped rows in the raw data (see https://josedvq.github.io/covfee/docs/output for details) have been imputed using the code at: https://github.com/TUDelft-SPC-Lab/conflab/tree/master/preprocessing/speaking_status
The processed annotations consist of:
./speaking: The first row contains person IDs matching the sensor IDs,
The rest of the row contains binary speaking status annotations at 60fps for the corresponding 2 min video segment (7200 frames).
./confidence: Same as above. These annotations reflect the continuous-valued rating of confidence of the annotators in their speaking annotation.
To load these files with pandas: pd.read_csv(p, index_col=False)
./raw-covfee.zip: the raw outputs from speaking status annotation for each of the eight annotated 2-min video segments. These were were output by the covfee annotation tool (https://github.com/josedvq/covfee)
Annotations were done at 60 fps.
--------------------
./pose:
./coco: the processed pose files in coco JSON format, aggregated into a single data frame per video segment. These files have been generated from the raw files using the code at: https://github.com/TUDelft-SPC-Lab/conflab-keypoints
To load in Python: f = json.load(open('/path/to/cam2_vid3_seg1_coco.json'))
The skeleton structure (limbs) is contained within each file in:
f['categories'][0]['skeleton']
and keypoint names at:
f['categories'][0]['keypoints']
./raw-covfee.zip: the raw outputs from continuous pose annotation. These were were output by the covfee annotation tool (https://github.com/josedvq/covfee)
Annotations were done at 60 fps.
---------------------
./f_formations:
seg 2: 14:00 onwards, for videos of the form x2xxx.MP4 in /video/raw/ for the relevant cameras (2,4,6,8,10).
seg 3: for videos of the form x3xxx.MP4 in /video/raw/ for the relevant cameras (2,4,6,8,10).
Note that camera 10 doesn't include meaningful subject information/body parts that are not already covered in camera 8.
First column: time stamp
Second column: "()" delineates groups, "<>" delineates subjects, cam X indicates the best camera view for which a particular group exists.
phone.csv: time stamp (pertaining to seg3), corresponding group, ID of person using the phone
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Fish Occlusion Dataset (FOD) is a large-scale image dataset specifically designed for training and evaluating instance segmentation models under challenging underwater occlusion scenarios. It contains 14,376 underwater images and 144,894 pixel-level annotated fish instances. The dataset includes 1,376 real images acquired under controlled conditions, and 13,000 synthetic images generated using compositing techniques to simulate diverse occlusion patterns and fish densities. All fish instances are labeled with both segmentation masks and occlusion-level categories (whole, part, fragment), allowing for fine-grained performance evaluation under different occlusion levels. The dataset follows the MS COCO annotation format and is intended to support the development of robust, segmentation, and occlusion reasoning models in underwater environments.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Cell microscopy images with cell and nucleus segmentations in COCO annotation format
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This Is Keypoint-Only subset from COCO 2017 Dataset. You can access the original COCO Dataset from here
This Dataset contains three folders: annotations, val2017, and train2017. - Contents in annotation folder is two jsons, for val dan train. Each jsons contains various informations, like the image id, bounding box, and keypoints locations. - Contents of val2017 and train2017 is various images that have been filtered. They are the images that have num_keypoints > 0 according to the annotation file.