37 datasets found
  1. P

    MS COCO Dataset

    • paperswithcode.com
    Updated Apr 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tsung-Yi Lin; Michael Maire; Serge Belongie; Lubomir Bourdev; Ross Girshick; James Hays; Pietro Perona; Deva Ramanan; C. Lawrence Zitnick; Piotr Dollár, MS COCO Dataset [Dataset]. https://paperswithcode.com/dataset/coco
    Explore at:
    Dataset updated
    Apr 15, 2024
    Authors
    Tsung-Yi Lin; Michael Maire; Serge Belongie; Lubomir Bourdev; Ross Girshick; James Hays; Pietro Perona; Deva Ramanan; C. Lawrence Zitnick; Piotr Dollár
    Description

    The MS COCO (Microsoft Common Objects in Context) dataset is a large-scale object detection, segmentation, key-point detection, and captioning dataset. The dataset consists of 328K images.

    Splits: The first version of MS COCO dataset was released in 2014. It contains 164K images split into training (83K), validation (41K) and test (41K) sets. In 2015 additional test set of 81K images was released, including all the previous test images and 40K new images.

    Based on community feedback, in 2017 the training/validation split was changed from 83K/41K to 118K/5K. The new split uses the same images and annotations. The 2017 test set is a subset of 41K images of the 2015 test set. Additionally, the 2017 release contains a new unannotated dataset of 123K images.

    Annotations: The dataset has annotations for

    object detection: bounding boxes and per-instance segmentation masks with 80 object categories, captioning: natural language descriptions of the images (see MS COCO Captions), keypoints detection: containing more than 200,000 images and 250,000 person instances labeled with keypoints (17 possible keypoints, such as left eye, nose, right hip, right ankle), stuff image segmentation – per-pixel segmentation masks with 91 stuff categories, such as grass, wall, sky (see MS COCO Stuff), panoptic: full scene segmentation, with 80 thing categories (such as person, bicycle, elephant) and a subset of 91 stuff categories (grass, sky, road), dense pose: more than 39,000 images and 56,000 person instances labeled with DensePose annotations – each labeled person is annotated with an instance id and a mapping between image pixels that belong to that person body and a template 3D model. The annotations are publicly available only for training and validation images.

  2. Microsoft Coco Dataset

    • universe.roboflow.com
    zip
    Updated Mar 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Microsoft (2025). Microsoft Coco Dataset [Dataset]. https://universe.roboflow.com/microsoft/coco/model/3
    Explore at:
    zipAvailable download formats
    Dataset updated
    Mar 23, 2025
    Dataset authored and provided by
    Microsofthttp://microsoft.com/
    Variables measured
    Object Bounding Boxes
    Description

    Microsoft Common Objects in Context (COCO) Dataset

    The Common Objects in Context (COCO) dataset is a widely recognized collection designed to spur object detection, segmentation, and captioning research. Created by Microsoft, COCO provides annotations, including object categories, keypoints, and more. The model it a valuable asset for machine learning practitioners and researchers. Today, many model architectures are benchmarked against COCO, which has enabled a standard system by which architectures can be compared.

    While COCO is often touted to comprise over 300k images, it's pivotal to understand that this number includes diverse formats like keypoints, among others. Specifically, the labeled dataset for object detection stands at 123,272 images.

    The full object detection labeled dataset is made available here, ensuring researchers have access to the most comprehensive data for their experiments. With that said, COCO has not released their test set annotations, meaning the test data doesn't come with labels. Thus, this data is not included in the dataset.

    The Roboflow team has worked extensively with COCO. Here are a few links that may be helpful as you get started working with this dataset:

  3. f

    Databases in MS COCO (json) format

    • figshare.com
    • springernature.figshare.com
    zip
    Updated Nov 20, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Robert Klopfleisch; Andreas Maier; Marc Aubreville; Christof Bertram; Christian Marzahl (2020). Databases in MS COCO (json) format [Dataset]. http://doi.org/10.6084/m9.figshare.12805244.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Nov 20, 2020
    Dataset provided by
    figshare
    Authors
    Robert Klopfleisch; Andreas Maier; Marc Aubreville; Christof Bertram; Christian Marzahl
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Databases in MS COCO (json) format

  4. Style Transfer for Object Detection in Art

    • kaggle.com
    zip
    Updated Mar 11, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    David Kadish (2021). Style Transfer for Object Detection in Art [Dataset]. https://www.kaggle.com/davidkadish/style-transfer-for-object-detection-in-art
    Explore at:
    zip(3762347804 bytes)Available download formats
    Dataset updated
    Mar 11, 2021
    Authors
    David Kadish
    Description

    Context

    Despite recent advances in object detection using deep learning neural networks, these neural networks still struggle to identify objects in art images such as paintings and drawings. This challenge is known as the cross depiction problem and it stems in part from the tendency of neural networks to prioritize identification of an object's texture over its shape. In this paper we propose and evaluate a process for training neural networks to localize objects - specifically people - in art images. We generated a large dataset for training and validation by modifying the images in the COCO dataset using AdaIn style transfer (style-coco.tar.xz). This dataset was used to fine-tune a Faster R-CNN object detection network (2020-12-10_09-45-15_58672_resnet152_stylecoco_epoch_15.pth), which is then tested on the existing People-Art testing dataset (PeopleArt-Coco.tar.xz). The result is a significant improvement on the state of the art and a new way forward for creating datasets to train neural networks to process art images.

    Content

    2020-12-10_09-45-15_58672_resnet152_stylecoco_epoch_15.pth: Trained object detection network (Faster-RCNN using a ResNet152 backbone pretrained on ImageNet) for use with PyTorch PeopleArt-Coco.tar.xz: People-Art dataset with COCO-formatted annotations (original at https://github.com/BathVisArtData/PeopleArt) style-coco.tar.xz: Stylized COCO dataset containing only the person category. Used to train 2020-12-10_09-45-15_58672_resnet152_stylecoco_epoch_15.pth

    Code

    The code is available on github at https://github.com/dkadish/Style-Transfer-for-Object-Detection-in-Art

    Citing

    If you are using this code or the concept of style transfer for object detection in art, please cite our paper (https://arxiv.org/abs/2102.06529):

    D. Kadish, S. Risi, and A. S. Løvlie, “Improving Object Detection in Art Images Using Only Style Transfer,” Feb. 2021.

  5. Parcel2D Real - A real-world image dataset of cuboid-shaped parcels with 2D...

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Jul 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexander Naumann; Alexander Naumann; Felix Hertlein; Felix Hertlein; Benchun Zhou; Benchun Zhou; Laura Dörr; Laura Dörr; Kai Furmans; Kai Furmans (2023). Parcel2D Real - A real-world image dataset of cuboid-shaped parcels with 2D and 3D annotations [Dataset]. http://doi.org/10.5281/zenodo.8031971
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 13, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Alexander Naumann; Alexander Naumann; Felix Hertlein; Felix Hertlein; Benchun Zhou; Benchun Zhou; Laura Dörr; Laura Dörr; Kai Furmans; Kai Furmans
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Real-world dataset of ~400 images of cuboid-shaped parcels with full 2D and 3D annotations in the COCO format.

    Relevant computer vision tasks:

    • bounding box detection
    • instance segmentation
    • keypoint estimation
    • 3D bounding box estimation
    • 3D voxel reconstruction (.binvox files)
    • 3D reconstruction (.obj files)

    For details, see our paper and project page.

    If you use this resource for scientific research, please consider citing

    @inproceedings{naumannScrapeCutPasteLearn2022,
      title    = {Scrape, Cut, Paste and Learn: Automated Dataset Generation Applied to Parcel Logistics},
      author    = {Naumann, Alexander and Hertlein, Felix and Zhou, Benchun and Dörr, Laura and Furmans, Kai},
      booktitle  = {{{IEEE Conference}} on {{Machine Learning}} and Applications ({{ICMLA}})},
      date     = 2022
    }

  6. Parcel3D - A Synthetic Dataset of Damaged and Intact Parcel Images with 2D...

    • zenodo.org
    • explore.openaire.eu
    • +1more
    zip
    Updated Jul 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexander Naumann; Alexander Naumann; Felix Hertlein; Felix Hertlein; Laura Dörr; Laura Dörr; Kai Furmans; Kai Furmans (2023). Parcel3D - A Synthetic Dataset of Damaged and Intact Parcel Images with 2D and 3D Annotations [Dataset]. http://doi.org/10.5281/zenodo.8032204
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 13, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Alexander Naumann; Alexander Naumann; Felix Hertlein; Felix Hertlein; Laura Dörr; Laura Dörr; Kai Furmans; Kai Furmans
    Description

    Synthetic dataset of over 13,000 images of damaged and intact parcels with full 2D and 3D annotations in the COCO format. For details see our paper and for visual samples our project page.


    Relevant computer vision tasks:

    • bounding box detection
    • classification
    • instance segmentation
    • keypoint estimation
    • 3D bounding box estimation
    • 3D voxel reconstruction
    • 3D reconstruction

    The dataset is for academic research use only, since it uses resources with restrictive licenses.
    For a detailed description of how the resources are used, we refer to our paper and project page.

    Licenses of the resources in detail:

    You can use our textureless models (i.e. the obj files) of damaged parcels under CC BY 4.0 (note that this does not apply to the textures).

    If you use this resource for scientific research, please consider citing

    @inproceedings{naumannParcel3DShapeReconstruction2023,
      author  = {Naumann, Alexander and Hertlein, Felix and D\"orr, Laura and Furmans, Kai},
      title   = {Parcel3D: Shape Reconstruction From Single RGB Images for Applications in Transportation Logistics},
      booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
      month   = {June},
      year   = {2023},
      pages   = {4402-4412}
    }
  7. Z

    Small Object Aerial Person Detection Dataset

    • data.niaid.nih.gov
    • zenodo.org
    Updated Apr 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rafael Makrigiorgis (2023). Small Object Aerial Person Detection Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7740080
    Explore at:
    Dataset updated
    Apr 5, 2023
    Dataset provided by
    Rafael Makrigiorgis
    Panayiotis Kolios
    Christos Kyrkou
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Small Object Aerial Person Detection Dataset:

    The aerial dataset publication comprises a collection of frames captured from unmanned aerial vehicles (UAVs) during flights over the University of Cyprus campus and Civil Defense exercises. The dataset is primarily intended for people detection, with a focus on detecting small objects due to the top-view perspective of the images. The dataset includes annotations generated in popular formats such as YOLO, COCO, and VOC, making it highly versatile and accessible for a wide range of applications. Overall, this aerial dataset publication represents a valuable resource for researchers and practitioners working in the field of computer vision and machine learning, particularly those focused on people detection and related applications.

        Subset
        Images
        People
    
    
        Training
        2092
        40687
    
    
        Validation
        523
        10589
    
    
        Testing
        521
        10432
    

    It is advised to further enhance the dataset so that random augmentations are probabilistically applied to each image prior to adding it to the batch for training. Specifically, there are a number of possible transformations such as geometric (rotations, translations, horizontal axis mirroring, cropping, and zooming), as well as image manipulations (illumination changes, color shifting, blurring, sharpening, and shadowing).

  8. MetaGraspNet Difficulty 1

    • kaggle.com
    zip
    Updated Mar 19, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yuhao Chen (2022). MetaGraspNet Difficulty 1 [Dataset]. https://www.kaggle.com/datasets/metagrasp/metagraspnetdifficulty1-easy
    Explore at:
    zip(4103890817 bytes)Available download formats
    Dataset updated
    Mar 19, 2022
    Authors
    Yuhao Chen
    License

    Attribution-NoDerivs 4.0 (CC BY-ND 4.0)https://creativecommons.org/licenses/by-nd/4.0/
    License information was derived automatically

    Description

    MetaGraspNet dataset

    This repository contains the MetaGraspNet Dataset described in the paper "MetaGraspNet: A Large-Scale Benchmark Dataset for Vision-driven Robotic Grasping via Physics-based Metaverse Synthesis" (https://arxiv.org/abs/2112.14663 ).

    There has been increasing interest in smart factories powered by robotics systems to tackle repetitive, laborious tasks. One particular impactful yet challenging task in robotics-powered smart factory applications is robotic grasping: using robotic arms to grasp objects autonomously in different settings. Robotic grasping requires a variety of computer vision tasks such as object detection, segmentation, grasp prediction, pick planning, etc. While significant progress has been made in leveraging of machine learning for robotic grasping, particularly with deep learning, a big challenge remains in the need for large-scale, high-quality RGBD datasets that cover a wide diversity of scenarios and permutations.

    To tackle this big, diverse data problem, we are inspired by the recent rise in the concept of metaverse, which has greatly closed the gap between virtual worlds and the physical world. In particular, metaverses allow us to create digital twins of real-world manufacturing scenarios and to virtually create different scenarios from which large volumes of data can be generated for training models. We present MetaGraspNet: a large-scale benchmark dataset for vision-driven robotic grasping via physics-based metaverse synthesis. The proposed dataset contains 100,000 images and 25 different object types, and is split into 5 difficulties to evaluate object detection and segmentation model performance in different grasping scenarios. We also propose a new layout-weighted performance metric alongside the dataset for evaluating object detection and segmentation performance in a manner that is more appropriate for robotic grasp applications compared to existing general-purpose performance metrics. This repository contains the first phase of MetaGraspNet benchmark dataset which includes detailed object detection, segmentation, layout annotations, and a script for layout-weighted performance metric (https://github.com/y2863/MetaGraspNet ).

    https://raw.githubusercontent.com/y2863/MetaGraspNet/main/.github/500.png">

    Citing MetaGraspNet

    If you use MetaGraspNet dataset or metric in your research, please use the following BibTeX entry. BibTeX @article{chen2021metagraspnet, author = {Yuhao Chen and E. Zhixuan Zeng and Maximilian Gilles and Alexander Wong}, title = {MetaGraspNet: a large-scale benchmark dataset for vision-driven robotic grasping via physics-based metaverse synthesis}, journal = {arXiv preprint arXiv:2112.14663}, year = {2021} }

    File Structure

    This dataset is arranged in the following file structure:

    root
    |-- meta-grasp
      |-- scene0
        |-- 0_camera_params.json
        |-- 0_depth.png
        |-- 0_rgb.png
        |-- 0_order.csv
        ...
      |-- scene1
      ...
    |-- difficulty-n-coco-label.json
    

    Each scene is an unique arrangement of objects, which we then display at various different angles. For each shot of a scene, we provide the camera parameters (x_camara_params.json), a depth image (x_depth.png), an rgb image (x_rgb.png), as well as a matrix representation of the ordering of each object (x_order.csv). The full label for the image are all available in difficulty-n-coco-label.json (where n is the difficulty level of the dataset) in the coco data format.

    Understanding order.csv

    The matrix describes a pairwise obstruction relationship between each object within the image. Given a "parent" object covering a "child" object: relationship_matrix[child_id, parent_id] = -1

  9. Sarnet Search And Rescue Dataset

    • universe.roboflow.com
    zip
    Updated Jun 16, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Roboflow Public (2022). Sarnet Search And Rescue Dataset [Dataset]. https://universe.roboflow.com/roboflow-public/sarnet-search-and-rescue
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 16, 2022
    Dataset provided by
    Roboflow
    Authors
    Roboflow Public
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    SaR Bounding Boxes
    Description

    Description from the SaRNet: A Dataset for Deep Learning Assisted Search and Rescue with Satellite Imagery GitHub Repository * The "Note" was added by the Roboflow team.

    Satellite Imagery for Search And Rescue Dataset - ArXiv

    This is a single class dataset consisting of tiles of satellite imagery labeled with potential 'targets'. Labelers were instructed to draw boxes around anything they suspect may a paraglider wing, missing in a remote area of Nevada. Volunteers were shown examples of similar objects already in the environment for comparison. The missing wing, as it was found after 3 weeks, is shown below.

    https://michaeltpublic.s3.amazonaws.com/images/anomaly_small.jpg" alt="anomaly">

    The dataset contains the following:

    SetImagesAnnotations
    Train18083048
    Validate490747
    Test254411
    Total25524206

    The data is in the COCO format, and is directly compatible with faster r-cnn as implemented in Facebook's Detectron2.

    Getting hold of the Data

    Download the data here: sarnet.zip

    Or follow these steps

    # download the dataset
    wget https://michaeltpublic.s3.amazonaws.com/sarnet.zip
    
    # extract the files
    unzip sarnet.zip
    

    ***Note* with Roboflow, you can download the data here** (original, raw images, with annotations): https://universe.roboflow.com/roboflow-public/sarnet-search-and-rescue/ (download v1, original_raw-images) * Download the dataset in COCO JSON format, or another format of choice, and import them to Roboflow after unzipping the folder to get started on your project.

    Getting started

    Get started with a Faster R-CNN model pretrained on SaRNet: SaRNet_Demo.ipynb

    Source Code for Paper

    Source code for the paper is located here: SaRNet_train_test.ipynb

    Cite this dataset

    @misc{thoreau2021sarnet,
       title={SaRNet: A Dataset for Deep Learning Assisted Search and Rescue with Satellite Imagery}, 
       author={Michael Thoreau and Frazer Wilson},
       year={2021},
       eprint={2107.12469},
       archivePrefix={arXiv},
       primaryClass={eess.IV}
    }
    

    Acknowledgment

    The source data was generously provided by Planet Labs, Airbus Defence and Space, and Maxar Technologies.

  10. R

    Pothole Object Detection Dataset - raw

    • public.roboflow.com
    zip
    Updated Nov 1, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Atikur Rahman Chitholian (2020). Pothole Object Detection Dataset - raw [Dataset]. https://public.roboflow.com/object-detection/pothole/1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Nov 1, 2020
    Dataset authored and provided by
    Atikur Rahman Chitholian
    License

    Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
    License information was derived automatically

    Variables measured
    Bounding Boxes of potholes
    Description

    Pothole Dataset

    https://i.imgur.com/7Xz8d5M.gif" alt="Example Image">

    This is a collection of 665 images of roads with the potholes labeled. The dataset was created and shared by Atikur Rahman Chitholian as part of his undergraduate thesis and was originally shared on Kaggle.

    Note: The original dataset did not contain a validation set; we have re-shuffled the images into a 70/20/10 train-valid-test split.

    Usage

    This dataset could be used for automatically finding and categorizing potholes in city streets so the worst ones can be fixed faster.

    The dataset is provided in a wide variety of formats for various common machine learning models.

  11. Mechanical Parts Dataset 2022

    • zenodo.org
    Updated Jan 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mübarek Mazhar Çakır; Mübarek Mazhar Çakır (2023). Mechanical Parts Dataset 2022 [Dataset]. http://doi.org/10.5281/zenodo.7504801
    Explore at:
    Dataset updated
    Jan 5, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Mübarek Mazhar Çakır; Mübarek Mazhar Çakır
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Mechanical Parts Dataset

    The dataset consists of a total of 2250 images obtained by downloading from various internet platforms. Among the images in the dataset, there are 714 images with bearings, 632 images with bolts, 616 images with gears and 586 images with nuts. A total of 10597 manual labeling processes were carried out in the dataset, including 2099 labels belonging to the bearing class, 2734 labels belonging to the bolt class, 2662 labels belonging to the gear class and 3102 labels belonging to the nut class.

    Folder Content

    The created dataset is divided into 3 as 80% train, 10% validation and 10% test. In the "Mechanical Parts Dataset" folder, there are three separate folders as "train", "test" and "val". In each of these three folders there are folders named "images" and "labels". Images are kept in the "images" folder and tag information is kept in the "labels" folder.

    Finally, inside the folder there is a yaml file named "mech_parts_data" for the Yolo algorithm. This file contains the number of classes and class names.

    Images and Labels

    The dataset was prepared in accordance with the Yolov5 algorithm.
    For example, the tag information of the image named "2a0xhkr_jpg.rf.45a11bf63c40ad6e47da384fdf6bb7a1.jpg" is stored in the txt file with the same name. The tag information (coordinates) in the txt file are as follows: "class x_center y_center width height".

    Update 05.01.2023

    ***Pascal voc and coco json formats have been added.***

    Related paper: doi.org/10.5281/zenodo.7496767

  12. ABOships-PLUS

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Jan 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Winsten Jesper; Winsten Jesper; Iancu Bogdan; Iancu Bogdan; Soloviev Valentin; Lilius Johan; Lilius Johan; Soloviev Valentin (2024). ABOships-PLUS [Dataset]. http://doi.org/10.5281/zenodo.10469672
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 8, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Winsten Jesper; Winsten Jesper; Iancu Bogdan; Iancu Bogdan; Soloviev Valentin; Lilius Johan; Lilius Johan; Soloviev Valentin
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    08.01.2024 Updated the annotations to be the correct ones.

    ABOships-PLUS is an improved iteration of the original ABOships dataset. It includes 9,880 images capturing maritime scenes, showcasing various types of maritime objects such as powerboats, ships, sailboats, and stationary objects. Detailed category definitions and images can be found in the associated reference paper. In total, ABOships-PLUS contains 33,227 annotated objects across these categories, including four types of ships.

    Several key changes and improvements have been made to ABOships-PLUS:

    • Object Size Filtering: In ABOships-PLUS, a filtering process was applied to exclude very small objects, specifically those with an occupied pixel area less than 16^2x16^2 pixels. This filtering ensures that the dataset primarily consists of more discernible maritime objects, contributing to improved data quality.
    • Superclass Aggregation: A notable transformation in ABOships-PLUS is the grouping of objects into four superclasses based on their distinct visual characteristics. This superclass aggregation facilitates the use of both transfer learning and learning from scratch, making the dataset more versatile for various machine learning applications.
    • Semantic Relevance: The categorization into superclasses in ABOships-PLUS was guided by semantic relevance, with human supervision. The objective was to create more meaningful superclasses, both from a semantic and visual perspective, enhancing the dataset's utility for maritime object detection research.
    • Format Transition: A significant change occurred in the data format. ABOships-PLUS adopts the COCO format for object detection: https://cocodataset.org/#format-data. This format transition enhances compatibility with a broader range of machine learning frameworks and tools, in contrast to the original ABOships dataset, which used CSV format.

    To create ABOships-PLUS, images were extracted from videos recorded in MPEG format, with a resolution of 720p at 15 frames per second (FPS). An image was extracted every 15 seconds, equivalent to every 225 frames, from videos filmed in the Finnish Archipelago using a camera attached to a moving watercraft known as a waterbus or "vesibussi" in Finnish.

    The distribution of labels within ABOships-PLUS is as follows: powerboat (21.8%), ship (46.0%), sailboat (24.2%), and stationary objects (8.1%). These changes aim to enhance the dataset's usability for maritime object detection research and applications.

    Reference article: https://doi.org/10.3390/jmse11091638

  13. R

    Shoes Dataset

    • universe.roboflow.com
    zip
    Updated Jul 28, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    up (2022). Shoes Dataset [Dataset]. https://universe.roboflow.com/up-7wdzo/shoes-w3c67/dataset/4
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 28, 2022
    Dataset authored and provided by
    up
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Shoes Bounding Boxes
    Description

    Shoes

    ## Overview
    
    Shoes is a dataset for object detection tasks - it contains Shoes annotations for 527 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  14. Trojan Detection Software Challenge - object-detection-jul2022-train

    • data.nist.gov
    • gimi9.com
    • +1more
    Updated Jul 24, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institute of Standards and Technology (2022). Trojan Detection Software Challenge - object-detection-jul2022-train [Dataset]. http://doi.org/10.18434/mds2-2783
    Explore at:
    Dataset updated
    Jul 24, 2022
    Dataset provided by
    National Institute of Standards and Technologyhttp://www.nist.gov/
    License

    https://www.nist.gov/open/licensehttps://www.nist.gov/open/license

    Description

    Round 10 Train Dataset This is the training data used to create and evaluate trojan detection software solutions. This data, generated at NIST, consists of object detection AIs trained on the COCO dataset. A known percentage of these trained AI models have been poisoned with a known trigger which induces incorrect behavior. This data will be used to develop software solutions for detecting which trained AI models have been poisoned via embedded triggers. This dataset consists of 144 AI models using a small set of model architectures. Half (50%) of the models have been poisoned with an embedded trigger which causes misclassification of the input when the trigger is present.

  15. License Plates Object Detection Dataset - Original License Plates

    • public.roboflow.com
    zip
    Updated Oct 15, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Roboflow (2022). License Plates Object Detection Dataset - Original License Plates [Dataset]. https://public.roboflow.com/object-detection/license-plates-us-eu/3
    Explore at:
    zipAvailable download formats
    Dataset updated
    Oct 15, 2022
    Dataset authored and provided by
    Roboflow
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Bounding Boxes of Plates
    Description

    Overview

    The License Plates dataset is a object detection dataset of different vehicles (i.e. cars, vans, etc.) and their respective license plate. Annotations also include examples of "vehicle" and "license-plate". This dataset has a train/validation/test split of 245/70/35 respectively. https://i.imgur.com/JmRgjBq.png" alt="Dataset Example">

    Use Cases

    This dataset could be used to create a vehicle and license plate detection object detection model. Roboflow provides a great guide on creating a license plate and vehicle object detection model.

    Using this Dataset

    This dataset is a subset of the Open Images Dataset. The annotations are licensed by Google LLC under CC BY 4.0 license. Some annotations have been combined or removed using Roboflow's annotation management tools to better align the annotations with the purpose of the dataset. The images have a CC BY 2.0 license.

    About Roboflow

    Roboflow creates tools that make computer vision easy to use for any developer, even if you're not a machine learning expert. You can use it to organize, label, inspect, convert, and export your image datasets. And even to train and deploy computer vision models with no code required. https://i.imgur.com/WHFqYSJ.png" alt="https://roboflow.com">

  16. ResNet-18

    • kaggle.com
    Updated Dec 12, 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    PyTorch (2017). ResNet-18 [Dataset]. https://www.kaggle.com/pytorch/resnet18/activity
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 12, 2017
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    PyTorch
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    ResNet-18

    Deep Residual Learning for Image Recognition

    Deeper neural networks are more difficult to train. We present a residual learning framework to ease the training of networks that are substantially deeper than those used previously. We explicitly reformulate the layers as learning residual functions with reference to the layer inputs, instead of learning unreferenced functions. We provide comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth. On the ImageNet dataset we evaluate residual nets with a depth of up to 152 layers---8x deeper than VGG nets but still having lower complexity.

    An ensemble of these residual nets achieves 3.57% error on the ImageNet test set. This result won the 1st place on the ILSVRC 2015 classification task. We also present analysis on CIFAR-10 with 100 and 1000 layers.

    The depth of representations is of central importance for many visual recognition tasks. Solely due to our extremely deep representations, we obtain a 28% relative improvement on the COCO object detection dataset. Deep residual nets are foundations of our submissions to ILSVRC & COCO 2015 competitions, where we also won the 1st places on the tasks of ImageNet detection, ImageNet localization, COCO detection, and COCO segmentation.

    Authors: Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun
    https://arxiv.org/abs/1512.03385

    Architecture visualization: http://ethereon.github.io/netscope/#/gist/db945b393d40bfa26006

    https://imgur.com/nyYh5xH.jpg" alt="Resnet">

    What is a Pre-trained Model?

    A pre-trained model has been previously trained on a dataset and contains the weights and biases that represent the features of whichever dataset it was trained on. Learned features are often transferable to different data. For example, a model trained on a large dataset of bird images will contain learned features like edges or horizontal lines that you would be transferable your dataset.

    Why use a Pre-trained Model?

    Pre-trained models are beneficial to us for many reasons. By using a pre-trained model you are saving time. Someone else has already spent the time and compute resources to learn a lot of features and your model will likely benefit from it.

  17. m

    MaVeCoDD Dataset: Marine Vessel Hull Corrosion in Dry-Dock Images

    • data.mendeley.com
    • narcis.nl
    Updated Apr 6, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Georgios Chliveros (2021). MaVeCoDD Dataset: Marine Vessel Hull Corrosion in Dry-Dock Images [Dataset]. http://doi.org/10.17632/ry392rp8cj.1
    Explore at:
    Dataset updated
    Apr 6, 2021
    Authors
    Georgios Chliveros
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Following SOLAS regulations sea-going vessels have to undergo at least two dry docks at a minimum every three (operational) years. The process refers to a vessel brought to dry land so that submerged portions of the hull can be cleaned and inspected. Both the docking process and the defect inspection is time consuming and expensive. Human experts are performing the inspection by means of visual inspection. Several image processing algorithms have been proposed to perform corrosion detection and could be used for vessel defect detection. However, to the best of our knowledge, there are no image sequences for benchmarking the performance of any algorithm and method. The purpose of this dataset is precisely to provide a benchmark dataset for current and future use.

    This dataset was collected and took the current form over the period of summer 2019 and 2020. The dataset of images was collected during dry docking of large vessels via two different cameras. The image folder contains high resolution images in one folder, and low resolution images in a second folder, alongside the labeled images that can be used as ground truth. Other issues, such as changing lighting conditions and general surface artifacts, are also evident, particularly in the low resolution images folder. Visual inspections were performed by trained professionals. The collected images correspond to hull areas that are deemed by the human inspector as being problematic. The inspector then highlights the regions of interest by manually labeling regions of interest identified as corroded. Note that these manually labeled images are deemed to be corroded and/or could produce rust on the surface of the hull in the (near) future.

    You can use the dataset provided herein to test any machine vision / deep learning algorithm. For that purpose, we further offer a python script (under the utils folder) to transform our image labels into JSON format coco annotations for use with deep learning frameworks (e.g. Keras API).

  18. Z

    Img2brain: Predicting the neural responses to visual stimuli of naturalistic...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Oct 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ayala-Ruano, Sebastian (2023). Img2brain: Predicting the neural responses to visual stimuli of naturalistic scenes using machine learning [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7979729
    Explore at:
    Dataset updated
    Oct 16, 2023
    Dataset authored and provided by
    Ayala-Ruano, Sebastian
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The data for this project is part of the Natural Scenes Dataset (NSD), a massive dataset of 7T fMRI responses to images of natural scenes coming from the COCO dataset. The training dataset consists of brain responses measured at 10.000 brain locations (voxels) to 8857 images (in jpg format) for one subject. The 10.000 voxels are distributed around the visual pathway and may encode perceptual and semantic features in different proportions. The test dataset comprises 984 images (in jpg format), and the goal is to predict the brain responses to these images.

    The zip file contains the following folders:

    1. trainingIMG: contains the training images (8857) in jpg format. The numbering corresponds to the order of the rows in the brain response matrix.

    2. testIMG: contains test images (984) in jpg format.

    3. trainingfMRI: contains a npy file with the fMRI responses measured at 10000 brain locations (voxels) to the training images. The matrix has 8857 rows (one for each image) and 10000 columns (one for each voxel).

  19. i

    SDGSAT-1 Misalignment dataset for Object Detection

    • ieee-dataport.org
    Updated Dec 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pei Tan (2024). SDGSAT-1 Misalignment dataset for Object Detection [Dataset]. http://doi.org/10.21227/0jzw-c416
    Explore at:
    Dataset updated
    Dec 22, 2024
    Dataset provided by
    IEEE Dataport
    Authors
    Pei Tan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Annotated 1,000 misalignment from the SDGSAT-1 glimmer imagery, divided into train, valid, and test sets with a ratio of 7:2:1 for the object detection task.This dataset contains only one type of object: misalignment. We used a 32×32 window to crop the raw SDGSAT-1 Level-1 glimmer imagery and converted the TIFF format to JPEG format. At each window, a column number was randomly selected, and the corresponding pixels to the right of this column were shifted vertically either upward or downward by 2 to 8 pixels. The annotations were done in COCO format using LabelImg, with each TXT label file corresponding one-to-one with the JPEG image files.

  20. Data from: Life beneath the ice: jellyfish and ctenophores from the Ross...

    • zenodo.org
    • data.niaid.nih.gov
    Updated Jul 30, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gerlien Verhaegen; Gerlien Verhaegen; Emiliano Cimoli; Emiliano Cimoli; Dhugal J Lindsay; Dhugal J Lindsay (2021). Life beneath the ice: jellyfish and ctenophores from the Ross Sea, Antarctica, with an image-based training set for machine learning [Dataset]. http://doi.org/10.5281/zenodo.5118013
    Explore at:
    Dataset updated
    Jul 30, 2021
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Gerlien Verhaegen; Gerlien Verhaegen; Emiliano Cimoli; Emiliano Cimoli; Dhugal J Lindsay; Dhugal J Lindsay
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Antarctica, Ross Sea
    Description

    This Zenodo dataset contain the Common Objects in Context (COCO) files linked to the following publication:

    Verhaegen, G, Cimoli, E, & Lindsay, D (2021). Life beneath the ice: jellyfish and ctenophores from the Ross Sea, Antarctica, with an image-based training set for machine learning. Biodiversity Data Journal.

    Each COCO zip folder contains an "annotations" folder including a json file and an "images" folder containing the annotated images.

    Details on each COCO zip folders:

    • Beroe_sp_A_images-coco 1.0.zip

    COCO annotations of Beroe sp. A for the following 114 images:

    MCMEC2018_20181116_NIKON_Beroe_sp_A_c_1 to MCMEC2018_20181116_NIKON_Beroe_sp_A_c_16, MCMEC2018_20181125_NIKON_Beroe_sp_A_d_1 to MCMEC2018_20181125_NIKON_Beroe_sp_A_d_57, MCMEC2018_20181127_NIKON_Beroe_sp_A_e_1 to MCMEC2018_20181127_NIKON_Beroe_sp_A_e_2, MCMEC2019_20191116_SONY_Beroe_sp_A_a_1 to MCMEC2019_20191116_SONY_Beroe_sp_A_a_28, and MCMEC2019_20191127_SONY_Beroe_sp_A_f_1 to MCMEC2019_20191127_SONY_Beroe_sp_A_f_12

    • Beroe_sp_B_images-coco 1.0.zip

    COCO annotations of Beroe sp. B for the following 2 images:

    MCMEC2019_20191115_SONY_Beroe_sp_B_a_1 and MCMEC2019_20191115_SONY_Beroe_sp_B_a_2

    • Callianira_cristata_images-coco 1.0.zip

    COCO annotations of Callianira cristata for the following 21 images:

    MCMEC2019_20191120_SONY_Callianira_cristata_b_1 to MCMEC2019_20191120_SONY_Callianira_cristata_b_21

    • Diplulmaris_antarctica_images-coco 1.0.zip

    COCO annotations of Diplulmaris antarctica for the following 83 images:

    MCMEC2019_20191116_SONY_Diplulmaris_antarctica_a_1 to MCMEC2019_20191116_SONY_Diplulmaris_antarctica_a_9, and MCMEC2019_20191201_SONY_Diplulmaris_antarctica_c_1 to MCMEC2019_20191201_SONY_Diplulmaris_antarctica_c_74

    • Koellikerina_maasi_images-coco 1.0.zip

    COCO annotations of Koellikerina maasi for the following 49 images:

    MCMEC2018_20181127_NIKON_Koellikerina_maasi_b_1 to MCMEC2018_20181127_NIKON_Koellikerina_maasi_b_4, MCMEC2018_20181129_NIKON_Koellikerina_maasi_c_1 to MCMEC2018_20181129_NIKON_Koellikerina_maasi_c_29, and MCMEC2019_20191126_SONY_Koellikerina_maasi_a_1 to MCMEC2019_20191126_SONY_Koellikerina_maasi_a_16

    • Leptomedusa_sp_A-coco 1.0.zip

    COCO annotations of Leptomedusa sp. A for Figure 5 (see paper).

    • Leuckartiara_brownei_images-coco 1.0.zip

    COCO annotations of Leuckartiara brownei for the following 48 images:

    MCMEC2018_20181129_NIKON_Leuckartiara_brownei_b_1 to MCMEC2018_20181129_NIKON_Leuckartiara_brownei_b_27, MCMEC2018_20181129_NIKON_Leuckartiara_brownei_c_1 to MCMEC2018_20181129_NIKON_Leuckartiara_brownei_c_6, and MCMEC2019_20191116_SONY_Leuckartiara_brownei_a_1 to MCMEC2019_20191116_SONY_Leuckartiara_brownei_a_15

    • MCMEC2019_20191115_SONY_Mertensiidae_sp_A_a_3-coco 1.0.zip

    COCO annotations of Mertensiidae sp. A for the following video (total of 1847 frames): MCMEC2019_20191115_SONY_Mertensiidae_sp_A_a_3 (https://youtu.be/0W2HHLW71Pw)

    • MCMEC2019_20191116_SONY_Leuckartiara_brownei_a_3-coco 1.0.zip

    COCO annotations of Leuckartiara brownei for the following video (total of 1367 frames): MCMEC2019_20191116_SONY_Leuckartiara_brownei_a_3 (https://youtu.be/dEIbVYlF_TQ)

    • MCMEC2019_20191122_SONY_Callianira_cristata_a_1-coco 1.0.zip

    COCO annotations of Callianira cristata for the following video (total of 2423 frames): MCMEC2019_20191122_SONY_Callianira_cristata_a_1 (https://youtu.be/30g9CvYh5JE)

    • MCMEC2019_20191122_SONY_Leptomedusa_sp_B_a_1-coco 1.0.zip

    COCO annotations of Leptomedusa sp. B for the following video (total of 1164 frames): MCMEC2019_20191122_SONY_Leptomedusa_sp_B_a_1 (https://youtu.be/hrufuPQ7F8U)

    • MCMEC2019_20191126_SONY_Koellikerina_maasi_a_1-coco 1.0.zip

    COCO annotations of Koellikerina maasi for the following video (total of 1643 frames): MCMEC2019_20191126_SONY_Koellikerina_maasi_a_1 (https://youtu.be/QiBPf_HYrQ8)

    • MCMEC2019_20191129_SONY_Mertensiidae_sp_A_b_1-coco 1.0.zip

    COCO annotations of Mertensiidae sp. A for the following video (total of 239 frames): MCMEC2019_20191129_SONY_Mertensiidae_sp_A_b_1 (https://youtu.be/pvXYlQGZIVg)

    • MCMEC2019_20191129_SONY_Pyrostephos_vanhoeffeni_b_2-coco 1.0.zip

    COCO annotations of Pyrostephos vanhoeffeni for the following video (total of 444 frames): MCMEC2019_20191129_SONY_Pyrostephos_vanhoeffeni_b_2 (https://youtu.be/2rrQCybEg0Q)

    • MCMEC2019_20191129_SONY_Pyrostephos_vanhoeffeni_b_3-coco 1.0.zip

    COCO annotations of Pyrostephos vanhoeffeni for the following video (total of 683 frames): MCMEC2019_20191129_SONY_Pyrostephos_vanhoeffeni_b_3 (https://youtu.be/G9tev_gdUvQ)

    • MCMEC2019_20191129_SONY_Pyrostephos_vanhoeffeni_b_4-coco 1.0.zip

    COCO annotations of Pyrostephos vanhoeffeni for the following video (total of 1127 frames): MCMEC2019_20191129_SONY_Pyrostephos_vanhoeffeni_b_4 (https://youtu.be/NfJjKBRh5Hs)

    • MCMEC2019_20191130_SONY_Beroe_sp_A_b_1-coco 1.0.zip

    COCO annotations of Beroe sp. A for the following video (total of 2171 frames): MCMEC2019_20191130_SONY_Beroe_sp_A_b_1 (https://youtu.be/kGBUQ7ZtH9U)

    • MCMEC2019_20191130_SONY_Beroe_sp_A_b_2-coco 1.0.zip

    COCO annotations of Beroe sp. A for the following video (total of 359 frames): MCMEC2019_20191130_SONY_Beroe_sp_A_b_2 (https://youtu.be/Vbl_KEmPNmU)

    • Mertensiidae_sp_A_images-coco 1.0.zip

    COCO annotations of Mertensiidae sp. A for the following 49 images:

    MCMEC2018_20181127_NIKON_Mertensiidae_sp_A_c_1 to MCMEC2018_20181127_NIKON_Mertensiidae_sp_A_c_2, MCMEC2018_20181127_NIKON_Mertensiidae_sp_A_f_1 to MCMEC2018_20181127_NIKON_Mertensiidae_sp_A_f_8, MCMEC2018_20181129_NIKON_Mertensiidae_sp_A_d_1 to MCMEC2018_20181129_NIKON_Mertensiidae_sp_A_d_13, MCMEC2018_20181201_ROV_Mertensiidae_sp_A_e_1 to MCMEC2018_20181201_ROV_Mertensiidae_sp_A_e_15, and MCMEC2019_20191115_SONY_Mertensiidae_sp_A_a_1 to MCMEC2019_20191115_SONY_Mertensiidae_sp_A_a_11

    • Pyrostephos_vanhoeffeni_images-coco 1.0.zip

    COCO annotations of Pyrostephos vanhoeffeni for the following 14 images: MCMEC2019_20191125_SONY_Pyrostephos_vanhoeffeni_a_1 to MCMEC2019_20191125_SONY_Pyrostephos_vanhoeffeni_a_8, MCMEC2019_20191129_SONY_Pyrostephos_vanhoeffeni_b_1 to MCMEC2019_20191129_SONY_Pyrostephos_vanhoeffeni_b_6

    • Solmundella_bitentaculata_images-coco 1.0.zip

    COCO annotations of Solmundella bitentaculata for the following 13 images: MCMEC2018_20181127_NIKON_Solmundella_bitentaculata_a_1 to MCMEC2018_20181127_NIKON_Solmundella_bitentaculata_a_13

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Tsung-Yi Lin; Michael Maire; Serge Belongie; Lubomir Bourdev; Ross Girshick; James Hays; Pietro Perona; Deva Ramanan; C. Lawrence Zitnick; Piotr Dollár, MS COCO Dataset [Dataset]. https://paperswithcode.com/dataset/coco

MS COCO Dataset

Microsoft Common Objects in Context

Explore at:
Dataset updated
Apr 15, 2024
Authors
Tsung-Yi Lin; Michael Maire; Serge Belongie; Lubomir Bourdev; Ross Girshick; James Hays; Pietro Perona; Deva Ramanan; C. Lawrence Zitnick; Piotr Dollár
Description

The MS COCO (Microsoft Common Objects in Context) dataset is a large-scale object detection, segmentation, key-point detection, and captioning dataset. The dataset consists of 328K images.

Splits: The first version of MS COCO dataset was released in 2014. It contains 164K images split into training (83K), validation (41K) and test (41K) sets. In 2015 additional test set of 81K images was released, including all the previous test images and 40K new images.

Based on community feedback, in 2017 the training/validation split was changed from 83K/41K to 118K/5K. The new split uses the same images and annotations. The 2017 test set is a subset of 41K images of the 2015 test set. Additionally, the 2017 release contains a new unannotated dataset of 123K images.

Annotations: The dataset has annotations for

object detection: bounding boxes and per-instance segmentation masks with 80 object categories, captioning: natural language descriptions of the images (see MS COCO Captions), keypoints detection: containing more than 200,000 images and 250,000 person instances labeled with keypoints (17 possible keypoints, such as left eye, nose, right hip, right ankle), stuff image segmentation – per-pixel segmentation masks with 91 stuff categories, such as grass, wall, sky (see MS COCO Stuff), panoptic: full scene segmentation, with 80 thing categories (such as person, bicycle, elephant) and a subset of 91 stuff categories (grass, sky, road), dense pose: more than 39,000 images and 56,000 person instances labeled with DensePose annotations – each labeled person is annotated with an instance id and a mapping between image pixels that belong to that person body and a template 3D model. The annotations are publicly available only for training and validation images.

Search
Clear search
Close search
Google apps
Main menu