51 datasets found
  1. Sartorius COCO Format Dataset

    • kaggle.com
    zip
    Updated Oct 28, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ari (2021). Sartorius COCO Format Dataset [Dataset]. https://www.kaggle.com/vexxingbanana/sartorius-coco-format-dataset
    Explore at:
    zip(9798602 bytes)Available download formats
    Dataset updated
    Oct 28, 2021
    Authors
    Ari
    Description

    Dataset

    This dataset was created by Ari

    Contents

  2. Microsoft Coco Dataset

    • universe.roboflow.com
    zip
    Updated Mar 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Microsoft (2025). Microsoft Coco Dataset [Dataset]. https://universe.roboflow.com/microsoft/coco/model/3
    Explore at:
    zipAvailable download formats
    Dataset updated
    Mar 23, 2025
    Dataset authored and provided by
    Microsofthttp://microsoft.com/
    Variables measured
    Object Bounding Boxes
    Description

    Microsoft Common Objects in Context (COCO) Dataset

    The Common Objects in Context (COCO) dataset is a widely recognized collection designed to spur object detection, segmentation, and captioning research. Created by Microsoft, COCO provides annotations, including object categories, keypoints, and more. The model it a valuable asset for machine learning practitioners and researchers. Today, many model architectures are benchmarked against COCO, which has enabled a standard system by which architectures can be compared.

    While COCO is often touted to comprise over 300k images, it's pivotal to understand that this number includes diverse formats like keypoints, among others. Specifically, the labeled dataset for object detection stands at 123,272 images.

    The full object detection labeled dataset is made available here, ensuring researchers have access to the most comprehensive data for their experiments. With that said, COCO has not released their test set annotations, meaning the test data doesn't come with labels. Thus, this data is not included in the dataset.

    The Roboflow team has worked extensively with COCO. Here are a few links that may be helpful as you get started working with this dataset:

  3. COCO dataset and neural network weights for micro-FTIR particle detection on...

    • zenodo.org
    • data.niaid.nih.gov
    bin, zip
    Updated Aug 13, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Thibault Schowing; Thibault Schowing (2024). COCO dataset and neural network weights for micro-FTIR particle detection on filters. [Dataset]. http://doi.org/10.5281/zenodo.10839527
    Explore at:
    bin, zipAvailable download formats
    Dataset updated
    Aug 13, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Thibault Schowing; Thibault Schowing
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The IMPTOX project has received funding from the EU's H2020 framework programme for research and innovation under grant agreement n. 965173. Imptox is part of the European MNP cluster on human health.

    More information about the project here.

    Description: This repository includes the trained weights and a custom COCO-formatted dataset used for developing and testing a Faster R-CNN R_50_FPN_3x object detector, specifically designed to identify particles in micro-FTIR filter images.

    Contents:

    1. Weights File (neuralNetWeights_V3.pth):

      • Format: .pth
      • Description: This file contains the trained weights for a Faster R-CNN model with a ResNet-50 backbone and a Feature Pyramid Network (FPN), trained for 3x schedule. These weights are specifically tuned for detecting particles in micro-FTIR filter images.
    2. Custom COCO Dataset (uFTIR_curated_square.v5-uftir_curated_square_2024-03-14.coco-segmentation.zip):

      • Format: .zip
      • Description: This zip archive contains a custom COCO-formatted dataset, including JPEG images and their corresponding annotation file. The dataset consists of images of micro-FTIR filters with annotated particles.
      • Contents:
        • Images: JPEG format images of micro-FTIR filters.
        • Annotations: A JSON file in COCO format providing detailed annotations of the particles in the images.
      • Management: The dataset can be managed and manipulated using the Pycocotools library, facilitating easy integration with existing COCO tools and workflows.

    Applications: The provided weights and dataset are intended for researchers and practitioners in the field of microscopy and particle detection. The dataset and model can be used for further training, validation, and fine-tuning of object detection models in similar domains.

    Usage Notes:

    • The neuralNetWeights_V3.pth file should be loaded into a PyTorch model compatible with the Faster R-CNN architecture, such as Detectron2.
    • The contents of uFTIR_curated_square.v5-uftir_curated_square_2024-03-14.coco-segmentation.zip should be extracted and can be used with any COCO-compatible object detection framework for training and evaluation purposes.
    • Code can be found on the related Github repository.

  4. Esefjorden Marine Vegetation Segmentation Dataset (EMVSD)

    • figshare.com
    bin
    Updated Dec 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    BjΓΈrn Christian Weinbach (2024). Esefjorden Marine Vegetation Segmentation Dataset (EMVSD) [Dataset]. http://doi.org/10.6084/m9.figshare.24072606.v4
    Explore at:
    binAvailable download formats
    Dataset updated
    Dec 9, 2024
    Dataset provided by
    figshare
    Authors
    BjΓΈrn Christian Weinbach
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Esefjorden Marine Vegetation Segmentation Dataset (EMVSD):Comprising 17,000 meticulously labeled images, this dataset is suited for instance segmentation tasks and represents a significant leap forward for marine research in the region. The images are stored in YOLO and COCO formats, ensuring compatibility with widely recognized and adopted object detection frameworks. Our decision to make this dataset publicly accessible underscores our commitment to collaborative research and the advancement of the broader scientific community.Dataset Structure:- Images: - Organized into three subsets: train, val, and test, located under the images/ directory. - Each subset contains high-resolution images optimized for object detection and segmentation tasks.- Annotations: - Available in YOLO txt and COCO formats for compatibility with major object detection frameworks. - Organized into three subsets: train, val, and test, located under the labels/ directory. - Additional metadata: - counts.txt: Summary of label distributions. - Cache files (train.cache, val.cache, test.cache) for efficient dataset loading.- Metadata: - classes.txt: Definitions for all annotated classes in the dataset. - Detailed COCO-format annotations in: - train_annotations.json - val_annotations.json - test_annotations.json- Configuration File: - EMVSD.yaml: Configuration file for seamless integration with machine learning libraries.Example Directory Structure:EMVSD/β”œβ”€β”€ images/β”‚ β”œβ”€β”€ train/β”‚ β”œβ”€β”€ val/β”‚ └── test/β”œβ”€β”€ labels/β”‚ β”œβ”€β”€ train/β”‚ β”œβ”€β”€ val/β”‚ β”œβ”€β”€ test/β”‚ β”œβ”€β”€ counts.txtβ”‚ β”œβ”€β”€ train.cacheβ”‚ β”œβ”€β”€ val.cacheβ”‚ └── test.cacheβ”œβ”€β”€ classes.txtβ”œβ”€β”€ train_annotations.jsonβ”œβ”€β”€ val_annotations.jsonβ”œβ”€β”€ test_annotations.json└── EMVSD.yaml

  5. R

    Taco: Trash Annotations In Context Dataset

    • universe.roboflow.com
    • zenodo.org
    zip
    Updated Aug 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mohamed Traore (2024). Taco: Trash Annotations In Context Dataset [Dataset]. https://universe.roboflow.com/mohamed-traore-2ekkp/taco-trash-annotations-in-context/model/13
    Explore at:
    zipAvailable download formats
    Dataset updated
    Aug 1, 2024
    Dataset authored and provided by
    Mohamed Traore
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Trash Polygons
    Description

    TACO: Trash Annotations in Context Dataset

    From: Pedro F. ProenΓ§a; Pedro SimΓ΅es

    TACO is a growing image dataset of trash in the wild. It contains segmented images of litter taken under diverse environments: woods, roads and beaches. These images are manually labeled according to an hierarchical taxonomy to train and evaluate object detection algorithms. Annotations are provided in a similar format to COCO dataset.

    The model in action:

    https://raw.githubusercontent.com/wiki/pedropro/TACO/images/teaser.gif" alt="Gif of the model running inference">

    Examples images from the dataset:

    https://raw.githubusercontent.com/wiki/pedropro/TACO/images/2.png" alt="Example Image #2 from the Dataset"> https://raw.githubusercontent.com/wiki/pedropro/TACO/images/5.png" alt="Example Image #5 from the Dataset">

    For more details and to cite the authors:

    • Paper: https://arxiv.org/abs/2003.06975
    • Paper Citation: @article{taco2020, title={TACO: Trash Annotations in Context for Litter Detection}, author={Pedro F ProenΓ§a and Pedro SimΓ΅es}, journal={arXiv preprint arXiv:2003.06975}, year=
  6. Parcel2D Real - A real-world image dataset of cuboid-shaped parcels with 2D...

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Jul 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexander Naumann; Alexander Naumann; Felix Hertlein; Felix Hertlein; Benchun Zhou; Benchun Zhou; Laura DΓΆrr; Laura DΓΆrr; Kai Furmans; Kai Furmans (2023). Parcel2D Real - A real-world image dataset of cuboid-shaped parcels with 2D and 3D annotations [Dataset]. http://doi.org/10.5281/zenodo.8031971
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 13, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Alexander Naumann; Alexander Naumann; Felix Hertlein; Felix Hertlein; Benchun Zhou; Benchun Zhou; Laura DΓΆrr; Laura DΓΆrr; Kai Furmans; Kai Furmans
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Real-world dataset of ~400 images of cuboid-shaped parcels with full 2D and 3D annotations in the COCO format.

    Relevant computer vision tasks:

    • bounding box detection
    • instance segmentation
    • keypoint estimation
    • 3D bounding box estimation
    • 3D voxel reconstruction (.binvox files)
    • 3D reconstruction (.obj files)

    For details, see our paper and project page.

    If you use this resource for scientific research, please consider citing

    @inproceedings{naumannScrapeCutPasteLearn2022,
      title    = {Scrape, Cut, Paste and Learn: Automated Dataset Generation Applied to Parcel Logistics},
      author    = {Naumann, Alexander and Hertlein, Felix and Zhou, Benchun and DΓΆrr, Laura and Furmans, Kai},
      booktitle  = {{{IEEE Conference}} on {{Machine Learning}} and Applications ({{ICMLA}})},
      date     = 2022
    }

  7. Data from: Dataset of very-high-resolution satellite RGB images to train...

    • zenodo.org
    • produccioncientifica.ugr.es
    • +1more
    zip
    Updated Jul 6, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rohaifa Khaldi; Rohaifa Khaldi; Sergio Puertas; Sergio Puertas; Siham Tabik; Siham Tabik; Domingo Alcaraz-Segura; Domingo Alcaraz-Segura (2022). Dataset of very-high-resolution satellite RGB images to train deep learning models to detect and segment high-mountain juniper shrubs in Sierra Nevada (Spain) [Dataset]. http://doi.org/10.5281/zenodo.6793457
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 6, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Rohaifa Khaldi; Rohaifa Khaldi; Sergio Puertas; Sergio Puertas; Siham Tabik; Siham Tabik; Domingo Alcaraz-Segura; Domingo Alcaraz-Segura
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Sierra Nevada, Spain
    Description

    This dataset provides annotated very-high-resolution satellite RGB images extracted from Google Earth to train deep learning models to perform instance segmentation of Juniperus communis L. and Juniperus sabina L. shrubs. All images are from the high mountain of Sierra Nevada in Spain. The dataset contains 810 images (.jpg) of size 224x224 pixels. We also provide partitioning of the data into Train (567 images), Test (162 images), and Validation (81 images) subsets. Their annotations are provided in three different .json files following the COCO annotation format.

  8. P

    TACO Dataset

    • paperswithcode.com
    • opendatalab.com
    Updated Nov 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pedro F. ProenΓ§a; Pedro SimΓ΅es (2024). TACO Dataset [Dataset]. https://paperswithcode.com/dataset/taco
    Explore at:
    Dataset updated
    Nov 7, 2024
    Authors
    Pedro F. ProenΓ§a; Pedro SimΓ΅es
    Description

    TACO is a growing image dataset of waste in the wild. It contains images of litter taken under diverse environments: woods, roads and beaches. These images are manually labelled and segmented according to a hierarchical taxonomy to train and evaluate object detection algorithms. The annotations are provided in COCO format.

  9. R

    Custom Yolov7 On Kaggle On Custom Dataset

    • universe.roboflow.com
    zip
    Updated Jan 29, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Owais Ahmad (2023). Custom Yolov7 On Kaggle On Custom Dataset [Dataset]. https://universe.roboflow.com/owais-ahmad/custom-yolov7-on-kaggle-on-custom-dataset-rakiq/dataset/2
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 29, 2023
    Dataset authored and provided by
    Owais Ahmad
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Person Car Bounding Boxes
    Description

    Custom Training with YOLOv7 πŸ”₯

    Some Important links

    Contact Information

    Objective

    To Showcase custom Object Detection on the Given Dataset to train and Infer the Model using newly launched YoloV7.

    Data Acquisition

    The goal of this task is to train a model that can localize and classify each instance of Person and Car as accurately as possible.

    from IPython.display import Markdown, display
    
    display(Markdown("../input/Car-Person-v2-Roboflow/README.roboflow.txt"))
    

    Custom Training with YOLOv7 πŸ”₯

    In this Notebook, I have processed the images with RoboFlow because in COCO formatted dataset was having different dimensions of image and Also data set was not splitted into different Format. To train a custom YOLOv7 model we need to recognize the objects in the dataset. To do so I have taken the following steps:

    • Export the dataset to YOLOv7
    • Train YOLOv7 to recognize the objects in our dataset
    • Evaluate our YOLOv7 model's performance
    • Run test inference to view performance of YOLOv7 model at work

    πŸ“¦ YOLOv7

    https://raw.githubusercontent.com/Owaiskhan9654/Yolo-V7-Custom-Dataset-Train-on-Kaggle/main/car-person-2.PNG" width=800>

    Image Credit - jinfagang

    Step 1: Install Requirements

    !git clone https://github.com/WongKinYiu/yolov7 # Downloading YOLOv7 repository and installing requirements
    %cd yolov7
    !pip install -qr requirements.txt
    !pip install -q roboflow
    

    Downloading YOLOV7 starting checkpoint

    !wget "https://github.com/WongKinYiu/yolov7/releases/download/v0.1/yolov7.pt"
    
    import os
    import glob
    import wandb
    import torch
    from roboflow import Roboflow
    from kaggle_secrets import UserSecretsClient
    from IPython.display import Image, clear_output, display # to display images
    
    
    
    print(f"Setup complete. Using torch {torch._version_} ({torch.cuda.get_device_properties(0).name if torch.cuda.is_available() else 'CPU'})")
    

    https://camo.githubusercontent.com/dd842f7b0be57140e68b2ab9cb007992acd131c48284eaf6b1aca758bfea358b/68747470733a2f2f692e696d6775722e636f6d2f52557469567a482e706e67">

    I will be integrating W&B for visualizations and logging artifacts and comparisons of different models!

    YOLOv7-Car-Person-Custom

    try:
      user_secrets = UserSecretsClient()
      wandb_api_key = user_secrets.get_secret("wandb_api")
      wandb.login(key=wandb_api_key)
      anonymous = None
    except:
      wandb.login(anonymous='must')
      print('To use your W&B account,
    Go to Add-ons -> Secrets and provide your W&B access token. Use the Label name as WANDB. 
    Get your W&B access token from here: https://wandb.ai/authorize')
      
      
      
    wandb.init(project="YOLOvR",name=f"7. YOLOv7-Car-Person-Custom-Run-7")
    

    Step 2: Assemble Our Dataset

    https://uploads-ssl.webflow.com/5f6bc60e665f54545a1e52a5/615627e5824c9c6195abfda9_computer-vision-cycle.png" alt="">

    In order to train our custom model, we need to assemble a dataset of representative images with bounding box annotations around the objects that we want to detect. And we need our dataset to be in YOLOv7 format.

    In Roboflow, We can choose between two paths:

    Version v2 Aug 12, 2022 Looks like this.

    https://raw.githubusercontent.com/Owaiskhan9654/Yolo-V7-Custom-Dataset-Train-on-Kaggle/main/Roboflow.PNG" alt="">

    user_secrets = UserSecretsClient()
    roboflow_api_key = user_secrets.get_secret("roboflow_api")
    
    rf = Roboflow(api_key=roboflow_api_key)
    project = rf.workspace("owais-ahmad").project("custom-yolov7-on-kaggle-on-custom-dataset-rakiq")
    dataset = project.version(2).download("yolov7")
    

    Step 3: Training Custom pretrained YOLOv7 model

    Here, I am able to pass a number of arguments: - img: define input image size - batch: determine

  10. m

    Tracking Plant Growth Using Image Sequence Analysis- Datasets

    • data.mendeley.com
    Updated Jan 10, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yiftah Szoke (2025). Tracking Plant Growth Using Image Sequence Analysis- Datasets [Dataset]. http://doi.org/10.17632/z2fp5kbgbh.1
    Explore at:
    Dataset updated
    Jan 10, 2025
    Authors
    Yiftah Szoke
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset consists of five subsets with annotated images in COCO format, designed for object detection and tracking plant growth: 1. Cucumber_Train Dataset (for Faster R-CNN) - Includes training, validation, and test images of cucumbers from different angles. - Annotations: Bounding boxes in COCO format for object detection tasks.

    1. Tomato Dataset
    2. Contains images of tomato plants for 24 hours at hourly intervals from a fixed angle.
    3. Annotations: Bounding boxes in COCO format.

    4. Pepper Dataset

    5. Contains images of pepper plants for 24 hours at hourly intervals from a fixed angle.

    6. Annotations: Bounding boxes in COCO format.

    7. Cannabis Dataset

    8. Contains images of cannabis plants for 24 hours at hourly intervals from a fixed angle.

    9. Annotations: Bounding boxes in COCO format.

    10. Cucumber Dataset

    11. Contains images of cucumber plants for 24 hours at hourly intervals from a fixed angle.

    12. Annotations: Bounding boxes in COCO format.

    This dataset supports training and evaluation of object detection models across diverse crops.

  11. ActiveHuman Part 1

    • zenodo.org
    • data.niaid.nih.gov
    Updated Nov 14, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Charalampos Georgiadis; Charalampos Georgiadis (2023). ActiveHuman Part 1 [Dataset]. http://doi.org/10.5281/zenodo.8359766
    Explore at:
    Dataset updated
    Nov 14, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Charalampos Georgiadis; Charalampos Georgiadis
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is Part 1/2 of the ActiveHuman dataset! Part 2 can be found here.

    Dataset Description

    ActiveHuman was generated using Unity's Perception package.

    It consists of 175428 RGB images and their semantic segmentation counterparts taken at different environments, lighting conditions, camera distances and angles. In total, the dataset contains images for 8 environments, 33 humans, 4 lighting conditions, 7 camera distances (1m-4m) and 36 camera angles (0-360 at 10-degree intervals).

    The dataset does not include images at every single combination of available camera distances and angles, since for some values the camera would collide with another object or go outside the confines of an environment. As a result, some combinations of camera distances and angles do not exist in the dataset.

    Alongside each image, 2D Bounding Box, 3D Bounding Box and Keypoint ground truth annotations are also generated via the use of Labelers and are stored as a JSON-based dataset. These Labelers are scripts that are responsible for capturing ground truth annotations for each captured image or frame. Keypoint annotations follow the COCO format defined by the COCO keypoint annotation template offered in the perception package.

    Folder configuration

    The dataset consists of 3 folders:

    • JSON Data: Contains all the generated JSON files.
    • RGB Images: Contains the generated RGB images.
    • Semantic Segmentation Images: Contains the generated semantic segmentation images.

    Essential Terminology

    • Annotation: Recorded data describing a single capture.
    • Capture: One completed rendering process of a Unity sensor which stored the rendered result to data files (e.g. PNG, JPG, etc.).
    • Ego: Object or person on which a collection of sensors is attached to (e.g., if a drone has a camera attached to it, the drone would be the ego and the camera would be the sensor).
    • Ego coordinate system: Coordinates with respect to the ego.
    • Global coordinate system: Coordinates with respect to the global origin in Unity.
    • Sensor: Device that captures the dataset (in this instance the sensor is a camera).
    • Sensor coordinate system: Coordinates with respect to the sensor.
    • Sequence: Time-ordered series of captures. This is very useful for video capture where the time-order relationship of two captures is vital.
    • UIID: Universal Unique Identifier. It is a unique hexadecimal identifier that can represent an individual instance of a capture, ego, sensor, annotation, labeled object or keypoint, or keypoint template.

    Dataset Data

    The dataset includes 4 types of JSON annotation files files:

    • annotation_definitions.json: Contains annotation definitions for all of the active Labelers of the simulation stored in an array. Each entry consists of a collection of key-value pairs which describe a particular type of annotation and contain information about that specific annotation describing how its data should be mapped back to labels or objects in the scene. Each entry contains the following key-value pairs:
      • id: Integer identifier of the annotation's definition.
      • name: Annotation name (e.g., keypoints, bounding box, bounding box 3D, semantic segmentation).
      • description: Description of the annotation's specifications.
      • format: Format of the file containing the annotation specifications (e.g., json, PNG).
      • spec: Format-specific specifications for the annotation values generated by each Labeler.

    Most Labelers generate different annotation specifications in the spec key-value pair:

    • BoundingBox2DLabeler/BoundingBox3DLabeler:
      • label_id: Integer identifier of a label.
      • label_name: String identifier of a label.
    • KeypointLabeler:
      • template_id: Keypoint template UUID.
      • template_name: Name of the keypoint template.
      • key_points: Array containing all the joints defined by the keypoint template. This array includes the key-value pairs:
        • label: Joint label.
        • index: Joint index.
        • color: RGBA values of the keypoint.
        • color_code: Hex color code of the keypoint
      • skeleton: Array containing all the skeleton connections defined by the keypoint template. Each skeleton connection defines a connection between two different joints. This array includes the key-value pairs:
        • label1: Label of the first joint.
        • label2: Label of the second joint.
        • joint1: Index of the first joint.
        • joint2: Index of the second joint.
        • color: RGBA values of the connection.
        • color_code: Hex color code of the connection.
    • SemanticSegmentationLabeler:
      • label_name: String identifier of a label.
      • pixel_value: RGBA values of the label.
      • color_code: Hex color code of the label.

    • captures_xyz.json: Each of these files contains an array of ground truth annotations generated by each active Labeler for each capture separately, as well as extra metadata that describe the state of each active sensor that is present in the scene. Each array entry in the contains the following key-value pairs:
      • id: UUID of the capture.
      • sequence_id: UUID of the sequence.
      • step: Index of the capture within a sequence.
      • timestamp: Timestamp (in ms) since the beginning of a sequence.
      • sensor: Properties of the sensor. This entry contains a collection with the following key-value pairs:
        • sensor_id: Sensor UUID.
        • ego_id: Ego UUID.
        • modality: Modality of the sensor (e.g., camera, radar).
        • translation: 3D vector that describes the sensor's position (in meters) with respect to the global coordinate system.
        • rotation: Quaternion variable that describes the sensor's orientation with respect to the ego coordinate system.
        • camera_intrinsic: matrix containing (if it exists) the camera's intrinsic calibration.
        • projection: Projection type used by the camera (e.g., orthographic, perspective).
      • ego: Attributes of the ego. This entry contains a collection with the following key-value pairs:
        • ego_id: Ego UUID.
        • translation: 3D vector that describes the ego's position (in meters) with respect to the global coordinate system.
        • rotation: Quaternion variable containing the ego's orientation.
        • velocity: 3D vector containing the ego's velocity (in meters per second).
        • acceleration: 3D vector containing the ego's acceleration (in ).
      • format: Format of the file captured by the sensor (e.g., PNG, JPG).
      • annotations: Key-value pair collections, one for each active Labeler. These key-value pairs are as follows:
        • id: Annotation UUID .
        • annotation_definition: Integer identifier of the annotation's definition.
        • filename: Name of the file generated by the Labeler. This entry is only present for Labelers that generate an image.
        • values: List of key-value pairs containing annotation data for the current Labeler.

    Each Labeler generates different annotation specifications in the values key-value pair:

    • BoundingBox2DLabeler:
      • label_id: Integer identifier of a label.
      • label_name: String identifier of a label.
      • instance_id: UUID of one instance of an object. Each object with the same label that is visible on the same capture has different instance_id values.
      • x: Position of the 2D bounding box on the X axis.
      • y: Position of the 2D bounding box position on the Y axis.
      • width: Width of the 2D bounding box.
      • height: Height of the 2D bounding box.
    • BoundingBox3DLabeler:
      • label_id: Integer identifier of a label.
      • label_name: String identifier of a label.
      • instance_id: UUID of one instance of an object. Each object with the same label that is visible on the same capture has different instance_id values.
      • translation: 3D vector containing the location of the center of the 3D bounding box with respect to the sensor coordinate system (in meters).
      • size: 3D

  12. P

    SUN-RGBD-IS Dataset

    • paperswithcode.com
    Updated Jan 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aecheon Jung; Soyun Choi; Junhong Min; Sungeun Hong (2025). SUN-RGBD-IS Dataset [Dataset]. https://paperswithcode.com/dataset/sun-rgbd-is
    Explore at:
    Dataset updated
    Jan 2, 2025
    Authors
    Aecheon Jung; Soyun Choi; Junhong Min; Sungeun Hong
    Description

    A RGB-D dataset converted from SUN-RGBD into COCO-style instance segmentation format. To transform SUN-RGBD into an instance segmentation benchmark (i.e., SUN-RGBDIS), we employed a pipeline similar to that of NYUDv2-IS. We selected 17 categories from the original 37 classes, carefully omitting non-instance categories like ceilings and walls. Images lacking any identifiable object instances were filtered out to maintain dataset relevance for instance segmentation tasks. We systematically convert segmentation annotations into COCO format, generating precise bounding boxes, instance masks, and object attributes.

  13. m

    openpits asbestos

    • data.mendeley.com
    Updated Nov 27, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mikhail Ronkin (2023). openpits asbestos [Dataset]. http://doi.org/10.17632/pfdbfpfygh.3
    Explore at:
    Dataset updated
    Nov 27, 2023
    Authors
    Mikhail Ronkin
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Database includes images of open-pit places taken in the Bazhenovskoye field, Russia. All images are taken in the different weather and day conditions. All data are labeled for instance segmentation (as well as object detection) problems and have labeling in the COCO format. The archive contains both: all data in the images folder and annotation in the annotations folder. The labeling was performed manually in the CVAT software. The image size is 2592 Γ— 2048.

  14. Z

    YOGData: Labelled data (YOLO and Mask R-CNN) for yogurt cup identification...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jun 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fotis K. Konstantinidis (2022). YOGData: Labelled data (YOLO and Mask R-CNN) for yogurt cup identification within production lines [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6773530
    Explore at:
    Dataset updated
    Jun 29, 2022
    Dataset provided by
    Dimitrios Tsilis
    Fotis K. Konstantinidis
    Symeon Symeonidis
    Vasiliki Balaska
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data abstract: The YogDATA dataset contains images from an industrial laboratory production line when it is functioned to quality yogurts. The case-study for the recognition of yogurt cups requires training of Mask R-CNN and YOLO v5.0 models with a set of corresponding images. Thus, it is important to collect the corresponding images to train and evaluate the class. Specifically, the YogDATA dataset includes the same labeled data for Mask R-CNN (coco format) and YOLO models. For the YOLO architecture, training and validation datsets include sets of images in jpg format and their annotations in txt file format. For the Mask R-CNN architecture, the annotation of the same sets of images are included in json file format (80% of images and annotations of each subset are in training set and 20% of images of each subset are in test set.)

    Paper abstract: The explosion of the digitisation of the traditional industrial processes and procedures is consolidating a positive impact on modern society by offering a critical contribution to its economic development. In particular, the dairy sector consists of various processes, which are very demanding and thorough. It is crucial to leverage modern automation tools and through-engineering solutions to increase their efficiency and continuously meet challenging standards. Towards this end, in this work, an intelligent algorithm based on machine vision and artificial intelligence, which identifies dairy products within production lines, is presented. Furthermore, in order to train and validate the model, the YogDATA dataset was created that includes yogurt cups within a production line. Specifically, we evaluate two deep learning models (Mask R-CNN and YOLO v5.0) to recognise and detect each yogurt cup in a production line, in order to automate the packaging processes of the products. According to our results, the performance precision of the two models is similar, estimating its at 99\%.

  15. C

    Annotations for ConfLab A Rich Multimodal Multisensor Dataset of...

    • data.4tu.nl
    Updated Jun 8, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chirag Raman; Jose Vargas Quiros; Stephanie Tan; Ashraful Islam; Ekin Gedik; Hayley Hung (2022). Annotations for ConfLab A Rich Multimodal Multisensor Dataset of Free-Standing Social Interactions In-the-Wild [Dataset]. http://doi.org/10.4121/20017664.v1
    Explore at:
    Dataset updated
    Jun 8, 2022
    Dataset provided by
    4TU.ResearchData
    Authors
    Chirag Raman; Jose Vargas Quiros; Stephanie Tan; Ashraful Islam; Ekin Gedik; Hayley Hung
    License

    https://data.4tu.nl/info/fileadmin/user_upload/Documenten/4TU.ResearchData_Restricted_Data_2022.pdfhttps://data.4tu.nl/info/fileadmin/user_upload/Documenten/4TU.ResearchData_Restricted_Data_2022.pdf

    Description

    This file contains the annotations for the ConfLab dataset, including actions (speaking status), pose, and F-formations.

    ------------------

    ./actions/speaking_status:

    ./processed: the processed speaking status files, aggregated into a single data frame per segment. Skipped rows in the raw data (see https://josedvq.github.io/covfee/docs/output for details) have been imputed using the code at: https://github.com/TUDelft-SPC-Lab/conflab/tree/master/preprocessing/speaking_status

    The processed annotations consist of:

    ./speaking: The first row contains person IDs matching the sensor IDs,

    The rest of the row contains binary speaking status annotations at 60fps for the corresponding 2 min video segment (7200 frames).

    ./confidence: Same as above. These annotations reflect the continuous-valued rating of confidence of the annotators in their speaking annotation.

    To load these files with pandas: pd.read_csv(p, index_col=False)


    ./raw.zip: the raw outputs from speaking status annotation for each of the eight annotated 2-min video segments. These were were output by the covfee annotation tool (https://github.com/josedvq/covfee)

    Annotations were done at 60 fps.

    --------------------

    ./pose:

    ./coco: the processed pose files in coco JSON format, aggregated into a single data frame per video segment. These files have been generated from the raw files using the code at: https://github.com/TUDelft-SPC-Lab/conflab-keypoints

    To load in Python: f = json.load(open('/path/to/cam2_vid3_seg1_coco.json'))

    The skeleton structure (limbs) is contained within each file in:

    f['categories'][0]['skeleton']

    and keypoint names at:

    f['categories'][0]['keypoints']

    ./raw.zip: the raw outputs from continuous pose annotation. These were were output by the covfee annotation tool (https://github.com/josedvq/covfee)

    Annotations were done at 60 fps.

    ---------------------

    ./f_formations:

    seg 2: 14:00 onwards, for videos of the form x2xxx.MP4 in /video/raw/ for the relevant cameras (2,4,6,8,10).

    seg 3: for videos of the form x3xxx.MP4 in /video/raw/ for the relevant cameras (2,4,6,8,10).

    Note that camera 10 doesn't include meaningful subject information/body parts that are not already covered in camera 8.

    First column: time stamp

    Second column: "()" delineates groups, "<>" delineates subjects, cam X indicates the best camera view for which a particular group exists.


    phone.csv: time stamp (pertaining to seg3), corresponding group, ID of person using the phone

  16. f

    MEDISEG

    • city.figshare.com
    application/x-gzip
    Updated Mar 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    William Chu (2025). MEDISEG [Dataset]. http://doi.org/10.25383/city.28574786.v1
    Explore at:
    application/x-gzipAvailable download formats
    Dataset updated
    Mar 14, 2025
    Dataset provided by
    City, University of London
    Authors
    William Chu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset OverviewMEDISEG (MEDication Image SEGmentation) is a high-quality, real-world dataset designed for the development and evaluation of pill recognition models. It contains two subsets:MEDISEG (3-Pills): A controlled dataset featuring three pill types with subtle differences in shape and color.MEDISEG (32-Pills): A more diverse dataset containing 32 distinct pill classes, reflecting real-world challenges such as occlusions, varied lighting conditions, and multiple medications in a single frame.Each subset includes COCO-format annotations with instance segmentation masks, bounding boxes, and class labels.Dataset StructureThe dataset is organized as follows:MEDISEG/│── LICENSE│── metadata.csv│── 3pills/β”‚ β”œβ”€β”€ annotations.jsonβ”‚ β”œβ”€β”€ images/β”‚ β”‚ β”œβ”€β”€ image1.jpgβ”‚ β”‚ β”œβ”€β”€ image2.jpg│── 32pills/β”‚ β”œβ”€β”€ annotations.jsonβ”‚ β”œβ”€β”€ images/β”‚ β”‚ β”œβ”€β”€ image1.jpgβ”‚ β”‚ β”œβ”€β”€ image2.jpgLICENSE: The CC BY 4.0 license under which the dataset is distributed.metadata.csv: Supplementary drug information, including registration numbers, brand names, active ingredients, regulatory classifications, and official URLs.annotations.json: COCO-format annotation files providing segmentation masks, bounding boxes, and class labels.images/: High-resolution JPG images of medications.AcknowledgementsIf you use this dataset, please cite the corresponding publication:bibtex@inproceedings{MEDISEG2025,title = {MEDISEG: A large-scale dataset of medication images with instance segmentation masks for preventing adverse drug events},author = {Chu, Wai Ip and Hirani, Shashi and Tarroni, Giacomo and Li, Ling},journal = {Nature Scientific Data},year = {2025},url = {https://example.com}}

  17. P

    Cow Pose Estimation Dataset Dataset

    • paperswithcode.com
    Updated Mar 5, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Cow Pose Estimation Dataset Dataset [Dataset]. https://paperswithcode.com/dataset/cow-pose-estimation-dataset
    Explore at:
    Dataset updated
    Mar 5, 2025
    Description

    Description:

    πŸ‘‰ Download the dataset here

    This dataset has been specifically curated for cow pose estimation, designed to enhance animal behavior analysis and monitoring through computer vision techniques. The dataset is annotated with 12 keypoints on the cow’s body, enabling precise tracking of body movements and posture. It is structured in the COCO format, making it compatible with popular deep learning models like YOLOv8, OpenPose, and others designed for object detection and keypoint estimation tasks.

    Applications:

    This dataset is ideal for agricultural tech solutions, veterinary care, and animal behavior research. It can be used in various use cases such as health monitoring, activity tracking, and early disease detection in cattle. Accurate pose estimation can also assist in optimizing livestock management by understanding animal movement patterns and detecting anomalies in their gait or behavior.

    Download Dataset

    Keypoint Annotations:

    The dataset includes the following 12 keypoints, strategically marked to represent significant anatomical features of cows:

    Nose: Essential for head orientation and overall movement tracking.

    Right Eye: Helps in head pose estimation.

    Left Eye: Complements the right eye for accurate head direction.

    Neck (side): Marks the side of the neck, key for understanding head and body coordination.

    Left Front Hoof: Tracks the front left leg movement.

    Right Front Hoof: Tracks the front right leg movement.

    Left Back Hoof: Important for understanding rear leg motion.

    Right Back Hoof: Completes the leg movement tracking for both sides.

    Backbone (side): Vital for posture and overall body orientation analysis.

    Tail Root: Used for tracking tail movements and posture shifts.

    Backpose Center (near tail’s midpoint): Marks the midpoint of the back, crucial for body stability and movement analysis.

    Stomach (center of side pose): Helps in identifying body alignment and weight distribution.

    Dataset Format:

    The data is structure in the COCO format, with annotations that include image coordinates for each keypoint. This format is highly suitable for integration into popular deep learning frameworks. Additionally, the dataset includes metadata like bounding boxes, image sizes, and segmentation masks to provide detail context for each cow in an image.

    Compatibility:

    This dataset is optimize for use with cutting-edge pose estimation models such as YOLOv8 and other keypoint detection models like DeepLabCut and HRNet, enabling efficient training and inference for cow pose tracking. It can be seamlessly integrate into existing machine learning pipelines for both real-time and post-processed analysis.

    This dataset is sourced from Kaggle.

  18. i

    SDGSAT-1 Misalignment dataset for Object Detection

    • ieee-dataport.org
    Updated Dec 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pei Tan (2024). SDGSAT-1 Misalignment dataset for Object Detection [Dataset]. http://doi.org/10.21227/0jzw-c416
    Explore at:
    Dataset updated
    Dec 22, 2024
    Dataset provided by
    IEEE Dataport
    Authors
    Pei Tan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Annotated 1,000 misalignment from the SDGSAT-1 glimmer imagery, divided into train, valid, and test sets with a ratio of 7:2:1 for the object detection task.This dataset contains only one type of object: misalignment. We used a 32Γ—32 window to crop the raw SDGSAT-1 Level-1 glimmer imagery and converted the TIFF format to JPEG format. At each window, a column number was randomly selected, and the corresponding pixels to the right of this column were shifted vertically either upward or downward by 2 to 8 pixels. The annotations were done in COCO format using LabelImg, with each TXT label file corresponding one-to-one with the JPEG image files.

  19. Sarnet Search And Rescue Dataset

    • universe.roboflow.com
    zip
    Updated Jun 16, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Roboflow Public (2022). Sarnet Search And Rescue Dataset [Dataset]. https://universe.roboflow.com/roboflow-public/sarnet-search-and-rescue
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 16, 2022
    Dataset provided by
    Roboflow
    Authors
    Roboflow Public
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    SaR Bounding Boxes
    Description

    Description from the SaRNet: A Dataset for Deep Learning Assisted Search and Rescue with Satellite Imagery GitHub Repository * The "Note" was added by the Roboflow team.

    Satellite Imagery for Search And Rescue Dataset - ArXiv

    This is a single class dataset consisting of tiles of satellite imagery labeled with potential 'targets'. Labelers were instructed to draw boxes around anything they suspect may a paraglider wing, missing in a remote area of Nevada. Volunteers were shown examples of similar objects already in the environment for comparison. The missing wing, as it was found after 3 weeks, is shown below.

    https://michaeltpublic.s3.amazonaws.com/images/anomaly_small.jpg" alt="anomaly">

    The dataset contains the following:

    SetImagesAnnotations
    Train18083048
    Validate490747
    Test254411
    Total25524206

    The data is in the COCO format, and is directly compatible with faster r-cnn as implemented in Facebook's Detectron2.

    Getting hold of the Data

    Download the data here: sarnet.zip

    Or follow these steps

    # download the dataset
    wget https://michaeltpublic.s3.amazonaws.com/sarnet.zip
    
    # extract the files
    unzip sarnet.zip
    

    ***Note* with Roboflow, you can download the data here** (original, raw images, with annotations): https://universe.roboflow.com/roboflow-public/sarnet-search-and-rescue/ (download v1, original_raw-images) * Download the dataset in COCO JSON format, or another format of choice, and import them to Roboflow after unzipping the folder to get started on your project.

    Getting started

    Get started with a Faster R-CNN model pretrained on SaRNet: SaRNet_Demo.ipynb

    Source Code for Paper

    Source code for the paper is located here: SaRNet_train_test.ipynb

    Cite this dataset

    @misc{thoreau2021sarnet,
       title={SaRNet: A Dataset for Deep Learning Assisted Search and Rescue with Satellite Imagery}, 
       author={Michael Thoreau and Frazer Wilson},
       year={2021},
       eprint={2107.12469},
       archivePrefix={arXiv},
       primaryClass={eess.IV}
    }
    

    Acknowledgment

    The source data was generously provided by Planet Labs, Airbus Defence and Space, and Maxar Technologies.

  20. T

    ref_coco

    • tensorflow.org
    • opendatalab.com
    Updated May 31, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). ref_coco [Dataset]. https://www.tensorflow.org/datasets/catalog/ref_coco
    Explore at:
    Dataset updated
    May 31, 2024
    Description

    A collection of 3 referring expression datasets based off images in the COCO dataset. A referring expression is a piece of text that describes a unique object in an image. These datasets are collected by asking human raters to disambiguate objects delineated by bounding boxes in the COCO dataset.

    RefCoco and RefCoco+ are from Kazemzadeh et al. 2014. RefCoco+ expressions are strictly appearance based descriptions, which they enforced by preventing raters from using location based descriptions (e.g., "person to the right" is not a valid description for RefCoco+). RefCocoG is from Mao et al. 2016, and has more rich description of objects compared to RefCoco due to differences in the annotation process. In particular, RefCoco was collected in an interactive game-based setting, while RefCocoG was collected in a non-interactive setting. On average, RefCocoG has 8.4 words per expression while RefCoco has 3.5 words.

    Each dataset has different split allocations that are typically all reported in papers. The "testA" and "testB" sets in RefCoco and RefCoco+ contain only people and only non-people respectively. Images are partitioned into the various splits. In the "google" split, objects, not images, are partitioned between the train and non-train splits. This means that the same image can appear in both the train and validation split, but the objects being referred to in the image will be different between the two sets. In contrast, the "unc" and "umd" splits partition images between the train, validation, and test split. In RefCocoG, the "google" split does not have a canonical test set, and the validation set is typically reported in papers as "val*".

    Stats for each dataset and split ("refs" is the number of referring expressions, and "images" is the number of images):

    datasetpartitionsplitrefsimages
    refcocogoogletrain4000019213
    refcocogoogleval50004559
    refcocogoogletest50004527
    refcocounctrain4240416994
    refcocouncval38111500
    refcocounctestA1975750
    refcocounctestB1810750
    refcoco+unctrain4227816992
    refcoco+uncval38051500
    refcoco+unctestA1975750
    refcoco+unctestB1798750
    refcocoggoogletrain4482224698
    refcocoggoogleval50004650
    refcocogumdtrain4222621899
    refcocogumdval25731300
    refcocogumdtest50232600

    To use this dataset:

    import tensorflow_datasets as tfds
    
    ds = tfds.load('ref_coco', split='train')
    for ex in ds.take(4):
     print(ex)
    

    See the guide for more informations on tensorflow_datasets.

    https://storage.googleapis.com/tfds-data/visualization/fig/ref_coco-refcoco_unc-1.1.0.png" alt="Visualization" width="500px">

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Ari (2021). Sartorius COCO Format Dataset [Dataset]. https://www.kaggle.com/vexxingbanana/sartorius-coco-format-dataset
Organization logo

Sartorius COCO Format Dataset

Coco annotation format json files for Sartorius Cell Segmentation Competition.

Explore at:
zip(9798602 bytes)Available download formats
Dataset updated
Oct 28, 2021
Authors
Ari
Description

Dataset

This dataset was created by Ari

Contents

Search
Clear search
Close search
Google apps
Main menu