100+ datasets found
  1. Microsoft COCO 2017 Object Detection Dataset - raw

    • public.roboflow.com
    zip
    Updated Feb 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Microsoft (2025). Microsoft COCO 2017 Object Detection Dataset - raw [Dataset]. https://public.roboflow.com/object-detection/microsoft-coco-subset/2
    Explore at:
    zipAvailable download formats
    Dataset updated
    Feb 1, 2025
    Dataset authored and provided by
    Microsofthttp://microsoft.com/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Bounding Boxes of coco-objects
    Description

    This is the full 2017 COCO object detection dataset (train and valid), which is a subset of the most recent 2020 COCO object detection dataset.

    COCO is a large-scale object detection, segmentation, and captioning dataset of many object types easily recognizable by a 4-year-old. The data is initially collected and published by Microsoft. The original source of the data is here and the paper introducing the COCO dataset is here.

  2. MS-COCO 2017 dataset - YOLO format

    • kaggle.com
    zip
    Updated Nov 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shahariar Alif (2025). MS-COCO 2017 dataset - YOLO format [Dataset]. https://www.kaggle.com/datasets/alifshahariar/ms-coco-2017-dataset-yolo-format
    Explore at:
    zip(26509567635 bytes)Available download formats
    Dataset updated
    Nov 1, 2025
    Authors
    Shahariar Alif
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    I wanted to train a custom YOLO object detection model, but the MS-COCO dataset was not in a good format. So I parsed the instances json files in the MS-COCO annotations and processed the dataset to be a YOLO friendly format.

    I downloaded the dataset from COCO webste. You can download any split you need from the COCO dataset website

    Directory info: 1. test: Only contains the test images 2. train: Has two sub folders, images - contains the training images, labels - contains the training labels in a .txt file for each train image 3. val: Has two sub folders, images - contains the validation images, labels - contains the validation labels in a .txt file for each validation image

    I do not own the dataset in any way. I merely parsed the dataset to a be in a ready to train YOLO format. Download the original dataset from the COCO webste

  3. T

    coco

    • tensorflow.org
    • huggingface.co
    Updated Jun 1, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). coco [Dataset]. https://www.tensorflow.org/datasets/catalog/coco
    Explore at:
    Dataset updated
    Jun 1, 2024
    Description

    COCO is a large-scale object detection, segmentation, and captioning dataset.

    Note: * Some images from the train and validation sets don't have annotations. * Coco 2014 and 2017 uses the same images, but different train/val/test splits * The test split don't have any annotations (only images). * Coco defines 91 classes but the data only uses 80 classes. * Panotptic annotations defines defines 200 classes but only uses 133.

    To use this dataset:

    import tensorflow_datasets as tfds
    
    ds = tfds.load('coco', split='train')
    for ex in ds.take(4):
     print(ex)
    

    See the guide for more informations on tensorflow_datasets.

    https://storage.googleapis.com/tfds-data/visualization/fig/coco-2014-1.1.0.png" alt="Visualization" width="500px">

  4. h

    coco2017

    • huggingface.co
    • opendatalab.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Padilla, coco2017 [Dataset]. https://huggingface.co/datasets/rafaelpadilla/coco2017
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    Padilla
    Description

    This dataset contains all COCO 2017 images and annotations split in training (118287 images) and validation (5000 images).

  5. I

    dataset_coco

    • app.ikomia.ai
    Updated Dec 19, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ikomia (2023). dataset_coco [Dataset]. https://app.ikomia.ai/hub/algorithms/dataset_coco/
    Explore at:
    Dataset updated
    Dec 19, 2023
    Dataset authored and provided by
    Ikomia
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Load COCO 2017 dataset Load any dataset in COCO format to Ikomia format. Then, any training algorithms from the Ikomia marketplace can be connected to this converter....

  6. a

    COCO

    • datasets.activeloop.ai
    • huggingface.co
    deeplake
    Updated Feb 5, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tsung-Yi Lin (2022). COCO [Dataset]. https://datasets.activeloop.ai/docs/ml/datasets/coco-dataset/
    Explore at:
    deeplakeAvailable download formats
    Dataset updated
    Feb 5, 2022
    Authors
    Tsung-Yi Lin
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 1, 2014 - Dec 31, 2015
    Dataset funded by
    Microsoft Research
    Description

    The COCO dataset is a large dataset of labeled images and annotations. It is a popular dataset for machine learning and artificial intelligence research. The dataset consists of 330,000 images and 500,000 object annotations. The annotations include the bounding boxes of objects in the images, as well as the labels of the objects.

  7. COCO 2017 Object Detection Dataset

    • kaggle.com
    zip
    Updated Aug 9, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Moein Shariatnia (2022). COCO 2017 Object Detection Dataset [Dataset]. https://www.kaggle.com/datasets/moeinshariatnia/coco-2017-object-detection-dataset
    Explore at:
    zip(19209582473 bytes)Available download formats
    Dataset updated
    Aug 9, 2022
    Authors
    Moein Shariatnia
    Description

    COCO Object Detection Dataset | 2017

    Downloaded from here and it includes Train images for now.

  8. h

    COCO_Person

    • huggingface.co
    Updated May 4, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abdelrahman Hamdy (2024). COCO_Person [Dataset]. https://huggingface.co/datasets/Hamdy20002/COCO_Person
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 4, 2024
    Authors
    Abdelrahman Hamdy
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    This Dataset is a subsets of COCO 2017 -train- images using "Crowd" & "person" Labels With the First Caption of Each one

    COCO Summary: The COCO dataset is a comprehensive collection designed for object detection, segmentation, and captioning tasks. It comprises over 200,000 images, encompassing a diverse array of everyday scenes and objects. Each image features multiple objects and scenes across 80 distinct object categories, all of which are annotated with descriptive image captions.

  9. COCO minitrain

    • kaggle.com
    zip
    Updated Dec 3, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    PhαΊ‘m ThΓ nh Trung (2022). COCO minitrain [Dataset]. https://www.kaggle.com/datasets/trungit/coco25k
    Explore at:
    zip(4066483999 bytes)Available download formats
    Dataset updated
    Dec 3, 2022
    Authors
    PhαΊ‘m ThΓ nh Trung
    Description

    COCO minitrain is a curated mini training set (25K images β‰ˆ 20% of train2017) for COCO. @inproceedings{HoughNet, author = {Nermin Samet and Samet Hicsonmez and Emre Akbas}, title = {HoughNet: Integrating near and long-range evidence for bottom-up object detection},
    booktitle = {European Conference on Computer Vision (ECCV)}, year = {2020}, }

  10. R

    Microsoft Coco 2017 Dataset

    • universe.roboflow.com
    zip
    Updated Feb 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jacob Solawetz (2025). Microsoft Coco 2017 Dataset [Dataset]. https://universe.roboflow.com/jacob-solawetz/microsoft-coco/model/9
    Explore at:
    zipAvailable download formats
    Dataset updated
    Feb 1, 2025
    Dataset authored and provided by
    Jacob Solawetz
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Coco Objects Bounding Boxes
    Description

    This is the full 2017 COCO object detection dataset (train and valid), which is a subset of the most recent 2020 COCO object detection dataset.

    COCO is a large-scale object detection, segmentation, and captioning dataset of many object types easily recognizable by a 4-year-old. The data is initially collected and published by Microsoft. The original source of the data is here and the paper introducing the COCO dataset is here.

  11. COCO8 Ultralytics

    • kaggle.com
    Updated Sep 27, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ultralytics (2024). COCO8 Ultralytics [Dataset]. http://doi.org/10.34740/kaggle/dsv/9497018
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 27, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Ultralytics
    License

    http://www.gnu.org/licenses/agpl-3.0.htmlhttp://www.gnu.org/licenses/agpl-3.0.html

    Description

    Ultralytics COCO8 is a small, but versatile object detection dataset composed of the first 8 images of the COCO train 2017 set, 4 for training and 4 for validation. This dataset is ideal for testing and debugging object detection models, or for experimenting with new detection approaches. With 8 images, it is small enough to be easily manageable, yet diverse enough to test training pipelines for errors and act as a sanity check before training larger datasets.

    To train a YOLOv8n model on the COCO8 dataset for 100 epochs with an image size of 640, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model Training page.

    Train Example

    # Start training from a pretrained *.pt model
    yolo detect train data=coco8.yaml model=yolov8n.pt epochs=100 imgsz=640
    
  12. T

    ref_coco

    • tensorflow.org
    • opendatalab.com
    Updated May 31, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). ref_coco [Dataset]. https://www.tensorflow.org/datasets/catalog/ref_coco
    Explore at:
    Dataset updated
    May 31, 2024
    Description

    A collection of 3 referring expression datasets based off images in the COCO dataset. A referring expression is a piece of text that describes a unique object in an image. These datasets are collected by asking human raters to disambiguate objects delineated by bounding boxes in the COCO dataset.

    RefCoco and RefCoco+ are from Kazemzadeh et al. 2014. RefCoco+ expressions are strictly appearance based descriptions, which they enforced by preventing raters from using location based descriptions (e.g., "person to the right" is not a valid description for RefCoco+). RefCocoG is from Mao et al. 2016, and has more rich description of objects compared to RefCoco due to differences in the annotation process. In particular, RefCoco was collected in an interactive game-based setting, while RefCocoG was collected in a non-interactive setting. On average, RefCocoG has 8.4 words per expression while RefCoco has 3.5 words.

    Each dataset has different split allocations that are typically all reported in papers. The "testA" and "testB" sets in RefCoco and RefCoco+ contain only people and only non-people respectively. Images are partitioned into the various splits. In the "google" split, objects, not images, are partitioned between the train and non-train splits. This means that the same image can appear in both the train and validation split, but the objects being referred to in the image will be different between the two sets. In contrast, the "unc" and "umd" splits partition images between the train, validation, and test split. In RefCocoG, the "google" split does not have a canonical test set, and the validation set is typically reported in papers as "val*".

    Stats for each dataset and split ("refs" is the number of referring expressions, and "images" is the number of images):

    datasetpartitionsplitrefsimages
    refcocogoogletrain4000019213
    refcocogoogleval50004559
    refcocogoogletest50004527
    refcocounctrain4240416994
    refcocouncval38111500
    refcocounctestA1975750
    refcocounctestB1810750
    refcoco+unctrain4227816992
    refcoco+uncval38051500
    refcoco+unctestA1975750
    refcoco+unctestB1798750
    refcocoggoogletrain4482224698
    refcocoggoogleval50004650
    refcocogumdtrain4222621899
    refcocogumdval25731300
    refcocogumdtest50232600

    To use this dataset:

    import tensorflow_datasets as tfds
    
    ds = tfds.load('ref_coco', split='train')
    for ex in ds.take(4):
     print(ex)
    

    See the guide for more informations on tensorflow_datasets.

    https://storage.googleapis.com/tfds-data/visualization/fig/ref_coco-refcoco_unc-1.1.0.png" alt="Visualization" width="500px">

  13. R

    Coco Train Sample Dataset

    • universe.roboflow.com
    zip
    Updated Mar 11, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    COCO (2024). Coco Train Sample Dataset [Dataset]. https://universe.roboflow.com/coco-va583/coco-train-sample/dataset/1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Mar 11, 2024
    Dataset authored and provided by
    COCO
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Bicycle Car Person Bounding Boxes
    Description

    COCO Train Sample

    ## Overview
    
    COCO Train Sample is a dataset for object detection tasks - it contains Bicycle Car Person annotations for 8,057 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  14. R

    Val Creation For Coco + Landing Pad Image Dataset Dataset

    • universe.roboflow.com
    zip
    Updated Feb 12, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    UWARG YOLOv7 (2023). Val Creation For Coco + Landing Pad Image Dataset Dataset [Dataset]. https://universe.roboflow.com/uwarg-yolov7/old-train-val-dataset-creation-for-coco-landing-pad-image-dataset/dataset/1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Feb 12, 2023
    Dataset authored and provided by
    UWARG YOLOv7
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    COCO And LandingPads Bounding Boxes
    Description

    Val Dataset Creation For COCO + Landing Pad Image Dataset

    ## Overview
    
    Val Dataset Creation For COCO + Landing Pad Image Dataset is a dataset for object detection tasks - it contains COCO And LandingPads annotations for 1,852 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  15. HuBMap COCO Dataset 512x512 Tiled

    • kaggle.com
    zip
    Updated Nov 20, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sreevishnu Damodaran (2020). HuBMap COCO Dataset 512x512 Tiled [Dataset]. https://www.kaggle.com/datasets/sreevishnudamodaran/hubmap-coco-dataset-512x512-tiled
    Explore at:
    zip(739767398 bytes)Available download formats
    Dataset updated
    Nov 20, 2020
    Authors
    Sreevishnu Damodaran
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This Dataset contains HuBMap Dataset in COCO format to use in any Object Detection and Instance Segmentation Task.

    COCO format easily supports Segmentation Frameworks such as AdelaiDet, Detectron2, TensorFlow etc.

    The dataset is structured with images split into directories and no downscaling was done.

    The following notebook explains how to convert custom annotations to COCO format:

    https://www.kaggle.com/sreevishnudamodaran/build-custom-coco-annotations-512x512-tiled

    Thanks to the Kaggle community and staff for all the support!

    Please don't miss to upvote and comment if you like my work :)

    Hope I everyone finds this useful!

    Directory Structure:

       - coco_train
         - images(contains images in jpg format)
           - original_tiff_image_name
             - tile_column_number
               - image
               .
               .
               .
              .
              .
              .
            .
            .
            .
         - train.json (contains all the segmentation annotations in coco 
         -       format with proper relative path of the images)
    
  16. R

    Coco 2017_train Image Dataset

    • universe.roboflow.com
    zip
    Updated Feb 21, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    COCOFinal (2024). Coco 2017_train Image Dataset [Dataset]. https://universe.roboflow.com/cocofinal-a52ez/coco-2017_train-image/dataset/1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Feb 21, 2024
    Dataset authored and provided by
    COCOFinal
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Person Car Dog Cake Bounding Boxes
    Description

    COCO 2017_Train Image

    ## Overview
    
    COCO 2017_Train Image is a dataset for object detection tasks - it contains Person Car Dog Cake annotations for 300 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  17. YOGData: Labelled data (YOLO and Mask R-CNN) for yogurt cup identification...

    • zenodo.org
    • data.niaid.nih.gov
    bin, zip
    Updated Jun 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Symeon Symeonidis; Vasiliki Balaska; Dimitrios Tsilis; Fotis K. Konstantinidis; Fotis K. Konstantinidis; Symeon Symeonidis; Vasiliki Balaska; Dimitrios Tsilis (2022). YOGData: Labelled data (YOLO and Mask R-CNN) for yogurt cup identification within production lines [Dataset]. http://doi.org/10.5281/zenodo.6773531
    Explore at:
    bin, zipAvailable download formats
    Dataset updated
    Jun 29, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Symeon Symeonidis; Vasiliki Balaska; Dimitrios Tsilis; Fotis K. Konstantinidis; Fotis K. Konstantinidis; Symeon Symeonidis; Vasiliki Balaska; Dimitrios Tsilis
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data abstract:
    The YogDATA dataset contains images from an industrial laboratory production line when it is functioned to quality yogurts. The case-study for the recognition of yogurt cups requires training of Mask R-CNN and YOLO v5.0 models with a set of corresponding images. Thus, it is important to collect the corresponding images to train and evaluate the class. Specifically, the YogDATA dataset includes the same labeled data for Mask R-CNN (coco format) and YOLO models. For the YOLO architecture, training and validation datsets include sets of images in jpg format and their annotations in txt file format. For the Mask R-CNN architecture, the annotation of the same sets of images are included in json file format (80% of images and annotations of each subset are in training set and 20% of images of each subset are in test set.)

    Paper abstract:
    The explosion of the digitisation of the traditional industrial processes and procedures is consolidating a positive impact on modern society by offering a critical contribution to its economic development. In particular, the dairy sector consists of various processes, which are very demanding and thorough. It is crucial to leverage modern automation tools and through-engineering solutions to increase their efficiency and continuously meet challenging standards. Towards this end, in this work, an intelligent algorithm based on machine vision and artificial intelligence, which identifies dairy products within production lines, is presented. Furthermore, in order to train and validate the model, the YogDATA dataset was created that includes yogurt cups within a production line. Specifically, we evaluate two deep learning models (Mask R-CNN and YOLO v5.0) to recognise and detect each yogurt cup in a production line, in order to automate the packaging processes of the products. According to our results, the performance precision of the two models is similar, estimating its at 99\%.

  18. R

    Coco 2017_train 300 Dataset

    • universe.roboflow.com
    zip
    Updated Feb 26, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    coc (2024). Coco 2017_train 300 Dataset [Dataset]. https://universe.roboflow.com/coc-qq6ry/coco-2017_train-300/dataset/1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Feb 26, 2024
    Dataset authored and provided by
    coc
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Dog Person Car Cake Polygons
    Description

    COCO 2017_train 300

    ## Overview
    
    COCO 2017_train 300 is a dataset for instance segmentation tasks - it contains Dog Person Car Cake annotations for 300 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  19. COCO King

    • kaggle.com
    zip
    Updated Apr 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Phani Ratan (2025). COCO King [Dataset]. https://www.kaggle.com/datasets/knowledgeforyou/coco-king
    Explore at:
    zip(656138535 bytes)Available download formats
    Dataset updated
    Apr 26, 2025
    Authors
    Phani Ratan
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset Overview

    COCO-King is a large-scale dataset for reference-guided image completion tasks, derived from the COCO dataset. It features images with masked objects and corresponding reference images of those objects, enabling models to learn how to replace or complete masked regions with guidance from reference images.

    Dataset Size and Structure

    Total size: 690MB Images: 9,558 total images (8,134 training + 1,424 validation) Categories: 170 diverse object categories Directory structure:

    coco-king/ β”œβ”€β”€ train/ β”‚ β”œβ”€β”€ images/ # Original images with objects to be masked β”‚ β”œβ”€β”€ mask/ # Binary masks (white background, black object) β”‚ └── reference/ # Augmented reference images of masked objects β”œβ”€β”€ val/ β”‚ β”œβ”€β”€ images/ # Validation images β”‚ β”œβ”€β”€ mask/ # Validation masks β”‚ └── reference/ # Validation reference images β”œβ”€β”€ metadata.json # Complete dataset metadata β”œβ”€β”€ train_annotations.json # COCO-format training annotations └── val_annotations.json # COCO-format validation annotations

    Unique Features

    Specially Curated Masks

    Smoothed Contours: Each mask features smooth, rounded edges to mimic human-drawn masks rather than pixel-perfect segmentations

    Processing Pipeline: Masks underwent morphological operations and Gaussian blurring to create natural-looking boundaries

    Single Masked Object per Image: Each image has one primary object masked (the largest that meets size criteria), despite containing multiple objects (avg. 7 objects per image)

    Rich Reference Images

    Paint by Example Style Augmentations: Reference images are augmented similar to the Paint by Example paper:

    Mild color jittering (brightness, contrast, saturation, hue) Random horizontal flips Small random rotations (up to 10 degrees) Mild perspective transformations Occasional equalization and auto-contrast

    Balanced Object Selection

    Size Range: Objects cover 0.89% to 42% of image area (average: ~25%) Multiple Objects: Every image contains multiple objects (ranging from 2 to 29) Diverse Categories: Well-distributed across 170 object categories

    Dataset Highlights

    • Person is the most common category (1,138 training, 184 validation)
    • Top categories include sky, trees, clouds, road, grass, walls, buildings
    • Average of 7 objects per image provides context and complexity
    • Bounding boxes are strategically sized to be neither too small nor too dominant
    • Each image-mask-reference triplet is carefully curated to ensure quality

    Applications

    This dataset is ideal for: Exemplar-based image inpainting/completion: Using reference images to guide the filling of masked regions Reference-guided object placement: Learning to place objects in scenes with proper perspective and lighting

    Object replacement: Replacing objects in images with new objects while maintaining scene coherence

    Style/appearance transfer: Learning to transfer appearance characteristics to objects in new scenes

    Research on Paint by Example or similar architectures: Models that aim to fill masked regions based on reference images

    Data Processing

    Derived from COCO dataset with additional processing

    Each image triplet (image, mask, reference) was processed to ensure: The masked object is of appropriate size Masks have smooth, natural contours Reference images maintain object identity while providing variation through augmentation

    This dataset offers a unique resource for developing and benchmarking models that can intelligently replace or complete portions of images based on reference examples.

  20. R

    Face Features Test Dataset

    • universe.roboflow.com
    zip
    Updated Dec 6, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Peter Lin (2021). Face Features Test Dataset [Dataset]. https://universe.roboflow.com/peter-lin/face-features-test/dataset/1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Dec 6, 2021
    Dataset authored and provided by
    Peter Lin
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Variables measured
    Face Features Bounding Boxes
    Description

    A simple dataset for benchmarking CreateML object detection models. The images are sampled from COCO dataset with eyes and nose bounding boxes added. It’s not meant to be serious or useful in a real application. The purpose is to look at how long it takes to train CreateML models with varying dataset and batch sizes.

    Training performance is affected by model configuration, dataset size and batch configuration. Larger models and batches require more memory. I used CreateML object detection project to compare the performance.

    Hardware

    M1 Macbook Air * 8 GPU * 4/4 CPU * 16G memory * 512G SSD

    M1 Max Macbook Pro * 24 GPU * 2/8 CPU * 32G memory * 2T SSD

    Small Dataset Train: 144 Valid: 16 Test: 8

    Results |batch | M1 ET | M1Max ET | peak mem G | |--------|:------|:---------|:-----------| |16 | 16 | 11 | 1.5 | |32 | 29 | 17 | 2.8 | |64 | 56 | 30 | 5.4 | |128 | 170 | 57 | 12 |

    Larger Dataset Train: 301 Valid: 29 Test: 18

    Results |batch | M1 ET | M1Max ET | peak mem G | |--------|:------|:---------|:-----------| |16 | 21 | 10 | 1.5 | |32 | 42 | 17 | 3.5 | |64 | 85 | 30 | 8.4 | |128 | 281 | 54 | 16.5 |

    CreateML Settings

    For all tests, training was set to Full Network. I closed CreateML between each run to make sure memory issues didn't cause a slow down. There is a bug with Monterey as of 11/2021 that leads to memory leak. I kept an eye on the memory usage. If it looked like there was a memory leak, I restarted MacOS.

    Observations

    In general, more GPU and memory with MBP reduces the training time. Having more memory lets you train with larger datasets. On M1 Macbook Air, the practical limit is 12G before memory pressure impacts performance. On M1 Max MBP, the practical limit is 26G before memory pressure impacts performance. To work around memory pressure, use smaller batch sizes.

    On the larger dataset with batch size 128, the M1Max is 5x faster than Macbook Air. Keep in mind a real dataset should have thousands of samples like Coco or Pascal. Ideally, you want a dataset with 100K images for experimentation and millions for the real training. The new M1 Max Macbooks is a cost effective alternative to building a Windows/Linux workstation with RTX 3090 24G. For most of 2021, the price of RTX 3090 with 24G is around $3,000.00. That means an equivalent windows workstation would cost the same as the M1Max Macbook pro I used to run the benchmarks.

    Full Network vs Transfer Learning

    As of CreateML 3, training with full network doesn't fully utilize the GPU. I don't know why it works that way. You have to select transfer learning to fully use the GPU. The results of transfer learning with the larger dataset. In general, the training time is faster and loss is better.

    batchET minTrain AccVal AccTest AccTop IU TrainTop IU ValidTop IU TestPeak mem Gloss
    1647519127823131.50.41
    3287521107826112.760.02
    641375238782495.30.017
    128257522137825148.40.012

    Github Project

    The source code and full results are up on Github https://github.com/woolfel/createmlbench

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Microsoft (2025). Microsoft COCO 2017 Object Detection Dataset - raw [Dataset]. https://public.roboflow.com/object-detection/microsoft-coco-subset/2
Organization logo

Microsoft COCO 2017 Object Detection Dataset - raw

Explore at:
zipAvailable download formats
Dataset updated
Feb 1, 2025
Dataset authored and provided by
Microsofthttp://microsoft.com/
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Variables measured
Bounding Boxes of coco-objects
Description

This is the full 2017 COCO object detection dataset (train and valid), which is a subset of the most recent 2020 COCO object detection dataset.

COCO is a large-scale object detection, segmentation, and captioning dataset of many object types easily recognizable by a 4-year-old. The data is initially collected and published by Microsoft. The original source of the data is here and the paper introducing the COCO dataset is here.

Search
Clear search
Close search
Google apps
Main menu