100+ datasets found

Microsoft COCO 2017 Object Detection Dataset - raw
public.roboflow.com
zip
Updated Feb 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Microsoft (2025). Microsoft COCO 2017 Object Detection Dataset - raw [Dataset]. https://public.roboflow.com/object-detection/microsoft-coco-subset/2
Explore at:
zipAvailable download formats
Dataset updated
Feb 1, 2025
Dataset authored and provided by
Microsofthttp://microsoft.com/
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Bounding Boxes of coco-objects
Description
This is the full 2017 COCO object detection dataset (train and valid), which is a subset of the most recent 2020 COCO object detection dataset.

COCO is a large-scale object detection, segmentation, and captioning dataset of many object types easily recognizable by a 4-year-old. The data is initially collected and published by Microsoft. The original source of the data is here and the paper introducing the COCO dataset is here.
MS-COCO 2017 dataset - YOLO format
kaggle.com
zip
Updated Nov 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shahariar Alif (2025). MS-COCO 2017 dataset - YOLO format [Dataset]. https://www.kaggle.com/datasets/alifshahariar/ms-coco-2017-dataset-yolo-format
Explore at:
zip(26509567635 bytes)Available download formats
Dataset updated
Nov 1, 2025
Authors
Shahariar Alif
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
I wanted to train a custom YOLO object detection model, but the MS-COCO dataset was not in a good format. So I parsed the instances json files in the MS-COCO annotations and processed the dataset to be a YOLO friendly format.

I downloaded the dataset from COCO webste. You can download any split you need from the COCO dataset website

Directory info: 1. test: Only contains the test images 2. train: Has two sub folders, images - contains the training images, labels - contains the training labels in a .txt file for each train image 3. val: Has two sub folders, images - contains the validation images, labels - contains the validation labels in a .txt file for each validation image

I do not own the dataset in any way. I merely parsed the dataset to a be in a ready to train YOLO format. Download the original dataset from the COCO webste
T
coco
tensorflow.org
huggingface.co
Updated Jun 1, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). coco [Dataset]. https://www.tensorflow.org/datasets/catalog/coco
Explore at:
Dataset updated
Jun 1, 2024
Description
COCO is a large-scale object detection, segmentation, and captioning dataset.

Note: * Some images from the train and validation sets don't have annotations. * Coco 2014 and 2017 uses the same images, but different train/val/test splits * The test split don't have any annotations (only images). * Coco defines 91 classes but the data only uses 80 classes. * Panotptic annotations defines defines 200 classes but only uses 133.

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('coco', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.

https://storage.googleapis.com/tfds-data/visualization/fig/coco-2014-1.1.0.png" alt="Visualization" width="500px">
h
coco2017
huggingface.co
opendatalab.com
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Padilla, coco2017 [Dataset]. https://huggingface.co/datasets/rafaelpadilla/coco2017
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Authors
Padilla
Description
This dataset contains all COCO 2017 images and annotations split in training (118287 images) and validation (5000 images).
I
dataset_coco
app.ikomia.ai
Updated Dec 19, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ikomia (2023). dataset_coco [Dataset]. https://app.ikomia.ai/hub/algorithms/dataset_coco/
Explore at:
Dataset updated
Dec 19, 2023
Dataset authored and provided by
Ikomia
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Load COCO 2017 dataset Load any dataset in COCO format to Ikomia format. Then, any training algorithms from the Ikomia marketplace can be connected to this converter....
a
COCO
datasets.activeloop.ai
huggingface.co
deeplake
Updated Feb 5, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tsung-Yi Lin (2022). COCO [Dataset]. https://datasets.activeloop.ai/docs/ml/datasets/coco-dataset/
Explore at:
deeplakeAvailable download formats
Dataset updated
Feb 5, 2022
Authors
Tsung-Yi Lin
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jan 1, 2014 - Dec 31, 2015
Dataset funded by
Microsoft Research
Description
The COCO dataset is a large dataset of labeled images and annotations. It is a popular dataset for machine learning and artificial intelligence research. The dataset consists of 330,000 images and 500,000 object annotations. The annotations include the bounding boxes of objects in the images, as well as the labels of the objects.
COCO 2017 Object Detection Dataset
kaggle.com
zip
Updated Aug 9, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Moein Shariatnia (2022). COCO 2017 Object Detection Dataset [Dataset]. https://www.kaggle.com/datasets/moeinshariatnia/coco-2017-object-detection-dataset
Explore at:
zip(19209582473 bytes)Available download formats
Dataset updated
Aug 9, 2022
Authors
Moein Shariatnia
Description
COCO Object Detection Dataset | 2017

Downloaded from here and it includes Train images for now.
h
COCO_Person
huggingface.co
Updated May 4, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Abdelrahman Hamdy (2024). COCO_Person [Dataset]. https://huggingface.co/datasets/Hamdy20002/COCO_Person
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 4, 2024
Authors
Abdelrahman Hamdy
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
This Dataset is a subsets of COCO 2017 -train- images using "Crowd" & "person" Labels With the First Caption of Each one

COCO Summary: The COCO dataset is a comprehensive collection designed for object detection, segmentation, and captioning tasks. It comprises over 200,000 images, encompassing a diverse array of everyday scenes and objects. Each image features multiple objects and scenes across 80 distinct object categories, all of which are annotated with descriptive image captions.
COCO minitrain
kaggle.com
zip
Updated Dec 3, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Phạm Thành Trung (2022). COCO minitrain [Dataset]. https://www.kaggle.com/datasets/trungit/coco25k
Explore at:
zip(4066483999 bytes)Available download formats
Dataset updated
Dec 3, 2022
Authors
Phạm Thành Trung
Description
COCO minitrain is a curated mini training set (25K images ≈ 20% of train2017) for COCO. @inproceedings{HoughNet, author = {Nermin Samet and Samet Hicsonmez and Emre Akbas}, title = {HoughNet: Integrating near and long-range evidence for bottom-up object detection},
booktitle = {European Conference on Computer Vision (ECCV)}, year = {2020}, }
R
Microsoft Coco 2017 Dataset
universe.roboflow.com
zip
Updated Feb 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jacob Solawetz (2025). Microsoft Coco 2017 Dataset [Dataset]. https://universe.roboflow.com/jacob-solawetz/microsoft-coco/model/9
Explore at:
zipAvailable download formats
Dataset updated
Feb 1, 2025
Dataset authored and provided by
Jacob Solawetz
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Coco Objects Bounding Boxes
Description
This is the full 2017 COCO object detection dataset (train and valid), which is a subset of the most recent 2020 COCO object detection dataset.

COCO is a large-scale object detection, segmentation, and captioning dataset of many object types easily recognizable by a 4-year-old. The data is initially collected and published by Microsoft. The original source of the data is here and the paper introducing the COCO dataset is here.
COCO8 Ultralytics
kaggle.com
Updated Sep 27, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ultralytics (2024). COCO8 Ultralytics [Dataset]. http://doi.org/10.34740/kaggle/dsv/9497018
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.34740/kaggle/dsv/9497018
Dataset updated
Sep 27, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Ultralytics
License
http://www.gnu.org/licenses/agpl-3.0.htmlhttp://www.gnu.org/licenses/agpl-3.0.html
Description
Ultralytics COCO8 is a small, but versatile object detection dataset composed of the first 8 images of the COCO train 2017 set, 4 for training and 4 for validation. This dataset is ideal for testing and debugging object detection models, or for experimenting with new detection approaches. With 8 images, it is small enough to be easily manageable, yet diverse enough to test training pipelines for errors and act as a sanity check before training larger datasets.

To train a YOLOv8n model on the COCO8 dataset for 100 epochs with an image size of 640, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model Training page.

Train Example

# Start training from a pretrained *.pt model yolo detect train data=coco8.yaml model=yolov8n.pt epochs=100 imgsz=640

ref_coco

tensorflow.org
opendatalab.com

Updated May 31, 2024

Facebook

Twitter

Click to copy link

Link copied

Cite

(2024). ref_coco [Dataset]. https://www.tensorflow.org/datasets/catalog/ref_coco

Explore at:

Dataset updated

May 31, 2024

Description

A collection of 3 referring expression datasets based off images in the COCO dataset. A referring expression is a piece of text that describes a unique object in an image. These datasets are collected by asking human raters to disambiguate objects delineated by bounding boxes in the COCO dataset.

RefCoco and RefCoco+ are from Kazemzadeh et al. 2014. RefCoco+ expressions are strictly appearance based descriptions, which they enforced by preventing raters from using location based descriptions (e.g., "person to the right" is not a valid description for RefCoco+). RefCocoG is from Mao et al. 2016, and has more rich description of objects compared to RefCoco due to differences in the annotation process. In particular, RefCoco was collected in an interactive game-based setting, while RefCocoG was collected in a non-interactive setting. On average, RefCocoG has 8.4 words per expression while RefCoco has 3.5 words.

Each dataset has different split allocations that are typically all reported in papers. The "testA" and "testB" sets in RefCoco and RefCoco+ contain only people and only non-people respectively. Images are partitioned into the various splits. In the "google" split, objects, not images, are partitioned between the train and non-train splits. This means that the same image can appear in both the train and validation split, but the objects being referred to in the image will be different between the two sets. In contrast, the "unc" and "umd" splits partition images between the train, validation, and test split. In RefCocoG, the "google" split does not have a canonical test set, and the validation set is typically reported in papers as "val*".

Stats for each dataset and split ("refs" is the number of referring expressions, and "images" is the number of images):

dataset	partition	split	refs	images
refcoco	google	train	40000	19213
refcoco	google	val	5000	4559
refcoco	google	test	5000	4527
refcoco	unc	train	42404	16994
refcoco	unc	val	3811	1500
refcoco	unc	testA	1975	750
refcoco	unc	testB	1810	750
refcoco+	unc	train	42278	16992
refcoco+	unc	val	3805	1500
refcoco+	unc	testA	1975	750
refcoco+	unc	testB	1798	750
refcocog	google	train	44822	24698
refcocog	google	val	5000	4650
refcocog	umd	train	42226	21899
refcocog	umd	val	2573	1300
refcocog	umd	test	5023	2600

To use this dataset:

import tensorflow_datasets as tfds

ds = tfds.load('ref_coco', split='train')
for ex in ds.take(4):
 print(ex)

See the guide for more informations on tensorflow_datasets.

https://storage.googleapis.com/tfds-data/visualization/fig/ref_coco-refcoco_unc-1.1.0.png" alt="Visualization" width="500px">

R
Coco Train Sample Dataset
universe.roboflow.com
zip
Updated Mar 11, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
COCO (2024). Coco Train Sample Dataset [Dataset]. https://universe.roboflow.com/coco-va583/coco-train-sample/dataset/1
Explore at:
zipAvailable download formats
Dataset updated
Mar 11, 2024
Dataset authored and provided by
COCO
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Bicycle Car Person Bounding Boxes
Description
COCO Train Sample

## Overview COCO Train Sample is a dataset for object detection tasks - it contains Bicycle Car Person annotations for 8,057 images. ## Getting Started You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model. ## License This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
R
Val Creation For Coco + Landing Pad Image Dataset Dataset
universe.roboflow.com
zip
Updated Feb 12, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
UWARG YOLOv7 (2023). Val Creation For Coco + Landing Pad Image Dataset Dataset [Dataset]. https://universe.roboflow.com/uwarg-yolov7/old-train-val-dataset-creation-for-coco-landing-pad-image-dataset/dataset/1
Explore at:
zipAvailable download formats
Dataset updated
Feb 12, 2023
Dataset authored and provided by
UWARG YOLOv7
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
COCO And LandingPads Bounding Boxes
Description
Val Dataset Creation For COCO + Landing Pad Image Dataset

## Overview Val Dataset Creation For COCO + Landing Pad Image Dataset is a dataset for object detection tasks - it contains COCO And LandingPads annotations for 1,852 images. ## Getting Started You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model. ## License This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
HuBMap COCO Dataset 512x512 Tiled
kaggle.com
zip
Updated Nov 20, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sreevishnu Damodaran (2020). HuBMap COCO Dataset 512x512 Tiled [Dataset]. https://www.kaggle.com/datasets/sreevishnudamodaran/hubmap-coco-dataset-512x512-tiled
Explore at:
zip(739767398 bytes)Available download formats
Dataset updated
Nov 20, 2020
Authors
Sreevishnu Damodaran
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
This Dataset contains HuBMap Dataset in COCO format to use in any Object Detection and Instance Segmentation Task.

COCO format easily supports Segmentation Frameworks such as AdelaiDet, Detectron2, TensorFlow etc.

The dataset is structured with images split into directories and no downscaling was done.

The following notebook explains how to convert custom annotations to COCO format:

https://www.kaggle.com/sreevishnudamodaran/build-custom-coco-annotations-512x512-tiled

Thanks to the Kaggle community and staff for all the support!

Please don't miss to upvote and comment if you like my work :)

Hope I everyone finds this useful!

Directory Structure:

- coco_train - images(contains images in jpg format) - original_tiff_image_name - tile_column_number - image . . . . . . . . . - train.json (contains all the segmentation annotations in coco - format with proper relative path of the images)
R
Coco 2017_train Image Dataset
universe.roboflow.com
zip
Updated Feb 21, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
COCOFinal (2024). Coco 2017_train Image Dataset [Dataset]. https://universe.roboflow.com/cocofinal-a52ez/coco-2017_train-image/dataset/1
Explore at:
zipAvailable download formats
Dataset updated
Feb 21, 2024
Dataset authored and provided by
COCOFinal
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Person Car Dog Cake Bounding Boxes
Description
COCO 2017_Train Image

## Overview COCO 2017_Train Image is a dataset for object detection tasks - it contains Person Car Dog Cake annotations for 300 images. ## Getting Started You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model. ## License This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
YOGData: Labelled data (YOLO and Mask R-CNN) for yogurt cup identification...
zenodo.org
data.niaid.nih.gov
bin, zip
Updated Jun 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Symeon Symeonidis; Vasiliki Balaska; Dimitrios Tsilis; Fotis K. Konstantinidis; Fotis K. Konstantinidis; Symeon Symeonidis; Vasiliki Balaska; Dimitrios Tsilis (2022). YOGData: Labelled data (YOLO and Mask R-CNN) for yogurt cup identification within production lines [Dataset]. http://doi.org/10.5281/zenodo.6773531
Explore at:
bin, zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.6773531
Dataset updated
Jun 29, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Symeon Symeonidis; Vasiliki Balaska; Dimitrios Tsilis; Fotis K. Konstantinidis; Fotis K. Konstantinidis; Symeon Symeonidis; Vasiliki Balaska; Dimitrios Tsilis
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Data abstract:
The YogDATA dataset contains images from an industrial laboratory production line when it is functioned to quality yogurts. The case-study for the recognition of yogurt cups requires training of Mask R-CNN and YOLO v5.0 models with a set of corresponding images. Thus, it is important to collect the corresponding images to train and evaluate the class. Specifically, the YogDATA dataset includes the same labeled data for Mask R-CNN (coco format) and YOLO models. For the YOLO architecture, training and validation datsets include sets of images in jpg format and their annotations in txt file format. For the Mask R-CNN architecture, the annotation of the same sets of images are included in json file format (80% of images and annotations of each subset are in training set and 20% of images of each subset are in test set.)

Paper abstract:
The explosion of the digitisation of the traditional industrial processes and procedures is consolidating a positive impact on modern society by offering a critical contribution to its economic development. In particular, the dairy sector consists of various processes, which are very demanding and thorough. It is crucial to leverage modern automation tools and through-engineering solutions to increase their efficiency and continuously meet challenging standards. Towards this end, in this work, an intelligent algorithm based on machine vision and artificial intelligence, which identifies dairy products within production lines, is presented. Furthermore, in order to train and validate the model, the YogDATA dataset was created that includes yogurt cups within a production line. Specifically, we evaluate two deep learning models (Mask R-CNN and YOLO v5.0) to recognise and detect each yogurt cup in a production line, in order to automate the packaging processes of the products. According to our results, the performance precision of the two models is similar, estimating its at 99\%.
R
Coco 2017_train 300 Dataset
universe.roboflow.com
zip
Updated Feb 26, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
coc (2024). Coco 2017_train 300 Dataset [Dataset]. https://universe.roboflow.com/coc-qq6ry/coco-2017_train-300/dataset/1
Explore at:
zipAvailable download formats
Dataset updated
Feb 26, 2024
Dataset authored and provided by
coc
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Dog Person Car Cake Polygons
Description
COCO 2017_train 300

## Overview COCO 2017_train 300 is a dataset for instance segmentation tasks - it contains Dog Person Car Cake annotations for 300 images. ## Getting Started You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model. ## License This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
COCO King
kaggle.com
zip
Updated Apr 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Phani Ratan (2025). COCO King [Dataset]. https://www.kaggle.com/datasets/knowledgeforyou/coco-king
Explore at:
zip(656138535 bytes)Available download formats
Dataset updated
Apr 26, 2025
Authors
Phani Ratan
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset Overview

COCO-King is a large-scale dataset for reference-guided image completion tasks, derived from the COCO dataset. It features images with masked objects and corresponding reference images of those objects, enabling models to learn how to replace or complete masked regions with guidance from reference images.

Dataset Size and Structure

Total size: 690MB Images: 9,558 total images (8,134 training + 1,424 validation) Categories: 170 diverse object categories Directory structure:

coco-king/ ├── train/ │ ├── images/ # Original images with objects to be masked │ ├── mask/ # Binary masks (white background, black object) │ └── reference/ # Augmented reference images of masked objects ├── val/ │ ├── images/ # Validation images │ ├── mask/ # Validation masks │ └── reference/ # Validation reference images ├── metadata.json # Complete dataset metadata ├── train_annotations.json # COCO-format training annotations └── val_annotations.json # COCO-format validation annotations

Unique Features

Specially Curated Masks

Smoothed Contours: Each mask features smooth, rounded edges to mimic human-drawn masks rather than pixel-perfect segmentations

Processing Pipeline: Masks underwent morphological operations and Gaussian blurring to create natural-looking boundaries

Single Masked Object per Image: Each image has one primary object masked (the largest that meets size criteria), despite containing multiple objects (avg. 7 objects per image)

Rich Reference Images

Paint by Example Style Augmentations: Reference images are augmented similar to the Paint by Example paper:

Mild color jittering (brightness, contrast, saturation, hue) Random horizontal flips Small random rotations (up to 10 degrees) Mild perspective transformations Occasional equalization and auto-contrast

Balanced Object Selection

Size Range: Objects cover 0.89% to 42% of image area (average: ~25%) Multiple Objects: Every image contains multiple objects (ranging from 2 to 29) Diverse Categories: Well-distributed across 170 object categories

Dataset Highlights

Person is the most common category (1,138 training, 184 validation)

Top categories include sky, trees, clouds, road, grass, walls, buildings

Average of 7 objects per image provides context and complexity

Bounding boxes are strategically sized to be neither too small nor too dominant

Each image-mask-reference triplet is carefully curated to ensure quality

Applications

This dataset is ideal for: Exemplar-based image inpainting/completion: Using reference images to guide the filling of masked regions Reference-guided object placement: Learning to place objects in scenes with proper perspective and lighting

Object replacement: Replacing objects in images with new objects while maintaining scene coherence

Style/appearance transfer: Learning to transfer appearance characteristics to objects in new scenes

Research on Paint by Example or similar architectures: Models that aim to fill masked regions based on reference images

Data Processing

Derived from COCO dataset with additional processing

Each image triplet (image, mask, reference) was processed to ensure: The masked object is of appropriate size Masks have smooth, natural contours Reference images maintain object identity while providing variation through augmentation

This dataset offers a unique resource for developing and benchmarking models that can intelligently replace or complete portions of images based on reference examples.
R
Face Features Test Dataset
universe.roboflow.com
zip
Updated Dec 6, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Peter Lin (2021). Face Features Test Dataset [Dataset]. https://universe.roboflow.com/peter-lin/face-features-test/dataset/1
Explore at:
zipAvailable download formats
Dataset updated
Dec 6, 2021
Dataset authored and provided by
Peter Lin
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Variables measured
Face Features Bounding Boxes
Description
A simple dataset for benchmarking CreateML object detection models. The images are sampled from COCO dataset with eyes and nose bounding boxes added. It’s not meant to be serious or useful in a real application. The purpose is to look at how long it takes to train CreateML models with varying dataset and batch sizes.

Training performance is affected by model configuration, dataset size and batch configuration. Larger models and batches require more memory. I used CreateML object detection project to compare the performance.

Hardware

M1 Macbook Air * 8 GPU * 4/4 CPU * 16G memory * 512G SSD

M1 Max Macbook Pro * 24 GPU * 2/8 CPU * 32G memory * 2T SSD

Small Dataset Train: 144 Valid: 16 Test: 8

Results |batch | M1 ET | M1Max ET | peak mem G | |--------|:------|:---------|:-----------| |16 | 16 | 11 | 1.5 | |32 | 29 | 17 | 2.8 | |64 | 56 | 30 | 5.4 | |128 | 170 | 57 | 12 |

Larger Dataset Train: 301 Valid: 29 Test: 18

Results |batch | M1 ET | M1Max ET | peak mem G | |--------|:------|:---------|:-----------| |16 | 21 | 10 | 1.5 | |32 | 42 | 17 | 3.5 | |64 | 85 | 30 | 8.4 | |128 | 281 | 54 | 16.5 |

CreateML Settings

For all tests, training was set to Full Network. I closed CreateML between each run to make sure memory issues didn't cause a slow down. There is a bug with Monterey as of 11/2021 that leads to memory leak. I kept an eye on the memory usage. If it looked like there was a memory leak, I restarted MacOS.

Observations

In general, more GPU and memory with MBP reduces the training time. Having more memory lets you train with larger datasets. On M1 Macbook Air, the practical limit is 12G before memory pressure impacts performance. On M1 Max MBP, the practical limit is 26G before memory pressure impacts performance. To work around memory pressure, use smaller batch sizes.

On the larger dataset with batch size 128, the M1Max is 5x faster than Macbook Air. Keep in mind a real dataset should have thousands of samples like Coco or Pascal. Ideally, you want a dataset with 100K images for experimentation and millions for the real training. The new M1 Max Macbooks is a cost effective alternative to building a Windows/Linux workstation with RTX 3090 24G. For most of 2021, the price of RTX 3090 with 24G is around $3,000.00. That means an equivalent windows workstation would cost the same as the M1Max Macbook pro I used to run the benchmarks.

Full Network vs Transfer Learning

As of CreateML 3, training with full network doesn't fully utilize the GPU. I don't know why it works that way. You have to select transfer learning to fully use the GPU. The results of transfer learning with the larger dataset. In general, the training time is faster and loss is better.

batch ET min Train Acc Val Acc Test Acc Top IU Train Top IU Valid Top IU Test Peak mem G loss
16 4 75 19 12 78 23 13 1.5 0.41
32 8 75 21 10 78 26 11 2.76 0.02
64 13 75 23 8 78 24 9 5.3 0.017
128 25 75 22 13 78 25 14 8.4 0.012

Github Project

The source code and full results are up on Github https://github.com/woolfel/createmlbench

batch	ET min	Train Acc	Val Acc	Test Acc	Top IU Train	Top IU Valid	Top IU Test	Peak mem G	loss
16	4	75	19	12	78	23	13	1.5	0.41
32	8	75	21	10	78	26	11	2.76	0.02
64	13	75	23	8	78	24	9	5.3	0.017
128	25	75	22	13	78	25	14	8.4	0.012

Facebook

Twitter

Click to copy link

Link copied

Cite

Microsoft (2025). Microsoft COCO 2017 Object Detection Dataset - raw [Dataset]. https://public.roboflow.com/object-detection/microsoft-coco-subset/2

Microsoft COCO 2017 Object Detection Dataset - raw

Explore at:

zipAvailable download formats

Dataset updated

Feb 1, 2025

Dataset authored and provided by

Microsofthttp://microsoft.com/

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Variables measured

Bounding Boxes of coco-objects

Description

This is the full 2017 COCO object detection dataset (train and valid), which is a subset of the most recent 2020 COCO object detection dataset.

COCO is a large-scale object detection, segmentation, and captioning dataset of many object types easily recognizable by a 4-year-old. The data is initially collected and published by Microsoft. The original source of the data is here and the paper introducing the COCO dataset is here.

Clear search

Close search

Google apps

Main menu

Microsoft COCO 2017 Object Detection Dataset - raw

MS-COCO 2017 dataset - YOLO format

coco

coco2017

dataset_coco

COCO

COCO 2017 Object Detection Dataset

COCO_Person

COCO minitrain

Microsoft Coco 2017 Dataset

COCO8 Ultralytics

Train Example

ref_coco

Coco Train Sample Dataset

COCO Train Sample

Val Creation For Coco + Landing Pad Image Dataset Dataset

Val Dataset Creation For COCO + Landing Pad Image Dataset

HuBMap COCO Dataset 512x512 Tiled

This Dataset contains HuBMap Dataset in COCO format to use in any Object Detection and Instance Segmentation Task.

COCO format easily supports Segmentation Frameworks such as AdelaiDet, Detectron2, TensorFlow etc.

Thanks to the Kaggle community and staff for all the support!

Please don't miss to upvote and comment if you like my work :)

Hope I everyone finds this useful!

Directory Structure:

Coco 2017_train Image Dataset

COCO 2017_Train Image

YOGData: Labelled data (YOLO and Mask R-CNN) for yogurt cup identification...

Coco 2017_train 300 Dataset

COCO 2017_train 300

COCO King

Face Features Test Dataset

Microsoft COCO 2017 Object Detection Dataset - raw