50 datasets found
  1. T

    coco

    • tensorflow.org
    • huggingface.co
    Updated Jun 1, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). coco [Dataset]. https://www.tensorflow.org/datasets/catalog/coco
    Explore at:
    Dataset updated
    Jun 1, 2024
    Description

    COCO is a large-scale object detection, segmentation, and captioning dataset.

    Note: * Some images from the train and validation sets don't have annotations. * Coco 2014 and 2017 uses the same images, but different train/val/test splits * The test split don't have any annotations (only images). * Coco defines 91 classes but the data only uses 80 classes. * Panotptic annotations defines defines 200 classes but only uses 133.

    To use this dataset:

    import tensorflow_datasets as tfds
    
    ds = tfds.load('coco', split='train')
    for ex in ds.take(4):
     print(ex)
    

    See the guide for more informations on tensorflow_datasets.

    https://storage.googleapis.com/tfds-data/visualization/fig/coco-2014-1.1.0.png" alt="Visualization" width="500px">

  2. R

    Microsoft Coco Dataset

    • universe.roboflow.com
    zip
    Updated Jul 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Microsoft (2025). Microsoft Coco Dataset [Dataset]. https://universe.roboflow.com/microsoft/coco/model/3
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 23, 2025
    Dataset authored and provided by
    Microsoft
    Variables measured
    Object Bounding Boxes
    Description

    Microsoft Common Objects in Context (COCO) Dataset

    The Common Objects in Context (COCO) dataset is a widely recognized collection designed to spur object detection, segmentation, and captioning research. Created by Microsoft, COCO provides annotations, including object categories, keypoints, and more. The model it a valuable asset for machine learning practitioners and researchers. Today, many model architectures are benchmarked against COCO, which has enabled a standard system by which architectures can be compared.

    While COCO is often touted to comprise over 300k images, it's pivotal to understand that this number includes diverse formats like keypoints, among others. Specifically, the labeled dataset for object detection stands at 123,272 images.

    The full object detection labeled dataset is made available here, ensuring researchers have access to the most comprehensive data for their experiments. With that said, COCO has not released their test set annotations, meaning the test data doesn't come with labels. Thus, this data is not included in the dataset.

    The Roboflow team has worked extensively with COCO. Here are a few links that may be helpful as you get started working with this dataset:

  3. R

    Coco Class Dataset

    • universe.roboflow.com
    zip
    Updated Mar 29, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    pick perfect (2025). Coco Class Dataset [Dataset]. https://universe.roboflow.com/pick-perfect-gngtx/coco-class/dataset/2
    Explore at:
    zipAvailable download formats
    Dataset updated
    Mar 29, 2025
    Dataset authored and provided by
    pick perfect
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Maturity Bounding Boxes
    Description

    Coco Class

    ## Overview
    
    Coco Class is a dataset for object detection tasks - it contains Maturity annotations for 6,187 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  4. Z

    COCO, LVIS, Open Images V4 classes mapping

    • data.niaid.nih.gov
    • zenodo.org
    • +1more
    Updated Oct 13, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Giuseppe Amato; Paolo Bolettieri; Fabio Carrara; Fabrizio Falchi; Claudio Gennaro; Nicola Messina; Lucia Vadicamo; Claudio Vairo (2022). COCO, LVIS, Open Images V4 classes mapping [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7194299
    Explore at:
    Dataset updated
    Oct 13, 2022
    Dataset provided by
    ISTI-CNR
    Authors
    Giuseppe Amato; Paolo Bolettieri; Fabio Carrara; Fabrizio Falchi; Claudio Gennaro; Nicola Messina; Lucia Vadicamo; Claudio Vairo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This repository contains a mapping between the classes of COCO, LVIS, and Open Images V4 datasets into a unique set of 1460 classes.

    COCO [Lin et al 2014] contains 80 classes, LVIS [gupta2019lvis] contains 1460 classes, Open Images V4 [Kuznetsova et al. 2020] contains 601 classes.

    We built a mapping of these classes using a semi-automatic procedure in order to have a unique final list of 1460 classes. We also generated a hierarchy for each class, using wordnet

    This repository contains the following files:

    coco_classes_map.txt, contains the mapping for the 80 coco classes

    lvis_classes_map.txt, contains the mapping for the 1460 coco classes

    openimages_classes_map.txt, contains the mapping for the 601 coco classes

    classname_hyperset_definition.csv, contains the final set of 1460 classes, their definition and hierarchy

    all-classnames.xlsx, contains a side-by-side view of all classes considered

    This mapping was used in VISIONE [Amato et al. 2021, Amato et al. 2022] that is a content-based retrieval system that supports various search functionalities (text search, object/color-based search, semantic and visual similarity search, temporal search). For the object detection VISIONE uses three pre-trained models: VfNet Zhang et al. 2021, Mask R-CNN He et al. 2017, and a Faster R-CNN+Inception ResNet (trained on the Open Images V4).

    This is repository is released under a Creative Commons Attribution license, please cite the following paper if you use it in your work in any form:

    @inproceedings{amato2021visione, title={The visione video search system: exploiting off-the-shelf text search engines for large-scale video retrieval}, author={Amato, Giuseppe and Bolettieri, Paolo and Carrara, Fabio and Debole, Franca and Falchi, Fabrizio and Gennaro, Claudio and Vadicamo, Lucia and Vairo, Claudio}, journal={Journal of Imaging}, volume={7}, number={5}, pages={76}, year={2021}, publisher={Multidisciplinary Digital Publishing Institute} }

    References:

    [Amato et al. 2022] Amato, G. et al. (2022). VISIONE at Video Browser Showdown 2022. In: , et al. MultiMedia Modeling. MMM 2022. Lecture Notes in Computer Science, vol 13142. Springer, Cham. https://doi.org/10.1007/978-3-030-98355-0_52

    [Amato et al. 2021] Amato, G., Bolettieri, P., Carrara, F., Debole, F., Falchi, F., Gennaro, C., Vadicamo, L. and Vairo, C., 2021. The visione video search system: exploiting off-the-shelf text search engines for large-scale video retrieval. Journal of Imaging, 7(5), p.76.

    [Gupta et al.2019] Gupta, A., Dollar, P. and Girshick, R., 2019. Lvis: A dataset for large vocabulary instance segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 5356-5364).

    [He et al. 2017] He, K., Gkioxari, G., Dollár, P. and Girshick, R., 2017. Mask r-cnn. In Proceedings of the IEEE international conference on computer vision (pp. 2961-2969).

    [Kuznetsova et al. 2020] Kuznetsova, A., Rom, H., Alldrin, N., Uijlings, J., Krasin, I., Pont-Tuset, J., Kamali, S., Popov, S., Malloci, M., Kolesnikov, A. and Duerig, T., 2020. The open images dataset v4. International Journal of Computer Vision, 128(7), pp.1956-1981.

    [Lin et al. 2014] Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P. and Zitnick, C.L., 2014, September. Microsoft coco: Common objects in context. In European conference on computer vision (pp. 740-755). Springer, Cham.

    [Zhang et al. 2021] Zhang, H., Wang, Y., Dayoub, F. and Sunderhauf, N., 2021. Varifocalnet: An iou-aware dense object detector. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 8514-8523).

  5. MJ-COCO-2025 Dataset

    • kaggle.com
    zip
    Updated May 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    MJ-COCO-2025 (2025). MJ-COCO-2025 Dataset [Dataset]. https://www.kaggle.com/datasets/mjcoco2025/mj-coco-2025
    Explore at:
    zip(91583066 bytes)Available download formats
    Dataset updated
    May 28, 2025
    Authors
    MJ-COCO-2025
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    MJ-COCO-2025 is a modified version of the MS-COCO-2017 dataset, in which the annotation errors have been automatically corrected using model-driven methods. The name "MJ" originates from the initials of Min Je Kim, the individual who updated the dataset. "MJ" also stands for "Modification & Justification," emphasizing that the modifications were not manually edited but were systematically validated through machine learning models to increase reliability and quality. Thus, MJ-COCO-2025 reflects both a personal identity and a commitment to improving the dataset through thoughtful modification, ensuring improved accuracy, reliability and consistency. The comparative results of MS-COCO and MJ-COCO datasets are presented in Table 1 and Figure 1. The MJ-COCO-2025 dataset features the improvements, including fixes for group annotations, addition of missing annotations, removal of redundant or overlapping labels, etc. These refinements aim to improve training and evaluation performance in object detection tasks.

    Summary of Improvements:

    The re-labeled MJ-COCO-2025 dataset exhibits notable improvements in annotation quality compared to the original MS-COCO-2017 dataset. As shown in Table 1, it includes substantial increases in categories such as previously missing annotations and group annotations. At the same time, the dataset has been refined by reducing annotation noise through the removal of duplicates, resolution of challenging or debatable cases, and elimination of non-existent object annotations.

    Table 1: Comparison of Class-wise Annotations: MS-COCO-2017 and MJ-COCO-2025. Class Names | MS-COCO | MJ-COCO | Difference | Class Names | MS-COCO | MJ-COCO | Difference ---------------------|---------|---------|------------|----------------------|---------|---------|------------ Airplane | 5,135 | 5,810 | 675 | Kite | 9,076 | 15,092 | 6,016 Apple | 5,851 | 19,527 | 13,676 | Knife | 7,770 | 6,697 | -1,073 Backpack | 8,720 | 10,029 | 1,309 | Laptop | 4,970 | 5,280 | 310 Banana | 9,458 | 49,705 | 40,247 | Microwave | 1,673 | 1,755 | 82 Baseball Bat | 3,276 | 3,517 | 241 | Motorcycle | 8,725 | 10,045 | 1,320 Baseball Glove | 3,747 | 3,440 | -307 | Mouse | 2,262 | 2,377 | 115 Bear | 1,294 | 1,311 | 17 | Orange | 6,399 | 18,416 | 12,017 Bed | 4,192 | 4,177 | -15 | Oven | 3,334 | 4,310 | 976 Bench | 9,838 | 9,784 | -54 | Parking Meter | 1,285 | 1,355 | 70 Bicycle | 7,113 | 7,853 | 740 | Person | 262,465 | 435,252 | 172,787 Bird | 10,806 | 13,346 | 2,540 | Pizza | 5,821 | 6,049 | 228 Boat | 10,759 | 13,386 | 2,627 | Potted Plant | 8,652 | 11,252 | 2,600 Book | 24,715 | 35,712 | 10,997 | Refrigerator | 2,637 | 2,728 | 91 Bottle | 24,342 | 32,455 | 8,113 | Remote | 5,703 | 5,428 | -275 Bowl | 14,358 | 13,591 | -767 | Sandwich | 4,373 | 3,925 | -448 Broccoli | 7,308 | 14,275 | 6,967 | Scissors | 1,481 | 1,558 | 77 Bus | 6,069 | 7,132 | 1,063 | Sheep | 9,509 | 12,813 | 3,304 Cake | 6,353 | 8,968 | 2,615 | Sink | 5,610 | 5,969 | 359 Car | 43,867 | 51,662 | 7,795 | Skateboard | 5,543 | 5,761 | 218 Carrot | 7,852 | 15,411 | 7,559 | Skis | 6,646 | 8,945 | 2,299 Cat | 4,768 | 4,895 | 127 | Snowboard | 2,685 | 2,565 | -120 Cell Phone | 6,434 | 6,642 | 208 | Spoon | 6,165 | 6,156 | -9 Chair | 38,491 | 56,750 | 18,259 | Sports Ball | 6,347 | 6,060 | -287 Clock | 6,334 | 7,618 | 1,284 | Stop Sign | 1,983 | 2,684 | 701 Couch | 5,779 | 5,598 | -181 | Suitcase | 6,192 | 7,447 | 1,255 Cow | 8,147 | 8,990 | 843 | Surfboard | 6,126 | 6,175 | 49 Cup | 20,650 | 22,545 | 1,895 | Teddy Bear | 4,793 | 6,432 | 1,639 Dining Table | 15,714 | 16,569 | 855 | Tennis Racket | 4,812 | 4,932 | 120 Dog | 5,508 | 5,870 | 362 | Tie | 6,496 | 6,048 | -448 Donut | 7,179 | 11,622 | 4,443 ...

  6. a

    COCO

    • datasets.activeloop.ai
    • huggingface.co
    deeplake
    Updated Feb 5, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tsung-Yi Lin (2022). COCO [Dataset]. https://datasets.activeloop.ai/docs/ml/datasets/coco-dataset/
    Explore at:
    deeplakeAvailable download formats
    Dataset updated
    Feb 5, 2022
    Authors
    Tsung-Yi Lin
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 1, 2014 - Dec 31, 2015
    Dataset funded by
    Microsoft Research
    Description

    The COCO dataset is a large dataset of labeled images and annotations. It is a popular dataset for machine learning and artificial intelligence research. The dataset consists of 330,000 images and 500,000 object annotations. The annotations include the bounding boxes of objects in the images, as well as the labels of the objects.

  7. h

    coco2017

    • huggingface.co
    • opendatalab.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Padilla, coco2017 [Dataset]. https://huggingface.co/datasets/rafaelpadilla/coco2017
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    Padilla
    Description

    This dataset contains all COCO 2017 images and annotations split in training (118287 images) and validation (5000 images).

  8. R

    Coco Reduced Classes Dataset

    • universe.roboflow.com
    zip
    Updated Nov 11, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    yolov7 (2022). Coco Reduced Classes Dataset [Dataset]. https://universe.roboflow.com/yolov7-zqhde/coco-reduced-classes
    Explore at:
    zipAvailable download formats
    Dataset updated
    Nov 11, 2022
    Dataset authored and provided by
    yolov7
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Variables measured
    People And Vehicles Bounding Boxes
    Description

    Coco Reduced Classes

    ## Overview
    
    Coco Reduced Classes is a dataset for object detection tasks - it contains People And Vehicles annotations for 4,952 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [MIT license](https://creativecommons.org/licenses/MIT).
    
  9. COCO King

    • kaggle.com
    zip
    Updated Apr 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Phani Ratan (2025). COCO King [Dataset]. https://www.kaggle.com/datasets/knowledgeforyou/coco-king
    Explore at:
    zip(656138535 bytes)Available download formats
    Dataset updated
    Apr 26, 2025
    Authors
    Phani Ratan
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset Overview

    COCO-King is a large-scale dataset for reference-guided image completion tasks, derived from the COCO dataset. It features images with masked objects and corresponding reference images of those objects, enabling models to learn how to replace or complete masked regions with guidance from reference images.

    Dataset Size and Structure

    Total size: 690MB Images: 9,558 total images (8,134 training + 1,424 validation) Categories: 170 diverse object categories Directory structure:

    coco-king/ ├── train/ │ ├── images/ # Original images with objects to be masked │ ├── mask/ # Binary masks (white background, black object) │ └── reference/ # Augmented reference images of masked objects ├── val/ │ ├── images/ # Validation images │ ├── mask/ # Validation masks │ └── reference/ # Validation reference images ├── metadata.json # Complete dataset metadata ├── train_annotations.json # COCO-format training annotations └── val_annotations.json # COCO-format validation annotations

    Unique Features

    Specially Curated Masks

    Smoothed Contours: Each mask features smooth, rounded edges to mimic human-drawn masks rather than pixel-perfect segmentations

    Processing Pipeline: Masks underwent morphological operations and Gaussian blurring to create natural-looking boundaries

    Single Masked Object per Image: Each image has one primary object masked (the largest that meets size criteria), despite containing multiple objects (avg. 7 objects per image)

    Rich Reference Images

    Paint by Example Style Augmentations: Reference images are augmented similar to the Paint by Example paper:

    Mild color jittering (brightness, contrast, saturation, hue) Random horizontal flips Small random rotations (up to 10 degrees) Mild perspective transformations Occasional equalization and auto-contrast

    Balanced Object Selection

    Size Range: Objects cover 0.89% to 42% of image area (average: ~25%) Multiple Objects: Every image contains multiple objects (ranging from 2 to 29) Diverse Categories: Well-distributed across 170 object categories

    Dataset Highlights

    • Person is the most common category (1,138 training, 184 validation)
    • Top categories include sky, trees, clouds, road, grass, walls, buildings
    • Average of 7 objects per image provides context and complexity
    • Bounding boxes are strategically sized to be neither too small nor too dominant
    • Each image-mask-reference triplet is carefully curated to ensure quality

    Applications

    This dataset is ideal for: Exemplar-based image inpainting/completion: Using reference images to guide the filling of masked regions Reference-guided object placement: Learning to place objects in scenes with proper perspective and lighting

    Object replacement: Replacing objects in images with new objects while maintaining scene coherence

    Style/appearance transfer: Learning to transfer appearance characteristics to objects in new scenes

    Research on Paint by Example or similar architectures: Models that aim to fill masked regions based on reference images

    Data Processing

    Derived from COCO dataset with additional processing

    Each image triplet (image, mask, reference) was processed to ensure: The masked object is of appropriate size Masks have smooth, natural contours Reference images maintain object identity while providing variation through augmentation

    This dataset offers a unique resource for developing and benchmarking models that can intelligently replace or complete portions of images based on reference examples.

  10. h

    coco

    • huggingface.co
    Updated Mar 3, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Detection datasets (2023). coco [Dataset]. https://huggingface.co/datasets/detection-datasets/coco
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 3, 2023
    Dataset authored and provided by
    Detection datasets
    Description

    detection-datasets/coco dataset hosted on Hugging Face and contributed by the HF Datasets community

  11. Microsoft COCO 2017 Object Detection Dataset - raw

    • public.roboflow.com
    zip
    Updated Feb 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Microsoft (2025). Microsoft COCO 2017 Object Detection Dataset - raw [Dataset]. https://public.roboflow.com/object-detection/microsoft-coco-subset/2
    Explore at:
    zipAvailable download formats
    Dataset updated
    Feb 1, 2025
    Dataset authored and provided by
    Microsofthttp://microsoft.com/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Bounding Boxes of coco-objects
    Description

    This is the full 2017 COCO object detection dataset (train and valid), which is a subset of the most recent 2020 COCO object detection dataset.

    COCO is a large-scale object detection, segmentation, and captioning dataset of many object types easily recognizable by a 4-year-old. The data is initially collected and published by Microsoft. The original source of the data is here and the paper introducing the COCO dataset is here.

  12. MS-COCO 2017 dataset - YOLO format

    • kaggle.com
    zip
    Updated Nov 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shahariar Alif (2025). MS-COCO 2017 dataset - YOLO format [Dataset]. https://www.kaggle.com/datasets/alifshahariar/ms-coco-2017-dataset-yolo-format
    Explore at:
    zip(26509567635 bytes)Available download formats
    Dataset updated
    Nov 1, 2025
    Authors
    Shahariar Alif
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    I wanted to train a custom YOLO object detection model, but the MS-COCO dataset was not in a good format. So I parsed the instances json files in the MS-COCO annotations and processed the dataset to be a YOLO friendly format.

    I downloaded the dataset from COCO webste. You can download any split you need from the COCO dataset website

    Directory info: 1. test: Only contains the test images 2. train: Has two sub folders, images - contains the training images, labels - contains the training labels in a .txt file for each train image 3. val: Has two sub folders, images - contains the validation images, labels - contains the validation labels in a .txt file for each validation image

    I do not own the dataset in any way. I merely parsed the dataset to a be in a ready to train YOLO format. Download the original dataset from the COCO webste

  13. R

    Conversion Of Format And Classes To Coco Dataset

    • universe.roboflow.com
    zip
    Updated Aug 25, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    North South University (2022). Conversion Of Format And Classes To Coco Dataset [Dataset]. https://universe.roboflow.com/north-south-university-8gvqa/conversion-of-format-and-classes-to-coco/dataset/3
    Explore at:
    zipAvailable download formats
    Dataset updated
    Aug 25, 2022
    Dataset authored and provided by
    North South University
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Objects Bounding Boxes
    Description

    Conversion Of Format And Classes To Coco

    ## Overview
    
    Conversion Of Format And Classes To Coco is a dataset for object detection tasks - it contains Objects annotations for 7,460 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  14. T

    ref_coco

    • tensorflow.org
    • opendatalab.com
    Updated May 31, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). ref_coco [Dataset]. https://www.tensorflow.org/datasets/catalog/ref_coco
    Explore at:
    Dataset updated
    May 31, 2024
    Description

    A collection of 3 referring expression datasets based off images in the COCO dataset. A referring expression is a piece of text that describes a unique object in an image. These datasets are collected by asking human raters to disambiguate objects delineated by bounding boxes in the COCO dataset.

    RefCoco and RefCoco+ are from Kazemzadeh et al. 2014. RefCoco+ expressions are strictly appearance based descriptions, which they enforced by preventing raters from using location based descriptions (e.g., "person to the right" is not a valid description for RefCoco+). RefCocoG is from Mao et al. 2016, and has more rich description of objects compared to RefCoco due to differences in the annotation process. In particular, RefCoco was collected in an interactive game-based setting, while RefCocoG was collected in a non-interactive setting. On average, RefCocoG has 8.4 words per expression while RefCoco has 3.5 words.

    Each dataset has different split allocations that are typically all reported in papers. The "testA" and "testB" sets in RefCoco and RefCoco+ contain only people and only non-people respectively. Images are partitioned into the various splits. In the "google" split, objects, not images, are partitioned between the train and non-train splits. This means that the same image can appear in both the train and validation split, but the objects being referred to in the image will be different between the two sets. In contrast, the "unc" and "umd" splits partition images between the train, validation, and test split. In RefCocoG, the "google" split does not have a canonical test set, and the validation set is typically reported in papers as "val*".

    Stats for each dataset and split ("refs" is the number of referring expressions, and "images" is the number of images):

    datasetpartitionsplitrefsimages
    refcocogoogletrain4000019213
    refcocogoogleval50004559
    refcocogoogletest50004527
    refcocounctrain4240416994
    refcocouncval38111500
    refcocounctestA1975750
    refcocounctestB1810750
    refcoco+unctrain4227816992
    refcoco+uncval38051500
    refcoco+unctestA1975750
    refcoco+unctestB1798750
    refcocoggoogletrain4482224698
    refcocoggoogleval50004650
    refcocogumdtrain4222621899
    refcocogumdval25731300
    refcocogumdtest50232600

    To use this dataset:

    import tensorflow_datasets as tfds
    
    ds = tfds.load('ref_coco', split='train')
    for ex in ds.take(4):
     print(ex)
    

    See the guide for more informations on tensorflow_datasets.

    https://storage.googleapis.com/tfds-data/visualization/fig/ref_coco-refcoco_unc-1.1.0.png" alt="Visualization" width="500px">

  15. Coco 128 Dataset

    • universe.roboflow.com
    zip
    Updated Sep 28, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Team Roboflow (2021). Coco 128 Dataset [Dataset]. https://universe.roboflow.com/team-roboflow/coco-128/dataset/2
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 28, 2021
    Dataset provided by
    Roboflowhttps://roboflow.com/
    Authors
    Team Roboflow
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Common Objects Bounding Boxes
    Description

    COCO 128 is a subset of 128 images of the larger COCO dataset. It reuses the training set for both validation and testing, with the purpose of proving that your training pipeline is working properly and can overfit this small dataset.

    COCO 128 is a great dataset to use the first time you are testing out a new model.

  16. h

    mscoco

    • huggingface.co
    Updated Aug 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Romrawin (Jin) Chumpu (2023). mscoco [Dataset]. https://huggingface.co/datasets/romrawinjp/mscoco
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 22, 2023
    Authors
    Romrawin (Jin) Chumpu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Common Objects in Context (COCO) Dataset

    This dataset is English captions of COCO dataset. The splits in this dataset is set according to Andrej Karpathy's split from dataset_coco.json file. The collection was created specifically for simplicity of use in training and evaluation pipeline by non-commercial and research purposes. The COCO images dataset is licensed under a Creative Commons Attribution 4.0 License.

      Reference
    

    @misc{lin2015microsoftcococommonobjects… See the full description on the dataset page: https://huggingface.co/datasets/romrawinjp/mscoco.

  17. COCO 2014 Dataset (for YOLOv3)

    • kaggle.com
    zip
    Updated Sep 9, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jeff Faudi (2021). COCO 2014 Dataset (for YOLOv3) [Dataset]. https://www.kaggle.com/datasets/jeffaudi/coco-2014-dataset-for-yolov3/discussion
    Explore at:
    zip(26852690979 bytes)Available download formats
    Dataset updated
    Sep 9, 2021
    Authors
    Jeff Faudi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Context

    The MS COCO (Microsoft Common Objects in Context) dataset is a large-scale object detection, segmentation, key-point detection, and captioning dataset. The dataset consists of 164K images.

    This is the original version from 2014 made available here for easy access in Kaggle and because it does not seem to be still available on the COCO Dataset website. This has been retrieved from the mirror that Joseph Redmon has setup on this own website.

    Content

    The 2014 version of the COCO dataset is an excellent object detection dataset with 80 classes, 82,783 training images and 40,504 validation images. This dataset contains all this imagery on two folders as well as the annotation with the class and location (bounding box) of the objects contained in each image.

    The initial split provides training (83K), validation (41K) and test (41K) sets. Since the split between training and validation was not optimal in the original dataset, there is also two text (.part) files with a new split with only 5,000 images for validation and the rest for training. The test set has no labels and can be used for visual validation or pseudo-labelling.

    Acknowledgements

    This is mostly inspired by Erik Linder-Norén and [Joseph Redmon](https://pjreddie.com/darknet/yolo

  18. Common Object Detection

    • hub.arcgis.com
    • sdiinnovation-geoplatform.hub.arcgis.com
    Updated Feb 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Esri (2023). Common Object Detection [Dataset]. https://hub.arcgis.com/content/a91bed8bc0fe4e1bb8db45c23959e5f1
    Explore at:
    Dataset updated
    Feb 28, 2023
    Dataset authored and provided by
    Esrihttp://esri.com/
    Area covered
    Description

    This is an open source object detection model by TensorFlow in TensorFlow Lite format. While it is not recommended to use this model in production surveys, it can be useful for demonstration purposes and to get started with smart assistants in ArcGIS Survey123. You are responsible for the use of this model. When using Survey123, it is your responsibility to review and manually correct outputs.This object detection model was trained using the Common Objects in Context (COCO) dataset. COCO is a large-scale object detection dataset that is available for use under the Creative Commons Attribution 4.0 License.The dataset contains 80 object categories and 1.5 million object instances that include people, animals, food items, vehicles, and household items. For a complete list of common objects this model can detect, see Classes.The model can be used in ArcGIS Survey123 to detect common objects in photos that are captured with the Survey123 field app. Using the modelFollow the guide to use the model. You can use this model to detect or redact common objects in images captured with the Survey123 field app. The model must be configured for a survey in Survey123 Connect.Fine-tuning the modelThis model cannot be fine-tuned using ArcGIS tools.InputCamera feed (either low-resolution preview or high-resolution capture).OutputImage with common object detections written to its EXIF metadata or an image with detected objects redacted.Model architectureThis is an open source object detection model by TensorFlow in TensorFlow Lite format with MobileNet architecture. The model is available for use under the Apache License 2.0.Sample resultsHere are a few results from the model.

  19. R

    Coco Person Dataset

    • universe.roboflow.com
    zip
    Updated Sep 28, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    dataset (2025). Coco Person Dataset [Dataset]. https://universe.roboflow.com/dataset-uutxr/coco-person-gi27a
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 28, 2025
    Dataset authored and provided by
    dataset
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Variables measured
    Coco Person Bounding Boxes
    Description

    Coco Person

    ## Overview
    
    Coco Person is a dataset for object detection tasks - it contains Coco Person annotations for 5,081 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [MIT license](https://creativecommons.org/licenses/MIT).
    
  20. coco dataset

    • kaggle.com
    zip
    Updated Jul 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ProgramerSalar (2025). coco dataset [Dataset]. https://www.kaggle.com/datasets/salargamer/coco-dataset
    Explore at:
    zip(20043918455 bytes)Available download formats
    Dataset updated
    Jul 5, 2025
    Authors
    ProgramerSalar
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    The COCO dataset is a foundational large-scale benchmark for object detection, segmentation, captioning, and keypoint analysis. Created by Microsoft, it features complex everyday scenes with common objects in their natural contexts. With over 330,000 images and 2.5 million labeled instances, it has become the gold standard for training and evaluating computer vision models.

    File Information

    images/
    Contains 2 subdirectories split by usage:
    train2017/: Main training set (118K images)
    val2017/: Validation set (5K images)
    File Naming: 000000000009.jpg (12-digit zero-padded IDs)
    Formats: JPEG images with varying resolutions (average 640×480)
    
    annotations/
    Contains task-specific JSON files with consistent naming:
    captions_*.json: 5 human-generated descriptions per image
    
Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
(2024). coco [Dataset]. https://www.tensorflow.org/datasets/catalog/coco

coco

Explore at:
6 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Jun 1, 2024
Description

COCO is a large-scale object detection, segmentation, and captioning dataset.

Note: * Some images from the train and validation sets don't have annotations. * Coco 2014 and 2017 uses the same images, but different train/val/test splits * The test split don't have any annotations (only images). * Coco defines 91 classes but the data only uses 80 classes. * Panotptic annotations defines defines 200 classes but only uses 133.

To use this dataset:

import tensorflow_datasets as tfds

ds = tfds.load('coco', split='train')
for ex in ds.take(4):
 print(ex)

See the guide for more informations on tensorflow_datasets.

https://storage.googleapis.com/tfds-data/visualization/fig/coco-2014-1.1.0.png" alt="Visualization" width="500px">

Search
Clear search
Close search
Google apps
Main menu