58 datasets found
  1. g

    COCO 2014 Dataset (for YOLOv3)

    • gts.ai
    json
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    GLOBOSE TECHNOLOGY SOLUTIONS PRIVATE LIMITED, COCO 2014 Dataset (for YOLOv3) [Dataset]. https://gts.ai/dataset-download/coco-2014-dataset-for-yolov3/
    Explore at:
    jsonAvailable download formats
    Dataset authored and provided by
    GLOBOSE TECHNOLOGY SOLUTIONS PRIVATE LIMITED
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    The MS COCO (Microsoft Common Objects in Context) 2014 dataset is a large-scale benchmark for object detection, segmentation, and key-point detection. It contains 164,000+ annotated images across 80 object categories.

  2. h

    coco-30-val-2014

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sayak Paul, coco-30-val-2014 [Dataset]. https://huggingface.co/datasets/sayakpaul/coco-30-val-2014
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    Sayak Paul
    Description

    Dataset Card for "coco-30-val-2014"

    This is 30k randomly sampled image-captioned pairs from the COCO 2014 val split. This is useful for image generation benchmarks (FID, CLIPScore, etc.). Refer to the gist to know how the dataset was created: https://gist.github.com/sayakpaul/0c4435a1df6eb6193f824f9198cabaa5.

  3. COCO 2014 Val Subset

    • kaggle.com
    zip
    Updated Apr 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Masoud Ilkhani (2025). COCO 2014 Val Subset [Dataset]. https://www.kaggle.com/datasets/masoudilkhani/coco-2014-val-subset
    Explore at:
    zip(6858196257 bytes)Available download formats
    Dataset updated
    Apr 21, 2025
    Authors
    Masoud Ilkhani
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Source: This dataset is a subset of the MS COCO dataset, originally released by Microsoft under the CC BY 4.0 License. This subset was extracted for educational and research purposes.

  4. h

    COCO

    • huggingface.co
    • datasets.activeloop.ai
    Updated Feb 6, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    HuggingFaceM4 (2023). COCO [Dataset]. https://huggingface.co/datasets/HuggingFaceM4/COCO
    Explore at:
    Dataset updated
    Feb 6, 2023
    Dataset authored and provided by
    HuggingFaceM4
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    MS COCO is a large-scale object detection, segmentation, and captioning dataset. COCO has several features: Object segmentation, Recognition in context, Superpixel stuff segmentation, 330K images (>200K labeled), 1.5 million object instances, 80 object categories, 91 stuff categories, 5 captions per image, 250,000 people with keypoints.

  5. T

    coco

    • tensorflow.org
    • huggingface.co
    Updated Jun 1, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). coco [Dataset]. https://www.tensorflow.org/datasets/catalog/coco
    Explore at:
    Dataset updated
    Jun 1, 2024
    Description

    COCO is a large-scale object detection, segmentation, and captioning dataset.

    Note: * Some images from the train and validation sets don't have annotations. * Coco 2014 and 2017 uses the same images, but different train/val/test splits * The test split don't have any annotations (only images). * Coco defines 91 classes but the data only uses 80 classes. * Panotptic annotations defines defines 200 classes but only uses 133.

    To use this dataset:

    import tensorflow_datasets as tfds
    
    ds = tfds.load('coco', split='train')
    for ex in ds.take(4):
     print(ex)
    

    See the guide for more informations on tensorflow_datasets.

    https://storage.googleapis.com/tfds-data/visualization/fig/coco-2014-1.1.0.png" alt="Visualization" width="500px">

  6. MS-COCO-2014

    • kaggle.com
    zip
    Updated Apr 14, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    rabbabansh (2024). MS-COCO-2014 [Dataset]. https://www.kaggle.com/datasets/rabbabansh/ms-coco-2014
    Explore at:
    zip(44094014209 bytes)Available download formats
    Dataset updated
    Apr 14, 2024
    Authors
    rabbabansh
    Description

    Dataset

    This dataset was created by rabbabansh

    Contents

  7. MS-COCO-2014-Captions

    • kaggle.com
    zip
    Updated Sep 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sepehr Noey (2024). MS-COCO-2014-Captions [Dataset]. https://www.kaggle.com/datasets/sepehrnoey/ms-coco-2014-captions
    Explore at:
    zip(20021579 bytes)Available download formats
    Dataset updated
    Sep 15, 2024
    Authors
    Sepehr Noey
    Description

    Dataset

    This dataset was created by Sepehr Noey

    Contents

  8. h

    Arabic-COCO2014-Validation

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lina A. Alhuri, Arabic-COCO2014-Validation [Dataset]. https://huggingface.co/datasets/LinaAlhuri/Arabic-COCO2014-Validation
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    Lina A. Alhuri
    Description

    Arabic Translated COCO Validation Dataset

      Overview
    

    Welcome to the Arabic Translated COCO Validation Dataset! This dataset is a version of the Common Objects in Context (COCO) dataset, specifically translated into Arabic. The COCO dataset is a widely used benchmark for image captioning and object detection tasks, and this translation aims to facilitate research and development in the Arabic language.

      Contents
    

    coco_url: This column includes images URL which… See the full description on the dataset page: https://huggingface.co/datasets/LinaAlhuri/Arabic-COCO2014-Validation.

  9. MS-COCO2014

    • kaggle.com
    zip
    Updated Jan 10, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SRI RAM M S (2024). MS-COCO2014 [Dataset]. https://www.kaggle.com/datasets/isriramms6/ms-coco2014
    Explore at:
    zip(13678959464 bytes)Available download formats
    Dataset updated
    Jan 10, 2024
    Authors
    SRI RAM M S
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset

    This dataset was created by SRI RAM M S

    Released under CC0: Public Domain

    Contents

  10. h

    coco_body_part

    • huggingface.co
    Updated Jun 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    C (2025). coco_body_part [Dataset]. https://huggingface.co/datasets/Xuban/coco_body_part
    Explore at:
    Dataset updated
    Jun 10, 2025
    Authors
    C
    Description

    COCO 2014 DensePose Relabeling with Body Parts

    This dataset is formatted for Ultralytics YOLO and is ready for training. IMPORTANT !!!! Update the paths in the yaml inside the dataset folder

      Demo
    

    Here is what inference looks like:

      Based on:
    

    GitHub Repository Paper

      Classes:
    

    { 1: "Person", 2: "Torso", 3: "Hand", 4: "Foot", 5: "Upper Leg", 6:"Lower Leg", 7: "Upper Arm", 8: "Lower Arm", 9: "Head" }… See the full description on the dataset page: https://huggingface.co/datasets/Xuban/coco_body_part.

  11. Z

    COCO, LVIS, Open Images V4 classes mapping

    • data.niaid.nih.gov
    • zenodo.org
    • +1more
    Updated Oct 13, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Giuseppe Amato; Paolo Bolettieri; Fabio Carrara; Fabrizio Falchi; Claudio Gennaro; Nicola Messina; Lucia Vadicamo; Claudio Vairo (2022). COCO, LVIS, Open Images V4 classes mapping [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7194299
    Explore at:
    Dataset updated
    Oct 13, 2022
    Dataset provided by
    ISTI-CNR
    Authors
    Giuseppe Amato; Paolo Bolettieri; Fabio Carrara; Fabrizio Falchi; Claudio Gennaro; Nicola Messina; Lucia Vadicamo; Claudio Vairo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This repository contains a mapping between the classes of COCO, LVIS, and Open Images V4 datasets into a unique set of 1460 classes.

    COCO [Lin et al 2014] contains 80 classes, LVIS [gupta2019lvis] contains 1460 classes, Open Images V4 [Kuznetsova et al. 2020] contains 601 classes.

    We built a mapping of these classes using a semi-automatic procedure in order to have a unique final list of 1460 classes. We also generated a hierarchy for each class, using wordnet

    This repository contains the following files:

    coco_classes_map.txt, contains the mapping for the 80 coco classes

    lvis_classes_map.txt, contains the mapping for the 1460 coco classes

    openimages_classes_map.txt, contains the mapping for the 601 coco classes

    classname_hyperset_definition.csv, contains the final set of 1460 classes, their definition and hierarchy

    all-classnames.xlsx, contains a side-by-side view of all classes considered

    This mapping was used in VISIONE [Amato et al. 2021, Amato et al. 2022] that is a content-based retrieval system that supports various search functionalities (text search, object/color-based search, semantic and visual similarity search, temporal search). For the object detection VISIONE uses three pre-trained models: VfNet Zhang et al. 2021, Mask R-CNN He et al. 2017, and a Faster R-CNN+Inception ResNet (trained on the Open Images V4).

    This is repository is released under a Creative Commons Attribution license, please cite the following paper if you use it in your work in any form:

    @inproceedings{amato2021visione, title={The visione video search system: exploiting off-the-shelf text search engines for large-scale video retrieval}, author={Amato, Giuseppe and Bolettieri, Paolo and Carrara, Fabio and Debole, Franca and Falchi, Fabrizio and Gennaro, Claudio and Vadicamo, Lucia and Vairo, Claudio}, journal={Journal of Imaging}, volume={7}, number={5}, pages={76}, year={2021}, publisher={Multidisciplinary Digital Publishing Institute} }

    References:

    [Amato et al. 2022] Amato, G. et al. (2022). VISIONE at Video Browser Showdown 2022. In: , et al. MultiMedia Modeling. MMM 2022. Lecture Notes in Computer Science, vol 13142. Springer, Cham. https://doi.org/10.1007/978-3-030-98355-0_52

    [Amato et al. 2021] Amato, G., Bolettieri, P., Carrara, F., Debole, F., Falchi, F., Gennaro, C., Vadicamo, L. and Vairo, C., 2021. The visione video search system: exploiting off-the-shelf text search engines for large-scale video retrieval. Journal of Imaging, 7(5), p.76.

    [Gupta et al.2019] Gupta, A., Dollar, P. and Girshick, R., 2019. Lvis: A dataset for large vocabulary instance segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 5356-5364).

    [He et al. 2017] He, K., Gkioxari, G., Dollár, P. and Girshick, R., 2017. Mask r-cnn. In Proceedings of the IEEE international conference on computer vision (pp. 2961-2969).

    [Kuznetsova et al. 2020] Kuznetsova, A., Rom, H., Alldrin, N., Uijlings, J., Krasin, I., Pont-Tuset, J., Kamali, S., Popov, S., Malloci, M., Kolesnikov, A. and Duerig, T., 2020. The open images dataset v4. International Journal of Computer Vision, 128(7), pp.1956-1981.

    [Lin et al. 2014] Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P. and Zitnick, C.L., 2014, September. Microsoft coco: Common objects in context. In European conference on computer vision (pp. 740-755). Springer, Cham.

    [Zhang et al. 2021] Zhang, H., Wang, Y., Dayoub, F. and Sunderhauf, N., 2021. Varifocalnet: An iou-aware dense object detector. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 8514-8523).

  12. COCO Caption 2014

    • kaggle.com
    Updated Aug 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ChuLiJ (2025). COCO Caption 2014 [Dataset]. https://www.kaggle.com/datasets/chulij/cocodatasets
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 3, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    ChuLiJ
    Description

    This dataset consists of the training set and validation set of COCO Caption 2014, containing only images. The corresponding captions can be obtained from 'https://storage.googleapis.com/sfr-vision-language-research/datasets/coco_karpathy_train.json' and 'https://storage.googleapis.com/sfr-vision-language-research/datasets/coco_karpathy_val.json', which obtains through the code of BLIP. The official dataset link is http://cocodataset.org/

    这个数据集是COCO Caption 2014的训练集和验证集,里面仅有图片,对应的Caption可以从BLIP官方代码的'https://storage.googleapis.com/sfr-vision-language-research/datasets/coco_karpathy_train.json '、'https://storage.googleapis.com/sfr-vision-language-research/datasets/coco_karpathy_val.json '中获取。官方数据集链接为http://cocodataset.org/

  13. h

    coco-2014-square

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eric Chen, coco-2014-square [Dataset]. https://huggingface.co/datasets/emc348/coco-2014-square
    Explore at:
    Authors
    Eric Chen
    Description

    emc348/coco-2014-square dataset hosted on Hugging Face and contributed by the HF Datasets community

  14. t

    T.-Y. Lin, M. Maire, S. Belongie, L. Bourdev, R. Girshick, J. Hays, P....

    • service.tib.eu
    Updated Dec 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). T.-Y. Lin, M. Maire, S. Belongie, L. Bourdev, R. Girshick, J. Hays, P. Perona, D. Ramanan, C. L. Zitnick, P. Dollár (2024). Dataset: Microsoft COCO 2014 and 2017. https://doi.org/10.57702/ch13xmig [Dataset]. https://service.tib.eu/ldmservice/dataset/microsoft-coco-2014-and-2017
    Explore at:
    Dataset updated
    Dec 16, 2024
    Description

    Microsoft COCO 2014 and 2017 datasets for object detection, segmentation, and captioning

  15. E

    SPEECH-COCO

    • live.european-language-grid.eu
    audio wav
    Updated Dec 10, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). SPEECH-COCO [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/7686
    Explore at:
    audio wavAvailable download formats
    Dataset updated
    Dec 10, 2023
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Introduction: Our corpus is an extension of the MS COCO image recognition and captioning dataset. MS COCO comprises images paired with a set of five captions. Yet, it does not include any speech. Therefore, we used Voxygen's text-to-speech system to synthesise the available captions. The addition of speech as a new modality enables MSCOCO to be used for researches in the field of language acquisition, unsupervised term discovery, keyword spotting, or semantic embedding using speech and vision. Our corpus is licensed under a Creative Commons Attribution 4.0 License. Data Set: This corpus contains 616,767 spoken captions from MSCOCO's val2014 and train2014 subsets (respectively 414,113 for train2014 and 202,654 for val2014). We used 8 different voices. 4 of them have a British accent (Paul, Bronwen, Judith, and Elizabeth) and the 4 others have an American accent (Phil, Bruce, Amanda, Jenny). In order to make the captions sound more natural, we used SOX tempo command, enabling us to change the speed without changing the pitch. 1/3 of the captions are 10% slower than the original pace, 1/3 are 10% faster. The last third of the captions was kept untouched. We also modified approximately 30% of the original captions and added disfluencies such as "um", "uh", "er" so that the captions would sound more natural. Each WAV file is paired with a JSON file containing various information: timecode of each word in the caption, name of the speaker, name of the WAV file, etc. The JSON files have the following data structure: {"duration": float, "speaker": string, "synthesisedCaption": string, "timecode": list, "speed": float, "wavFilename": string, "captionID": int, "imgID": int, "disfluency": list}. On average, each caption comprises 10.79 tokens, disfluencies included. The WAV files are on average 3.52 seconds long.

  16. COCO 2014

    • kaggle.com
    zip
    Updated May 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hồ Minh Quang (2025). COCO 2014 [Dataset]. https://www.kaggle.com/datasets/hydroq/coco-2014
    Explore at:
    zip(2065006586 bytes)Available download formats
    Dataset updated
    May 24, 2025
    Authors
    Hồ Minh Quang
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset

    This dataset was created by Hồ Minh Quang

    Released under CC0: Public Domain

    Contents

  17. h

    coco-val2014-captioning

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aryan Tomar, coco-val2014-captioning [Dataset]. https://huggingface.co/datasets/aryntmr/coco-val2014-captioning
    Explore at:
    Authors
    Aryan Tomar
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    COCO Val2014 - Image Captioning & Object Detection

    This dataset contains COCO 2014 validation set with captions and object annotations.

      Dataset Structure
    

    image_id: COCO image ID image: The image file input_prompt: Instruction prompt for the model gt_objects: List of ground truth object categories gt_captions: List of ground truth captions (5 per image)

      Usage
    

    from datasets import load_dataset

    dataset = load_dataset("your-username/dataset-name")… See the full description on the dataset page: https://huggingface.co/datasets/aryntmr/coco-val2014-captioning.

  18. t

    Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva...

    • service.tib.eu
    Updated Dec 3, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, C Lawrence Zitnick (2024). Dataset: Microsoft COCO Dataset. https://doi.org/10.57702/spl4y042 [Dataset]. https://service.tib.eu/ldmservice/dataset/microsoft-coco-dataset
    Explore at:
    Dataset updated
    Dec 3, 2024
    Description

    The MS COCO 2014 Dataset contains images of 91 object categories, which contains 82783 training images, 40504 validation images and 40775 testing images.

  19. COCO 2014 test

    • kaggle.com
    zip
    Updated Jul 23, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    akashdeepjassal (2019). COCO 2014 test [Dataset]. https://www.kaggle.com/akashdeepjassal/coco-2014-test
    Explore at:
    zip(0 bytes)Available download formats
    Dataset updated
    Jul 23, 2019
    Authors
    akashdeepjassal
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    What is COCO?

    COCO is a large-scale object detection, segmentation, and captioning dataset. COCO has several features:

    MS-COCO website

  20. h

    COCO-AB

    • huggingface.co
    Updated Jan 12, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Seong Joon Oh (2022). COCO-AB [Dataset]. https://huggingface.co/datasets/coallaoh/COCO-AB
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 12, 2022
    Authors
    Seong Joon Oh
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    General Information

    Title: COCO-AB Description: The COCO-AB dataset is an extension of the COCO 2014 training set, enriched with additional annotation byproducts (AB). The data includes 82,765 reannotated images from the original COCO 2014 training set. It has relevance in computer vision, specifically in object detection and location. The aim of the dataset is to provide a richer understanding of the images (without extra costs) by recording additional actions and interactions… See the full description on the dataset page: https://huggingface.co/datasets/coallaoh/COCO-AB.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
GLOBOSE TECHNOLOGY SOLUTIONS PRIVATE LIMITED, COCO 2014 Dataset (for YOLOv3) [Dataset]. https://gts.ai/dataset-download/coco-2014-dataset-for-yolov3/

COCO 2014 Dataset (for YOLOv3)

Explore at:
3 scholarly articles cite this dataset (View in Google Scholar)
jsonAvailable download formats
Dataset authored and provided by
GLOBOSE TECHNOLOGY SOLUTIONS PRIVATE LIMITED
License

CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically

Description

The MS COCO (Microsoft Common Objects in Context) 2014 dataset is a large-scale benchmark for object detection, segmentation, and key-point detection. It contains 164,000+ annotated images across 80 object categories.

Search
Clear search
Close search
Google apps
Main menu