3 datasets found

Z
COCO, LVIS, Open Images V4 classes mapping
data.niaid.nih.gov
zenodo.org
+1more
Updated Oct 13, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Giuseppe Amato; Paolo Bolettieri; Fabio Carrara; Fabrizio Falchi; Claudio Gennaro; Nicola Messina; Lucia Vadicamo; Claudio Vairo (2022). COCO, LVIS, Open Images V4 classes mapping [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7194299
Explore at:
Dataset updated
Oct 13, 2022
Dataset provided by
ISTI-CNR
Authors
Giuseppe Amato; Paolo Bolettieri; Fabio Carrara; Fabrizio Falchi; Claudio Gennaro; Nicola Messina; Lucia Vadicamo; Claudio Vairo
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This repository contains a mapping between the classes of COCO, LVIS, and Open Images V4 datasets into a unique set of 1460 classes.

COCO [Lin et al 2014] contains 80 classes, LVIS [gupta2019lvis] contains 1460 classes, Open Images V4 [Kuznetsova et al. 2020] contains 601 classes.

We built a mapping of these classes using a semi-automatic procedure in order to have a unique final list of 1460 classes. We also generated a hierarchy for each class, using wordnet

This repository contains the following files:

coco_classes_map.txt, contains the mapping for the 80 coco classes

lvis_classes_map.txt, contains the mapping for the 1460 coco classes

openimages_classes_map.txt, contains the mapping for the 601 coco classes

classname_hyperset_definition.csv, contains the final set of 1460 classes, their definition and hierarchy

all-classnames.xlsx, contains a side-by-side view of all classes considered

This mapping was used in VISIONE [Amato et al. 2021, Amato et al. 2022] that is a content-based retrieval system that supports various search functionalities (text search, object/color-based search, semantic and visual similarity search, temporal search). For the object detection VISIONE uses three pre-trained models: VfNet Zhang et al. 2021, Mask R-CNN He et al. 2017, and a Faster R-CNN+Inception ResNet (trained on the Open Images V4).

This is repository is released under a Creative Commons Attribution license, please cite the following paper if you use it in your work in any form:

@inproceedings{amato2021visione, title={The visione video search system: exploiting off-the-shelf text search engines for large-scale video retrieval}, author={Amato, Giuseppe and Bolettieri, Paolo and Carrara, Fabio and Debole, Franca and Falchi, Fabrizio and Gennaro, Claudio and Vadicamo, Lucia and Vairo, Claudio}, journal={Journal of Imaging}, volume={7}, number={5}, pages={76}, year={2021}, publisher={Multidisciplinary Digital Publishing Institute} }

References:

[Amato et al. 2022] Amato, G. et al. (2022). VISIONE at Video Browser Showdown 2022. In: , et al. MultiMedia Modeling. MMM 2022. Lecture Notes in Computer Science, vol 13142. Springer, Cham. https://doi.org/10.1007/978-3-030-98355-0_52

[Amato et al. 2021] Amato, G., Bolettieri, P., Carrara, F., Debole, F., Falchi, F., Gennaro, C., Vadicamo, L. and Vairo, C., 2021. The visione video search system: exploiting off-the-shelf text search engines for large-scale video retrieval. Journal of Imaging, 7(5), p.76.

[Gupta et al.2019] Gupta, A., Dollar, P. and Girshick, R., 2019. Lvis: A dataset for large vocabulary instance segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 5356-5364).

[He et al. 2017] He, K., Gkioxari, G., Dollár, P. and Girshick, R., 2017. Mask r-cnn. In Proceedings of the IEEE international conference on computer vision (pp. 2961-2969).

[Kuznetsova et al. 2020] Kuznetsova, A., Rom, H., Alldrin, N., Uijlings, J., Krasin, I., Pont-Tuset, J., Kamali, S., Popov, S., Malloci, M., Kolesnikov, A. and Duerig, T., 2020. The open images dataset v4. International Journal of Computer Vision, 128(7), pp.1956-1981.

[Lin et al. 2014] Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P. and Zitnick, C.L., 2014, September. Microsoft coco: Common objects in context. In European conference on computer vision (pp. 740-755). Springer, Cham.

[Zhang et al. 2021] Zhang, H., Wang, Y., Dayoub, F. and Sunderhauf, N., 2021. Varifocalnet: An iou-aware dense object detector. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 8514-8523).
Microsoft COCO (Zhao et al 2017)
kaggle.com
zip
Updated Oct 21, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rachael Tatman (2019). Microsoft COCO (Zhao et al 2017) [Dataset]. https://www.kaggle.com/rtatman/ms-coco
Explore at:
zip(19282796 bytes)Available download formats
Dataset updated
Oct 21, 2019
Authors
Rachael Tatman
Description
Context

This dataset contains pickled Python objects with data from the annotations of the Microsoft (MS) COCO dataset. COCO is a large-scale object detection, segmentation, and captioning dataset.

Content

Except for the objs file, which is a plain text file continuing a list of objects, the data in this dataset is all in the pickle format, a way of storing Python objects at binary data files.

Important: These pickles were pickled using Python 2. Since Kernels use Python 3, you will need to specify the encoding when unpickling these files. The Python utility scripts here have been updated to correctly unpickle these files.

# the correct syntax to read these pickled files into Python 3 pickle.load(open('file_path, 'rb'), encoding = "latin1")

Acknowledgements

As a derivative of the original COCO dataset, this dataset is distributed under a CC-BY 4.0 license. These files were distributed as part of the supporting materials for Zhao et al 2017. If you use these files in your work, please cite the following paper:

Zhao, J., Wang, T., Yatskar, M., Ordonez, V., & Chang, K. W. (2017). Men Also Like Shopping: Reducing Gender Bias Amplification using Corpus-level Constraints. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (pp. 2979-2989).
Extracted regions of interest from COCO val2017
kaggle.com
zip
Updated Jun 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Akash Parua (2023). Extracted regions of interest from COCO val2017 [Dataset]. https://www.kaggle.com/datasets/akashparua/extracted-regions-of-interest-from-coco-val2017
Explore at:
zip(85607292 bytes)Available download formats
Dataset updated
Jun 8, 2023
Authors
Akash Parua
Description
Extract the file , go to data -> Data . You will find directories with names 'bear' , 'bicycle' , 'bus' , 'car' , 'cat', 'cow' , 'dog' , 'elephant' , 'fire_hydrant' , 'motorcycle' , 'person' ,'sheep' , 'stop_sign' , 'traffic_light' , 'train' ,'truck'. Each directory contains positive sample images for the said class (directory name) . The images have been extracted from COCO 2017 Validation images using the ground truth annotations . The datasets can be used for One-to-Rest Support Vector Machine classification after applying HOG feature extraction on each image . The data can also be used to train region suggestion algorithms for object detection . Each image is named in following format - (name of the image from COCO val2017 from where extracted)_(x)_(y)_(w)_(h).jpg , where x,y,w,h are annotations (Xmin, Ymin , width , height)
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Giuseppe Amato; Paolo Bolettieri; Fabio Carrara; Fabrizio Falchi; Claudio Gennaro; Nicola Messina; Lucia Vadicamo; Claudio Vairo (2022). COCO, LVIS, Open Images V4 classes mapping [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7194299

COCO, LVIS, Open Images V4 classes mapping

Explore at:

Dataset updated

Oct 13, 2022

Dataset provided by

ISTI-CNR

Authors

Giuseppe Amato; Paolo Bolettieri; Fabio Carrara; Fabrizio Falchi; Claudio Gennaro; Nicola Messina; Lucia Vadicamo; Claudio Vairo

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This repository contains a mapping between the classes of COCO, LVIS, and Open Images V4 datasets into a unique set of 1460 classes.

COCO [Lin et al 2014] contains 80 classes, LVIS [gupta2019lvis] contains 1460 classes, Open Images V4 [Kuznetsova et al. 2020] contains 601 classes.

We built a mapping of these classes using a semi-automatic procedure in order to have a unique final list of 1460 classes. We also generated a hierarchy for each class, using wordnet

This repository contains the following files:

coco_classes_map.txt, contains the mapping for the 80 coco classes

lvis_classes_map.txt, contains the mapping for the 1460 coco classes

openimages_classes_map.txt, contains the mapping for the 601 coco classes

classname_hyperset_definition.csv, contains the final set of 1460 classes, their definition and hierarchy

all-classnames.xlsx, contains a side-by-side view of all classes considered

This mapping was used in VISIONE [Amato et al. 2021, Amato et al. 2022] that is a content-based retrieval system that supports various search functionalities (text search, object/color-based search, semantic and visual similarity search, temporal search). For the object detection VISIONE uses three pre-trained models: VfNet Zhang et al. 2021, Mask R-CNN He et al. 2017, and a Faster R-CNN+Inception ResNet (trained on the Open Images V4).

This is repository is released under a Creative Commons Attribution license, please cite the following paper if you use it in your work in any form:

@inproceedings{amato2021visione, title={The visione video search system: exploiting off-the-shelf text search engines for large-scale video retrieval}, author={Amato, Giuseppe and Bolettieri, Paolo and Carrara, Fabio and Debole, Franca and Falchi, Fabrizio and Gennaro, Claudio and Vadicamo, Lucia and Vairo, Claudio}, journal={Journal of Imaging}, volume={7}, number={5}, pages={76}, year={2021}, publisher={Multidisciplinary Digital Publishing Institute} }

References:

[Amato et al. 2022] Amato, G. et al. (2022). VISIONE at Video Browser Showdown 2022. In: , et al. MultiMedia Modeling. MMM 2022. Lecture Notes in Computer Science, vol 13142. Springer, Cham. https://doi.org/10.1007/978-3-030-98355-0_52

[Amato et al. 2021] Amato, G., Bolettieri, P., Carrara, F., Debole, F., Falchi, F., Gennaro, C., Vadicamo, L. and Vairo, C., 2021. The visione video search system: exploiting off-the-shelf text search engines for large-scale video retrieval. Journal of Imaging, 7(5), p.76.

[Gupta et al.2019] Gupta, A., Dollar, P. and Girshick, R., 2019. Lvis: A dataset for large vocabulary instance segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 5356-5364).

[He et al. 2017] He, K., Gkioxari, G., Dollár, P. and Girshick, R., 2017. Mask r-cnn. In Proceedings of the IEEE international conference on computer vision (pp. 2961-2969).

[Kuznetsova et al. 2020] Kuznetsova, A., Rom, H., Alldrin, N., Uijlings, J., Krasin, I., Pont-Tuset, J., Kamali, S., Popov, S., Malloci, M., Kolesnikov, A. and Duerig, T., 2020. The open images dataset v4. International Journal of Computer Vision, 128(7), pp.1956-1981.

[Lin et al. 2014] Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P. and Zitnick, C.L., 2014, September. Microsoft coco: Common objects in context. In European conference on computer vision (pp. 740-755). Springer, Cham.

[Zhang et al. 2021] Zhang, H., Wang, Y., Dayoub, F. and Sunderhauf, N., 2021. Varifocalnet: An iou-aware dense object detector. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 8514-8523).

Clear search

Close search

Google apps

Main menu

COCO, LVIS, Open Images V4 classes mapping

Microsoft COCO (Zhao et al 2017)

Context

Content

Acknowledgements

Extracted regions of interest from COCO val2017

COCO, LVIS, Open Images V4 classes mapping