10 datasets found

COCO, LVIS, Open Images V4 classes mapping
zenodo.org
data.niaid.nih.gov
bin, csv, txt
Updated Oct 13, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Giuseppe Amato; Giuseppe Amato; Paolo Bolettieri; Paolo Bolettieri; Fabio Carrara; Fabio Carrara; Fabrizio Falchi; Fabrizio Falchi; Claudio Gennaro; Claudio Gennaro; Nicola Messina; Nicola Messina; Lucia Vadicamo; Lucia Vadicamo; Claudio Vairo; Claudio Vairo (2022). COCO, LVIS, Open Images V4 classes mapping [Dataset]. http://doi.org/10.5281/zenodo.7194300
Explore at:
csv, txt, binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7194300
Dataset updated
Oct 13, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Giuseppe Amato; Giuseppe Amato; Paolo Bolettieri; Paolo Bolettieri; Fabio Carrara; Fabio Carrara; Fabrizio Falchi; Fabrizio Falchi; Claudio Gennaro; Claudio Gennaro; Nicola Messina; Nicola Messina; Lucia Vadicamo; Lucia Vadicamo; Claudio Vairo; Claudio Vairo
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This repository contains a mapping between the classes of COCO, LVIS, and Open Images V4 datasets into a unique set of 1460 classes.

COCO [Lin et al 2014] contains 80 classes, LVIS [gupta2019lvis] contains 1460 classes, Open Images V4 [Kuznetsova et al. 2020] contains 601 classes.

We built a mapping of these classes using a semi-automatic procedure in order to have a unique final list of 1460 classes. We also generated a hierarchy for each class, using wordnet

This repository contains the following files:

coco_classes_map.txt, contains the mapping for the 80 coco classes

lvis_classes_map.txt, contains the mapping for the 1460 coco classes

openimages_classes_map.txt, contains the mapping for the 601 coco classes

classname_hyperset_definition.csv, contains the final set of 1460 classes, their definition and hierarchy

all-classnames.xlsx, contains a side-by-side view of all classes considered

This mapping was used in VISIONE [Amato et al. 2021, Amato et al. 2022] that is a content-based retrieval system that supports various search functionalities (text search, object/color-based search, semantic and visual similarity search, temporal search). For the object detection VISIONE uses three pre-trained models: VfNet [Zhang et al. 2021] (trained on COCO dataset), Mask R-CNN [He et al. 2017] (trained on LVIS), and a Faster R-CNN+Inception ResNet (trained on the Open Images V4).

This is repository is released under a Creative Commons Attribution license, please cite the following paper if you use it in your work in any form:

@inproceedings{amato2021visione, title={The visione video search system: exploiting off-the-shelf text search engines for large-scale video retrieval}, author={Amato, Giuseppe and Bolettieri, Paolo and Carrara, Fabio and Debole, Franca and Falchi, Fabrizio and Gennaro, Claudio and Vadicamo, Lucia and Vairo, Claudio}, journal={Journal of Imaging}, volume={7}, number={5}, pages={76}, year={2021}, publisher={Multidisciplinary Digital Publishing Institute} }

References:

[Amato et al. 2022] Amato, G. et al. (2022). VISIONE at Video Browser Showdown 2022. In: , et al. MultiMedia Modeling. MMM 2022. Lecture Notes in Computer Science, vol 13142. Springer, Cham. https://doi.org/10.1007/978-3-030-98355-0_52

[Amato et al. 2021] Amato, G., Bolettieri, P., Carrara, F., Debole, F., Falchi, F., Gennaro, C., Vadicamo, L. and Vairo, C., 2021. The visione video search system: exploiting off-the-shelf text search engines for large-scale video retrieval. Journal of Imaging, 7(5), p.76.

[Gupta et al.2019] Gupta, A., Dollar, P. and Girshick, R., 2019. Lvis: A dataset for large vocabulary instance segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 5356-5364).

[He et al. 2017] He, K., Gkioxari, G., Dollár, P. and Girshick, R., 2017. Mask r-cnn. In Proceedings of the IEEE international conference on computer vision (pp. 2961-2969).

[Kuznetsova et al. 2020] Kuznetsova, A., Rom, H., Alldrin, N., Uijlings, J., Krasin, I., Pont-Tuset, J., Kamali, S., Popov, S., Malloci, M., Kolesnikov, A. and Duerig, T., 2020. The open images dataset v4. International Journal of Computer Vision, 128(7), pp.1956-1981.

[Lin et al. 2014] Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P. and Zitnick, C.L., 2014, September. Microsoft coco: Common objects in context. In European conference on computer vision (pp. 740-755). Springer, Cham.

[Zhang et al. 2021] Zhang, H., Wang, Y., Dayoub, F. and Sunderhauf, N., 2021. Varifocalnet: An iou-aware dense object detector. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 8514-8523).
h
Recap-COCO-30K
huggingface.co
Updated Jun 28, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
UCSC-VLAA (2024). Recap-COCO-30K [Dataset]. https://huggingface.co/datasets/UCSC-VLAA/Recap-COCO-30K
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 28, 2024
Dataset authored and provided by
UCSC-VLAA
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Llava recaptioned COCO2014 ValSet.

Used for text-to-image generation evaluaion. More detial can be found in What If We Recaption Billions of Web Images with LLaMA-3?

Dataset Structure

"image_id" (str): COCO image id. "coco_url" (image): the COCO image url. "caption" (str): the original COCO caption. "recaption" (str): the llava recaptioned COCO caption.

Citation

BibTeX: @article{li2024recapdatacomp, title={What If We Recaption Billions of Web Images with… See the full description on the dataset page: https://huggingface.co/datasets/UCSC-VLAA/Recap-COCO-30K.
h
coco-detection-strings
huggingface.co
Updated May 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Aritra Roy Gosthipaty (2025). coco-detection-strings [Dataset]. https://huggingface.co/datasets/ariG23498/coco-detection-strings
Explore at:
Dataset updated
May 26, 2025
Authors
Aritra Roy Gosthipaty
Description
Processed the bounding boxes from coco to paligemma like. Reference dataset -> detection-datasets/coco
R
Taco: Trash Annotations In Context Dataset
universe.roboflow.com
zenodo.org
zip
Updated Aug 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mohamed Traore (2024). Taco: Trash Annotations In Context Dataset [Dataset]. https://universe.roboflow.com/mohamed-traore-2ekkp/taco-trash-annotations-in-context/model/13
Explore at:
zipAvailable download formats
Dataset updated
Aug 1, 2024
Dataset authored and provided by
Mohamed Traore
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Trash Polygons
Description
TACO: Trash Annotations in Context Dataset

From: Pedro F. Proença; Pedro Simões

For more information, go to: http://tacodataset.org

https://github.com/pedropro/TACO https://raw.githubusercontent.com/wiki/pedropro/TACO/images/logonav.png" alt="TACO Logo">

TACO is a growing image dataset of trash in the wild. It contains segmented images of litter taken under diverse environments: woods, roads and beaches. These images are manually labeled according to an hierarchical taxonomy to train and evaluate object detection algorithms. Annotations are provided in a similar format to COCO dataset.

The model in action:

https://raw.githubusercontent.com/wiki/pedropro/TACO/images/teaser.gif" alt="Gif of the model running inference">

Examples images from the dataset:

https://raw.githubusercontent.com/wiki/pedropro/TACO/images/2.png" alt="Example Image #2 from the Dataset"> https://raw.githubusercontent.com/wiki/pedropro/TACO/images/5.png" alt="Example Image #5 from the Dataset">

For more details and to cite the authors:

Paper: https://arxiv.org/abs/2003.06975

Paper Citation: @article{taco2020, title={TACO: Trash Annotations in Context for Litter Detection}, author={Pedro F Proença and Pedro Simões}, journal={arXiv preprint arXiv:2003.06975}, year=
Style Transfer for Object Detection in Art
kaggle.com
Updated Mar 11, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
David Kadish (2021). Style Transfer for Object Detection in Art [Dataset]. https://www.kaggle.com/datasets/davidkadish/style-transfer-for-object-detection-in-art/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 11, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
David Kadish
Description
Context

Despite recent advances in object detection using deep learning neural networks, these neural networks still struggle to identify objects in art images such as paintings and drawings. This challenge is known as the cross depiction problem and it stems in part from the tendency of neural networks to prioritize identification of an object's texture over its shape. In this paper we propose and evaluate a process for training neural networks to localize objects - specifically people - in art images. We generated a large dataset for training and validation by modifying the images in the COCO dataset using AdaIn style transfer (style-coco.tar.xz). This dataset was used to fine-tune a Faster R-CNN object detection network (2020-12-10_09-45-15_58672_resnet152_stylecoco_epoch_15.pth), which is then tested on the existing People-Art testing dataset (PeopleArt-Coco.tar.xz). The result is a significant improvement on the state of the art and a new way forward for creating datasets to train neural networks to process art images.

Content

2020-12-10_09-45-15_58672_resnet152_stylecoco_epoch_15.pth: Trained object detection network (Faster-RCNN using a ResNet152 backbone pretrained on ImageNet) for use with PyTorch PeopleArt-Coco.tar.xz: People-Art dataset with COCO-formatted annotations (original at https://github.com/BathVisArtData/PeopleArt) style-coco.tar.xz: Stylized COCO dataset containing only the person category. Used to train 2020-12-10_09-45-15_58672_resnet152_stylecoco_epoch_15.pth

Code

The code is available on github at https://github.com/dkadish/Style-Transfer-for-Object-Detection-in-Art

Citing

If you are using this code or the concept of style transfer for object detection in art, please cite our paper (https://arxiv.org/abs/2102.06529):

D. Kadish, S. Risi, and A. S. Løvlie, “Improving Object Detection in Art Images Using Only Style Transfer,” Feb. 2021.
Activities of Daily Living Object Dataset
figshare.com
bin
Updated Nov 28, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Md Tanzil Shahria; Mohammad H Rahman (2024). Activities of Daily Living Object Dataset [Dataset]. http://doi.org/10.6084/m9.figshare.27263424.v3
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.27263424.v3
Dataset updated
Nov 28, 2024
Dataset provided by
figshare
Authors
Md Tanzil Shahria; Mohammad H Rahman
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Activities of Daily Living Object DatasetOverviewThe ADL (Activities of Daily Living) Object Dataset is a curated collection of images and annotations specifically focusing on objects commonly interacted with during daily living activities. This dataset is designed to facilitate research and development in assistive robotics in home environments.Data Sources and LicensingThe dataset comprises images and annotations sourced from four publicly available datasets:COCO DatasetLicense: Creative Commons Attribution 4.0 International (CC BY 4.0)License Link: https://creativecommons.org/licenses/by/4.0/Citation:Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., & Zitnick, C. L. (2014). Microsoft COCO: Common Objects in Context. European Conference on Computer Vision (ECCV), 740–755.Open Images DatasetLicense: Creative Commons Attribution 4.0 International (CC BY 4.0)License Link: https://creativecommons.org/licenses/by/4.0/Citation:Kuznetsova, A., Rom, H., Alldrin, N., Uijlings, J., Krasin, I., Pont-Tuset, J., Kamali, S., Popov, S., Malloci, M., Duerig, T., & Ferrari, V. (2020). The Open Images Dataset V6: Unified Image Classification, Object Detection, and Visual Relationship Detection at Scale. International Journal of Computer Vision, 128(7), 1956–1981.LVIS DatasetLicense: Creative Commons Attribution 4.0 International (CC BY 4.0)License Link: https://creativecommons.org/licenses/by/4.0/Citation:Gupta, A., Dollar, P., & Girshick, R. (2019). LVIS: A Dataset for Large Vocabulary Instance Segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 5356–5364.Roboflow UniverseLicense: Creative Commons Attribution 4.0 International (CC BY 4.0)License Link: https://creativecommons.org/licenses/by/4.0/Citation: The following repositories from Roboflow Universe were used in compiling this dataset:Work, U. AI Based Automatic Stationery Billing System Data Dataset. 2022. Accessible at: https://universe.roboflow.com/university-work/ai-based-automatic-stationery-billing-system-data (accessed on 11 October 2024).Destruction, P.M. Pencilcase Dataset. 2023. Accessible at: https://universe.roboflow.com/project-mental-destruction/pencilcase-se7nb (accessed on 11 October 2024).Destruction, P.M. Final Project Dataset. 2023. Accessible at: https://universe.roboflow.com/project-mental-destruction/final-project-wsuvj (accessed on 11 October 2024).Personal. CSST106 Dataset. 2024. Accessible at: https://universe.roboflow.com/personal-pgkq6/csst106 (accessed on 11 October 2024).New-Workspace-kubz3. Pencilcase Dataset. 2022. Accessible at: https://universe.roboflow.com/new-workspace-kubz3/pencilcase-s9ag9 (accessed on 11 October 2024).Finespiralnotebook. Spiral Notebook Dataset. 2024. Accessible at: https://universe.roboflow.com/finespiralnotebook/spiral_notebook (accessed on 11 October 2024).Dairymilk. Classmate Dataset. 2024. Accessible at: https://universe.roboflow.com/dairymilk/classmate (accessed on 11 October 2024).Dziubatyi, M. Domace Zadanie Notebook Dataset. 2023. Accessible at: https://universe.roboflow.com/maksym-dziubatyi/domace-zadanie-notebook (accessed on 11 October 2024).One. Stationery Dataset. 2024. Accessible at: https://universe.roboflow.com/one-vrmjr/stationery-mxtt2 (accessed on 11 October 2024).jk001226. Liplip Dataset. 2024. Accessible at: https://universe.roboflow.com/jk001226/liplip (accessed on 11 October 2024).jk001226. Lip Dataset. 2024. Accessible at: https://universe.roboflow.com/jk001226/lip-uteep (accessed on 11 October 2024).Upwork5. Socks3 Dataset. 2022. Accessible at: https://universe.roboflow.com/upwork5/socks3 (accessed on 11 October 2024).Book. DeskTableLamps Material Dataset. 2024. Accessible at: https://universe.roboflow.com/book-mxasl/desktablelamps-material-rjbgd (accessed on 11 October 2024).Gary. Medicine Jar Dataset. 2024. Accessible at: https://universe.roboflow.com/gary-ofgwc/medicine-jar (accessed on 11 October 2024).TEST. Kolmarbnh Dataset. 2023. Accessible at: https://universe.roboflow.com/test-wj4qi/kolmarbnh (accessed on 11 October 2024).Tube. Tube Dataset. 2024. Accessible at: https://universe.roboflow.com/tube-nv2vt/tube-9ah9t (accessed on 11 October 2024). Staj. Canned Goods Dataset. 2024. Accessible at: https://universe.roboflow.com/staj-2ipmz/canned-goods-isxbi (accessed on 11 October 2024).Hussam, M. Wallet Dataset. 2024. Accessible at: https://universe.roboflow.com/mohamed-hussam-cq81o/wallet-sn9n2 (accessed on 14 October 2024).Training, K. Perfume Dataset. 2022. Accessible at: https://universe.roboflow.com/kdigital-training/perfume (accessed on 14 October 2024).Keyboards. Shoe-Walking Dataset. 2024. Accessible at: https://universe.roboflow.com/keyboards-tjtri/shoe-walking (accessed on 14 October 2024).MOMO. Toilet Paper Dataset. 2024. Accessible at: https://universe.roboflow.com/momo-nutwk/toilet-paper-wehrw (accessed on 14 October 2024).Project-zlrja. Toilet Paper Detection Dataset. 2024. Accessible at: https://universe.roboflow.com/project-zlrja/toilet-paper-detection (accessed on 14 October 2024).Govorkov, Y. Highlighter Detection Dataset. 2023. Accessible at: https://universe.roboflow.com/yuriy-govorkov-j9qrv/highlighter_detection (accessed on 14 October 2024).Stock. Plum Dataset. 2024. Accessible at: https://universe.roboflow.com/stock-qxdzf/plum-kdznw (accessed on 14 October 2024).Ibnu. Avocado Dataset. 2024. Accessible at: https://universe.roboflow.com/ibnu-h3cda/avocado-g9fsl (accessed on 14 October 2024).Molina, N. Detection Avocado Dataset. 2024. Accessible at: https://universe.roboflow.com/norberto-molina-zakki/detection-avocado (accessed on 14 October 2024).in Lab, V.F. Peach Dataset. 2023. Accessible at: https://universe.roboflow.com/vietnam-fruit-in-lab/peach-ejdry (accessed on 14 October 2024).Group, K. Tomato Detection 4 Dataset. 2023. Accessible at: https://universe.roboflow.com/kkabs-group-dkcni/tomato-detection-4 (accessed on 14 October 2024).Detection, M. Tomato Checker Dataset. 2024. Accessible at: https://universe.roboflow.com/money-detection-xez0r/tomato-checker (accessed on 14 October 2024).University, A.S. Smart Cam V1 Dataset. 2023. Accessible at: https://universe.roboflow.com/ain-shams-university-byja6/smart_cam_v1 (accessed on 14 October 2024).EMAD, S. Keysdetection Dataset. 2023. Accessible at: https://universe.roboflow.com/shehab-emad-n2q9i/keysdetection (accessed on 14 October 2024).Roads. Chips Dataset. 2024. Accessible at: https://universe.roboflow.com/roads-rvmaq/chips-a0us5 (accessed on 14 October 2024).workspace bgkzo, N. Object Dataset. 2021. Accessible at: https://universe.roboflow.com/new-workspace-bgkzo/object-eidim (accessed on 14 October 2024).Watch, W. Wrist Watch Dataset. 2024. Accessible at: https://universe.roboflow.com/wrist-watch/wrist-watch-0l25c (accessed on 14 October 2024).WYZUP. Milk Dataset. 2024. Accessible at: https://universe.roboflow.com/wyzup/milk-onbxt (accessed on 14 October 2024).AussieStuff. Food Dataset. 2024. Accessible at: https://universe.roboflow.com/aussiestuff/food-al9wr (accessed on 14 October 2024).Almukhametov, A. Pencils Color Dataset. 2023. Accessible at: https://universe.roboflow.com/almas-almukhametov-hs5jk/pencils-color (accessed on 14 October 2024).All images and annotations obtained from these datasets are released under the Creative Commons Attribution 4.0 International License (CC BY 4.0). This license permits sharing and adaptation of the material in any medium or format, for any purpose, even commercially, provided that appropriate credit is given, a link to the license is provided, and any changes made are indicated.Redistribution Permission:As all images and annotations are under the CC BY 4.0 license, we are legally permitted to redistribute this data within our dataset. We have complied with the license terms by:Providing appropriate attribution to the original creators.Including links to the CC BY 4.0 license.Indicating any changes made to the original material.Dataset StructureThe dataset includes:Images: High-quality images featuring ADL objects suitable for robotic manipulation.Annotations: Bounding boxes and class labels formatted in the YOLO (You Only Look Once) Darknet format.ClassesThe dataset focuses on objects commonly involved in daily living activities. A full list of object classes is provided in the classes.txt file.FormatImages: JPEG format.Annotations: Text files corresponding to each image, containing bounding box coordinates and class labels in YOLO Darknet format.How to Use the DatasetDownload the DatasetUnpack the Datasetunzip ADL_Object_Dataset.zipHow to Cite This DatasetIf you use this dataset in your research, please cite our paper:@article{shahria2024activities, title={Activities of Daily Living Object Dataset: Advancing Assistive Robotic Manipulation with a Tailored Dataset}, author={Shahria, Md Tanzil and Rahman, Mohammad H.}, journal={Sensors}, volume={24}, number={23}, pages={7566}, year={2024}, publisher={MDPI}}LicenseThis dataset is released under the Creative Commons Attribution 4.0 International License (CC BY 4.0).License Link: https://creativecommons.org/licenses/by/4.0/By using this dataset, you agree to provide appropriate credit, indicate if changes were made, and not impose additional restrictions beyond those of the original licenses.AcknowledgmentsWe gratefully acknowledge the use of data from the following open-source datasets, which were instrumental in the creation of our specialized ADL object dataset:COCO Dataset: We thank the creators and contributors of the COCO dataset for making their images and annotations publicly available under the CC BY 4.0 license.Open Images Dataset: We express our gratitude to the Open Images team for providing a comprehensive dataset of annotated images under the CC BY 4.0 license.LVIS Dataset: We appreciate the efforts of the LVIS dataset creators for releasing their extensive dataset under the CC BY 4.0 license.Roboflow Universe:
h
cococon
huggingface.co
Updated Apr 16, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Adyasha Maharana (2023). cococon [Dataset]. https://huggingface.co/datasets/adymaharana/cococon
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 16, 2023
Authors
Adyasha Maharana
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset Card for CoCoCON

Dataset Description Languages

Dataset Structure Data Fields Data Splits

Dataset Creation Considerations for Using the Data Licensing Information Citation Information

Dataset Description

CocoCON is a challenging dataset for evaluating cross-task consistency in vision-and-language models. We use contrast sets created by modifying COCO test instances for multiple tasks in small but semantically meaningful ways to change the gold label, and… See the full description on the dataset page: https://huggingface.co/datasets/adymaharana/cococon.
o
MOBDrone: a large-scale drone-view dataset for man overboard detection
explore.openaire.eu
data.niaid.nih.gov
+1more
Updated Jan 1, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Donato Cafarelli; Luca Ciampi; Lucia Vadicamo; Claudio Gennaro; Andrea Berton; Marco Paterni; Chiara Benvenuti; Mirko Passera; Fabrizio Falchi (2022). MOBDrone: a large-scale drone-view dataset for man overboard detection [Dataset]. http://doi.org/10.5281/zenodo.5996889
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.5996889
Dataset updated
Jan 1, 2022
Authors
Donato Cafarelli; Luca Ciampi; Lucia Vadicamo; Claudio Gennaro; Andrea Berton; Marco Paterni; Chiara Benvenuti; Mirko Passera; Fabrizio Falchi
Description
Dataset The Man OverBoard Drone (MOBDrone) dataset is a large-scale collection of aerial footage images. It contains 126,170 frames extracted from 66 video clips gathered from one UAV flying at an altitude of 10 to 60 meters above the mean sea level. Images are manually annotated with more than 180K bounding boxes localizing objects belonging to 5 categories --- person, boat, lifebuoy, surfboard, wood. More than 113K of these bounding boxes belong to the person category and localize people in the water simulating the need to be rescued. In this repository, we provide: 66 Full HD video clips (total size: 5.5 GB) 126,170 images extracted from the videos at a rate of 30 FPS (total size: 243 GB) 3 annotation files for the extracted images that follow the MS COCO data format (for more info see https://cocodataset.org/#format-data): annotations_5_custom_classes.json: this file contains annotations concerning all five categories; please note that class ids do not correspond with the ones provided by the MS COCO standard since we account for two new classes not previously considered in the MS COCO dataset --- lifebuoy and wood annotations_3_coco_classes.json: this file contains annotations concerning the three classes also accounted by the MS COCO dataset --- person, boat, surfboard. Class ids correspond with the ones provided by the MS COCO standard. annotations_person_coco_classes.json: this file contains annotations concerning only the 'person' class. Class id corresponds to the one provided by the MS COCO standard. The MOBDrone dataset is intended as a test data benchmark. However, for researchers interested in using our data also for training purposes, we provide training and test splits: Test set: All the images whose filename starts with "DJI_0804" (total: 37,604 images) Training set: All the images whose filename starts with "DJI_0915" (total: 88,568 images) More details about data generation and the evaluation protocol can be found at our MOBDrone paper: https://arxiv.org/abs/2203.07973 The code to reproduce our results is available at this GitHub Repository: https://github.com/ciampluca/MOBDrone_eval See also http://aimh.isti.cnr.it/dataset/MOBDrone Citing the MOBDrone The MOBDrone is released under a Creative Commons Attribution license, so please cite the MOBDrone if it is used in your work in any form. Published academic papers should use the academic paper citation for our MOBDrone paper, where we evaluated several pre-trained state-of-the-art object detectors focusing on the detection of the overboard people @inproceedings{MOBDrone2021, title={MOBDrone: a Drone Video Dataset for Man OverBoard Rescue}, author={Donato Cafarelli and Luca Ciampi and Lucia Vadicamo and Claudio Gennaro and Andrea Berton and Marco Paterni and Chiara Benvenuti and Mirko Passera and Fabrizio Falchi}, booktitle={ICIAP2021: 21th International Conference on Image Analysis and Processing}, year={2021} } and this Zenodo Dataset @dataset{donato_cafarelli_2022_5996890, author={Donato Cafarelli and Luca Ciampi and Lucia Vadicamo and Claudio Gennaro and Andrea Berton and Marco Paterni and Chiara Benvenuti and Mirko Passera and Fabrizio Falchi}, title = {{MOBDrone: a large-scale drone-view dataset for man overboard detection}}, month = feb, year = 2022, publisher = {Zenodo}, version = {1.0.0}, doi = {10.5281/zenodo.5996890}, url = {https://doi.org/10.5281/zenodo.5996890} } Personal works, such as machine learning projects/blog posts, should provide a URL to the MOBDrone Zenodo page (https://doi.org/10.5281/zenodo.5996890), though a reference to our MOBDrone paper would also be appreciated. Contact Information If you would like further information about the MOBDrone or if you experience any issues downloading files, please contact us at mobdrone[at]isti.cnr.it Acknowledgements This work was partially supported by NAUSICAA - "NAUtical Safety by means of Integrated Computer-Assistance Appliances 4.0" project funded by the Tuscany region (CUP D44E20003410009). The data collection was carried out with the collaboration of the Fly&Sense Service of the CNR of Pisa - for the flight operations of remotely piloted aerial systems - and of the Institute of Clinical Physiology (IFC) of the CNR - for the water immersion operations.
Teleaudiology facilitators: A scoping review (Coco et al., 2020)
asha.figshare.com
pdf
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Laura Coco; Alyssa Davidson; Nicole Marrone (2023). Teleaudiology facilitators: A scoping review (Coco et al., 2020) [Dataset]. http://doi.org/10.23641/asha.12475796.v1
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.23641/asha.12475796.v1
Dataset updated
May 31, 2023
Dataset provided by
American Speech–Language–Hearing Association
Authors
Laura Coco; Alyssa Davidson; Nicole Marrone
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Purpose: Teleaudiology helps improve access to hearing health care by overcoming the geographic gap between providers and patients. In many teleaudiology encounters, a facilitator is needed at the patient site to help with hands-on aspects of procedures. The aim of this study was to review the scope and nature of research around patient-site facilitators in teleaudiology. We focused on identifying the facilitators’ background, training, and responsibilities.Method: To conduct this scoping review, we searched PubMed, CINAHL, and Embase. To be included, studies needed to address teleaudiology; be experimental/quasi-experimental, correlational/predictive, or descriptive; be published in English; and include the use of a facilitator at the patient location.Results: A total of 82 studies met the inclusion criteria. The available literature described a number of different individuals in the role of the patient-site facilitator, including audiologists, students, and local aides. Fifty-seven unique tasks were identified, including orienting the client to the space, assisting with technology, and assisting with audiology procedures. The largest number of studies (n = 42) did not describe the facilitators’ training. When reported, the facilitators’ training was heterogenous in terms of who delivered the training, the length of the training, and the training content.Conclusions: Across studies, the range of duties performed by patient-site facilitators indicates they may have an important role in teleaudiology. However, details are still needed surrounding their background, responsibilities, and training. Future research is warranted exploring the role of the patient-site facilitator, including their impact on teleaudiology service delivery.Supplemental Material S1. Summary of review studies.Supplemental Material S2. Audiology-related sub-specialty duties performed by patient-site facilitators with citations.Supplemental Material S3. General telehealth duties performed by patient-site facilitators with citations.Coco, L., Davidson, A., & Marrone, N. (2020). The role of patient-site facilitators in teleaudiology: A scoping review. American Journal of Audiology, 29(3S), 661-675. https://doi.org/10.1044/2020_AJA-19-00070Publisher Note: This article is part of the Special Issue: 4th International Meeting on Internet and Audiology.
h
coco2014train_10k
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shengfang Zhai, coco2014train_10k [Dataset]. https://huggingface.co/datasets/zsf/coco2014train_10k
Explore at:
Authors
Shengfang Zhai
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Description

The dataset, randomly selected from MS-COCO, contains a total of 10K image-text pairs. It could be used in the paper of Text-to-Image Diffusion Models can be Easily Backdoored through Multimodal Data Poisoning. The dataset has been processed to conform to the format required by the GitHub repository BadT2I code.

Citation

If you find it useful in your research, please consider citing our paper: @inproceedings{zhai2023text, title={Text-to-image diffusion… See the full description on the dataset page: https://huggingface.co/datasets/zsf/coco2014train_10k.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Giuseppe Amato; Giuseppe Amato; Paolo Bolettieri; Paolo Bolettieri; Fabio Carrara; Fabio Carrara; Fabrizio Falchi; Fabrizio Falchi; Claudio Gennaro; Claudio Gennaro; Nicola Messina; Nicola Messina; Lucia Vadicamo; Lucia Vadicamo; Claudio Vairo; Claudio Vairo (2022). COCO, LVIS, Open Images V4 classes mapping [Dataset]. http://doi.org/10.5281/zenodo.7194300

COCO, LVIS, Open Images V4 classes mapping

Explore at:

3 scholarly articles cite this dataset (View in Google Scholar)

csv, txt, binAvailable download formats

Unique identifier

https://doi.org/10.5281/zenodo.7194300

Dataset updated

Oct 13, 2022

Dataset provided by

Zenodohttp://zenodo.org/

Authors

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This repository contains a mapping between the classes of COCO, LVIS, and Open Images V4 datasets into a unique set of 1460 classes.

COCO [Lin et al 2014] contains 80 classes, LVIS [gupta2019lvis] contains 1460 classes, Open Images V4 [Kuznetsova et al. 2020] contains 601 classes.

We built a mapping of these classes using a semi-automatic procedure in order to have a unique final list of 1460 classes. We also generated a hierarchy for each class, using wordnet

This repository contains the following files:

coco_classes_map.txt, contains the mapping for the 80 coco classes
lvis_classes_map.txt, contains the mapping for the 1460 coco classes
openimages_classes_map.txt, contains the mapping for the 601 coco classes
classname_hyperset_definition.csv, contains the final set of 1460 classes, their definition and hierarchy
all-classnames.xlsx, contains a side-by-side view of all classes considered

This mapping was used in VISIONE [Amato et al. 2021, Amato et al. 2022] that is a content-based retrieval system that supports various search functionalities (text search, object/color-based search, semantic and visual similarity search, temporal search). For the object detection VISIONE uses three pre-trained models: VfNet [Zhang et al. 2021] (trained on COCO dataset), Mask R-CNN [He et al. 2017] (trained on LVIS), and a Faster R-CNN+Inception ResNet (trained on the Open Images V4).

This is repository is released under a Creative Commons Attribution license, please cite the following paper if you use it in your work in any form:

@inproceedings{amato2021visione,
 title={The visione video search system: exploiting off-the-shelf text search engines for large-scale video retrieval},
 author={Amato, Giuseppe and Bolettieri, Paolo and Carrara, Fabio and Debole, Franca and Falchi, Fabrizio and Gennaro, Claudio and Vadicamo, Lucia and Vairo, Claudio},
 journal={Journal of Imaging},
 volume={7},
 number={5},
 pages={76},
 year={2021},
 publisher={Multidisciplinary Digital Publishing Institute}
}

References:

[Amato et al. 2022] Amato, G. et al. (2022). VISIONE at Video Browser Showdown 2022. In: , et al. MultiMedia Modeling. MMM 2022. Lecture Notes in Computer Science, vol 13142. Springer, Cham. https://doi.org/10.1007/978-3-030-98355-0_52

[Amato et al. 2021] Amato, G., Bolettieri, P., Carrara, F., Debole, F., Falchi, F., Gennaro, C., Vadicamo, L. and Vairo, C., 2021. The visione video search system: exploiting off-the-shelf text search engines for large-scale video retrieval. Journal of Imaging, 7(5), p.76.

[Gupta et al.2019] Gupta, A., Dollar, P. and Girshick, R., 2019. Lvis: A dataset for large vocabulary instance segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 5356-5364).

[He et al. 2017] He, K., Gkioxari, G., Dollár, P. and Girshick, R., 2017. Mask r-cnn. In Proceedings of the IEEE international conference on computer vision (pp. 2961-2969).

[Kuznetsova et al. 2020] Kuznetsova, A., Rom, H., Alldrin, N., Uijlings, J., Krasin, I., Pont-Tuset, J., Kamali, S., Popov, S., Malloci, M., Kolesnikov, A. and Duerig, T., 2020. The open images dataset v4. International Journal of Computer Vision, 128(7), pp.1956-1981.

[Lin et al. 2014] Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P. and Zitnick, C.L., 2014, September. Microsoft coco: Common objects in context. In European conference on computer vision (pp. 740-755). Springer, Cham.

[Zhang et al. 2021] Zhang, H., Wang, Y., Dayoub, F. and Sunderhauf, N., 2021. Varifocalnet: An iou-aware dense object detector. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 8514-8523).

Clear search

Close search

Google apps

Main menu

COCO, LVIS, Open Images V4 classes mapping

Recap-COCO-30K

coco-detection-strings

Taco: Trash Annotations In Context Dataset

TACO: Trash Annotations in Context Dataset

From: Pedro F. Proença; Pedro Simões

The model in action:

Examples images from the dataset:

For more details and to cite the authors:

Style Transfer for Object Detection in Art

Context

Content

Code

Citing

Activities of Daily Living Object Dataset

cococon

MOBDrone: a large-scale drone-view dataset for man overboard detection

Teleaudiology facilitators: A scoping review (Coco et al., 2020)

coco2014train_10k

COCO, LVIS, Open Images V4 classes mapping