10 datasets found
  1. COCO, LVIS, Open Images V4 classes mapping

    • zenodo.org
    • data.niaid.nih.gov
    bin, csv, txt
    Updated Oct 13, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Giuseppe Amato; Giuseppe Amato; Paolo Bolettieri; Paolo Bolettieri; Fabio Carrara; Fabio Carrara; Fabrizio Falchi; Fabrizio Falchi; Claudio Gennaro; Claudio Gennaro; Nicola Messina; Nicola Messina; Lucia Vadicamo; Lucia Vadicamo; Claudio Vairo; Claudio Vairo (2022). COCO, LVIS, Open Images V4 classes mapping [Dataset]. http://doi.org/10.5281/zenodo.7194300
    Explore at:
    csv, txt, binAvailable download formats
    Dataset updated
    Oct 13, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Giuseppe Amato; Giuseppe Amato; Paolo Bolettieri; Paolo Bolettieri; Fabio Carrara; Fabio Carrara; Fabrizio Falchi; Fabrizio Falchi; Claudio Gennaro; Claudio Gennaro; Nicola Messina; Nicola Messina; Lucia Vadicamo; Lucia Vadicamo; Claudio Vairo; Claudio Vairo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This repository contains a mapping between the classes of COCO, LVIS, and Open Images V4 datasets into a unique set of 1460 classes.

    COCO [Lin et al 2014] contains 80 classes, LVIS [gupta2019lvis] contains 1460 classes, Open Images V4 [Kuznetsova et al. 2020] contains 601 classes.

    We built a mapping of these classes using a semi-automatic procedure in order to have a unique final list of 1460 classes. We also generated a hierarchy for each class, using wordnet

    This repository contains the following files:

    • coco_classes_map.txt, contains the mapping for the 80 coco classes
    • lvis_classes_map.txt, contains the mapping for the 1460 coco classes
    • openimages_classes_map.txt, contains the mapping for the 601 coco classes
    • classname_hyperset_definition.csv, contains the final set of 1460 classes, their definition and hierarchy
    • all-classnames.xlsx, contains a side-by-side view of all classes considered

    This mapping was used in VISIONE [Amato et al. 2021, Amato et al. 2022] that is a content-based retrieval system that supports various search functionalities (text search, object/color-based search, semantic and visual similarity search, temporal search). For the object detection VISIONE uses three pre-trained models: VfNet [Zhang et al. 2021] (trained on COCO dataset), Mask R-CNN [He et al. 2017] (trained on LVIS), and a Faster R-CNN+Inception ResNet (trained on the Open Images V4).

    This is repository is released under a Creative Commons Attribution license, please cite the following paper if you use it in your work in any form:

    @inproceedings{amato2021visione,
     title={The visione video search system: exploiting off-the-shelf text search engines for large-scale video retrieval},
     author={Amato, Giuseppe and Bolettieri, Paolo and Carrara, Fabio and Debole, Franca and Falchi, Fabrizio and Gennaro, Claudio and Vadicamo, Lucia and Vairo, Claudio},
     journal={Journal of Imaging},
     volume={7},
     number={5},
     pages={76},
     year={2021},
     publisher={Multidisciplinary Digital Publishing Institute}
    }
    

    References:

    [Amato et al. 2022] Amato, G. et al. (2022). VISIONE at Video Browser Showdown 2022. In: , et al. MultiMedia Modeling. MMM 2022. Lecture Notes in Computer Science, vol 13142. Springer, Cham. https://doi.org/10.1007/978-3-030-98355-0_52

    [Amato et al. 2021] Amato, G., Bolettieri, P., Carrara, F., Debole, F., Falchi, F., Gennaro, C., Vadicamo, L. and Vairo, C., 2021. The visione video search system: exploiting off-the-shelf text search engines for large-scale video retrieval. Journal of Imaging, 7(5), p.76.

    [Gupta et al.2019] Gupta, A., Dollar, P. and Girshick, R., 2019. Lvis: A dataset for large vocabulary instance segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 5356-5364).

    [He et al. 2017] He, K., Gkioxari, G., Dollár, P. and Girshick, R., 2017. Mask r-cnn. In Proceedings of the IEEE international conference on computer vision (pp. 2961-2969).

    [Kuznetsova et al. 2020] Kuznetsova, A., Rom, H., Alldrin, N., Uijlings, J., Krasin, I., Pont-Tuset, J., Kamali, S., Popov, S., Malloci, M., Kolesnikov, A. and Duerig, T., 2020. The open images dataset v4. International Journal of Computer Vision, 128(7), pp.1956-1981.

    [Lin et al. 2014] Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P. and Zitnick, C.L., 2014, September. Microsoft coco: Common objects in context. In European conference on computer vision (pp. 740-755). Springer, Cham.

    [Zhang et al. 2021] Zhang, H., Wang, Y., Dayoub, F. and Sunderhauf, N., 2021. Varifocalnet: An iou-aware dense object detector. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 8514-8523).

  2. h

    Recap-COCO-30K

    • huggingface.co
    Updated Jun 28, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    UCSC-VLAA (2024). Recap-COCO-30K [Dataset]. https://huggingface.co/datasets/UCSC-VLAA/Recap-COCO-30K
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 28, 2024
    Dataset authored and provided by
    UCSC-VLAA
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Llava recaptioned COCO2014 ValSet.

    Used for text-to-image generation evaluaion. More detial can be found in What If We Recaption Billions of Web Images with LLaMA-3?

      Dataset Structure
    

    "image_id" (str): COCO image id. "coco_url" (image): the COCO image url. "caption" (str): the original COCO caption. "recaption" (str): the llava recaptioned COCO caption.

      Citation
    

    BibTeX: @article{li2024recapdatacomp, title={What If We Recaption Billions of Web Images with… See the full description on the dataset page: https://huggingface.co/datasets/UCSC-VLAA/Recap-COCO-30K.

  3. h

    coco-detection-strings

    • huggingface.co
    Updated May 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aritra Roy Gosthipaty (2025). coco-detection-strings [Dataset]. https://huggingface.co/datasets/ariG23498/coco-detection-strings
    Explore at:
    Dataset updated
    May 26, 2025
    Authors
    Aritra Roy Gosthipaty
    Description

    Processed the bounding boxes from coco to paligemma like. Reference dataset -> detection-datasets/coco

  4. R

    Taco: Trash Annotations In Context Dataset

    • universe.roboflow.com
    • zenodo.org
    zip
    Updated Aug 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mohamed Traore (2024). Taco: Trash Annotations In Context Dataset [Dataset]. https://universe.roboflow.com/mohamed-traore-2ekkp/taco-trash-annotations-in-context/model/13
    Explore at:
    zipAvailable download formats
    Dataset updated
    Aug 1, 2024
    Dataset authored and provided by
    Mohamed Traore
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Trash Polygons
    Description

    TACO: Trash Annotations in Context Dataset

    From: Pedro F. Proença; Pedro Simões

    TACO is a growing image dataset of trash in the wild. It contains segmented images of litter taken under diverse environments: woods, roads and beaches. These images are manually labeled according to an hierarchical taxonomy to train and evaluate object detection algorithms. Annotations are provided in a similar format to COCO dataset.

    The model in action:

    https://raw.githubusercontent.com/wiki/pedropro/TACO/images/teaser.gif" alt="Gif of the model running inference">

    Examples images from the dataset:

    https://raw.githubusercontent.com/wiki/pedropro/TACO/images/2.png" alt="Example Image #2 from the Dataset"> https://raw.githubusercontent.com/wiki/pedropro/TACO/images/5.png" alt="Example Image #5 from the Dataset">

    For more details and to cite the authors:

    • Paper: https://arxiv.org/abs/2003.06975
    • Paper Citation: @article{taco2020, title={TACO: Trash Annotations in Context for Litter Detection}, author={Pedro F Proença and Pedro Simões}, journal={arXiv preprint arXiv:2003.06975}, year=
  5. Style Transfer for Object Detection in Art

    • kaggle.com
    Updated Mar 11, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    David Kadish (2021). Style Transfer for Object Detection in Art [Dataset]. https://www.kaggle.com/datasets/davidkadish/style-transfer-for-object-detection-in-art/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 11, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    David Kadish
    Description

    Context

    Despite recent advances in object detection using deep learning neural networks, these neural networks still struggle to identify objects in art images such as paintings and drawings. This challenge is known as the cross depiction problem and it stems in part from the tendency of neural networks to prioritize identification of an object's texture over its shape. In this paper we propose and evaluate a process for training neural networks to localize objects - specifically people - in art images. We generated a large dataset for training and validation by modifying the images in the COCO dataset using AdaIn style transfer (style-coco.tar.xz). This dataset was used to fine-tune a Faster R-CNN object detection network (2020-12-10_09-45-15_58672_resnet152_stylecoco_epoch_15.pth), which is then tested on the existing People-Art testing dataset (PeopleArt-Coco.tar.xz). The result is a significant improvement on the state of the art and a new way forward for creating datasets to train neural networks to process art images.

    Content

    2020-12-10_09-45-15_58672_resnet152_stylecoco_epoch_15.pth: Trained object detection network (Faster-RCNN using a ResNet152 backbone pretrained on ImageNet) for use with PyTorch PeopleArt-Coco.tar.xz: People-Art dataset with COCO-formatted annotations (original at https://github.com/BathVisArtData/PeopleArt) style-coco.tar.xz: Stylized COCO dataset containing only the person category. Used to train 2020-12-10_09-45-15_58672_resnet152_stylecoco_epoch_15.pth

    Code

    The code is available on github at https://github.com/dkadish/Style-Transfer-for-Object-Detection-in-Art

    Citing

    If you are using this code or the concept of style transfer for object detection in art, please cite our paper (https://arxiv.org/abs/2102.06529):

    D. Kadish, S. Risi, and A. S. Løvlie, “Improving Object Detection in Art Images Using Only Style Transfer,” Feb. 2021.

  6. Activities of Daily Living Object Dataset

    • figshare.com
    bin
    Updated Nov 28, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Md Tanzil Shahria; Mohammad H Rahman (2024). Activities of Daily Living Object Dataset [Dataset]. http://doi.org/10.6084/m9.figshare.27263424.v3
    Explore at:
    binAvailable download formats
    Dataset updated
    Nov 28, 2024
    Dataset provided by
    figshare
    Authors
    Md Tanzil Shahria; Mohammad H Rahman
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Activities of Daily Living Object DatasetOverviewThe ADL (Activities of Daily Living) Object Dataset is a curated collection of images and annotations specifically focusing on objects commonly interacted with during daily living activities. This dataset is designed to facilitate research and development in assistive robotics in home environments.Data Sources and LicensingThe dataset comprises images and annotations sourced from four publicly available datasets:COCO DatasetLicense: Creative Commons Attribution 4.0 International (CC BY 4.0)License Link: https://creativecommons.org/licenses/by/4.0/Citation:Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., & Zitnick, C. L. (2014). Microsoft COCO: Common Objects in Context. European Conference on Computer Vision (ECCV), 740–755.Open Images DatasetLicense: Creative Commons Attribution 4.0 International (CC BY 4.0)License Link: https://creativecommons.org/licenses/by/4.0/Citation:Kuznetsova, A., Rom, H., Alldrin, N., Uijlings, J., Krasin, I., Pont-Tuset, J., Kamali, S., Popov, S., Malloci, M., Duerig, T., & Ferrari, V. (2020). The Open Images Dataset V6: Unified Image Classification, Object Detection, and Visual Relationship Detection at Scale. International Journal of Computer Vision, 128(7), 1956–1981.LVIS DatasetLicense: Creative Commons Attribution 4.0 International (CC BY 4.0)License Link: https://creativecommons.org/licenses/by/4.0/Citation:Gupta, A., Dollar, P., & Girshick, R. (2019). LVIS: A Dataset for Large Vocabulary Instance Segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 5356–5364.Roboflow UniverseLicense: Creative Commons Attribution 4.0 International (CC BY 4.0)License Link: https://creativecommons.org/licenses/by/4.0/Citation: The following repositories from Roboflow Universe were used in compiling this dataset:Work, U. AI Based Automatic Stationery Billing System Data Dataset. 2022. Accessible at: https://universe.roboflow.com/university-work/ai-based-automatic-stationery-billing-system-data (accessed on 11 October 2024).Destruction, P.M. Pencilcase Dataset. 2023. Accessible at: https://universe.roboflow.com/project-mental-destruction/pencilcase-se7nb (accessed on 11 October 2024).Destruction, P.M. Final Project Dataset. 2023. Accessible at: https://universe.roboflow.com/project-mental-destruction/final-project-wsuvj (accessed on 11 October 2024).Personal. CSST106 Dataset. 2024. Accessible at: https://universe.roboflow.com/personal-pgkq6/csst106 (accessed on 11 October 2024).New-Workspace-kubz3. Pencilcase Dataset. 2022. Accessible at: https://universe.roboflow.com/new-workspace-kubz3/pencilcase-s9ag9 (accessed on 11 October 2024).Finespiralnotebook. Spiral Notebook Dataset. 2024. Accessible at: https://universe.roboflow.com/finespiralnotebook/spiral_notebook (accessed on 11 October 2024).Dairymilk. Classmate Dataset. 2024. Accessible at: https://universe.roboflow.com/dairymilk/classmate (accessed on 11 October 2024).Dziubatyi, M. Domace Zadanie Notebook Dataset. 2023. Accessible at: https://universe.roboflow.com/maksym-dziubatyi/domace-zadanie-notebook (accessed on 11 October 2024).One. Stationery Dataset. 2024. Accessible at: https://universe.roboflow.com/one-vrmjr/stationery-mxtt2 (accessed on 11 October 2024).jk001226. Liplip Dataset. 2024. Accessible at: https://universe.roboflow.com/jk001226/liplip (accessed on 11 October 2024).jk001226. Lip Dataset. 2024. Accessible at: https://universe.roboflow.com/jk001226/lip-uteep (accessed on 11 October 2024).Upwork5. Socks3 Dataset. 2022. Accessible at: https://universe.roboflow.com/upwork5/socks3 (accessed on 11 October 2024).Book. DeskTableLamps Material Dataset. 2024. Accessible at: https://universe.roboflow.com/book-mxasl/desktablelamps-material-rjbgd (accessed on 11 October 2024).Gary. Medicine Jar Dataset. 2024. Accessible at: https://universe.roboflow.com/gary-ofgwc/medicine-jar (accessed on 11 October 2024).TEST. Kolmarbnh Dataset. 2023. Accessible at: https://universe.roboflow.com/test-wj4qi/kolmarbnh (accessed on 11 October 2024).Tube. Tube Dataset. 2024. Accessible at: https://universe.roboflow.com/tube-nv2vt/tube-9ah9t (accessed on 11 October 2024). Staj. Canned Goods Dataset. 2024. Accessible at: https://universe.roboflow.com/staj-2ipmz/canned-goods-isxbi (accessed on 11 October 2024).Hussam, M. Wallet Dataset. 2024. Accessible at: https://universe.roboflow.com/mohamed-hussam-cq81o/wallet-sn9n2 (accessed on 14 October 2024).Training, K. Perfume Dataset. 2022. Accessible at: https://universe.roboflow.com/kdigital-training/perfume (accessed on 14 October 2024).Keyboards. Shoe-Walking Dataset. 2024. Accessible at: https://universe.roboflow.com/keyboards-tjtri/shoe-walking (accessed on 14 October 2024).MOMO. Toilet Paper Dataset. 2024. Accessible at: https://universe.roboflow.com/momo-nutwk/toilet-paper-wehrw (accessed on 14 October 2024).Project-zlrja. Toilet Paper Detection Dataset. 2024. Accessible at: https://universe.roboflow.com/project-zlrja/toilet-paper-detection (accessed on 14 October 2024).Govorkov, Y. Highlighter Detection Dataset. 2023. Accessible at: https://universe.roboflow.com/yuriy-govorkov-j9qrv/highlighter_detection (accessed on 14 October 2024).Stock. Plum Dataset. 2024. Accessible at: https://universe.roboflow.com/stock-qxdzf/plum-kdznw (accessed on 14 October 2024).Ibnu. Avocado Dataset. 2024. Accessible at: https://universe.roboflow.com/ibnu-h3cda/avocado-g9fsl (accessed on 14 October 2024).Molina, N. Detection Avocado Dataset. 2024. Accessible at: https://universe.roboflow.com/norberto-molina-zakki/detection-avocado (accessed on 14 October 2024).in Lab, V.F. Peach Dataset. 2023. Accessible at: https://universe.roboflow.com/vietnam-fruit-in-lab/peach-ejdry (accessed on 14 October 2024).Group, K. Tomato Detection 4 Dataset. 2023. Accessible at: https://universe.roboflow.com/kkabs-group-dkcni/tomato-detection-4 (accessed on 14 October 2024).Detection, M. Tomato Checker Dataset. 2024. Accessible at: https://universe.roboflow.com/money-detection-xez0r/tomato-checker (accessed on 14 October 2024).University, A.S. Smart Cam V1 Dataset. 2023. Accessible at: https://universe.roboflow.com/ain-shams-university-byja6/smart_cam_v1 (accessed on 14 October 2024).EMAD, S. Keysdetection Dataset. 2023. Accessible at: https://universe.roboflow.com/shehab-emad-n2q9i/keysdetection (accessed on 14 October 2024).Roads. Chips Dataset. 2024. Accessible at: https://universe.roboflow.com/roads-rvmaq/chips-a0us5 (accessed on 14 October 2024).workspace bgkzo, N. Object Dataset. 2021. Accessible at: https://universe.roboflow.com/new-workspace-bgkzo/object-eidim (accessed on 14 October 2024).Watch, W. Wrist Watch Dataset. 2024. Accessible at: https://universe.roboflow.com/wrist-watch/wrist-watch-0l25c (accessed on 14 October 2024).WYZUP. Milk Dataset. 2024. Accessible at: https://universe.roboflow.com/wyzup/milk-onbxt (accessed on 14 October 2024).AussieStuff. Food Dataset. 2024. Accessible at: https://universe.roboflow.com/aussiestuff/food-al9wr (accessed on 14 October 2024).Almukhametov, A. Pencils Color Dataset. 2023. Accessible at: https://universe.roboflow.com/almas-almukhametov-hs5jk/pencils-color (accessed on 14 October 2024).All images and annotations obtained from these datasets are released under the Creative Commons Attribution 4.0 International License (CC BY 4.0). This license permits sharing and adaptation of the material in any medium or format, for any purpose, even commercially, provided that appropriate credit is given, a link to the license is provided, and any changes made are indicated.Redistribution Permission:As all images and annotations are under the CC BY 4.0 license, we are legally permitted to redistribute this data within our dataset. We have complied with the license terms by:Providing appropriate attribution to the original creators.Including links to the CC BY 4.0 license.Indicating any changes made to the original material.Dataset StructureThe dataset includes:Images: High-quality images featuring ADL objects suitable for robotic manipulation.Annotations: Bounding boxes and class labels formatted in the YOLO (You Only Look Once) Darknet format.ClassesThe dataset focuses on objects commonly involved in daily living activities. A full list of object classes is provided in the classes.txt file.FormatImages: JPEG format.Annotations: Text files corresponding to each image, containing bounding box coordinates and class labels in YOLO Darknet format.How to Use the DatasetDownload the DatasetUnpack the Datasetunzip ADL_Object_Dataset.zipHow to Cite This DatasetIf you use this dataset in your research, please cite our paper:@article{shahria2024activities, title={Activities of Daily Living Object Dataset: Advancing Assistive Robotic Manipulation with a Tailored Dataset}, author={Shahria, Md Tanzil and Rahman, Mohammad H.}, journal={Sensors}, volume={24}, number={23}, pages={7566}, year={2024}, publisher={MDPI}}LicenseThis dataset is released under the Creative Commons Attribution 4.0 International License (CC BY 4.0).License Link: https://creativecommons.org/licenses/by/4.0/By using this dataset, you agree to provide appropriate credit, indicate if changes were made, and not impose additional restrictions beyond those of the original licenses.AcknowledgmentsWe gratefully acknowledge the use of data from the following open-source datasets, which were instrumental in the creation of our specialized ADL object dataset:COCO Dataset: We thank the creators and contributors of the COCO dataset for making their images and annotations publicly available under the CC BY 4.0 license.Open Images Dataset: We express our gratitude to the Open Images team for providing a comprehensive dataset of annotated images under the CC BY 4.0 license.LVIS Dataset: We appreciate the efforts of the LVIS dataset creators for releasing their extensive dataset under the CC BY 4.0 license.Roboflow Universe:

  7. h

    cococon

    • huggingface.co
    Updated Apr 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Adyasha Maharana (2023). cococon [Dataset]. https://huggingface.co/datasets/adymaharana/cococon
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 16, 2023
    Authors
    Adyasha Maharana
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset Card for CoCoCON

    Dataset Description Languages

    Dataset Structure Data Fields Data Splits

    Dataset Creation Considerations for Using the Data Licensing Information Citation Information

      Dataset Description
    

    CocoCON is a challenging dataset for evaluating cross-task consistency in vision-and-language models. We use contrast sets created by modifying COCO test instances for multiple tasks in small but semantically meaningful ways to change the gold label, and… See the full description on the dataset page: https://huggingface.co/datasets/adymaharana/cococon.

  8. o

    MOBDrone: a large-scale drone-view dataset for man overboard detection

    • explore.openaire.eu
    • data.niaid.nih.gov
    • +1more
    Updated Jan 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Donato Cafarelli; Luca Ciampi; Lucia Vadicamo; Claudio Gennaro; Andrea Berton; Marco Paterni; Chiara Benvenuti; Mirko Passera; Fabrizio Falchi (2022). MOBDrone: a large-scale drone-view dataset for man overboard detection [Dataset]. http://doi.org/10.5281/zenodo.5996889
    Explore at:
    Dataset updated
    Jan 1, 2022
    Authors
    Donato Cafarelli; Luca Ciampi; Lucia Vadicamo; Claudio Gennaro; Andrea Berton; Marco Paterni; Chiara Benvenuti; Mirko Passera; Fabrizio Falchi
    Description

    Dataset The Man OverBoard Drone (MOBDrone) dataset is a large-scale collection of aerial footage images. It contains 126,170 frames extracted from 66 video clips gathered from one UAV flying at an altitude of 10 to 60 meters above the mean sea level. Images are manually annotated with more than 180K bounding boxes localizing objects belonging to 5 categories --- person, boat, lifebuoy, surfboard, wood. More than 113K of these bounding boxes belong to the person category and localize people in the water simulating the need to be rescued. In this repository, we provide: 66 Full HD video clips (total size: 5.5 GB) 126,170 images extracted from the videos at a rate of 30 FPS (total size: 243 GB) 3 annotation files for the extracted images that follow the MS COCO data format (for more info see https://cocodataset.org/#format-data): annotations_5_custom_classes.json: this file contains annotations concerning all five categories; please note that class ids do not correspond with the ones provided by the MS COCO standard since we account for two new classes not previously considered in the MS COCO dataset --- lifebuoy and wood annotations_3_coco_classes.json: this file contains annotations concerning the three classes also accounted by the MS COCO dataset --- person, boat, surfboard. Class ids correspond with the ones provided by the MS COCO standard. annotations_person_coco_classes.json: this file contains annotations concerning only the 'person' class. Class id corresponds to the one provided by the MS COCO standard. The MOBDrone dataset is intended as a test data benchmark. However, for researchers interested in using our data also for training purposes, we provide training and test splits: Test set: All the images whose filename starts with "DJI_0804" (total: 37,604 images) Training set: All the images whose filename starts with "DJI_0915" (total: 88,568 images) More details about data generation and the evaluation protocol can be found at our MOBDrone paper: https://arxiv.org/abs/2203.07973 The code to reproduce our results is available at this GitHub Repository: https://github.com/ciampluca/MOBDrone_eval See also http://aimh.isti.cnr.it/dataset/MOBDrone Citing the MOBDrone The MOBDrone is released under a Creative Commons Attribution license, so please cite the MOBDrone if it is used in your work in any form. Published academic papers should use the academic paper citation for our MOBDrone paper, where we evaluated several pre-trained state-of-the-art object detectors focusing on the detection of the overboard people @inproceedings{MOBDrone2021, title={MOBDrone: a Drone Video Dataset for Man OverBoard Rescue}, author={Donato Cafarelli and Luca Ciampi and Lucia Vadicamo and Claudio Gennaro and Andrea Berton and Marco Paterni and Chiara Benvenuti and Mirko Passera and Fabrizio Falchi}, booktitle={ICIAP2021: 21th International Conference on Image Analysis and Processing}, year={2021} } and this Zenodo Dataset @dataset{donato_cafarelli_2022_5996890, author={Donato Cafarelli and Luca Ciampi and Lucia Vadicamo and Claudio Gennaro and Andrea Berton and Marco Paterni and Chiara Benvenuti and Mirko Passera and Fabrizio Falchi}, title = {{MOBDrone: a large-scale drone-view dataset for man overboard detection}}, month = feb, year = 2022, publisher = {Zenodo}, version = {1.0.0}, doi = {10.5281/zenodo.5996890}, url = {https://doi.org/10.5281/zenodo.5996890} } Personal works, such as machine learning projects/blog posts, should provide a URL to the MOBDrone Zenodo page (https://doi.org/10.5281/zenodo.5996890), though a reference to our MOBDrone paper would also be appreciated. Contact Information If you would like further information about the MOBDrone or if you experience any issues downloading files, please contact us at mobdrone[at]isti.cnr.it Acknowledgements This work was partially supported by NAUSICAA - "NAUtical Safety by means of Integrated Computer-Assistance Appliances 4.0" project funded by the Tuscany region (CUP D44E20003410009). The data collection was carried out with the collaboration of the Fly&Sense Service of the CNR of Pisa - for the flight operations of remotely piloted aerial systems - and of the Institute of Clinical Physiology (IFC) of the CNR - for the water immersion operations.

  9. Teleaudiology facilitators: A scoping review (Coco et al., 2020)

    • asha.figshare.com
    pdf
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Laura Coco; Alyssa Davidson; Nicole Marrone (2023). Teleaudiology facilitators: A scoping review (Coco et al., 2020) [Dataset]. http://doi.org/10.23641/asha.12475796.v1
    Explore at:
    pdfAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    American Speech–Language–Hearing Association
    Authors
    Laura Coco; Alyssa Davidson; Nicole Marrone
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Purpose: Teleaudiology helps improve access to hearing health care by overcoming the geographic gap between providers and patients. In many teleaudiology encounters, a facilitator is needed at the patient site to help with hands-on aspects of procedures. The aim of this study was to review the scope and nature of research around patient-site facilitators in teleaudiology. We focused on identifying the facilitators’ background, training, and responsibilities.Method: To conduct this scoping review, we searched PubMed, CINAHL, and Embase. To be included, studies needed to address teleaudiology; be experimental/quasi-experimental, correlational/predictive, or descriptive; be published in English; and include the use of a facilitator at the patient location.Results: A total of 82 studies met the inclusion criteria. The available literature described a number of different individuals in the role of the patient-site facilitator, including audiologists, students, and local aides. Fifty-seven unique tasks were identified, including orienting the client to the space, assisting with technology, and assisting with audiology procedures. The largest number of studies (n = 42) did not describe the facilitators’ training. When reported, the facilitators’ training was heterogenous in terms of who delivered the training, the length of the training, and the training content.Conclusions: Across studies, the range of duties performed by patient-site facilitators indicates they may have an important role in teleaudiology. However, details are still needed surrounding their background, responsibilities, and training. Future research is warranted exploring the role of the patient-site facilitator, including their impact on teleaudiology service delivery.Supplemental Material S1. Summary of review studies.Supplemental Material S2. Audiology-related sub-specialty duties performed by patient-site facilitators with citations.Supplemental Material S3. General telehealth duties performed by patient-site facilitators with citations.Coco, L., Davidson, A., & Marrone, N. (2020). The role of patient-site facilitators in teleaudiology: A scoping review. American Journal of Audiology, 29(3S), 661-675. https://doi.org/10.1044/2020_AJA-19-00070Publisher Note: This article is part of the Special Issue: 4th International Meeting on Internet and Audiology.

  10. h

    coco2014train_10k

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shengfang Zhai, coco2014train_10k [Dataset]. https://huggingface.co/datasets/zsf/coco2014train_10k
    Explore at:
    Authors
    Shengfang Zhai
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Description

    The dataset, randomly selected from MS-COCO, contains a total of 10K image-text pairs. It could be used in the paper of Text-to-Image Diffusion Models can be Easily Backdoored through Multimodal Data Poisoning. The dataset has been processed to conform to the format required by the GitHub repository BadT2I code.

      Citation
    

    If you find it useful in your research, please consider citing our paper: @inproceedings{zhai2023text, title={Text-to-image diffusion… See the full description on the dataset page: https://huggingface.co/datasets/zsf/coco2014train_10k.

  11. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Giuseppe Amato; Giuseppe Amato; Paolo Bolettieri; Paolo Bolettieri; Fabio Carrara; Fabio Carrara; Fabrizio Falchi; Fabrizio Falchi; Claudio Gennaro; Claudio Gennaro; Nicola Messina; Nicola Messina; Lucia Vadicamo; Lucia Vadicamo; Claudio Vairo; Claudio Vairo (2022). COCO, LVIS, Open Images V4 classes mapping [Dataset]. http://doi.org/10.5281/zenodo.7194300
Organization logo

COCO, LVIS, Open Images V4 classes mapping

Explore at:
3 scholarly articles cite this dataset (View in Google Scholar)
csv, txt, binAvailable download formats
Dataset updated
Oct 13, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Giuseppe Amato; Giuseppe Amato; Paolo Bolettieri; Paolo Bolettieri; Fabio Carrara; Fabio Carrara; Fabrizio Falchi; Fabrizio Falchi; Claudio Gennaro; Claudio Gennaro; Nicola Messina; Nicola Messina; Lucia Vadicamo; Lucia Vadicamo; Claudio Vairo; Claudio Vairo
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This repository contains a mapping between the classes of COCO, LVIS, and Open Images V4 datasets into a unique set of 1460 classes.

COCO [Lin et al 2014] contains 80 classes, LVIS [gupta2019lvis] contains 1460 classes, Open Images V4 [Kuznetsova et al. 2020] contains 601 classes.

We built a mapping of these classes using a semi-automatic procedure in order to have a unique final list of 1460 classes. We also generated a hierarchy for each class, using wordnet

This repository contains the following files:

  • coco_classes_map.txt, contains the mapping for the 80 coco classes
  • lvis_classes_map.txt, contains the mapping for the 1460 coco classes
  • openimages_classes_map.txt, contains the mapping for the 601 coco classes
  • classname_hyperset_definition.csv, contains the final set of 1460 classes, their definition and hierarchy
  • all-classnames.xlsx, contains a side-by-side view of all classes considered

This mapping was used in VISIONE [Amato et al. 2021, Amato et al. 2022] that is a content-based retrieval system that supports various search functionalities (text search, object/color-based search, semantic and visual similarity search, temporal search). For the object detection VISIONE uses three pre-trained models: VfNet [Zhang et al. 2021] (trained on COCO dataset), Mask R-CNN [He et al. 2017] (trained on LVIS), and a Faster R-CNN+Inception ResNet (trained on the Open Images V4).

This is repository is released under a Creative Commons Attribution license, please cite the following paper if you use it in your work in any form:

@inproceedings{amato2021visione,
 title={The visione video search system: exploiting off-the-shelf text search engines for large-scale video retrieval},
 author={Amato, Giuseppe and Bolettieri, Paolo and Carrara, Fabio and Debole, Franca and Falchi, Fabrizio and Gennaro, Claudio and Vadicamo, Lucia and Vairo, Claudio},
 journal={Journal of Imaging},
 volume={7},
 number={5},
 pages={76},
 year={2021},
 publisher={Multidisciplinary Digital Publishing Institute}
}

References:

[Amato et al. 2022] Amato, G. et al. (2022). VISIONE at Video Browser Showdown 2022. In: , et al. MultiMedia Modeling. MMM 2022. Lecture Notes in Computer Science, vol 13142. Springer, Cham. https://doi.org/10.1007/978-3-030-98355-0_52

[Amato et al. 2021] Amato, G., Bolettieri, P., Carrara, F., Debole, F., Falchi, F., Gennaro, C., Vadicamo, L. and Vairo, C., 2021. The visione video search system: exploiting off-the-shelf text search engines for large-scale video retrieval. Journal of Imaging, 7(5), p.76.

[Gupta et al.2019] Gupta, A., Dollar, P. and Girshick, R., 2019. Lvis: A dataset for large vocabulary instance segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 5356-5364).

[He et al. 2017] He, K., Gkioxari, G., Dollár, P. and Girshick, R., 2017. Mask r-cnn. In Proceedings of the IEEE international conference on computer vision (pp. 2961-2969).

[Kuznetsova et al. 2020] Kuznetsova, A., Rom, H., Alldrin, N., Uijlings, J., Krasin, I., Pont-Tuset, J., Kamali, S., Popov, S., Malloci, M., Kolesnikov, A. and Duerig, T., 2020. The open images dataset v4. International Journal of Computer Vision, 128(7), pp.1956-1981.

[Lin et al. 2014] Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P. and Zitnick, C.L., 2014, September. Microsoft coco: Common objects in context. In European conference on computer vision (pp. 740-755). Springer, Cham.

[Zhang et al. 2021] Zhang, H., Wang, Y., Dayoub, F. and Sunderhauf, N., 2021. Varifocalnet: An iou-aware dense object detector. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 8514-8523).

Search
Clear search
Close search
Google apps
Main menu