100+ datasets found
  1. t

    KITTI object detection benchmark dataset - Dataset - LDM

    • service.tib.eu
    Updated Dec 2, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). KITTI object detection benchmark dataset - Dataset - LDM [Dataset]. https://service.tib.eu/ldmservice/dataset/kitti-object-detection-benchmark-dataset
    Explore at:
    Dataset updated
    Dec 2, 2024
    Description

    The KITTI object detection benchmark dataset.

  2. i

    Benchmark dataset for small and narrow rectangular object detection from...

    • ieee-dataport.org
    Updated May 18, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    zhonghua hong (2022). Benchmark dataset for small and narrow rectangular object detection from Google Earth imagery [Dataset]. https://ieee-dataport.org/documents/benchmark-dataset-small-and-narrow-rectangular-object-detection-google-earth-imagery
    Explore at:
    Dataset updated
    May 18, 2022
    Authors
    zhonghua hong
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The benchmark dataset are consisted of 2

  3. h

    object-detection-bench

    • huggingface.co
    Updated May 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    JigsawStack (2025). object-detection-bench [Dataset]. https://huggingface.co/datasets/JigsawStack/object-detection-bench
    Explore at:
    Dataset updated
    May 28, 2025
    Dataset authored and provided by
    JigsawStack
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Object Detection Bench

    This dataset is a customized version of the RealworldQA dataset, specifically tailored for object detection and segmentation benchmarking tasks.

      Dataset Description
    

    This benchmark dataset contains real-world images with questions, answers, and custom prompts designed for evaluating object detection and segmentation models. Each sample includes:

    Image: Real-world photographs Question: Original question about the image content Answer: Ground truth… See the full description on the dataset page: https://huggingface.co/datasets/JigsawStack/object-detection-bench.

  4. S

    A Benchmark Dataset for Fine-Grained Object Detection and Recognition Based...

    • scidb.cn
    Updated Feb 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wu Youming; Diao Wenhui; Suo Yuxi; Sun Xian (2025). A Benchmark Dataset for Fine-Grained Object Detection and Recognition Based on Single-Look Complex SAR Images (FAIR-CSAR-V1.0) [Dataset]. http://doi.org/10.57760/sciencedb.radars.00019
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 20, 2025
    Dataset provided by
    Science Data Bank
    Authors
    Wu Youming; Diao Wenhui; Suo Yuxi; Sun Xian
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    FAIR-CSAR-V1.0 dataset, constructed on single-look complex (SLC) images of Gaofen-3 satellite, is the largest and most finely annotated SAR image dataset for fine-grained target to date. FAIR-CSAR-V1.0 aims to advance related technologies in SAR image object detection, recognition, and target characteristic understanding. The dataset is developed by Key Laboratory of Target Cognition and Application Technology (TCAT) at the Aerospace Information Research Institute, Chinese Academy of Sciences. FAIR-CSAR-V1.0 comprises 175 scenes of Gaofen-3 Level-1 SLC products, covering 32 global regions including airports, oil refineries, ports, and rivers. With a total data volume of 250 GB and over 340,000 instances, FAIR-CSAR-V1.0 covers 5 main categories and 22 subcategories, providing detailed annotations for imaging parameters (e.g., radar center frequency, pulse repetition frequency) and target characteristics (e.g., satellite-ground relative azimuthal angle, key scattering point distribution). FAIR-CSAR-V1.0 consists of two sub-datasets: the SL dataset and the FSI dataset. The SL dataset, acquired in spotlight mode with a nominal resolution of 1 meter, contains 170,000 instances across 22 target classes. The FSI dataset, acquired in fine stripmap mode with a nominal resolution of 5 meters, includes 170,000 instances across 3 target classes. Figure 1 presents an overview of the dataset.Data paper and citation format:[1] Youming Wu, Wenhui Diao, Yuxi Suo, Xian Sun. A Benchmark Dataset for Fine-Grained Object Detection and Recognition Based on Single-Look Complex SAR Images (FAIR-CSAR-V1.0) [OL]. Journal of Radars, 2025. https://radars.ac.cn/web/data/getData?dataType=FAIR_CSAR_en&pageType=en.[2] Y. Wu, Y. Suo, Q. Meng, W. Dai, T. Miao, W. Zhao, Z. Yan, W. Diao, G. Xie, Q. Ke, Y. Zhao, K. Fu and X. Sun, FAIR-CSAR: A Benchmark Dataset for Fine-Grained Object Detection and Recognition Based on Single-Look Complex SAR Images[J]. IEEE Transactions on Geoscience and Remote Sensing, vol. 63, pp. 1-22, 2025, doi: 10.1109/TGRS.2024.3519891.

  5. m

    Extended Evaluation of SnowPole Detection for Machine-Perceivable...

    • data.mendeley.com
    Updated Jun 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Durga Prasad Bavirisetti (2025). Extended Evaluation of SnowPole Detection for Machine-Perceivable Infrastructure for Nordic Winter Conditions: A Comparative Study of Object Detection Models [Dataset]. http://doi.org/10.17632/tt6rbx7s3h.3
    Explore at:
    Dataset updated
    Jun 30, 2025
    Authors
    Durga Prasad Bavirisetti
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In this study, we present an extensive evaluation of state-of-the-art YOLO object detection architectures for identifying snow poles in LiDAR-derived imagery captured under challenging Nordic conditions. Building upon our previous work on the SnowPole Detection dataset [1] and our LiDAR–GNSS-based localization framework [2], we expand the benchmark to include six YOLO models—YOLOv5s, YOLOv7-tiny, YOLOv8n, YOLOv9t, YOLOv10n, and YOLOv11n—evaluated across multiple input modalities. Specifically, we assess single-channel modalities (Reflectance, Signal, Near-Infrared) and six pseudo-color combinations derived by mapping these channels to RGB representations. Each model’s performance is quantified using Precision, Recall, mAP@50, mAP@50–95, and GPU inference latency. To facilitate systematic comparison, we define a composite Rank Score that integrates detection accuracy and real-time performance in a weighted formulation. Experimental results show that YOLOv9t consistently achieves the highest detection accuracy, while YOLOv11n provides the best trade-off between accuracy and inference speed, making it a promising candidate for real-time applications on embedded platforms. Among input modalities, pseudo-color combinations—particularly those fusing Near-Infrared, Signal, and Reflectance channels—outperformed single modalities across most configurations, achieving the highest Rank Scores and mAP metrics. Therefore, we recommend using multimodal LiDAR representations such as Combination 4 and Combination 5 to maximize detection robustness in practical deployments. All datasets, benchmarking code, and trained models are publicly avail- able to support reproducibility and further research through our GitHub repository (a).

    References [1] Durga Prasad Bavirisetti, Gabriel Hanssen Kiss, Petter Arnesen, Hanne Seter, Shaira Tabassum, and Frank Lindseth. Snowpole detection: A comprehensive dataset for detection and localization using lidar imaging in nordic winter conditions. Data in Brief, 59:111403, 2025. [2] Durga Prasad Bavirisetti, Gabriel Hanssen Kiss, and Frank Lindseth. A pole detection and geospatial localization framework using lidar-gnss data fusion. In 2024 27th International Conference on Information Fusion (FUSION), pages 1–8. IEEE, 2024. (a) https://github.com/MuhammadIbneRafiq/Extended-evaluation-snowpole-lidar-dataset

  6. R

    Benchmark Dataset

    • universe.roboflow.com
    zip
    Updated May 10, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    actions (2025). Benchmark Dataset [Dataset]. https://universe.roboflow.com/actions-kwook/benchmark-astr8/dataset/1
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 10, 2025
    Dataset authored and provided by
    actions
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Players Bounding Boxes
    Description

    Benchmark

    ## Overview
    
    Benchmark is a dataset for object detection tasks - it contains Players annotations for 256 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  7. R

    Yolov5 On Benchmark Dataset

    • universe.roboflow.com
    zip
    Updated Nov 2, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Research1 (2022). Yolov5 On Benchmark Dataset [Dataset]. https://universe.roboflow.com/research1-fcw6m/yolov5-on-benchmark-dataset
    Explore at:
    zipAvailable download formats
    Dataset updated
    Nov 2, 2022
    Dataset authored and provided by
    Research1
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Variables measured
    Cracks Dents Bounding Boxes
    Description

    Yolov5 On Benchmark Dataset

    ## Overview
    
    Yolov5 On Benchmark Dataset is a dataset for object detection tasks - it contains Cracks Dents annotations for 201 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [Public Domain license](https://creativecommons.org/licenses/Public Domain).
    
  8. Small Object Aerial Person Detection Dataset

    • zenodo.org
    txt, zip
    Updated Apr 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rafael Makrigiorgis; Rafael Makrigiorgis; Christos Kyrkou; Christos Kyrkou; Panayiotis Kolios; Panayiotis Kolios (2023). Small Object Aerial Person Detection Dataset [Dataset]. http://doi.org/10.5281/zenodo.7740081
    Explore at:
    zip, txtAvailable download formats
    Dataset updated
    Apr 5, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Rafael Makrigiorgis; Rafael Makrigiorgis; Christos Kyrkou; Christos Kyrkou; Panayiotis Kolios; Panayiotis Kolios
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Small Object Aerial Person Detection Dataset:

    The aerial dataset publication comprises a collection of frames captured from unmanned aerial vehicles (UAVs) during flights over the University of Cyprus campus and Civil Defense exercises. The dataset is primarily intended for people detection, with a focus on detecting small objects due to the top-view perspective of the images. The dataset includes annotations generated in popular formats such as YOLO, COCO, and VOC, making it highly versatile and accessible for a wide range of applications. Overall, this aerial dataset publication represents a valuable resource for researchers and practitioners working in the field of computer vision and machine learning, particularly those focused on people detection and related applications.

    SubsetImagesPeople
    Training209240687
    Validation52310589
    Testing52110432

    It is advised to further enhance the dataset so that random augmentations are probabilistically applied to each image prior to adding it to the batch for training. Specifically, there are a number of possible transformations such as geometric (rotations, translations, horizontal axis mirroring, cropping, and zooming), as well as image manipulations (illumination changes, color shifting, blurring, sharpening, and shadowing).

  9. Z

    Data from: Tiny Robotics Dataset and Benchmark for Continual Object...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Mar 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Francesco, Pasti (2025). Tiny Robotics Dataset and Benchmark for Continual Object Detection [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_13834549
    Explore at:
    Dataset updated
    Mar 11, 2025
    Dataset authored and provided by
    Francesco, Pasti
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset for TiROD: Tiny Robotics Dataset and Benchmark for Continual Object DetectionOfficial Website -> https://pastifra.github.io/TiROD/

    Code -> https://github.com/pastifra/TiROD_code

    Video -> https://www.youtube.com/watch?v=e76m3ol1i4I

    Paper -> https://arxiv.org/abs/2409.16215

  10. P

    TinyPerson Dataset

    • paperswithcode.com
    • opendatalab.com
    Updated Feb 20, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Xuehui Yu; Yuqi Gong; Nan Jiang; Qixiang Ye; Zhenjun Han (2021). TinyPerson Dataset [Dataset]. https://paperswithcode.com/dataset/tinyperson
    Explore at:
    Dataset updated
    Feb 20, 2021
    Authors
    Xuehui Yu; Yuqi Gong; Nan Jiang; Qixiang Ye; Zhenjun Han
    Description

    TinyPerson is a benchmark for tiny object detection in a long distance and with massive backgrounds. The images in TinyPerson are collected from the Internet. First, videos with a high resolution are collected from different websites. Second, images from the video are sampled every 50 frames. Then images with a certain repetition (homogeneity) are deleted, and the resulting images are annotated with 72,651 objects with bounding boxes by hand.

  11. S

    MTMS300: a multiple-targets and multiple-scales benchmark dataset for...

    • scidb.cn
    Updated Apr 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Li Chuwei; Zhang Zhilong; Li Shuxin (2025). MTMS300: a multiple-targets and multiple-scales benchmark dataset for salient object detection [Dataset]. http://doi.org/10.57760/sciencedb.j00240.00024
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 14, 2025
    Dataset provided by
    Science Data Bank
    Authors
    Li Chuwei; Zhang Zhilong; Li Shuxin
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    During the development of salient object detection algorithms, benchmark datasets have played a critical role. However, existing benchmark datasets commonly suffer from dataset bias, making it challenging to fully reflect the performance of different algorithms or capture the technical characteristics of certain typical applications. To address these limitations, we have undertaken two key initiatives: (1) We designed a new benchmark dataset, MTMS300 (Multiple Targets and Multiple Scales), tailored to reconnaissance and surveillance applications. This dataset contains 300 color visible-light images from land, sea, and air scenarios, featuring: Reduced center bias, Balanced distribution of target-to-image area ratios, Diverse image sizes, Multiple targets per image.(2) We curated a new benchmark dataset, DSC (Difficult Scenes in Common), by identifying images from publicly available benchmarks that pose significant challenges (with low metric scores) for most non-deep-learning algorithms. The proposed datasets exhibit distinct characteristics, enabling more comprehensive evaluation of visual saliency algorithms. This advancement will drive the development of visual saliency algorithms toward task-specific applications.

  12. Underwater Objects Dataset

    • universe.roboflow.com
    zip
    Updated May 7, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Roboflow 100 (2023). Underwater Objects Dataset [Dataset]. https://universe.roboflow.com/roboflow-100/underwater-objects-5v7p8/model/1
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 7, 2023
    Dataset provided by
    Roboflow, Inc.
    Authors
    Roboflow 100
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Underwater Objects Bounding Boxes
    Description

    This dataset was originally created by Yimin Chen. To see the current project, which may have been updated since this version, please go here: https://universe.roboflow.com/workspace-txxpz/underwater-detection.

    This dataset is part of RF100, an Intel-sponsored initiative to create a new object detection benchmark for model generalizability.

    Access the RF100 Github repo: https://github.com/roboflow-ai/roboflow-100-benchmark

  13. Z

    Data from: DeepScoresV2

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jun 7, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pacha, Alexander (2023). DeepScoresV2 [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4012192
    Explore at:
    Dataset updated
    Jun 7, 2023
    Dataset provided by
    Tuggener, Lukas
    Pacha, Alexander
    Satyawan, Yvan Putra
    Stadelmann, Thilo
    Schmidhuber, JĂĽrgen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The DeepScoresV2 Dataset for Music Object Detection contains digitally rendered images of written sheet music, together with the corresponding ground truth to fit various types of machine learning models. A total of 151 Million different instances of music symbols, belonging to 135 different classes are annotated. The total Dataset contains 255,385 Images. For most researches, the dense version, containing 1714 of the most diverse and interesting images, should suffice.

    The dataset contains ground in the form of:

    Non-oriented bounding boxes

    Oriented bounding boxes

    Semantic segmentation

    Instance segmentation

    The accompaning paper The DeepScoresV2 Dataset and Benchmark for Music Object Detection published at ICPR2020 can be found here:

    https://digitalcollection.zhaw.ch/handle/11475/20647

    A toolkit for convenient loading and inspection of the data can be found here:

    https://github.com/yvan674/obb_anns

    Code to train baseline models can be found here:

    https://github.com/tuggeluk/mmdetection/tree/DSV2_Baseline_FasterRCNN

    https://github.com/tuggeluk/DeepWatershedDetection/tree/dwd_old

  14. Urban Zone Aerial Object Detection Dataset

    • kaggle.com
    Updated Sep 26, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sganderla (2021). Urban Zone Aerial Object Detection Dataset [Dataset]. https://www.kaggle.com/sganderla/urban-zone-aerial-object-detection-dataset/tasks
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 26, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Sganderla
    Description

    Context

    As a base to create object detection models, this dataset focus on aerial images of urban areas. It contains 187.138 images of 4 object class and over 4.112.482 annotated objects, being that a object annotation is on a YOLOv5 type.

    Content

    The dataset is a combination of 3 others dataset, being them Stanford Drone Dataset, Vision Meets Drones, and Umanned Unmanned Aerial Vehicles Benchmark Object Detection and Tracking.

    Object Classes

    The dataset has the following objects of interest: person, small vehicle, medium vehicle and large vehicle.

    Acknowledgements

    @inproceedings{robicquet2016learning, title={Learning social etiquette: Human trajectory understanding in crowded scenes}, author={Robicquet, Alexandre and Sadeghian, Amir and Alahi, Alexandre and Savarese, Silvio}, booktitle={European conference on computer vision}, pages={549--565}, year={2016}, organization={Springer} }

    @inproceedings{du2018unmanned, title={The unmanned aerial vehicle benchmark: Object detection and tracking}, author={Du, Dawei and Qi, Yuankai and Yu, Hongyang and Yang, Yifan and Duan, Kaiwen and Li, Guorong and Zhang, Weigang and Huang, Qingming and Tian, Qi}, booktitle={Proceedings of the European Conference on Computer Vision (ECCV)}, pages={370--386}, year={2018} }

    @article{zhu2020vision, title={Vision meets drones: Past, present and future}, author={Zhu, Pengfei and Wen, Longyin and Du, Dawei and Bian, Xiao and Hu, Qinghua and Ling, Haibin}, journal={arXiv preprint arXiv:2001.06303}, year={2020} }

  15. t

    PASCAL Visual Object Classes Challenge - Dataset - LDM

    • service.tib.eu
    Updated Dec 3, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). PASCAL Visual Object Classes Challenge - Dataset - LDM [Dataset]. https://service.tib.eu/ldmservice/dataset/pascal-visual-object-classes-challenge
    Explore at:
    Dataset updated
    Dec 3, 2024
    Description

    The PASCAL Visual Object Classes Challenge (VOC) is a benchmark dataset for object detection and semantic segmentation.

  16. MetaGraspNet Single Class Mutiple Instance Dataset

    • kaggle.com
    Updated Apr 1, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yuhao Chen (2022). MetaGraspNet Single Class Mutiple Instance Dataset [Dataset]. https://www.kaggle.com/metagrasp/metagraspnet-single-class-mutiple-instance-dataset/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 1, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Yuhao Chen
    License

    Attribution-NoDerivs 4.0 (CC BY-ND 4.0)https://creativecommons.org/licenses/by-nd/4.0/
    License information was derived automatically

    Description

    MetaGraspNet dataset

    This repository contains the MetaGraspNet Dataset described in the paper "MetaGraspNet: A Large-Scale Benchmark Dataset for Vision-driven Robotic Grasping via Physics-based Metaverse Synthesis" (https://arxiv.org/abs/2112.14663 ).

    There has been increasing interest in smart factories powered by robotics systems to tackle repetitive, laborious tasks. One particular impactful yet challenging task in robotics-powered smart factory applications is robotic grasping: using robotic arms to grasp objects autonomously in different settings. Robotic grasping requires a variety of computer vision tasks such as object detection, segmentation, grasp prediction, pick planning, etc. While significant progress has been made in leveraging of machine learning for robotic grasping, particularly with deep learning, a big challenge remains in the need for large-scale, high-quality RGBD datasets that cover a wide diversity of scenarios and permutations.

    To tackle this big, diverse data problem, we are inspired by the recent rise in the concept of metaverse, which has greatly closed the gap between virtual worlds and the physical world. In particular, metaverses allow us to create digital twins of real-world manufacturing scenarios and to virtually create different scenarios from which large volumes of data can be generated for training models. We present MetaGraspNet: a large-scale benchmark dataset for vision-driven robotic grasping via physics-based metaverse synthesis. The proposed dataset contains 100,000 images and 25 different object types, and is split into 5 difficulties to evaluate object detection and segmentation model performance in different grasping scenarios. We also propose a new layout-weighted performance metric alongside the dataset for evaluating object detection and segmentation performance in a manner that is more appropriate for robotic grasp applications compared to existing general-purpose performance metrics. This repository contains the first phase of MetaGraspNet benchmark dataset which includes detailed object detection, segmentation, layout annotations, and a script for layout-weighted performance metric (https://github.com/y2863/MetaGraspNet ).

    https://raw.githubusercontent.com/y2863/MetaGraspNet/main/.github/500.png">

    Citing MetaGraspNet

    If you use MetaGraspNet dataset or metric in your research, please use the following BibTeX entry. BibTeX @article{chen2021metagraspnet, author = {Yuhao Chen and E. Zhixuan Zeng and Maximilian Gilles and Alexander Wong}, title = {MetaGraspNet: a large-scale benchmark dataset for vision-driven robotic grasping via physics-based metaverse synthesis}, journal = {arXiv preprint arXiv:2112.14663}, year = {2021} }

    File Structure

    This dataset is arranged in the following file structure:

    root
    |-- meta-grasp
      |-- scene0
        |-- 0_camera_params.json
        |-- 0_depth.png
        |-- 0_rgb.png
        |-- 0_order.csv
        ...
      |-- scene1
      ...
    |-- difficulty-n-coco-label.json
    

    Each scene is an unique arrangement of objects, which we then display at various different angles. For each shot of a scene, we provide the camera parameters (x_camara_params.json), a depth image (x_depth.png), an rgb image (x_rgb.png), as well as a matrix representation of the ordering of each object (x_order.csv). The full label for the image are all available in difficulty-n-coco-label.json (where n is the difficulty level of the dataset) in the coco data format.

    Understanding order.csv

    The matrix describes a pairwise obstruction relationship between each object within the image. Given a "parent" object covering a "child" object: relationship_matrix[child_id, parent_id] = -1

  17. t

    Pigdetect: a diverse and challenging benchmark dataset for the detection of...

    • service.tib.eu
    Updated May 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Pigdetect: a diverse and challenging benchmark dataset for the detection of pigs in images - Vdataset - LDM [Dataset]. https://service.tib.eu/ldmservice/dataset/goe-doi-10-25625-i6uye9
    Explore at:
    Dataset updated
    May 16, 2025
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Note: To better find the files to download, select "Change View: Tree". The dataset contains: 2931 images from conventional pig farming with object detection annotations in yolo and coco format with predefined training, validation and test splits Trained model weights for pig detection A thorough explanation of all files contained in this data repository can be found in ReadMe.txt.

  18. h

    XS-VID

    • huggingface.co
    Updated Jun 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jiahao s (2025). XS-VID [Dataset]. https://huggingface.co/datasets/lanlanlan23/XS-VID
    Explore at:
    Dataset updated
    Jun 4, 2025
    Authors
    Jiahao s
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    XS-VID: An Extremely Small Video Object Detection Dataset

      Dataset Description
    

    XS-VID is designed as a benchmark dataset for extremely small video object detection. It is intended to evaluate the performance of video object detection models, particularly focusing on efficiency and effectiveness in resource-limited situations. The dataset includes a variety of videos and scenarios to comprehensively assess model capabilities. [News]: XS-VIDv2 is coming soon! We are excited… See the full description on the dataset page: https://huggingface.co/datasets/lanlanlan23/XS-VID.

  19. m

    Cry, Laugh, or Angry? A Benchmark Dataset for Computer Vision-Based Approach...

    • data.mendeley.com
    Updated Mar 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Md. Mehedi Hasan (2025). Cry, Laugh, or Angry? A Benchmark Dataset for Computer Vision-Based Approach to Infant Facial Emotion Recognition [Dataset]. http://doi.org/10.17632/hy969mrx9p.1
    Explore at:
    Dataset updated
    Mar 10, 2025
    Authors
    Md. Mehedi Hasan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is a meticulously curated dataset designed for infant facial emotion recognition, featuring four primary emotional expressions: Angry, Cry, Laugh, and Normal. The dataset aims to facilitate research in machine learning, deep learning, affective computing, and human-computer interaction by providing a large collection of labeled infant facial images.

    Primary Data (1600 Images): - Angry: 400 - Cry: 400 - Laugh: 400 - Normal: 400

    Data Augmentation & Expanded Dataset (26,143 Images): To enhance the dataset's robustness and expand the dataset, 20 augmentation techniques (including HorizontalFlip, VerticalFlip, Rotate, ShiftScaleRotate, BrightnessContrast, GaussNoise, GaussianBlur, Sharpen, HueSaturationValue, CLAHE, GridDistortion, ElasticTransform, GammaCorrection, MotionBlur, ColorJitter, Emboss, Equalize, Posterize, FogEffect, and RainEffect) were applied randomly. This resulted in a significantly larger dataset with:

    • Angry: 5,781
    • Cry: 6,930
    • Laugh: 6,870
    • Normal: 6,562

    Data Collection & Ethical Considerations: The dataset was collected under strict ethical guidelines to ensure compliance with privacy and data protection laws. Key ethical considerations include: 1. Ethical Approval: The study was reviewed and approved by the Institutional Review Board (IRB) of Daffodil International University under Reference No: REC-FSIT-2024-11-10. 2. Informed Parental Consent: Written consent was obtained from parents before capturing and utilizing infant facial images for research purposes. 3. Privacy Protection: No personally identifiable information (PII) is included in the dataset, and images are strictly used for research in AI-driven emotion recognition.

    Data Collection Locations & Geographical Diversity: To ensure diversity in infant facial expressions, data collection was conducted across multiple locations in Bangladesh, covering healthcare centers and educational institutions:

    1. 250-bed District Sadar Hospital, Sherpur (Latitude: 25.019405 & Longitude: 90.013733)
    2. Upazila Health Complex, Baraigram, Natore (Latitude: 24.3083 & Longitude: 89.1700)
    3. Char Bhabna Community Clinic, Sherpur (Latitude: 25.0188 & Longitude: 90.0175)
    4. Jamiatul Amin Mohammad Al-Islamia Cadet Madrasa, Khagan, Dhaka (Latitude: 23.872856 & Longitude: 90.318947)

    Face Detection Methodology: To extract the facial regions efficiently, RetinaNet—a deep learning-based object detection model—was employed. The use of RetinaNet ensures precise facial cropping while minimizing background noise and occlusions.

    Potential Applications: 1. Affective Computing: Understanding infant emotions for smart healthcare and early childhood development. 2. Computer Vision: Training deep learning models for automated infant facial expression recognition. 3. Pediatric & Mental Health Research: Assisting in early autism screening and emotion-aware AI for child psychology. 4. Human-Computer Interaction (HCI): Designing AI-powered assistive technologies for infants.

  20. i

    Tiny Object Detection in Real-Time Traffic Surveillance (VisDrone Dataset)

    • ieee-dataport.org
    Updated Jun 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bonala Shanmukesh (2025). Tiny Object Detection in Real-Time Traffic Surveillance (VisDrone Dataset) [Dataset]. https://ieee-dataport.org/documents/tiny-object-detection-real-time-traffic-surveillance-visdrone-dataset
    Explore at:
    Dataset updated
    Jun 18, 2025
    Authors
    Bonala Shanmukesh
    Description

    This dataset supports the manuscript titled "Tiny Object Detection in Aerial Traffic Surveillance using YOLOv8-Nano". It contains training and evaluation resources used to benchmark YOLOv8n and YOLO-MARS models on the VisDrone dataset for real-time object detection. The data includes:

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
(2024). KITTI object detection benchmark dataset - Dataset - LDM [Dataset]. https://service.tib.eu/ldmservice/dataset/kitti-object-detection-benchmark-dataset

KITTI object detection benchmark dataset - Dataset - LDM

Explore at:
Dataset updated
Dec 2, 2024
Description

The KITTI object detection benchmark dataset.

Search
Clear search
Close search
Google apps
Main menu