100+ datasets found

t
KITTI object detection benchmark dataset - Dataset - LDM
service.tib.eu
Updated Dec 2, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). KITTI object detection benchmark dataset - Dataset - LDM [Dataset]. https://service.tib.eu/ldmservice/dataset/kitti-object-detection-benchmark-dataset
Explore at:
Dataset updated
Dec 2, 2024
Description
The KITTI object detection benchmark dataset.
i
Benchmark dataset for small and narrow rectangular object detection from...
ieee-dataport.org
Updated May 18, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
zhonghua hong (2022). Benchmark dataset for small and narrow rectangular object detection from Google Earth imagery [Dataset]. https://ieee-dataport.org/documents/benchmark-dataset-small-and-narrow-rectangular-object-detection-google-earth-imagery
Explore at:
Dataset updated
May 18, 2022
Authors
zhonghua hong
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The benchmark dataset are consisted of 2
h
object-detection-bench
huggingface.co
Updated May 28, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
JigsawStack (2025). object-detection-bench [Dataset]. https://huggingface.co/datasets/JigsawStack/object-detection-bench
Explore at:
Dataset updated
May 28, 2025
Dataset authored and provided by
JigsawStack
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Object Detection Bench

This dataset is a customized version of the RealworldQA dataset, specifically tailored for object detection and segmentation benchmarking tasks.

Dataset Description

This benchmark dataset contains real-world images with questions, answers, and custom prompts designed for evaluating object detection and segmentation models. Each sample includes:

Image: Real-world photographs Question: Original question about the image content Answer: Ground truth… See the full description on the dataset page: https://huggingface.co/datasets/JigsawStack/object-detection-bench.
S
A Benchmark Dataset for Fine-Grained Object Detection and Recognition Based...
scidb.cn
Updated Feb 20, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Wu Youming; Diao Wenhui; Suo Yuxi; Sun Xian (2025). A Benchmark Dataset for Fine-Grained Object Detection and Recognition Based on Single-Look Complex SAR Images (FAIR-CSAR-V1.0) [Dataset]. http://doi.org/10.57760/sciencedb.radars.00019
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.57760/sciencedb.radars.00019
Dataset updated
Feb 20, 2025
Dataset provided by
Science Data Bank
Authors
Wu Youming; Diao Wenhui; Suo Yuxi; Sun Xian
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
FAIR-CSAR-V1.0 dataset, constructed on single-look complex (SLC) images of Gaofen-3 satellite, is the largest and most finely annotated SAR image dataset for fine-grained target to date. FAIR-CSAR-V1.0 aims to advance related technologies in SAR image object detection, recognition, and target characteristic understanding. The dataset is developed by Key Laboratory of Target Cognition and Application Technology (TCAT) at the Aerospace Information Research Institute, Chinese Academy of Sciences. FAIR-CSAR-V1.0 comprises 175 scenes of Gaofen-3 Level-1 SLC products, covering 32 global regions including airports, oil refineries, ports, and rivers. With a total data volume of 250 GB and over 340,000 instances, FAIR-CSAR-V1.0 covers 5 main categories and 22 subcategories, providing detailed annotations for imaging parameters (e.g., radar center frequency, pulse repetition frequency) and target characteristics (e.g., satellite-ground relative azimuthal angle, key scattering point distribution). FAIR-CSAR-V1.0 consists of two sub-datasets: the SL dataset and the FSI dataset. The SL dataset, acquired in spotlight mode with a nominal resolution of 1 meter, contains 170,000 instances across 22 target classes. The FSI dataset, acquired in fine stripmap mode with a nominal resolution of 5 meters, includes 170,000 instances across 3 target classes. Figure 1 presents an overview of the dataset.Data paper and citation format:[1] Youming Wu, Wenhui Diao, Yuxi Suo, Xian Sun. A Benchmark Dataset for Fine-Grained Object Detection and Recognition Based on Single-Look Complex SAR Images (FAIR-CSAR-V1.0) [OL]. Journal of Radars, 2025. https://radars.ac.cn/web/data/getData?dataType=FAIR_CSAR_en&pageType=en.[2] Y. Wu, Y. Suo, Q. Meng, W. Dai, T. Miao, W. Zhao, Z. Yan, W. Diao, G. Xie, Q. Ke, Y. Zhao, K. Fu and X. Sun, FAIR-CSAR: A Benchmark Dataset for Fine-Grained Object Detection and Recognition Based on Single-Look Complex SAR Images[J]. IEEE Transactions on Geoscience and Remote Sensing, vol. 63, pp. 1-22, 2025, doi: 10.1109/TGRS.2024.3519891.
m
Extended Evaluation of SnowPole Detection for Machine-Perceivable...
data.mendeley.com
Updated Jun 30, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Durga Prasad Bavirisetti (2025). Extended Evaluation of SnowPole Detection for Machine-Perceivable Infrastructure for Nordic Winter Conditions: A Comparative Study of Object Detection Models [Dataset]. http://doi.org/10.17632/tt6rbx7s3h.3
Explore at:
Unique identifier
https://doi.org/10.17632/tt6rbx7s3h.3
Dataset updated
Jun 30, 2025
Authors
Durga Prasad Bavirisetti
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
In this study, we present an extensive evaluation of state-of-the-art YOLO object detection architectures for identifying snow poles in LiDAR-derived imagery captured under challenging Nordic conditions. Building upon our previous work on the SnowPole Detection dataset [1] and our LiDAR–GNSS-based localization framework [2], we expand the benchmark to include six YOLO models—YOLOv5s, YOLOv7-tiny, YOLOv8n, YOLOv9t, YOLOv10n, and YOLOv11n—evaluated across multiple input modalities. Specifically, we assess single-channel modalities (Reflectance, Signal, Near-Infrared) and six pseudo-color combinations derived by mapping these channels to RGB representations. Each model’s performance is quantified using Precision, Recall, mAP@50, mAP@50–95, and GPU inference latency. To facilitate systematic comparison, we define a composite Rank Score that integrates detection accuracy and real-time performance in a weighted formulation. Experimental results show that YOLOv9t consistently achieves the highest detection accuracy, while YOLOv11n provides the best trade-off between accuracy and inference speed, making it a promising candidate for real-time applications on embedded platforms. Among input modalities, pseudo-color combinations—particularly those fusing Near-Infrared, Signal, and Reflectance channels—outperformed single modalities across most configurations, achieving the highest Rank Scores and mAP metrics. Therefore, we recommend using multimodal LiDAR representations such as Combination 4 and Combination 5 to maximize detection robustness in practical deployments. All datasets, benchmarking code, and trained models are publicly avail- able to support reproducibility and further research through our GitHub repository (a).

References [1] Durga Prasad Bavirisetti, Gabriel Hanssen Kiss, Petter Arnesen, Hanne Seter, Shaira Tabassum, and Frank Lindseth. Snowpole detection: A comprehensive dataset for detection and localization using lidar imaging in nordic winter conditions. Data in Brief, 59:111403, 2025. [2] Durga Prasad Bavirisetti, Gabriel Hanssen Kiss, and Frank Lindseth. A pole detection and geospatial localization framework using lidar-gnss data fusion. In 2024 27th International Conference on Information Fusion (FUSION), pages 1–8. IEEE, 2024. (a) https://github.com/MuhammadIbneRafiq/Extended-evaluation-snowpole-lidar-dataset
R
Benchmark Dataset
universe.roboflow.com
zip
Updated May 10, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
actions (2025). Benchmark Dataset [Dataset]. https://universe.roboflow.com/actions-kwook/benchmark-astr8/dataset/1
Explore at:
zipAvailable download formats
Dataset updated
May 10, 2025
Dataset authored and provided by
actions
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Players Bounding Boxes
Description
Benchmark

## Overview Benchmark is a dataset for object detection tasks - it contains Players annotations for 256 images. ## Getting Started You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model. ## License This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
R
Yolov5 On Benchmark Dataset
universe.roboflow.com
zip
Updated Nov 2, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Research1 (2022). Yolov5 On Benchmark Dataset [Dataset]. https://universe.roboflow.com/research1-fcw6m/yolov5-on-benchmark-dataset
Explore at:
zipAvailable download formats
Dataset updated
Nov 2, 2022
Dataset authored and provided by
Research1
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Variables measured
Cracks Dents Bounding Boxes
Description
Yolov5 On Benchmark Dataset

## Overview Yolov5 On Benchmark Dataset is a dataset for object detection tasks - it contains Cracks Dents annotations for 201 images. ## Getting Started You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model. ## License This dataset is available under the [Public Domain license](https://creativecommons.org/licenses/Public Domain).
Small Object Aerial Person Detection Dataset
zenodo.org
txt, zip
Updated Apr 5, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rafael Makrigiorgis; Rafael Makrigiorgis; Christos Kyrkou; Christos Kyrkou; Panayiotis Kolios; Panayiotis Kolios (2023). Small Object Aerial Person Detection Dataset [Dataset]. http://doi.org/10.5281/zenodo.7740081
Explore at:
zip, txtAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7740081
Dataset updated
Apr 5, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Rafael Makrigiorgis; Rafael Makrigiorgis; Christos Kyrkou; Christos Kyrkou; Panayiotis Kolios; Panayiotis Kolios
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Small Object Aerial Person Detection Dataset:

The aerial dataset publication comprises a collection of frames captured from unmanned aerial vehicles (UAVs) during flights over the University of Cyprus campus and Civil Defense exercises. The dataset is primarily intended for people detection, with a focus on detecting small objects due to the top-view perspective of the images. The dataset includes annotations generated in popular formats such as YOLO, COCO, and VOC, making it highly versatile and accessible for a wide range of applications. Overall, this aerial dataset publication represents a valuable resource for researchers and practitioners working in the field of computer vision and machine learning, particularly those focused on people detection and related applications.

Subset Images People
Training 2092 40687
Validation 523 10589
Testing 521 10432

It is advised to further enhance the dataset so that random augmentations are probabilistically applied to each image prior to adding it to the batch for training. Specifically, there are a number of possible transformations such as geometric (rotations, translations, horizontal axis mirroring, cropping, and zooming), as well as image manipulations (illumination changes, color shifting, blurring, sharpening, and shadowing).
Z
Data from: Tiny Robotics Dataset and Benchmark for Continual Object...
data.niaid.nih.gov
zenodo.org
Updated Mar 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Francesco, Pasti (2025). Tiny Robotics Dataset and Benchmark for Continual Object Detection [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_13834549
Explore at:
Dataset updated
Mar 11, 2025
Dataset authored and provided by
Francesco, Pasti
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset for TiROD: Tiny Robotics Dataset and Benchmark for Continual Object DetectionOfficial Website -> https://pastifra.github.io/TiROD/

Code -> https://github.com/pastifra/TiROD_code

Video -> https://www.youtube.com/watch?v=e76m3ol1i4I

Paper -> https://arxiv.org/abs/2409.16215
P
TinyPerson Dataset
paperswithcode.com
opendatalab.com
Updated Feb 20, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Xuehui Yu; Yuqi Gong; Nan Jiang; Qixiang Ye; Zhenjun Han (2021). TinyPerson Dataset [Dataset]. https://paperswithcode.com/dataset/tinyperson
Explore at:
Dataset updated
Feb 20, 2021
Authors
Xuehui Yu; Yuqi Gong; Nan Jiang; Qixiang Ye; Zhenjun Han
Description
TinyPerson is a benchmark for tiny object detection in a long distance and with massive backgrounds. The images in TinyPerson are collected from the Internet. First, videos with a high resolution are collected from different websites. Second, images from the video are sampled every 50 frames. Then images with a certain repetition (homogeneity) are deleted, and the resulting images are annotated with 72,651 objects with bounding boxes by hand.
S
MTMS300: a multiple-targets and multiple-scales benchmark dataset for...
scidb.cn
Updated Apr 14, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Li Chuwei; Zhang Zhilong; Li Shuxin (2025). MTMS300: a multiple-targets and multiple-scales benchmark dataset for salient object detection [Dataset]. http://doi.org/10.57760/sciencedb.j00240.00024
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.57760/sciencedb.j00240.00024
Dataset updated
Apr 14, 2025
Dataset provided by
Science Data Bank
Authors
Li Chuwei; Zhang Zhilong; Li Shuxin
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
During the development of salient object detection algorithms, benchmark datasets have played a critical role. However, existing benchmark datasets commonly suffer from dataset bias, making it challenging to fully reflect the performance of different algorithms or capture the technical characteristics of certain typical applications. To address these limitations, we have undertaken two key initiatives: (1) We designed a new benchmark dataset, MTMS300 (Multiple Targets and Multiple Scales), tailored to reconnaissance and surveillance applications. This dataset contains 300 color visible-light images from land, sea, and air scenarios, featuring: Reduced center bias, Balanced distribution of target-to-image area ratios, Diverse image sizes, Multiple targets per image.(2) We curated a new benchmark dataset, DSC (Difficult Scenes in Common), by identifying images from publicly available benchmarks that pose significant challenges (with low metric scores) for most non-deep-learning algorithms. The proposed datasets exhibit distinct characteristics, enabling more comprehensive evaluation of visual saliency algorithms. This advancement will drive the development of visual saliency algorithms toward task-specific applications.
Underwater Objects Dataset
universe.roboflow.com
zip
Updated May 7, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Roboflow 100 (2023). Underwater Objects Dataset [Dataset]. https://universe.roboflow.com/roboflow-100/underwater-objects-5v7p8/model/1
Explore at:
zipAvailable download formats
Dataset updated
May 7, 2023
Dataset provided by
Roboflow, Inc.
Authors
Roboflow 100
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Underwater Objects Bounding Boxes
Description
This dataset was originally created by Yimin Chen. To see the current project, which may have been updated since this version, please go here: https://universe.roboflow.com/workspace-txxpz/underwater-detection.

This dataset is part of RF100, an Intel-sponsored initiative to create a new object detection benchmark for model generalizability.

Access the RF100 Github repo: https://github.com/roboflow-ai/roboflow-100-benchmark
Z
Data from: DeepScoresV2
data.niaid.nih.gov
zenodo.org
Updated Jun 7, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pacha, Alexander (2023). DeepScoresV2 [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4012192
Explore at:
Dataset updated
Jun 7, 2023
Dataset provided by
Tuggener, Lukas
Pacha, Alexander
Satyawan, Yvan Putra
Stadelmann, Thilo
Schmidhuber, Jürgen
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The DeepScoresV2 Dataset for Music Object Detection contains digitally rendered images of written sheet music, together with the corresponding ground truth to fit various types of machine learning models. A total of 151 Million different instances of music symbols, belonging to 135 different classes are annotated. The total Dataset contains 255,385 Images. For most researches, the dense version, containing 1714 of the most diverse and interesting images, should suffice.

The dataset contains ground in the form of:

Non-oriented bounding boxes

Oriented bounding boxes

Semantic segmentation

Instance segmentation

The accompaning paper The DeepScoresV2 Dataset and Benchmark for Music Object Detection published at ICPR2020 can be found here:

https://digitalcollection.zhaw.ch/handle/11475/20647

A toolkit for convenient loading and inspection of the data can be found here:

https://github.com/yvan674/obb_anns

Code to train baseline models can be found here:

https://github.com/tuggeluk/mmdetection/tree/DSV2_Baseline_FasterRCNN

https://github.com/tuggeluk/DeepWatershedDetection/tree/dwd_old
Urban Zone Aerial Object Detection Dataset
kaggle.com
Updated Sep 26, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sganderla (2021). Urban Zone Aerial Object Detection Dataset [Dataset]. https://www.kaggle.com/sganderla/urban-zone-aerial-object-detection-dataset/tasks
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 26, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Sganderla
Description
Context

As a base to create object detection models, this dataset focus on aerial images of urban areas. It contains 187.138 images of 4 object class and over 4.112.482 annotated objects, being that a object annotation is on a YOLOv5 type.

Content

The dataset is a combination of 3 others dataset, being them Stanford Drone Dataset, Vision Meets Drones, and Umanned Unmanned Aerial Vehicles Benchmark Object Detection and Tracking.

Object Classes

The dataset has the following objects of interest: person, small vehicle, medium vehicle and large vehicle.

Acknowledgements

@inproceedings{robicquet2016learning, title={Learning social etiquette: Human trajectory understanding in crowded scenes}, author={Robicquet, Alexandre and Sadeghian, Amir and Alahi, Alexandre and Savarese, Silvio}, booktitle={European conference on computer vision}, pages={549--565}, year={2016}, organization={Springer} }

@inproceedings{du2018unmanned, title={The unmanned aerial vehicle benchmark: Object detection and tracking}, author={Du, Dawei and Qi, Yuankai and Yu, Hongyang and Yang, Yifan and Duan, Kaiwen and Li, Guorong and Zhang, Weigang and Huang, Qingming and Tian, Qi}, booktitle={Proceedings of the European Conference on Computer Vision (ECCV)}, pages={370--386}, year={2018} }

@article{zhu2020vision, title={Vision meets drones: Past, present and future}, author={Zhu, Pengfei and Wen, Longyin and Du, Dawei and Bian, Xiao and Hu, Qinghua and Ling, Haibin}, journal={arXiv preprint arXiv:2001.06303}, year={2020} }
t
PASCAL Visual Object Classes Challenge - Dataset - LDM
service.tib.eu
Updated Dec 3, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). PASCAL Visual Object Classes Challenge - Dataset - LDM [Dataset]. https://service.tib.eu/ldmservice/dataset/pascal-visual-object-classes-challenge
Explore at:
Dataset updated
Dec 3, 2024
Description
The PASCAL Visual Object Classes Challenge (VOC) is a benchmark dataset for object detection and semantic segmentation.
MetaGraspNet Single Class Mutiple Instance Dataset
kaggle.com
Updated Apr 1, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yuhao Chen (2022). MetaGraspNet Single Class Mutiple Instance Dataset [Dataset]. https://www.kaggle.com/metagrasp/metagraspnet-single-class-mutiple-instance-dataset/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 1, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Yuhao Chen
License
Attribution-NoDerivs 4.0 (CC BY-ND 4.0)https://creativecommons.org/licenses/by-nd/4.0/
License information was derived automatically
Description
MetaGraspNet dataset

This repository contains the MetaGraspNet Dataset described in the paper "MetaGraspNet: A Large-Scale Benchmark Dataset for Vision-driven Robotic Grasping via Physics-based Metaverse Synthesis" (https://arxiv.org/abs/2112.14663 ).

There has been increasing interest in smart factories powered by robotics systems to tackle repetitive, laborious tasks. One particular impactful yet challenging task in robotics-powered smart factory applications is robotic grasping: using robotic arms to grasp objects autonomously in different settings. Robotic grasping requires a variety of computer vision tasks such as object detection, segmentation, grasp prediction, pick planning, etc. While significant progress has been made in leveraging of machine learning for robotic grasping, particularly with deep learning, a big challenge remains in the need for large-scale, high-quality RGBD datasets that cover a wide diversity of scenarios and permutations.

To tackle this big, diverse data problem, we are inspired by the recent rise in the concept of metaverse, which has greatly closed the gap between virtual worlds and the physical world. In particular, metaverses allow us to create digital twins of real-world manufacturing scenarios and to virtually create different scenarios from which large volumes of data can be generated for training models. We present MetaGraspNet: a large-scale benchmark dataset for vision-driven robotic grasping via physics-based metaverse synthesis. The proposed dataset contains 100,000 images and 25 different object types, and is split into 5 difficulties to evaluate object detection and segmentation model performance in different grasping scenarios. We also propose a new layout-weighted performance metric alongside the dataset for evaluating object detection and segmentation performance in a manner that is more appropriate for robotic grasp applications compared to existing general-purpose performance metrics. This repository contains the first phase of MetaGraspNet benchmark dataset which includes detailed object detection, segmentation, layout annotations, and a script for layout-weighted performance metric (https://github.com/y2863/MetaGraspNet ).

https://raw.githubusercontent.com/y2863/MetaGraspNet/main/.github/500.png">

Citing MetaGraspNet

If you use MetaGraspNet dataset or metric in your research, please use the following BibTeX entry. BibTeX @article{chen2021metagraspnet, author = {Yuhao Chen and E. Zhixuan Zeng and Maximilian Gilles and Alexander Wong}, title = {MetaGraspNet: a large-scale benchmark dataset for vision-driven robotic grasping via physics-based metaverse synthesis}, journal = {arXiv preprint arXiv:2112.14663}, year = {2021} }

File Structure

This dataset is arranged in the following file structure:

root |-- meta-grasp |-- scene0 |-- 0_camera_params.json |-- 0_depth.png |-- 0_rgb.png |-- 0_order.csv ... |-- scene1 ... |-- difficulty-n-coco-label.json

Each scene is an unique arrangement of objects, which we then display at various different angles. For each shot of a scene, we provide the camera parameters (x_camara_params.json), a depth image (x_depth.png), an rgb image (x_rgb.png), as well as a matrix representation of the ordering of each object (x_order.csv). The full label for the image are all available in difficulty-n-coco-label.json (where n is the difficulty level of the dataset) in the coco data format.

Understanding order.csv

The matrix describes a pairwise obstruction relationship between each object within the image. Given a "parent" object covering a "child" object: relationship_matrix[child_id, parent_id] = -1
t
Pigdetect: a diverse and challenging benchmark dataset for the detection of...
service.tib.eu
Updated May 16, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Pigdetect: a diverse and challenging benchmark dataset for the detection of pigs in images - Vdataset - LDM [Dataset]. https://service.tib.eu/ldmservice/dataset/goe-doi-10-25625-i6uye9
Explore at:
Dataset updated
May 16, 2025
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Note: To better find the files to download, select "Change View: Tree". The dataset contains: 2931 images from conventional pig farming with object detection annotations in yolo and coco format with predefined training, validation and test splits Trained model weights for pig detection A thorough explanation of all files contained in this data repository can be found in ReadMe.txt.
h
XS-VID
huggingface.co
Updated Jun 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jiahao s (2025). XS-VID [Dataset]. https://huggingface.co/datasets/lanlanlan23/XS-VID
Explore at:
Dataset updated
Jun 4, 2025
Authors
Jiahao s
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
XS-VID: An Extremely Small Video Object Detection Dataset

Dataset Description

XS-VID is designed as a benchmark dataset for extremely small video object detection. It is intended to evaluate the performance of video object detection models, particularly focusing on efficiency and effectiveness in resource-limited situations. The dataset includes a variety of videos and scenarios to comprehensively assess model capabilities. [News]: XS-VIDv2 is coming soon! We are excited… See the full description on the dataset page: https://huggingface.co/datasets/lanlanlan23/XS-VID.
m
Cry, Laugh, or Angry? A Benchmark Dataset for Computer Vision-Based Approach...
data.mendeley.com
Updated Mar 10, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Md. Mehedi Hasan (2025). Cry, Laugh, or Angry? A Benchmark Dataset for Computer Vision-Based Approach to Infant Facial Emotion Recognition [Dataset]. http://doi.org/10.17632/hy969mrx9p.1
Explore at:
Unique identifier
https://doi.org/10.17632/hy969mrx9p.1
Dataset updated
Mar 10, 2025
Authors
Md. Mehedi Hasan
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is a meticulously curated dataset designed for infant facial emotion recognition, featuring four primary emotional expressions: Angry, Cry, Laugh, and Normal. The dataset aims to facilitate research in machine learning, deep learning, affective computing, and human-computer interaction by providing a large collection of labeled infant facial images.

Primary Data (1600 Images): - Angry: 400 - Cry: 400 - Laugh: 400 - Normal: 400

Data Augmentation & Expanded Dataset (26,143 Images): To enhance the dataset's robustness and expand the dataset, 20 augmentation techniques (including HorizontalFlip, VerticalFlip, Rotate, ShiftScaleRotate, BrightnessContrast, GaussNoise, GaussianBlur, Sharpen, HueSaturationValue, CLAHE, GridDistortion, ElasticTransform, GammaCorrection, MotionBlur, ColorJitter, Emboss, Equalize, Posterize, FogEffect, and RainEffect) were applied randomly. This resulted in a significantly larger dataset with:

Angry: 5,781

Cry: 6,930

Laugh: 6,870

Normal: 6,562

Data Collection & Ethical Considerations: The dataset was collected under strict ethical guidelines to ensure compliance with privacy and data protection laws. Key ethical considerations include: 1. Ethical Approval: The study was reviewed and approved by the Institutional Review Board (IRB) of Daffodil International University under Reference No: REC-FSIT-2024-11-10. 2. Informed Parental Consent: Written consent was obtained from parents before capturing and utilizing infant facial images for research purposes. 3. Privacy Protection: No personally identifiable information (PII) is included in the dataset, and images are strictly used for research in AI-driven emotion recognition.

Data Collection Locations & Geographical Diversity: To ensure diversity in infant facial expressions, data collection was conducted across multiple locations in Bangladesh, covering healthcare centers and educational institutions:

250-bed District Sadar Hospital, Sherpur (Latitude: 25.019405 & Longitude: 90.013733)

Upazila Health Complex, Baraigram, Natore (Latitude: 24.3083 & Longitude: 89.1700)

Char Bhabna Community Clinic, Sherpur (Latitude: 25.0188 & Longitude: 90.0175)

Jamiatul Amin Mohammad Al-Islamia Cadet Madrasa, Khagan, Dhaka (Latitude: 23.872856 & Longitude: 90.318947)

Face Detection Methodology: To extract the facial regions efficiently, RetinaNet—a deep learning-based object detection model—was employed. The use of RetinaNet ensures precise facial cropping while minimizing background noise and occlusions.

Potential Applications: 1. Affective Computing: Understanding infant emotions for smart healthcare and early childhood development. 2. Computer Vision: Training deep learning models for automated infant facial expression recognition. 3. Pediatric & Mental Health Research: Assisting in early autism screening and emotion-aware AI for child psychology. 4. Human-Computer Interaction (HCI): Designing AI-powered assistive technologies for infants.
i
Tiny Object Detection in Real-Time Traffic Surveillance (VisDrone Dataset)
ieee-dataport.org
Updated Jun 18, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bonala Shanmukesh (2025). Tiny Object Detection in Real-Time Traffic Surveillance (VisDrone Dataset) [Dataset]. https://ieee-dataport.org/documents/tiny-object-detection-real-time-traffic-surveillance-visdrone-dataset
Explore at:
Dataset updated
Jun 18, 2025
Authors
Bonala Shanmukesh
Description
This dataset supports the manuscript titled "Tiny Object Detection in Aerial Traffic Surveillance using YOLOv8-Nano". It contains training and evaluation resources used to benchmark YOLOv8n and YOLO-MARS models on the VisDrone dataset for real-time object detection. The data includes:

Facebook

Twitter

Click to copy link

Link copied

Cite

(2024). KITTI object detection benchmark dataset - Dataset - LDM [Dataset]. https://service.tib.eu/ldmservice/dataset/kitti-object-detection-benchmark-dataset

KITTI object detection benchmark dataset - Dataset - LDM

Explore at:

Dataset updated

Dec 2, 2024

Description

The KITTI object detection benchmark dataset.

Clear search

Close search

Google apps

Main menu

Subset	Images	People
Training	2092	40687
Validation	523	10589
Testing	521	10432

KITTI object detection benchmark dataset - Dataset - LDM

Benchmark dataset for small and narrow rectangular object detection from...

object-detection-bench

A Benchmark Dataset for Fine-Grained Object Detection and Recognition Based...

Extended Evaluation of SnowPole Detection for Machine-Perceivable...

Benchmark Dataset

Benchmark

Yolov5 On Benchmark Dataset

Yolov5 On Benchmark Dataset

Small Object Aerial Person Detection Dataset

Data from: Tiny Robotics Dataset and Benchmark for Continual Object...

TinyPerson Dataset

MTMS300: a multiple-targets and multiple-scales benchmark dataset for...

Underwater Objects Dataset

Data from: DeepScoresV2

Urban Zone Aerial Object Detection Dataset

Context

Content

Object Classes

Acknowledgements

PASCAL Visual Object Classes Challenge - Dataset - LDM

MetaGraspNet Single Class Mutiple Instance Dataset

MetaGraspNet dataset

Citing MetaGraspNet

File Structure

Understanding order.csv

Pigdetect: a diverse and challenging benchmark dataset for the detection of...

XS-VID

Cry, Laugh, or Angry? A Benchmark Dataset for Computer Vision-Based Approach...

Tiny Object Detection in Real-Time Traffic Surveillance (VisDrone Dataset)

KITTI object detection benchmark dataset - Dataset - LDM