Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Oriented Bounding Boxes Dataset is a dataset for object detection tasks - it contains Robot O0Gq annotations for 563 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Facebook
TwitterVector dataset extracted using a deep learning oriented object detection model. Model is trained to identify and classify above and below swimming pools. Show full description
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Boots Oriented Bounding Box is a dataset for object detection tasks - it contains Box annotations for 509 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Facebook
TwitterAttribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
License information was derived automatically
The paper for this dataset is found here, the dataset was used in the Gaofen Challenge hosted by the Aerospace Information Research Institute, Chinese Academy of Sciences.
I have put this together because a few months ago I had a project that needed such a dataset for vehicle detection, and found there wasn't much out there with suitable resolution and quality. I ended up using the xView1 Dataset, which was pretty good, but noted at the time the FAIR1M had a lot of potential too.
It's main points of difference of FAIR1M compared to many others in this space are: - Some geographical diversity: Asia, Europe, North America, Capetown, Sydney. Mostly Urban - Oriented bounding boxes - Most of the imagery is high resolution: 0.3m or 0.6m, which makes it just enough for small car detection.
For comparison, xView-1 is larger and more geographically diverse, but has flat bounding boxes. If you want to try oriented bounding boxes, FAIR1M is worth a try.
I could only find 240,852 spatially unique labels, the rest seem to be duplicates due to overlapping imagery. Though some of course would be in the hidden test set, which has not been made public. Anyway, that's still a lot of labels, so thanks to the organisers for making these available.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Current remote sensing object detection frameworks often focus solely on the geometric relationship between true and predicted boxes, neglecting the intrinsic shapes of the boxes. In the field of remote sensing detection, there are numerous elongated bounding boxes. Variations in the shape and size of these boxes result in differences in their Intersection over Union (IoU) values, which is particularly noticeable when detecting small objects. Platforms with limited resources, such as satellites and unmanned drones, have strict requirements for detector storage space and computational complexity. This makes it challenging for existing methods to balance detection performance and computational demands. Therefore, this paper presents RS-YOLO, a lightweight framework that enhances You Only Look Once (YOLO) and is specifically designed for deployment on resource-limited platforms. RS-YOLO has developed a bounding box regression approach for remote sensing images, focusing on the shape and scale of the boundary boxes. Additionally, to improve the integration of multi-scale spatial features, RS-YOLO introduces a lightweight multi-scale hybrid attention module for cross-space fusion. The DOTA-v1.0 and HRSC2016 datasets were used to test our model, which was then compared to multiple state-of-the-art oriented object detection models. The results indicate that the detector introduced in this article achieves top performance while being lightweight and suitable for deployment on resource-limited platforms.
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
High-resolution aerial imagery with 16,000+ oriented bounding boxes for vehicle detection, pre-formatted for Ultralytics YOLOv11.
This dataset is a ready-to-use version of the original Eagle Dataset from the German Aerospace Center (DLR). The original dataset was created to benchmark object detection models on challenging aerial imagery, featuring vehicles at various orientations.
This version has been converted to the YOLOv11-OBB (Oriented Bounding Box) format. The conversion makes the dataset directly compatible with modern deep learning frameworks like Ultralytics YOLO, allowing researchers and developers to train state-of-the-art object detectors with minimal setup.
The dataset is ideal for tasks requiring precise localization of rotated objects, such as vehicle detection in parking lots, traffic monitoring, and urban planning from aerial viewpoints.
The dataset is split into training, validation, and test sets, following a standard structure for computer vision tasks.
Dataset Split & Counts:
Directory Structure:
EagleDatasetYOLO/
├── train/
│ ├── images/ # 159 images
│ └── labels/ # 159 .txt obb labels
├── val/
│ ├── images/ # 53 images
│ └── labels/ # 53 .txt obb labels
├── test/
│ ├── images/ # 106 images
│ └── labels/ # 106 .txt obb labels
├── data.yaml
└── license.md
Annotation Format (YOLOv11-OBB):
Each .txt label file contains one object per line. The format for each object is:
<class_id> <x_center> <y_center> <width> <height> <angle>
<class_id>: The class index (in this case, 0 for 'vehicle').<x_center> <y_center>: The normalized center coordinates of the bounding box.<width> <height>: The normalized width and height of the bounding box.<angle>: The rotation angle of the box in radians, from -π/2 to π/2.data.yaml Configuration:
A data.yaml file is included for easy integration with the Ultralytics framework.
path: ../EagleDatasetYOLO
train: train/images
val: val/images
test: test/images
nc: 1
names: ['vehicle']
This dataset is a conversion of the original work by the German Aerospace Center (DLR). The conversion to YOLOv11-OBB format was performed by Mridankan Mandal.
The dataset is released under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International license (CC BY-NC-SA 4.0).
If you use this dataset in your research, please cite the original creators and acknowledge the conversion work.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The DeepScoresV2 Dataset for Music Object Detection contains digitally rendered images of written sheet music, together with the corresponding ground truth to fit various types of machine learning models. A total of 151 Million different instances of music symbols, belonging to 135 different classes are annotated. The total Dataset contains 255,385 Images. For most researches, the dense version, containing 1714 of the most diverse and interesting images, should suffice.
The dataset contains ground in the form of:
Non-oriented bounding boxes
Oriented bounding boxes
Semantic segmentation
Instance segmentation
The accompaning paper The DeepScoresV2 Dataset and Benchmark for Music Object Detection published at ICPR2020 can be found here:
https://digitalcollection.zhaw.ch/handle/11475/20647
A toolkit for convenient loading and inspection of the data can be found here:
https://github.com/yvan674/obb_anns
Code to train baseline models can be found here:
https://github.com/tuggeluk/mmdetection/tree/DSV2_Baseline_FasterRCNN
https://github.com/tuggeluk/DeepWatershedDetection/tree/dwd_old
Facebook
TwitterRef. https://www.mvtec.com/company/research/datasets/mvtec-screws
The MVTec Screws dataset has been designed for oriented box detection. It contains 384 images of 13 different types of screws and nuts on a wooden background. All objects are labeled by oriented bounding boxes and their respective category. Overall, there are 4,426 of such annotations.
The data is released under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0).
In particular, it is not allowed to use the dataset for commercial purposes. If you are unsure whether or not your application violates the non-commercial use clause of the license, please contact us via the form below.
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
A cleaned, and reformatted version of the VSAI Dataset, specifically adapted for Oriented Bounding Box (OBB) vehicle detection using the YOLOv11 format.
This dataset is designed for aerial/drone-based vehicle detection tasks. It is a modified version of the original VSAI Dataset v1 by the DroneVision Team. This version has been modified by Mridankan Mandal for the easy of training object detection models like the YOLO11-OBB models.
The dataset is split into two classes: small-vehicle and large-vehicle. All annotations have been converted to the YOLOv11-OBB format, and the data is organized into training, validation, and testing sets.
This dataset improves upon the original by incorporating several key modifications to make it more accessible and useful for modern computer vision tasks:
The dataset is organized in a standard YOLO format for easy integration with popular training frameworks.
YOLOOBBVSAIDataset/
├── train/
│ ├── images/ #Contains 4,297 image files.
│ └── labels/ #Contains 4,297 .txt label files.
├── val/
│ ├── images/ #Contains 537 image files.
│ └── labels/ #Contains 537 .txt label files.
├── test/
│ ├── images/ #Contains 538 image files.
│ └── labels/ #Contains 538 .txt label files.
├── data.yaml #Dataset configuration file.
├── license.md #Full license details.
└── ReadMe.md #Dataset README file.
Each .txt label file contains one or more lines, with each line representing a single object in the YOLOv11-OBB format:
class_id x1 y1 x2 y2 x3 y3 x4 y4
class_id: An integer representing the object class (0 for small-vehicle, 1 for large-vehicle).(x1, y1)...(x4, y4): The four corner points of the oriented bounding box, with coordinates normalized between 0 and 1.data.yaml:To begin training a YOLO model with this dataset, you can use the provided data.yaml file. Simply update the path to the location of the dataset on your local machine.
#The path to the root dataset directory.
path: /path/to/YOLOOBBVSAIDataset/
train: train/images
val: val/images
test: test/images
#Number of classes.
nc: 2
#The Class names,
names:
0: small-vehicle
1: large-vehicle
This dataset is released under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) license.
When using this dataset, attribute as follows:
If you use this dataset in your research, use the following BibTeX entry to cite it:
@dataset{vsai_yolo_obb_2025,
title={VSAI Dataset (YOLOv11-OBB Format)},
author={Mridankan Mandal},
year={2025},
note={Modified from original VSAI v1 dataset by DroneVision},
license={CC BY-NC-SA 4.0}
}
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
This is a ready-to-use dataset consisting of X-ray images of the human jaw, with corresponding annotations for individual teeth. Each tooth is labeled using oriented bounding box (OBB) coordinates, making the dataset well-suited for tasks that require precise object localization and orientation awareness. There are a total of 17 classes representing teeth in upper jaw
The annotations are formatted specifically for compatibility with YOLO-OBB (Oriented Bounding Box) models, enabling seamless integration into training pipelines for dental detection and analysis tasks.
Facebook
Twitterhttps://captain-whu.github.io/DOTA/dataset.htmlhttps://captain-whu.github.io/DOTA/dataset.html
In the past decade, significant progress in object detection has been made in natural images, but authors of the DOTA v2.0: Dataset of Object deTection in Aerial images note that this progress hasn't extended to aerial images. The main reason for this discrepancy is the substantial variations in object scale and orientation caused by the bird's-eye view of aerial images. One major obstacle to the development of object detection in aerial images (ODAI) is the lack of large-scale benchmark datasets. The DOTA dataset contains 1,793,658 object instances spanning 18 different categories, all annotated with oriented bounding box annotations (OBB). These annotations were collected from a total of 11,268 aerial images. Using this extensive and meticulously annotated dataset, the authors establish baselines covering ten state-of-the-art algorithms, each with over 70 different configurations. These configurations are evaluated for both speed and accuracy performance.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The cost and effort of modelling existing bridges from point clouds currently outweighs the perceived benefits of the resulting model. There is a pressing need to automate this process. Previous research has achieved the automatic generation of surface primitives combined with rule-based classification to create labelled cuboids and cylinders from point clouds. While these methods work well in synthetic datasets or idealized cases, they encounter huge challenges when dealing with real-world bridge point clouds, which are often unevenly distributed and suffer from occlusions. In addition, real bridge geometries are complicated. In this paper, we propose a novel top-down method to tackle these challenges for detecting slab, pier, pier cap, and girder components in reinforced concrete bridges. This method uses a slicing algorithm to separate the deck assembly from pier assemblies. It then detects and segment pier caps using their surface normal, and girders using oriented bounding boxes and density histograms. Finally, our method merges over-segments into individually labelled point clusters. The results of 10 real-world bridge point cloud experiments indicate that our method achieves an average detection precision of 98.8%. This is the first method of its kind to achieve robust detection performance for the four component types in reinforced concrete bridges and to directly produce labelled point clusters. Our work provides a solid foundation for future work in generating rich Industry Foundation Classes models from the labelled point clusters.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is a project created to aid in land use classification of properties based on their facades on the streets. It is a bounding box object detection oriented dataset, but the objective is to try semi-supervised techniques to utilize the fewer annotated image examples as possible.
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Original Author: MVTec Software GmbH, July 2020.
Dataset contains 384 images of 13 different types of screws and nuts on a wooden background. The objects are labeled by oriented bounding boxes and their respective category. Overall, there are 4426 of such annotations. Initially, they have been selected randomly, such that approximately 70% of the instances of each category are within the training split, and 15% each in the validation and test splits.
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
SemanticSugarBeets is a comprehensive dataset and framework designed for analyzing post-harvest and post-storage sugar beets using monocular RGB images. It supports three key tasks: instance segmentation to identify and delineate individual sugar beets, semantic segmentation to classify specific regions of each beet (e.g., damage, soil adhesion, vegetation, and rot) and oriented object detection to estimate the size and mass of beets using reference objects. The dataset includes 952 annotated images with 2,920 sugar-beet instances, captured both before and after storage. Accompanying the dataset is a demo application and processing code, available on GitHub. For more details, refer to the paper presented at the Agriculture-Vision Workshop at CVPR 2025.
The dataset supports three primary learning tasks, each designed to address specific aspects of sugar-beet analysis:
The dataset is organized into the following directories:
File names of images and annotations follow this format:
ssb-
If you use the SemanticSugarBeets dataset or source code in your research, please cite the following paper to acknowledge the authors' contributions:
Croonen, G., Trondl, A., Simon, J., Steininger, D., 2025. SemanticSugarBeets: A Multi-Task Framework and Dataset for Inspecting Harvest and Storage Characteristics of Sugar Beets. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains 4,599 high-quality, annotated images of 25 commonly used chemistry lab apparatuses. The images, each containing structures in real-world settings, have been captured from different angles, backgrounds, and distances, while also undergoing variations in lighting to aid in the robustness of object detection models. Every image has been labeled using bounding box annotation in TXT (YOLO) format, alongside the class IDs and normalized bounding box coordinates, making object detection more precise. The annotations and bounding boxes have been built using the Roboflow platform.To achieve a better learning procedure, the dataset has been split into three sub-datasets: training, validation, and testing. The training dataset constitutes 70% of the entire dataset, with validation and testing at 20% and 10% respectively. In addition, all images undergo scaling to a standard of 640x640 pixels while being auto-oriented to rectify rotation discrepancies brought about by the EXIF metadata. The dataset is structured in three main folders - train, valid, and test, and each contains images/ and labels/ subfolders. Every image contains a label file containing class and bounding box data corresponding to each detected object.The whole dataset features 6,960 labeled instances per 25 apparatus categories including beakers, conical flasks, measuring cylinders, test tubes, among others. The dataset can be utilized for the development of automation systems, real-time monitoring and tracking systems, tools for safety monitoring, alongside AI educational tools.
Facebook
Twitterhttps://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
Aria Everyday Objects Dataset
[Project Page] [Data Explorer] [Code] [Paper]
Aria Everyday Objects (AEO) is a small, challenging 3D object detection dataset for egocentric data. AEO consists of approximately 45 minutes of egocentric data across 25 sequences captured by non-computer vision experts collected in a diverse set of locations throughout the US. Oriented 3D bounding boxes have been annotated for each sequence. Annotation is done in 3D, using the camera calibration, SLAM… See the full description on the dataset page: https://huggingface.co/datasets/projectaria/aria-everyday-objects.
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
A large scale, merged dataset for oriented vehicle detection in aerial imagery, preformatted for YOLOv11-OBB models.
This dataset combines three distinct aerial imagery collections—**VSAI**, DroneVehicles, and DIOR-R, into a unified resource for training and benchmarking oriented object detection models. It has been specifically preprocessed and formatted for use with Ultralytics' YOLOv11-OBB models.
The primary goal is to provide a detailed dataset for tasks like aerial surveillance, traffic monitoring, and vehicle detection from a drone's perspective. All annotations have been converted to the YOLO OBB format, and the classes have been simplified for focused vehicle detection tasks.
small-vehicle and large-vehicle.data.yaml configuration file for immediate use in YOLO training pipelines.train, validation, and test sets.small-vehicle and large-vehicle. The vehicle class from the DIOR-R dataset was mapped to large-vehicle.| Class ID | Class Name | Source Dataset(s) |
|---|---|---|
| 0 | small-vehicle | VSAI, DroneVehicles |
| 1 | large-vehicle | VSAI, DroneVehicles, DIOR-R |
Each image has a corresponding .txt label file. Each line in the file represents one object in the YOLOv11-OBB format:
class_id x1 y1 x2 y2 x3 y3 x4 y4
class_id: The class index (0 for small-vehicle, 1 for large-vehicle).(x1, y1)...(x4, y4): The four corner points of the oriented bounding box, with all coordinates normalized to a range of [0, 1].The dataset is organized into a standard YOLO directory structure for easy integration with training programs.
RoadVehiclesYOLOOBBDataset/
├── train/
│ ├── images/ #18,274 images
│ └── labels/ #18,274 labels
├── val/
│ ├── images/ #5,420 images
│ └── labels/ #5,420 labels
├── test/
│ ├── images/ #5,431 images
│ └── labels/ #5,431 labels
├── data.yaml #YOLO dataset configuration file.
└── ReadMe.md #Documentation
To use this dataset with YOLOv11 or other compatible frameworks, simply point your training script to the included data.yaml file.
data.yaml:#Dataset configuration.
path: RoadVehiclesYOLOOBBDataset/
train: train/images
val: val/images
test: test/images
#Number of classes.
nc: 2
#Class names.
names:
0: small-vehicle
1: large-vehicle
This merged dataset is released under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0), which is the most restrictive license among its sources.
When using this dataset, please provide attribution to all original sources as follows:
- VSAI_Dataset: by DroneVision, licensed under CC BY-NC-SA 4.0.
- DroneVehicles Dataset: by Yiming Sun, Bing Cao, Pengfei Zhu, and Qin G. Hu and modified by Mridankan Mandal, licensed under CC BY-NC-SA 4.0.
- DIOR-R dataset: by the DIOR...
Facebook
TwitterThis dataset contains 8,085 groups of images across 10 categories The collection scenes include street, snack street, shop entrance, corridor, community entrance, construction site, etc. The data diversity includes multiple scenes, different time periods(day, night), different photographic angles. Each image is annotated with rectangular bounding boxes for urban objects. This data can be used for tasks such as urban object detection, smart city management, public safety monitoring, and AI-driven city infrastructure analysis.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is an open-source synthetic dataset for computer vision object detection, focused on people holding knives in public and semi-public environments, viewed from CCTV and surveillance camera perspectives. It is designed to help train and evaluate YOLO, YOLOv8, YOLOWorld, Detectron, and other object detection models for threat recognition, security analytics, and abnormal behavior detection. Key Features Classes: person, knife Annotations: YOLO format (bounding boxes, normalized) Image Type:… See the full description on the dataset page: https://huggingface.co/datasets/Simuletic/cctv-knife-detection-dataset.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Oriented Bounding Boxes Dataset is a dataset for object detection tasks - it contains Robot O0Gq annotations for 563 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).