Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
RetinaNet is a dataset for object detection tasks - it contains Vehicle Bike annotations for 298 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
The dataset used in this paper is the RetinaNet object detector.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is the dataset and model used for Tiny Towns Scorer, a computer vision project completed as part of CS 4664: Data-Centric Computing Capstone at Virginia Tech. The goal of the project was to calculate player scores in the board game Tiny Towns.
The dataset consists of 226 images and associated annotations, intended for object detection. The images are photographs of players' game boards over the course of a game of Tiny Towns, as well as photos of individual game pieces taken after the game. Photos were taken using hand-held smartphones. Images are in JPG and PNG formats. The annotations are provided in TFRecord 1.0 and CVAT for Images 1.1 formats.
The weights for the trained RetinaNet-portion of the model are also provided.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Retinanet Autos Nuevo is a dataset for object detection tasks - it contains Autos annotations for 482 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
The dataset used in this paper is the Faster R-CNN and RetinaNet object detectors.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Pothole Retinanet is a dataset for object detection tasks - it contains Potholes annotations for 7,223 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A model trained on Vehicles Dataset using the TFOD API
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
If you use this dataset, please cite this paper: Puertas, E.; De-Las-Heras, G.; Sánchez-Soriano, J.; Fernández-Andrés, J. Dataset: Variable Message Signal Annotated Images for Object Detection. Data 2022, 7, 41. https://doi.org/10.3390/data7040041
This dataset consists of Spanish road images taken from inside a vehicle, as well as annotations in XML files in PASCAL VOC format that indicate the location of Variable Message Signals within them. Also, a CSV file is attached with information regarding the geographic position, the folder where the image is located, and the text in Spanish. This can be used to train supervised learning computer vision algorithms, such as convolutional neural networks. Throughout this work, the process followed to obtain the dataset, image acquisition, and labeling, and its specifications are detailed. The dataset is constituted of 1216 instances, 888 positives, and 328 negatives, in 1152 jpg images with a resolution of 1280x720 pixels. These are divided into 576 real images and 576 images created from the data-augmentation technique. The purpose of this dataset is to help in road computer vision research since there is not one specifically for VMSs.
The folder structure of the dataset is as follows:
In which:
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is a meticulously curated dataset designed for infant facial emotion recognition, featuring four primary emotional expressions: Angry, Cry, Laugh, and Normal. The dataset aims to facilitate research in machine learning, deep learning, affective computing, and human-computer interaction by providing a large collection of labeled infant facial images.
Primary Data (1600 Images): - Angry: 400 - Cry: 400 - Laugh: 400 - Normal: 400
Data Augmentation & Expanded Dataset (26,143 Images): To enhance the dataset's robustness and expand the dataset, 20 augmentation techniques (including HorizontalFlip, VerticalFlip, Rotate, ShiftScaleRotate, BrightnessContrast, GaussNoise, GaussianBlur, Sharpen, HueSaturationValue, CLAHE, GridDistortion, ElasticTransform, GammaCorrection, MotionBlur, ColorJitter, Emboss, Equalize, Posterize, FogEffect, and RainEffect) were applied randomly. This resulted in a significantly larger dataset with:
Data Collection & Ethical Considerations: The dataset was collected under strict ethical guidelines to ensure compliance with privacy and data protection laws. Key ethical considerations include: 1. Ethical Approval: The study was reviewed and approved by the Institutional Review Board (IRB) of Daffodil International University under Reference No: REC-FSIT-2024-11-10. 2. Informed Parental Consent: Written consent was obtained from parents before capturing and utilizing infant facial images for research purposes. 3. Privacy Protection: No personally identifiable information (PII) is included in the dataset, and images are strictly used for research in AI-driven emotion recognition.
Data Collection Locations & Geographical Diversity: To ensure diversity in infant facial expressions, data collection was conducted across multiple locations in Bangladesh, covering healthcare centers and educational institutions:
Face Detection Methodology: To extract the facial regions efficiently, RetinaNet—a deep learning-based object detection model—was employed. The use of RetinaNet ensures precise facial cropping while minimizing background noise and occlusions.
Potential Applications: 1. Affective Computing: Understanding infant emotions for smart healthcare and early childhood development. 2. Computer Vision: Training deep learning models for automated infant facial expression recognition. 3. Pediatric & Mental Health Research: Assisting in early autism screening and emotion-aware AI for child psychology. 4. Human-Computer Interaction (HCI): Designing AI-powered assistive technologies for infants.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
RetinaNetv8 is a dataset for object detection tasks - it contains Vehicle annotations for 2,701 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
If you use this dataset, please cite this paper: Puertas, E.; De-Las-Heras, G.; Fernández-Andrés, J.; Sánchez-Soriano, J. Dataset: Roundabout Aerial Images for Vehicle Detection. Data 2022, 7, 47. https://doi.org/10.3390/data7040047
This publication presents a dataset of Spanish roundabouts aerial images taken from an UAV, along with annotations in PASCAL VOC XML files that indicate the position of vehicles within them. Additionally, a CSV file is attached containing information related to the location and characteristics of the captured roundabouts. This work details the process followed to obtain them: image capture, processing and labeling. The dataset consists of 985,260 total instances: 947,400 cars, 19,596 cycles, 2,262 trucks, 7,008 buses and 2,208 empty roundabouts, in 61,896 1920x1080px JPG images. These are divided into 15,474 extracted images from 8 roundabouts with different traffic flows and 46,422 images created using data augmentation techniques. The purpose of this dataset is to help research on computer vision on the road, as such labeled images are not abundant. It can be used to train supervised learning models, such as convolutional neural networks, which are very popular in object detection.
Roundabout (scenes) |
Frames |
Car |
Truck |
Cycle |
Bus |
Empty |
1 (00001) |
1,996 |
34,558 |
0 |
4229 |
0 |
0 |
2 (00002) |
514 |
743 |
0 |
0 |
0 |
157 |
3 (00003-00017) |
1,795 |
4822 |
58 |
0 |
0 |
0 |
4 (00018-00033) |
1,027 |
6615 |
0 |
0 |
0 |
0 |
5 (00034-00049) |
1,261 |
2248 |
0 |
550 |
0 |
81 |
6 (00050-00052) |
5,501 |
180,342 |
1420 |
120 |
1376 |
0 |
7 (00053) |
2,036 |
5,789 |
562 |
0 |
226 |
92 |
8 (00054) |
1,344 |
1,733 |
222 |
0 |
150 |
222 |
Total |
15,474 |
236,850 |
2,262 |
4,899 |
1,752 |
552 |
Data augmentation |
x4 |
x4 |
x4 |
x4 |
x4 |
x4 |
Total |
61,896 |
947,400 |
9048 |
19,596 |
7,008 |
2,208 |
This Project consists of two datasets, both of aerial images and videos of dolphins, being taken by drones. The data was captured from few places (Italy and Israel coast lines).
The aim of the project is to examine automated dolphins detection and tracking from aerial surveys.
The project description, details and results are presented in the paper (link to the paper).
Each dataset was organized and set for a different phase of the project. Each dataset is located in a different zip file:
1. Detection - Detection.zip
2. Tracking - Tracking.zip
Further information about the datasets' content and annotation format is below.
* In aim to watch each file content, use the preview option, in addition a description appears later on this section.
Detection Dataset
This dataset contains 1125 aerial images, while an image can contain several dolphins.
The detection phase of the project is done using RetinaNet, supervised deep learning based algorithm, with the implementation of Keras RetinaNet. Therefore, the data was divided into three parts - Train, Validation and Test. The relations is 70%, 15%, 15% respectively.
The annotation format follows the requested format of that implementation (Keras RetinaNet). Each object, which is a dolphin, is annotated as a bounding box coordinates and a class. For this project, the dolphins were not distinguished into species, therefore, a dolphin object is annotated as a bounding box, and classified as a 'Dolphin'. Detection zip file includes:
*The annotation format is detailed in Annotation section.
Detection zip file content:
Detection
|——————train_set (images)
|——————train_set.csv
|——————validation_set (images)
|——————train_set.csv
|——————test_set (images)
|——————train_set.csv
└——————class_mapping.csv
Tracking
This dataset contains 5 short videos (10-30 seconds), which were trimmed from a longer aerial videos, captured from a drone.
The tracking phase of the project is done using two metrics:
Both metrics demand the videos' frames sequence as an input. Therefore, the videos' frames were extracted. The first frame was annotated manually for initialization, and the algorithms track accordingly. Same as the Detection dataset, each frame can includes several objects (dolphins).
For annotation consistency, the videos' frames sequences were annotated similar to the Detection Dataset above, (details can be found in Annotation section). Each video's frames annotations separately. Therefore, Tracking zip file contains a folder for each video (5 folders in total), named after the video's file name.
Each video folder contains:
The examined videos description and details are displayed in 'Videos Description.xlsx' file. Use the preview option for displaying its content.
Tracking zip file content:
Tracking
|——————DJI_0195_trim_0015_0045
| └——————frames (images)
| └——————annotations_DJI_0195_trim_0015_0045.csv
| └——————class_mapping_DJI_0195_trim_0015_0045.csv
| └——————DJI_0195_trim_0015_0045.MP4
|——————DJI_0395_trim_0010_0025
| └——————frames (images)
| └——————annotations_DJI_0395_trim_0010_0025.csv
| └——————class_mapping_DJI_0395_trim_0010_0025.csv
| └——————DJI_0195_trim_0015_0045.MP4
|——————DJI_0395_trim_00140_00150
| └——————frames (images)
| └——————annotations_DJI_0395_trim_00140_00150.csv
| └——————class_mapping_DJI_0395_trim_00140_00150.csv
| └——————DJI_0395_trim_00140_00150.MP4
|——————DJI_0395_trim_0055_0085
| └——————frames (images)
| └——————annotations_DJI_0395_trim_0055_0085.csv
| └——————class_mapping_DJI_0395_trim_0055_0085.csv
| └——————DJI_0395_trim_0055_0085.MP4
└——————HighToLow_trim_0045_0070
└—————frames (images)
└—————annotations_HighToLow_trim_0045_0070.csv
└—————class_mapping_HighToLow_trim_0045_0070.csv
└—————HighToLow_trim_0045_0070.MP4
Annotations format
Both datasets have similar annotation format which is described below. The data annotation format, of both datasets, follows the requested format of Keras RetinaNet Implementation, which was used for training in the Dolphins Detection phase of the project.
Each object (dolphin) is annotated by a bounding box left-top and right-bottom coordinates and a class. Each image or frame can includes several objects. All data was annotated using Labelbox application.
For each subset (Train, Validation and Test of Detection dataset, and each video of Tracking Dataset) there are two corresponded CSV files:
Each line in the Annotations CSV file contains an annotation (bounding box) in an image or frame.
The format of each line of the CSV annotation is:
path/to/image.jpg,x1,y1,x2,y2,class_name
An example from `train_set.csv`:
.\train_set\1146_20170730101_ce1_sc_GOPR3047 103.jpg,506,644,599,681,Dolphin
.\train_set\1146_20170730101_ce1_sc_GOPR3047 103.jpg,394,754,466,826,Dolphin
.\train_set\1147_20170730101_ce1_sc_GOPR3047 104.jpg,613,699,682,781,Dolphin
.\train_set\1147_20170730101_ce1_sc_GOPR3047 104.jpg,528,354,586,443,Dolphin
.\train_set\1147_20170730101_ce1_sc_GOPR3047 104.jpg,633,250,723,307,Dolphin
This defines a dataset with 2 images:
Each line in the Class Mapping CSV file contains a mapping:
class_name,id
An example:
Dolphin,0
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
## Overview
Object_detection_retinanet is a dataset for object detection tasks - it contains Dont annotations for 425 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [MIT license](https://creativecommons.org/licenses/MIT).
This dataset was created by Claudio Bruzzone
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Accurate and dependable weed detection technology is a prerequisite for weed control robots to do autonomous weeding. Due to the complexity of the farmland environment and the resemblance between crops and weeds, detecting weeds in the field under natural settings is a difficult task. Existing deep learning-based weed detection approaches often suffer from issues such as monotonous detection scene, lack of picture samples and location information for detected items, low detection accuracy, etc. as compared to conventional weed detection methods. To address these issues, WeedNet-R, a vision-based network for weed identification and localization in sugar beet fields, is proposed. WeedNet-R adds numerous context modules to RetinaNet’s neck in order to combine context information from many feature maps and so expand the effective receptive fields of the entire network. During model training, meantime, a learning rate adjustment method combining an untuned exponential warmup schedule and cosine annealing technique is implemented. As a result, the suggested method for weed detection is more accurate without requiring a considerable increase in model parameters. The WeedNet-R was trained and assessed using the OD-SugarBeets dataset, which is enhanced by manually adding the bounding box labels based on the publicly available agricultural dataset, i.e. SugarBeet2016. Compared to the original RetinaNet, the mAP of the proposed WeedNet-R increased in the weed detection job in sugar beet fields by 4.65% to 92.30%. WeedNet-R’s average precision for weed and sugar beet is 85.70% and 98.89%, respectively. WeedNet-R outperforms other sophisticated object detection algorithms in terms of detection accuracy while matching other single-stage detectors in terms of detection speed.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
If you use the Tryp dataset, please cite the following:Esla Timothy Anzaku, Mohammed Aliy Mohammed, Utku Ozbulak, Jongbum Won, Hyesoo Hong, Janarthanan Krishnamoorthy, Sofie Van Hoecke, Stefan Magez, Arnout Van Messem, Wesley De Neve "Tryp: a dataset of microscopy images of unstained thick blood smears for trypanosome detection." Scientific Data 10, 716 (2023). https://doi.org/10.1038/s41597-023-02608-yThe Tryp dataset provides bounding box annotations for detecting Trypanosoma brucei brucei in microscopy images of unstained thick blood smears. Extracting the Tryp.zip file unveils three main folders:positive_imagesnegative_imagesvideosThe videos folder holds all the originally recorded videos, which were used to extract the images in the Tryp dataset and are categorized into positive and negative folders.Inside the positive_images folder are three more folders:trainvalidationtestEach folder contains two more folders, images and labels, and a JSON file. The images and labels folders hold the corresponding images and annotation files compatible with the YOLOv7 model. On the other hand, the JSON files have annotations in the MS COCO format, which is suitable for training the Faster R-CNN and RetinaNet models using the implementation by Torchvision.The related code is available at https://github.com/esla/trypanosome_parasite_detection
This dataset was created by Feanor007
Released under Data files © Original Authors
It contains the following files:
This collection contains the trained models and object detection results of 2 architectures found in the Detectron2 library, on the MS COCO val2017 dataset, under different JPEG compresion level Q = {5, 12, 19, 26, 33, 40, 47, 54, 61, 68, 75, 82, 89, 96} (14 levels per trained model). Architectures: F50 – Faster R-CNN on ResNet-50 with FPN R50 – RetinaNet on ResNet-50 with FPN Training type: D2 – Detectron2 Model ZOO pre-trained 1x model (90.000 iterations, batch 16) STD – standard 1x training (90.000 iterations) on original train2017 dataset Q20 – 1x training (90.000 iterations) on train2017 dataset degraded to Q=20 Q40 – 1x training (90.000 iterations) on train2017 dataset degraded to Q=40 T20 – extra 1x training on top of D2 on train2017 dataset degraded to Q=20 T40 – extra 1x training on top of D2 on train2017 dataset degraded to Q=40 Model and metrics files models_FasterRCNN.tar.gz (F50-STD, F50-Q20, …) models_RetinaNet.tar.gz (R50-STD, R50-Q20, …) For every model there are 3 files: config.yaml – the Detectron2 config of the model. model_final.pth – the weights (training snapshot) in PyTorch format. metrics.json – training metrics (like time, total loss, etc.) every 20 iterations. The D2 models were not included, because they are available from the Detectron2 Model ZOO, as faster_rcnn_R_50_FPN_1x (F50-D2) and retinanet_R_50_FPN_1x (R50-D2). Result files F50-results.tar.gz – results for Faster R-CNN models (inluding D2). R50-results.tar.gz – results for RetinaNet models (inluding D2). For every model there are 14 subdirectories, e.g. evaluator_dump_R50x1_005 through evaluator_dump_R50x1_096, for each of the JPEG Q values. Each such folder contains: coco_instances_results.json – all detected objects (image id, bounding box, class index and confidence). results.json – AP metrics as computed by COCO API. Source code for processing the data The data can be processed using our code, published at: https://github.com/tgandor/urban_oculus. Additional dependencies for the source code: COCO API Detectron2
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Performance evaluation metrics for detection models based on confidence thresholds from internal data.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
RetinaNet is a dataset for object detection tasks - it contains Vehicle Bike annotations for 298 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).