28 datasets found

PASCAL VOC-Formatted Object Detection Dataset

kaggle.com

Updated Dec 4, 2024

Facebook

Twitter

Click to copy link

Link copied

Cite

Mcii34 (2024). PASCAL VOC-Formatted Object Detection Dataset [Dataset]. https://www.kaggle.com/datasets/mcii34/utilities-detection-voc-dataset

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Dec 4, 2024

Dataset provided by

Kagglehttp://kaggle.com/

Authors

Mcii34

Description

This dataset is formatted in the PASCAL VOC format, a widely-used structure for object detection tasks. It includes high-quality images, their corresponding bounding box annotations, and predefined splits for training, validation, and testing.

Structure: VOC/ ├── Annotations # XML files with bounding box and class labels ├── JPEGImages # All images in .jpg format ├── ImageSets │ └── Main # Contains train.txt, val.txt, and test.txt
Content:
- Images: High-quality .jpg files stored in JPEGImages.
- Annotations: .xml files in Annotations folder that contain bounding box coordinates and object class names for each image.
- Splits: train.txt, val.txt, and test.txt in ImageSets/Main specify which images belong to each split for training, validation, and testing.

What Has Been Done

Organized Dataset:
- Ensured the dataset is structured according to the VOC format for seamless use with object detection frameworks.
Standardized XML Files:
- Updated the <path> tag in each XML file to reflect the correct relative path (e.g., JPEGImages/image1.jpg) or removed it if unnecessary.
- Ensured the <folder> tag is standardized (e.g., set to VOC) or removed for compatibility.
Created Train-Val-Test Splits:
- Generated train.txt, val.txt, and test.txt files in the ImageSets/Main directory.
- Applied stratified sampling to ensure an equal representation of object classes in each split.
Validated Class Distribution:
- Counted the number of images containing each object class in the training, validation, and testing splits to confirm balanced sampling.

Object Classes

This dataset contains the following object classes:

Class Name (English)	Class Name (Chinese)	Abbreviation
Manhole Cover	井盖 (jg)	jg
Crossing Light	人行灯 (rxd)	rxd
Pipeline Indicating Pile	地下管线桩 (dxgx)	dxgx
Traffic Signs	指示牌 (zsp)	zsp
Hydrant	消防栓 (xfs)	xfs
Camera	电子眼 (dzy)	dzy
Traffic Light	红绿灯 (lhd)	lhd
Guidepost	街道路名牌 (jdp)	jdp
Traffic Warning Sign	警示牌 (jsp)	jsp
Streetlamp	路灯 (ld)	ld
Communication Box	通讯箱 (txx)	txx

Zoo animals
kaggle.com
Updated Mar 25, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jirka Daberger (2023). Zoo animals [Dataset]. https://www.kaggle.com/datasets/jirkadaberger/zoo-animals/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 25, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Jirka Daberger
Description
This dataset contains different animal categories such as: buffalo, capybara, cat, cow, deer, dog, elephant, flamingo, giraffe, jaguar, kangaroo, lion, parrot, penguin, rhino, sheep, tiger, turtle and zebra.

Most of the images can be found in existing datasets: https://github.com/freds0/capybara_dataset https://universe.roboflow.com/miguel-narbot-usp-br/capybara-and-animals/dataset/1 https://www.kaggle.com/datasets/hugozanini1/kangaroodataset?resource=download https://github.com/experiencor/kangaroo https://universe.roboflow.com/z-jeans-pig/kangaroo-epscj/dataset/1 https://cvwc2019.github.io/challenge.html# https://www.kaggle.com/datasets/biancaferreira/african-wildlife https://universe.roboflow.com/new-workspace-5kofa/elephant-dataset/dataset/6 https://universe.roboflow.com/nathanael-hutama-harsono/large-cat/dataset/1/images/?split=train https://universe.roboflow.com/giraffs-and-cows/giraffes-and-cows/dataset/1 https://universe.roboflow.com/turtledetector/turtledetector/dataset/2 https://www.kaggle.com/datasets/smaranjitghose/sea-turtle-face-detection https://universe.roboflow.com/fadilyounes-me-gmail-com/zebra---savanna/dataset/1 https://universe.roboflow.com/test-qeryf/yolov5-9snhq https://universe.roboflow.com/or-the-king/two-zebras https://universe.roboflow.com/wild-animals-datasets/zebra-images/dataset/2 https://universe.roboflow.com/zebras/zebras/dataset/2 https://universe.roboflow.com/v2-rabotaem-xkxra/zebras_v2/dataset/5 https://universe.roboflow.com/vijay-vikas-mangena/animal_od_test1/dataset/1 https://universe.roboflow.com/bdoma13-gmail-com/rhino_horn/dataset/7 https://universe.roboflow.com/rudtkd134-naver-com/finalproject2/dataset/2 https://universe.roboflow.com/the-super-nekita/cats-brofl/dataset/2 https://universe.roboflow.com/lihi-gur-arie/pinguin-object-detection/dataset/2 https://universe.roboflow.com/utas-377cc/penguindataset-4dujc/dataset/10 https://universe.roboflow.com/new-workspace-tdyir/penguin-clfnj/dataset/1 https://universe.roboflow.com/utas-wd4sd/kit315_assignment/dataset/7 https://universe.roboflow.com/jeonjuuniv/deer-hqp4i/dataset/1 https://universe.roboflow.com/new-workspace-hqowp/sheeps/dataset/1 https://universe.roboflow.com/ali-eren-altindag/sheepstest2/dataset/1 https://universe.roboflow.com/yaser/sheep-0gudu/dataset/3 https://universe.roboflow.com/ali-eren-altindag/mixed_sheep/dataset/1 https://universe.roboflow.com/pkm-kc-2022/sapi-birahi/dataset/2 https://universe.roboflow.com/ghostikgh/team1_cows/dataset/5 https://universe.roboflow.com/ml-dlq4x/liontrain/dataset/2 https://universe.roboflow.com/animals/lionnew/dataset/2 https://universe.roboflow.com/parrottrening/parrot_trening/dataset/1 https://universe.roboflow.com/uet-hi8bg/parrots-r4tfl/dataset/1 https://universe.roboflow.com/superweight/parrot_poop/dataset/5 https://www.kaggle.com/datasets/tarunbisht11/intruder-detection

From those datasets the images has been filtered (deleted objects of size smaller 32, images with dimension smaller than 320px has been deleted, images and labeled objects has been renamed). The rest of images has been labeled by me.
Variable Message Signal annotated images for object detection
zenodo.org
portalcientifico.universidadeuropea.com
zip
Updated Oct 2, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gonzalo de las Heras de Matías; Gonzalo de las Heras de Matías; Javier Sánchez-Soriano; Javier Sánchez-Soriano; Enrique Puertas; Enrique Puertas (2022). Variable Message Signal annotated images for object detection [Dataset]. http://doi.org/10.5281/zenodo.5904211
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.5904211
Dataset updated
Oct 2, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Gonzalo de las Heras de Matías; Gonzalo de las Heras de Matías; Javier Sánchez-Soriano; Javier Sánchez-Soriano; Enrique Puertas; Enrique Puertas
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
If you use this dataset, please cite this paper: Puertas, E.; De-Las-Heras, G.; Sánchez-Soriano, J.; Fernández-Andrés, J. Dataset: Variable Message Signal Annotated Images for Object Detection. Data 2022, 7, 41. https://doi.org/10.3390/data7040041

This dataset consists of Spanish road images taken from inside a vehicle, as well as annotations in XML files in PASCAL VOC format that indicate the location of Variable Message Signals within them. Also, a CSV file is attached with information regarding the geographic position, the folder where the image is located, and the text in Spanish. This can be used to train supervised learning computer vision algorithms, such as convolutional neural networks. Throughout this work, the process followed to obtain the dataset, image acquisition, and labeling, and its specifications are detailed. The dataset is constituted of 1216 instances, 888 positives, and 328 negatives, in 1152 jpg images with a resolution of 1280x720 pixels. These are divided into 576 real images and 576 images created from the data-augmentation technique. The purpose of this dataset is to help in road computer vision research since there is not one specifically for VMSs.

The folder structure of the dataset is as follows:

vms_dataset/

data.csv

real_images/

imgs/

annotations/

data-augmentation/

imgs/

annotations/

In which:

data.csv: Each row contains the following information separated by commas (,): image_name, x_min, y_min, x_max, y_max, class_name, lat, long, folder, text.

real_images: Images extracted directly from the videos.

data-augmentation: Images created using data-augmentation

imgs: Image files in .jpg format.

annotations: Annotation files in .xml format.
R
Ct For Lung Cancer Diagnosis (lung Pet Ct Dx) Pascal Voc Annotions Dataset
universe.roboflow.com
zip
Updated Jun 26, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mehmet Fatih AKCA (2021). Ct For Lung Cancer Diagnosis (lung Pet Ct Dx) Pascal Voc Annotions Dataset [Dataset]. https://universe.roboflow.com/mehmet-fatih-akca/yolotransfer/model/1
Explore at:
zipAvailable download formats
Dataset updated
Jun 26, 2021
Dataset authored and provided by
Mehmet Fatih AKCA
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Variables measured
Cancer Bounding Boxes
Description
This dataset consists of CT and PET-CT DICOM images of lung cancer subjects with XML Annotation files that indicate tumor location with bounding boxes. The images were retrospectively acquired from patients with suspicion of lung cancer, and who underwent standard-of-care lung biopsy and PET/CT. Subjects were grouped according to a tissue histopathological diagnosis. Patients with Names/IDs containing the letter 'A' were diagnosed with Adenocarcinoma, 'B' with Small Cell Carcinoma, 'E' with Large Cell Carcinoma, and 'G' with Squamous Cell Carcinoma.

The images were analyzed on the mediastinum (window width, 350 HU; level, 40 HU) and lung (window width, 1,400 HU; level, –700 HU) settings. The reconstructions were made in 2mm-slice-thick and lung settings. The CT slice interval varies from 0.625 mm to 5 mm. Scanning mode includes plain, contrast and 3D reconstruction.

Before the examination, the patient underwent fasting for at least 6 hours, and the blood glucose of each patient was less than 11 mmol/L. Whole-body emission scans were acquired 60 minutes after the intravenous injection of 18F-FDG (4.44MBq/kg, 0.12mCi/kg), with patients in the supine position in the PET scanner. FDG doses and uptake times were 168.72-468.79MBq (295.8±64.8MBq) and 27-171min (70.4±24.9 minutes), respectively. 18F-FDG with a radiochemical purity of 95% was provided. Patients were allowed to breathe normally during PET and CT acquisitions. Attenuation correction of PET images was performed using CT data with the hybrid segmentation method. Attenuation corrections were performed using a CT protocol (180mAs,120kV,1.0pitch). Each study comprised one CT volume, one PET volume and fused PET and CT images: the CT resolution was 512 × 512 pixels at 1mm × 1mm, the PET resolution was 200 × 200 pixels at 4.07mm × 4.07mm, with a slice thickness and an interslice distance of 1mm. Both volumes were reconstructed with the same number of slices. Three-dimensional (3D) emission and transmission scanning were acquired from the base of the skull to mid femur. The PET images were reconstructed via the TrueX TOF method with a slice thickness of 1mm.

The location of each tumor was annotated by five academic thoracic radiologists with expertise in lung cancer to make this dataset a useful tool and resource for developing algorithms for medical diagnosis. Two of the radiologists had more than 15 years of experience and the others had more than 5 years of experience. After one of the radiologists labeled each subject the other four radiologists performed a verification, resulting in all five radiologists reviewing each annotation file in the dataset. Annotations were captured using Labellmg. The image annotations are saved as XML files in PASCAL VOC format, which can be parsed using the PASCAL Development Toolkit: https://pypi.org/project/pascal-voc-tools/. Python code to visualize the annotation boxes on top of the DICOM images can be downloaded here.

Two deep learning researchers used the images and the corresponding annotation files to train several well-known detection models which resulted in a maximum a posteriori probability (MAP) of around 0.87 on the validation set.

Dataset link: https://wiki.cancerimagingarchive.net/pages/viewpage.action?pageId=70224216

Data from: Roundabout Aerial Images for Vehicle Detection

zenodo.org
portalcientifico.universidadeuropea.com

csv, xz

Updated Oct 2, 2022

+ more versions

Facebook

Twitter

Click to copy link

Link copied

Cite

Gonzalo De-Las-Heras; Gonzalo De-Las-Heras; Javier Sánchez-Soriano; Javier Sánchez-Soriano; Enrique Puertas; Enrique Puertas (2022). Roundabout Aerial Images for Vehicle Detection [Dataset]. http://doi.org/10.5281/zenodo.6407460

Explore at:

csv, xzAvailable download formats

Unique identifier

https://doi.org/10.5281/zenodo.6407460

Dataset updated

Oct 2, 2022

Dataset provided by

Zenodohttp://zenodo.org/

Authors

Gonzalo De-Las-Heras; Gonzalo De-Las-Heras; Javier Sánchez-Soriano; Javier Sánchez-Soriano; Enrique Puertas; Enrique Puertas

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

If you use this dataset, please cite this paper: Puertas, E.; De-Las-Heras, G.; Fernández-Andrés, J.; Sánchez-Soriano, J. Dataset: Roundabout Aerial Images for Vehicle Detection. Data 2022, 7, 47. https://doi.org/10.3390/data7040047

This publication presents a dataset of Spanish roundabouts aerial images taken from an UAV, along with annotations in PASCAL VOC XML files that indicate the position of vehicles within them. Additionally, a CSV file is attached containing information related to the location and characteristics of the captured roundabouts. This work details the process followed to obtain them: image capture, processing and labeling. The dataset consists of 985,260 total instances: 947,400 cars, 19,596 cycles, 2,262 trucks, 7,008 buses and 2,208 empty roundabouts, in 61,896 1920x1080px JPG images. These are divided into 15,474 extracted images from 8 roundabouts with different traffic flows and 46,422 images created using data augmentation techniques. The purpose of this dataset is to help research on computer vision on the road, as such labeled images are not abundant. It can be used to train supervised learning models, such as convolutional neural networks, which are very popular in object detection.

Roundabout (scenes)	Frames	Car	Truck	Cycle	Bus	Empty
1 (00001)	1,996	34,558	0	4229	0	0
2 (00002)	514	743	0	0	0	157
3 (00003-00017)	1,795	4822	58	0	0	0
4 (00018-00033)	1,027	6615	0	0	0	0
5 (00034-00049)	1,261	2248	0	550	0	81
6 (00050-00052)	5,501	180,342	1420	120	1376	0
7 (00053)	2,036	5,789	562	0	226	92
8 (00054)	1,344	1,733	222	0	150	222
Total	15,474	236,850	2,262	4,899	1,752	552
Data augmentation	x4	x4	x4	x4	x4	x4
Total	61,896	947,400	9048	19,596	7,008	2,208

Retail Classification
kaggle.com
zip
Updated Dec 8, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alitquan Mallick (2020). Retail Classification [Dataset]. https://www.kaggle.com/alitquanmallick/grocery-classifier
Explore at:
zip(953455521 bytes)Available download formats
Dataset updated
Dec 8, 2020
Authors
Alitquan Mallick
Description
Context

Just a simple dataset to demonstrate object detection and classification in the retail environment, preferably using computer vision.

Content

This dataset contains resized images which have been annotated using LabelIMG. These resized images are founded in the directory 'ResizedImages' while corresponding XML notations in the Pascal VOC format. I used a YOLOv3 model to use this data. As of November 13, 2020, only three categories of products exists: 'can', 'shampoo', and 'spice.' Images vary in number of objects, with some images sporting only one object of one class, others sporting multiple object of the same class, and lastly, some sporting multiple objects of different classes.

Inspiration

The inspiration from this dataset was the need for a submission to the FLAIRS conference.
H
Data from: Annotated cows in aerial images for use in deep learning models
dataverse.harvard.edu
Updated May 31, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
G.J. Franke; Sander Mucher (2021). Annotated cows in aerial images for use in deep learning models [Dataset]. http://doi.org/10.7910/DVN/N7GJYU
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/N7GJYU
Dataset updated
May 31, 2021
Dataset provided by
Harvard Dataverse
Authors
G.J. Franke; Sander Mucher
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
A large dataset containing aerial images from fields in Juchowo, Poland and Wageningen, the Netherlands, with annotated cows present in the images using Pascal VOC XML Annotation Format. This dataset has been used to train various Deep Learning models (nanonets, yolov3 and the like) as part of the GenTORE project (https://www.gentore.eu ) Please download all the files, then use 7-zip to unzip the multi part archive.
Z
Turkish Pedestrian Dataset - TURPED
data.niaid.nih.gov
Updated Jul 19, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mustafa (2022). Turkish Pedestrian Dataset - TURPED [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6860744
Explore at:
Dataset updated
Jul 19, 2022
Dataset provided by
M. Alper
Mustafa
Tuğçe
Description
Data abstract: This Zenodo upload contains the Turkish Pedestrian Dataset (TURPED) for benchmarking and developing pedestrian detection methods for autonomous driving assistance systems. There are three folders named "Annotations", "Image Sets" and "JPEGImages". The annotations folder includes the pedestrian labels for each image in an XML format. The standard Pascal VOC XML annotation format is chosen to ease of use. The TXT files in the image sets folder describe which images are in training/validation/test sets. Finally, the images can be found in the JPEGImages folder in JPEG format.
m
Mexican Sign Language's Dactylology and Ten First Numbers - Labeled images...
data.mendeley.com
Updated May 30, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mario Rodriguez (2023). Mexican Sign Language's Dactylology and Ten First Numbers - Labeled images and videos. From person #1 to #5 [Dataset]. http://doi.org/10.17632/5s4mt7xrd9.1
Explore at:
Unique identifier
https://doi.org/10.17632/5s4mt7xrd9.1
Dataset updated
May 30, 2023
Authors
Mario Rodriguez
License
Attribution-NonCommercial 3.0 (CC BY-NC 3.0)https://creativecommons.org/licenses/by-nc/3.0/
License information was derived automatically
Area covered
Mexico
Description
The dataset comprises edited recordings of Mexican sign language's Dactylology (29 signs) and Ten First Numbers (from 1 to 10), including static and continuous signs accordingly from person 1 to person 5. The edited recordings are organized for easy access and management. Edited videos and screenshots of static signs are labeled in their file with corresponding sign language representations and stored in a consistent order per person, the number of the cycle of recording, and per hand. Static sign images can be exported in PASCAL VOC format with XML annotations too. The dataset is designed to facilitate feature extraction and further analysis in Mexican sign language recognition research.
Iran-Vehicle-plate-dataset
kaggle.com
zip
Updated May 21, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Samyar Rahimi (2021). Iran-Vehicle-plate-dataset [Dataset]. https://www.kaggle.com/samyarr/iranvehicleplatedataset
Explore at:
zip(641496850 bytes)Available download formats
Dataset updated
May 21, 2021
Authors
Samyar Rahimi
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Area covered
Iran
Description
Context

This dataset contains 313 images of Iranian vehicle plates in 224x224. Annotations are for 224x224 images and they are in PASCAL VOC XML format.

Original images are 1280x1280 which do not have annotations.

https://github.com/Samyarrahimi/Iran-Vehicle-plate-dataset
Z
Personal Protective Equipment Dataset (PPED)
data.niaid.nih.gov
zenodo.org
Updated May 17, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anonymous (2022). Personal Protective Equipment Dataset (PPED) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6551757
Explore at:
Dataset updated
May 17, 2022
Dataset authored and provided by
Anonymous
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Personal Protective Equipment Dataset (PPED)

This dataset serves as a benchmark for PPE in chemical plants We provide datasets and experimental results.

The dataset

We produced a data set based on the actual needs and relevant regulations in chemical plants. The standard GB 39800.1-2020 formulated by the Ministry of Emergency Management of the People’s Republic of China defines the protective requirements for plants and chemical laboratories. The complete dataset is contained in the folder PPED/data.

1.1. Image collection

We took more than 3300 pictures. We set the following different characteristics, including different environments, different distances, different lighting conditions, different angles, and the diversity of the number of people photographed.

Backgrounds: There are 4 backgrounds, including office, near machines, factory and regular outdoor scenes.

Scale: By taking pictures from different distances, the captured PPEs are classified in small, medium and large scales.

Light: Good lighting conditions and poor lighting conditions were studied.

Diversity: Some images contain a single person, and some contain multiple people.

Angle: The pictures we took can be divided into front and side.

A total of more than 3300 photos were taken in the raw data under all conditions. All images are located in the folder “PPED/data/JPEGImages”.

1.2. Label

We use Labelimg as the labeling tool, and we use the PASCAL-VOC labelimg format. Yolo use the txt format, we can use trans_voc2yolo.py to convert the XML file in PASCAL-VOC format to txt file. Annotations are stored in the folder PPED/data/Annotations

1.3. Dataset Features

The pictures are made by us according to the different conditions mentioned above. The file PPED/data/feature.csv is a CSV file which notes all the .os of all the image. It records every feature of the picture, including lighting conditions, angles, backgrounds, number of people and scale.

1.4. Dataset Division

The data set is divided into 9:1 training set and test set.

Baseline Experiments

We provide baseline results with five models, namely Faster R-CNN ®, Faster R-CNN (M), SSD, YOLOv3-spp, and YOLOv5. All code and results is given in folder PPED/experiment.

2.1. Environment and Configuration:

Intel Core i7-8700 CPU

NVIDIA GTX1060 GPU

16 GB of RAM

Python: 3.8.10

pytorch: 1.9.0

pycocotools: pycocotools-win

Windows 10

2.2. Applied Models

The source codes and results of the applied models is given in folder PPED/experiment with sub-folders corresponding to the model names.

2.2.1. Faster R-CNN

Faster R-CNN

backbone: resnet50+fpn

We downloaded the pre-training weights from https://download.pytorch.org/models/fasterrcnn_resnet50_fpn_coco-258fb6c6.pth.

We modified the dataset path, training classes and training parameters including batch size.

We run train_res50_fpn.py start training.

Then, the weights are trained by the training set.

Finally, we validate the results on the test set.

backbone: mobilenetv2

the same training method as resnet50+fpn, but the effect is not as good as resnet50+fpn, so it is directly discarded.

The Faster R-CNN source code used in our experiment is given in folder PPED/experiment/Faster R-CNN. The weights of the fully-trained Faster R-CNN (R), Faster R-CNN (M) model are stored in file PPED/experiment/trained_models/resNetFpn-model-19.pth and mobile-model.pth. The performance measurements of Faster R-CNN (R) Faster R-CNN (M) are stored in folder PPED/experiment/results/Faster RCNN(R)and Faster RCNN(M).

2.2.2. SSD

backbone: resnet50

We downloaded pre-training weights from https://download.pytorch.org/models/resnet50-19c8e357.pth.

The same training method as Faster R-CNN is applied.

The SSD source code used in our experiment is given in folder PPED/experiment/ssd. The weights of the fully-trained SSD model are stored in file PPED/experiment/trained_models/SSD_19.pth. The performance measurements of SSD are stored in folder PPED/experiment/results/SSD.

2.2.3. YOLOv3-spp

backbone: DarkNet53

We modified the type information of the XML file to match our application.

We run trans_voc2yolo.py to convert the XML file in VOC format to a txt file.

The weights used are: yolov3-spp-ultralytics-608.pt.

The YOLOv3-spp source code used in our experiment is given in folder PPED/experiment/YOLOv3-spp. The weights of the fully-trained YOLOv3-spp model are stored in file PPED/experiment/trained_models/YOLOvspp-19.pt. The performance measurements of YOLOv3-spp are stored in folder PPED/experiment/results/YOLOv3-spp.

2.2.4. YOLOv5

backbone: CSP_DarkNet

We modified the type information of the XML file to match our application.

We run trans_voc2yolo.py to convert the XML file in VOC format to a txt file.

The weights used are: yolov5s.

The YOLOv5 source code used in our experiment is given in folder PPED/experiment/yolov5. The weights of the fully-trained YOLOv5 model are stored in file PPED/experiment/trained_models/YOLOv5.pt. The performance measurements of YOLOv5 are stored in folder PPED/experiment/results/YOLOv5.

2.3. Evaluation

The computed evaluation metrics as well as the code needed to compute them from our dataset are provided in the folder PPED/experiment/eval.

Code Sources

Faster R-CNN (R and M)

https://github.com/WZMIAOMIAO/deep-learning-for-image-processing/tree/master/pytorch_object_detection/faster_rcnn

official code: https://github.com/pytorch/vision/blob/main/torchvision/models/detection/faster_rcnn.py

SSD

https://github.com/WZMIAOMIAO/deep-learning-for-image-processing/tree/master/pytorch_object_detection/ssd

official code: https://github.com/pytorch/vision/blob/main/torchvision/models/detection/ssd.py

YOLOv3-spp

https://github.com/WZMIAOMIAO/deep-learning-for-image-processing/tree/master/pytorch_object_detection/yolov3-spp

YOLOv5

https://github.com/ultralytics/yolov5
Data from: ODDS: Real-Time Object Detection using Depth Sensors on Embedded...
zenodo.org
data.niaid.nih.gov
application/gzip
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Niluthpol Chowdhury Mithun; Sirajum Munir; Karen Guo; Charles Shelton; Niluthpol Chowdhury Mithun; Sirajum Munir; Karen Guo; Charles Shelton (2020). ODDS: Real-Time Object Detection using Depth Sensors on Embedded GPUs [Dataset]. http://doi.org/10.5281/zenodo.1163770
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.1163770
Dataset updated
Jan 24, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Niluthpol Chowdhury Mithun; Sirajum Munir; Karen Guo; Charles Shelton; Niluthpol Chowdhury Mithun; Sirajum Munir; Karen Guo; Charles Shelton
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
ODDS Smart Building Depth Dataset

#Introduction:

The goal of this dataset is to facilitate research focusing on recognizing objects in smart buildings using the depth sensor mounted at the ceiling. This dataset contains annotations of depth images for eight frequently seen object classes. The classes are: person, backpack, laptop, gun, phone, umbrella, cup, and box.

#Data Collection:

We collected data from two settings. We had Kinect mounted at a 9.3 feet ceiling near to a 6 feet wide door. We also used a tripod with a horizontal extender holding the kinect at a similar height looking downwards. We asked about 20 volunteers to enter and exit a number of times each in different directions (3 times walking straight, 3 times walking towards left side, 3 times walking towards right side) holding objects in many different ways and poses underneath the Kinect. Each subject was using his/her own backpack, purse, laptop, etc. As a result, we considered varieties within the same object, e.g., for laptops, we considered Macbooks, HP laptops, Lenovo laptops of different years and models, and for backpacks, we considered backpacks, side bags, and purse of women. We asked the subjects to walk while holding it in many ways, e.g., for laptop, the laptop was fully open, partially closed, and fully closed while carried. Also, people hold laptops in front and side of their bodies, and underneath their elbow. The subjects carried their backpacks in their back, in their side at different levels from foot to shoulder. We wanted to collect data with real guns. However, bringing real guns to the office is prohibited. So, we obtained a few nerf guns and the subjects were carrying these guns pointing it to front, side, up, and down while walking.

#Annotated Data Description:

The Annotated dataset is created following the structure of Pascal VOC devkit, so that the data preparation becomes simple and it can be used quickly with different with object detection libraries that are friendly to Pascal VOC style annotations (e.g. Faster-RCNN, YOLO, SSD). The annotated data consists of a set of images; each image has an annotation file giving a bounding box and object class label for each object in one of the eight classes present in the image. Multiple objects from multiple classes may be present in the same image. The dataset has 3 main directories:

1)DepthImages: Contains all the images of training set and validation set.

2)Annotations: Contains one xml file per image file, (e.g., 1.xml for image file 1.png). The xml file includes the bounding box annotations for all objects in the corresponding image.

3)ImagesSets: Contains two text files training_samples.txt and testing_samples.txt. The training_samples.txt file has the name of images used in training and the testing_samples.txt has the name of images used for testing. (We randomly choose 80%, 20% split)

#UnAnnotated Data Description:

The un-annotated data consists of several set of depth images. No ground-truth annotation is available for these images yet. These un-annotated sets contain several challenging scenarios and no data has been collected from this office during annotated dataset construction. Hence, it will provide a way to test generalization performance of the algorithm.

#Citation:

If you use ODDS Smart Building dataset in your work, please cite the following reference in any publications:
@inproceedings{mithun2018odds,
title={ODDS: Real-Time Object Detection using Depth Sensors on Embedded GPUs},
author={Niluthpol Chowdhury Mithun and Sirajum Munir and Karen Guo and Charles Shelton},
booktitle={ ACM/IEEE Conference on Information Processing in Sensor Networks (IPSN)},
year={2018},
}
O
PESMOD (PExels Small Moving Object Detection)
opendatalab.com
zip
Updated Mar 6, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sakarya University (2023). PESMOD (PExels Small Moving Object Detection) [Dataset]. https://opendatalab.com/PESMOD
Explore at:
zip(3501709659 bytes)Available download formats
Dataset updated
Mar 6, 2023
Dataset provided by
Sakarya University
Description
The PESMOD (PExels Small Moving Object Detection) dataset consists of high resolution aerial images in which moving objects are labelled manually. It was created from videos selected from the Pexels website. The aim of this dataset is to provide a different and challenging dataset for moving object detection methods evaluation. Each moving object is labelled for each frame with PASCAL VOC format in a XML file. The dataset consists of 8 different video sequences.
R
Larch Casebearer Detection Dataset
universe.roboflow.com
zip
Updated May 23, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
KTUHomeWork (2024). Larch Casebearer Detection Dataset [Dataset]. https://universe.roboflow.com/ktuhomework/larch-casebearer-detection/dataset/1
Explore at:
zipAvailable download formats
Dataset updated
May 23, 2024
Dataset authored and provided by
KTUHomeWork
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Trees Bounding Boxes
Description
The larch casebearer, Coleophora laricella, is a moth that mainly attacks larch trees and has caused significant damage in larch stands in Västergötland, Sweden.

Original dataset of aerial drone images of larch forests was modified to remove duplicates or badly annotated images. Only images taken in may (20190527) are present here as they contain damage level classes. Also, the annotation files in Pascal VOC XML format were converted to YOLO PyTorch TXT format.

Images were taken in 5 areas around Västergötland, Sweden and the names of the files correspond to different areas:

B01 - Bebehojd

B02 - Ekbacka

B03 - Jallasvag

B04 - Kampe

B05 - Nordkap

There are 4 classes in total: * H - healthy larch trees * LD - light damage to larch trees * HD - high damage to larch trees * other - non-larch trees

Training, validation and testing images were selected from different acquisition sites to try creating a better generalizing detection model.

If you use these data in a publication or report, please use the following citation:

Swedish Forest Agency (2021): Forest Damages – Larch Casebearer 1.0. National Forest Data Lab. Dataset.

For questions about this data set, contact Halil Radogoshi (halil.radogoshi@skogsstyrelsen.se) at the Swedish Forest Agency.

More information about the dataset can be found on LILA BC page.
Z
ZeroCostDL4Mic - YoloV2 example training and test dataset
data.niaid.nih.gov
zenodo.org
Updated Jul 14, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lucas von Chamier (2020). ZeroCostDL4Mic - YoloV2 example training and test dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3941907
Explore at:
Dataset updated
Jul 14, 2020
Dataset provided by
Lucas von Chamier
Guillaume Jacquemet
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Name: ZeroCostDL4Mic - YoloV2 example training and test dataset

(see our Wiki for details)

Data type: 2D grayscale .png images with corresponding bounding box annotations in .xml PASCAL Voc format.

Microscopy data type: Phase contrast microscopy data (brightfield)

Microscope: Inverted Zeiss Axio zoom widefield microscope equipped with an AxioCam MRm camera, an EL Plan-Neofluar 20 × /0.5 NA objective (Carl Zeiss), with a heated chamber (37 °C) and a CO2 controller (5%).

Cell type: MDA-MB-231 cells migrating on cell-derived matrices generated by fibroblasts.

File format: .png (8-bit)

Image size: 1388 x 1040 px (323 nm)

Author(s): Guillaume Jacquemet1,2,3, Lucas von Chamier4,5

Contact email: lucas.chamier.13@ucl.ac.uk and guillaume.jacquemet@abo.fi

Affiliation(s):

1) Faculty of Science and Engineering, Cell Biology, Åbo Akademi University, 20520 Turku, Finland

2) Turku Bioscience Centre, University of Turku and Åbo Akademi University, FI-20520 Turku

3) ORCID: 0000-0002-9286-920X

4) MRC-Laboratory for Molecular Cell Biology. University College London, London, UK

5) ORCID: 0000-0002-9243-912X

Associated publications: Jacquemet et al 2016. DOI: 10.1038/ncomms13297

Funding bodies: G.J. was supported by grants awarded by the Academy of Finland, the Sigrid Juselius Foundation and Åbo Akademi University Research Foundation (CoE CellMech) and by Drug Discovery and Diagnostics strategic funding to Åbo Akademi University.
Z
Chinese Chemical Safety Signs (CCSS)
data.niaid.nih.gov
Updated Mar 21, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anonymous (2023). Chinese Chemical Safety Signs (CCSS) [Dataset]. https://data.niaid.nih.gov/resources?id=ZENODO_5482333
Explore at:
Dataset updated
Mar 21, 2023
Dataset authored and provided by
Anonymous
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Notice: We have currently a paper under double-blind review that introduces this dataset. Therefore, we have anonymized the dataset authorship. Once the review process has concluded, we will update the authorship information of this dataset.

Chinese Chemical Safety Signs (CCSS)

This dataset is compiled as a benchmark for recognizing chemical safety signs from images. We provide both the dataset and the experimental results at doi:10.5281/zenodo.5482334.

The Dataset

The complete dataset is contained in the folder ccss/data in archive css_data.zip. The images include signs based on the Chinese standard "Safety Signs and their Application Guidelines" (GB 2894-2008) for safety signs in chemical environments. This standard, in turn, refers to the standards ISO 7010 (Graphical symbols – Safety Colours and Safety Signs – Safety signs used in workplaces and public areas), GB/T 10001 (Public Information Graphic Symbols for Signs), and GB 13495 (Fire Safety Signs)

1.1. Image Collection

We collect photos commonly used chemical safety signs in chemical laboratories and chemical teaching buildings. For a discussion of the standards we base our collections, refer to the book "Talking about Hazardous Chemicals and Safety Signs" for common signs, and refer to the safety signs guidelines (GB 2894-2008).

The shooting was mainly carried out in 6 locations, namely on the road, in a parking lot, construction walls, in a chemical laboratory, outside near big machines, and inside the factory and corridor.

Shooting scale: Images in which the signs appear in small, medium and large scales were taken for each location by shooting photos from different distances.

Shooting light: good lighting conditions and poor lighting conditions were investigated.

Part of the images contain multiple targets and the other part contains only single signs.

Under all conditions, a total of 4650 photos were taken in the original data. These were expanded to 27'900 photos were via data enhancement. All images are located in folder ccss/data/JPEGImages.

The file ccss/data/features/enhanced_data_to_original_data.csv provides a mapping between the enhanced image name and the corresponding original image.

1.2. Annotation and Labelling

The labelling tool is Labelimg, which uses the PASCAL-VOC labelling format. The annotation is stored in the folder ccss/data/Annotations.

Faster R-CNN and SSD are two algorithms that use this format. When training YOLOv5, you can run trans_voc2yolo.py to convert the XML file in PASCAL-VOC format to a txt file.

We provide further meta-information about the dataset in form of a CSV file features.csv which notes, for each image, which other features it has (lighting conditions, scale, multiplicity, etc.).

1.3. Dataset Features

As stated above, the images have been shot under different conditions. We provide all the feature information in folder ccss/data/features. For each feature, there is a separate list of file names in that folder. The file ccss/data/features/features_on_original_data.csv is a CSV file which notes all the features of each original image.

1.4. Dataset Division

The data set is fixedly divided into 7:3 training set and test set. You can find the corresponding image names in the files ccss/data/training_data_file_names.txt and ccss/data/test_data_file_names.txt.

Baseline Experiments

We provide baseline results with the three models of Faster R-CNN, SSD, and YOLOv5. All code and results is given in folder ccss/experiment in archive ccss_experiment.

2.2. Environment and Configuration

Single Intel Core i7-8700 CPU

NVIDIA GTX1060 GPU

16 GB of RAM

Python: 3.8.10

pytorch: 1.9.0

pycocotools: pycocotools-win

Visual Studio 2017

Windows 10

2.3. Applied Models

The source codes and results of the applied models is given in folder ccss/experiment with sub-folders corresponding to the model names.

2.3.1. Faster R-CNN

backbone: resnet50+fpn.

we downloaded the pre-training weights from

we modify the type information of the JSON file to match our application.

run train_res50_fpn.py

finally, the weights trained by the training set.

backbone: mobilenetv2

the same training method as resnet50+fpn, but the effect is not as good as resnet50+fpn, so it is directly discarded.

The Faster R-CNN source code used in our experiment is given in folder ccss/experiment/sources/faster_rcnn. The weights of the fully-trained Faster R-CNN model are stored in file ccss/experiment/trained_models/faster_rcnn.pth. The performance measurements of Faster R-CNN are stored in folder ccss/experiment/performance_indicators/faster_rcnn.

2.3.2. SSD

backbone: resnet50

we downloaded pre-training weights from

the same training method as Faster R-CNN is applied.

The SSD source code used in our experiment is given in folder ccss/experiment/sources/ssd. The weights of the fully-trained SSD model are stored in file ccss/experiment/trained_models/ssd.pth. The performance measurements of SSD are stored in folder ccss/experiment/performance_indicators/ssd.

2.3.4. YOLOv5

backbone: CSP_DarkNet

we modified the type information of the YML file to match our application

run trans_voc2yolo.py to convert the XML file in VOC format to a txt file.

the weights used are: yolov5s.

The YOLOv5 source code used in our experiment is given in folder ccss/experiment/sources/yolov5. The weights of the fully-trained YOLOv5 model are stored in file ccss/experiment/trained_models/yolov5.pt. The performance measurements of YOLOv5 are stored in folder ccss/experiment/performance_indicators/yolov5.

2.4. Evaluation

The computed evaluation metrics as well as the code needed to compute them from our dataset are provided in the folder ccss/experiment/performance_indicators. They are provided over the complete test st as well as separately for the image features (over the test set).

Code Sources

Faster R-CNN

official code:

SSD

official code:

YOLOv5

We are particularly thankful to the author of the GitHub repository WZMIAOMIAO/deep-learning-for-image-processing (with whom we are not affiliated). Their instructive videos and codes were most helpful during our work. In particular, we based our own experimental codes on his work (and obtained permission to include it in this archive).

Licensing

While our dataset and results are published under the Creative Commons Attribution 4.0 License, this does not hold for the included code sources. These sources are under the particular license of the repository where they have been obtained from (see Section 3 above).
RFI_AI4QC dataset
zenodo.org
data.niaid.nih.gov
Updated Sep 27, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zenodo (2024). RFI_AI4QC dataset [Dataset]. http://doi.org/10.5281/zenodo.13848248
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.13848248
Dataset updated
Sep 27, 2024
Dataset provided by
Zenodohttp://zenodo.org/
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jul 1, 2024
Description

This dataset was used in the AI4QC project (Artificial Intelligence for Quality Control), in the context of RFI detection through an object detection task. It consists of a set of labeled RFIs (radio frequency interferences). These interferences are caused by man-made sources and can lead to an artefact in the satellite image, typically a bright rectangular pattern. Bounding boxes were defined around RFI artefacts in 3940 Sentinel-1 quick-looks (png images). A few "other anomalies" were identified as well, leaving a total of 11724 "RFI" bounding boxes and 301 "Other Anomalies" bounding boxes.

The labeled RFIs are available in three formats: PASCAL VOC (xml files), COCO (json files) and YOLO (txt files). Each is contained in a different zip file. The S1_images zip file contains the 3940 Sentinel-1 quick-looks. One can combine the label files (in a chosen format) with the S1 images to train object detection algorithms to automatically detect RFIs in a satellite image. A predefined train/test split is available (80% training and 20% testing), with training and testing zip folders containing the images and labels for each subset. The data was split according to 3 criterias: RFI over land vs sea, size of the RFI bounding boxes and geographic location.
u
Dataset of Drawings in Medieval Manuscripts (DMM)
fdr.uni-hamburg.de
png, zip
Updated Jan 21, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hussein Mohammed (2023). Dataset of Drawings in Medieval Manuscripts (DMM) [Dataset]. http://doi.org/10.25592/uhhfdm.11236
Explore at:
png, zipAvailable download formats
Unique identifier
https://doi.org/10.25592/uhhfdm.11236
Dataset updated
Jan 21, 2023
Dataset provided by
Universität Hamburg
Authors
Hussein Mohammed
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
A subset of 124 images has been selected from the DocExplore dataset in order to create a detection dataset of drawings in medieval manuscripts. Since the original dataset has been published without providing any annotations, we selected and annotated 8 different patterns in the subset, which resulted in a total of 268 annotated instance.

The dataset has been split into three sub-sets (train, validation, test) in order to follow the standard procedure followed by modern deep-learning models. Furthermore, each image has one corresponding ".xml" file containing the annotation information of drawings in that image using the Pascal Voc format.

The research for this work was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany's Excellence Strategy – EXC 2176 ‘Understanding Written Artefacts: Material, Interaction and Transmission in Manuscript Cultures', project no. 390893796. The research was conducted within the scope of the Centre for the Study of Manuscript Cultures (CSMC) at Universität Hamburg.

In addition, we thank Aneta Yotova for annotating the samples in this dataset.
r
MangoYOLO data set
researchdata.edu.au
acquire.cqu.edu.au
Updated Apr 8, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Z Wang; Kerry Walsh; C McCarthy; Anand Koirala (2021). MangoYOLO data set [Dataset]. https://researchdata.edu.au/mangoyolo-set
Explore at:
Dataset updated
Apr 8, 2021
Dataset provided by
Central Queensland University
Authors
Z Wang; Kerry Walsh; C McCarthy; Anand Koirala
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Datasets and directories are structured similar to the PASCAL VOC dataset, avoiding the need to change scripts already available, with the detection frameworks ready to parse PASCAL VOC annotations into their format.

The sub-directory JPEGImages consist of 1730 images (612x512 pixels) used for train, test and validation. Each image has at least one annotated fruit. The sub-directory Annotations consists of all the annotation files (record of bounding box coordinates for each image) in xml format and have the same name as the image name. The sub-directory Main consists of the text file that contains image names (without extension) used for train, test and validation. Training set (train.txt) lists 1300 train images Validation set (val.txt) lists 130 validation images Test set (test.txt) lists 300 test images

Each image has an XML annotation file (filename = image name) and each image set (training validation and test set) has associated text files (train.txt, val.txt and test.txt) containing the list of image names to be used for training and testing. The XML annotation file contains the image attributes (name, width, height), the object attributes (class name, object bounding box co-ordinates (xmin, ymin, xmax, ymax)). (xmin, ymin) and (xmax, ymax) are the pixel co-ordinates of the bounding box’s top-left corner and bottom-right corner respectively.
m
Tank Detection-Count Dataset
data.mendeley.com
Updated Feb 17, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
JAKARIA RABBI (2020). Tank Detection-Count Dataset [Dataset]. http://doi.org/10.17632/bkxj8z84m9.1
Explore at:
Unique identifier
https://doi.org/10.17632/bkxj8z84m9.1
Dataset updated
Feb 17, 2020
Authors
JAKARIA RABBI
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Tank Detection and Count Dataset contains 760 satellite image tiles of size 512*512 pixels and one-pixel cover 30cm*30cm at ground level. Each tile is associated with .xml and .txt files. Both .xml and .txt file contains the same annotations of oil/gas tanks but in a different format. .xml contains the pascal VOC format and in .txt file, every line contains the class of the tank and four coordinates of the bounding box: xmin, ymin, xmax, ymax.

Facebook

Twitter

Click to copy link

Link copied

Cite

Mcii34 (2024). PASCAL VOC-Formatted Object Detection Dataset [Dataset]. https://www.kaggle.com/datasets/mcii34/utilities-detection-voc-dataset

PASCAL VOC-Formatted Object Detection Dataset

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Dec 4, 2024

Dataset provided by

Kagglehttp://kaggle.com/

Authors

Mcii34

Description

Structure: VOC/ ├── Annotations # XML files with bounding box and class labels ├── JPEGImages # All images in .jpg format ├── ImageSets │ └── Main # Contains train.txt, val.txt, and test.txt
Content:
- Images: High-quality .jpg files stored in JPEGImages.
- Annotations: .xml files in Annotations folder that contain bounding box coordinates and object class names for each image.
- Splits: train.txt, val.txt, and test.txt in ImageSets/Main specify which images belong to each split for training, validation, and testing.

What Has Been Done

Organized Dataset:
- Ensured the dataset is structured according to the VOC format for seamless use with object detection frameworks.
Standardized XML Files:
- Updated the <path> tag in each XML file to reflect the correct relative path (e.g., JPEGImages/image1.jpg) or removed it if unnecessary.
- Ensured the <folder> tag is standardized (e.g., set to VOC) or removed for compatibility.
Created Train-Val-Test Splits:
- Generated train.txt, val.txt, and test.txt files in the ImageSets/Main directory.
- Applied stratified sampling to ensure an equal representation of object classes in each split.
Validated Class Distribution:
- Counted the number of images containing each object class in the training, validation, and testing splits to confirm balanced sampling.

Object Classes

This dataset contains the following object classes:

Class Name (English)	Class Name (Chinese)	Abbreviation
Manhole Cover	井盖 (jg)	jg
Crossing Light	人行灯 (rxd)	rxd
Pipeline Indicating Pile	地下管线桩 (dxgx)	dxgx
Traffic Signs	指示牌 (zsp)	zsp
Hydrant	消防栓 (xfs)	xfs
Camera	电子眼 (dzy)	dzy
Traffic Light	红绿灯 (lhd)	lhd
Guidepost	街道路名牌 (jdp)	jdp
Traffic Warning Sign	警示牌 (jsp)	jsp
Streetlamp	路灯 (ld)	ld
Communication Box	通讯箱 (txx)	txx

Clear search

Close search

Google apps

Main menu

PASCAL VOC-Formatted Object Detection Dataset

What Has Been Done

Object Classes

Zoo animals

Variable Message Signal annotated images for object detection

Ct For Lung Cancer Diagnosis (lung Pet Ct Dx) Pascal Voc Annotions Dataset

Data from: Roundabout Aerial Images for Vehicle Detection

Retail Classification

Context

Content

Inspiration

Data from: Annotated cows in aerial images for use in deep learning models

Turkish Pedestrian Dataset - TURPED

Mexican Sign Language's Dactylology and Ten First Numbers - Labeled images...

Iran-Vehicle-plate-dataset

Context

Personal Protective Equipment Dataset (PPED)

Data from: ODDS: Real-Time Object Detection using Depth Sensors on Embedded...

PESMOD (PExels Small Moving Object Detection)

Larch Casebearer Detection Dataset

ZeroCostDL4Mic - YoloV2 example training and test dataset

Chinese Chemical Safety Signs (CCSS)

RFI_AI4QC dataset

Dataset of Drawings in Medieval Manuscripts (DMM)

MangoYOLO data set

Tank Detection-Count Dataset

PASCAL VOC-Formatted Object Detection Dataset

What Has Been Done

Object Classes