28 datasets found
  1. PASCAL VOC-Formatted Object Detection Dataset

    • kaggle.com
    Updated Dec 4, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mcii34 (2024). PASCAL VOC-Formatted Object Detection Dataset [Dataset]. https://www.kaggle.com/datasets/mcii34/utilities-detection-voc-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 4, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Mcii34
    Description

    This dataset is formatted in the PASCAL VOC format, a widely-used structure for object detection tasks. It includes high-quality images, their corresponding bounding box annotations, and predefined splits for training, validation, and testing.

    • Structure: VOC/ ├── Annotations # XML files with bounding box and class labels ├── JPEGImages # All images in .jpg format ├── ImageSets │ └── Main # Contains train.txt, val.txt, and test.txt

    • Content:

      • Images: High-quality .jpg files stored in JPEGImages.
      • Annotations: .xml files in Annotations folder that contain bounding box coordinates and object class names for each image.
      • Splits: train.txt, val.txt, and test.txt in ImageSets/Main specify which images belong to each split for training, validation, and testing.

    What Has Been Done

    1. Organized Dataset:

      • Ensured the dataset is structured according to the VOC format for seamless use with object detection frameworks.
    2. Standardized XML Files:

      • Updated the <path> tag in each XML file to reflect the correct relative path (e.g., JPEGImages/image1.jpg) or removed it if unnecessary.
      • Ensured the <folder> tag is standardized (e.g., set to VOC) or removed for compatibility.
    3. Created Train-Val-Test Splits:

      • Generated train.txt, val.txt, and test.txt files in the ImageSets/Main directory.
      • Applied stratified sampling to ensure an equal representation of object classes in each split.
    4. Validated Class Distribution:

      • Counted the number of images containing each object class in the training, validation, and testing splits to confirm balanced sampling.

    Object Classes

    This dataset contains the following object classes:

    Class Name (English)Class Name (Chinese)Abbreviation
    Manhole Cover井盖 (jg)jg
    Crossing Light人行灯 (rxd)rxd
    Pipeline Indicating Pile地下管线桩 (dxgx)dxgx
    Traffic Signs指示牌 (zsp)zsp
    Hydrant消防栓 (xfs)xfs
    Camera电子眼 (dzy)dzy
    Traffic Light红绿灯 (lhd)lhd
    Guidepost街道路名牌 (jdp)jdp
    Traffic Warning Sign警示牌 (jsp)jsp
    Streetlamp路灯 (ld)ld
    Communication Box通讯箱 (txx)txx
  2. Zoo animals

    • kaggle.com
    Updated Mar 25, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jirka Daberger (2023). Zoo animals [Dataset]. https://www.kaggle.com/datasets/jirkadaberger/zoo-animals/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 25, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Jirka Daberger
    Description

    This dataset contains different animal categories such as: buffalo, capybara, cat, cow, deer, dog, elephant, flamingo, giraffe, jaguar, kangaroo, lion, parrot, penguin, rhino, sheep, tiger, turtle and zebra.

    Most of the images can be found in existing datasets: https://github.com/freds0/capybara_dataset https://universe.roboflow.com/miguel-narbot-usp-br/capybara-and-animals/dataset/1 https://www.kaggle.com/datasets/hugozanini1/kangaroodataset?resource=download https://github.com/experiencor/kangaroo https://universe.roboflow.com/z-jeans-pig/kangaroo-epscj/dataset/1 https://cvwc2019.github.io/challenge.html# https://www.kaggle.com/datasets/biancaferreira/african-wildlife https://universe.roboflow.com/new-workspace-5kofa/elephant-dataset/dataset/6 https://universe.roboflow.com/nathanael-hutama-harsono/large-cat/dataset/1/images/?split=train https://universe.roboflow.com/giraffs-and-cows/giraffes-and-cows/dataset/1 https://universe.roboflow.com/turtledetector/turtledetector/dataset/2 https://www.kaggle.com/datasets/smaranjitghose/sea-turtle-face-detection https://universe.roboflow.com/fadilyounes-me-gmail-com/zebra---savanna/dataset/1 https://universe.roboflow.com/test-qeryf/yolov5-9snhq https://universe.roboflow.com/or-the-king/two-zebras https://universe.roboflow.com/wild-animals-datasets/zebra-images/dataset/2 https://universe.roboflow.com/zebras/zebras/dataset/2 https://universe.roboflow.com/v2-rabotaem-xkxra/zebras_v2/dataset/5 https://universe.roboflow.com/vijay-vikas-mangena/animal_od_test1/dataset/1 https://universe.roboflow.com/bdoma13-gmail-com/rhino_horn/dataset/7 https://universe.roboflow.com/rudtkd134-naver-com/finalproject2/dataset/2 https://universe.roboflow.com/the-super-nekita/cats-brofl/dataset/2 https://universe.roboflow.com/lihi-gur-arie/pinguin-object-detection/dataset/2 https://universe.roboflow.com/utas-377cc/penguindataset-4dujc/dataset/10 https://universe.roboflow.com/new-workspace-tdyir/penguin-clfnj/dataset/1 https://universe.roboflow.com/utas-wd4sd/kit315_assignment/dataset/7 https://universe.roboflow.com/jeonjuuniv/deer-hqp4i/dataset/1 https://universe.roboflow.com/new-workspace-hqowp/sheeps/dataset/1 https://universe.roboflow.com/ali-eren-altindag/sheepstest2/dataset/1 https://universe.roboflow.com/yaser/sheep-0gudu/dataset/3 https://universe.roboflow.com/ali-eren-altindag/mixed_sheep/dataset/1 https://universe.roboflow.com/pkm-kc-2022/sapi-birahi/dataset/2 https://universe.roboflow.com/ghostikgh/team1_cows/dataset/5 https://universe.roboflow.com/ml-dlq4x/liontrain/dataset/2 https://universe.roboflow.com/animals/lionnew/dataset/2 https://universe.roboflow.com/parrottrening/parrot_trening/dataset/1 https://universe.roboflow.com/uet-hi8bg/parrots-r4tfl/dataset/1 https://universe.roboflow.com/superweight/parrot_poop/dataset/5 https://www.kaggle.com/datasets/tarunbisht11/intruder-detection

    From those datasets the images has been filtered (deleted objects of size smaller 32, images with dimension smaller than 320px has been deleted, images and labeled objects has been renamed). The rest of images has been labeled by me.

  3. Variable Message Signal annotated images for object detection

    • zenodo.org
    • portalcientifico.universidadeuropea.com
    zip
    Updated Oct 2, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gonzalo de las Heras de Matías; Gonzalo de las Heras de Matías; Javier Sánchez-Soriano; Javier Sánchez-Soriano; Enrique Puertas; Enrique Puertas (2022). Variable Message Signal annotated images for object detection [Dataset]. http://doi.org/10.5281/zenodo.5904211
    Explore at:
    zipAvailable download formats
    Dataset updated
    Oct 2, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Gonzalo de las Heras de Matías; Gonzalo de las Heras de Matías; Javier Sánchez-Soriano; Javier Sánchez-Soriano; Enrique Puertas; Enrique Puertas
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    If you use this dataset, please cite this paper: Puertas, E.; De-Las-Heras, G.; Sánchez-Soriano, J.; Fernández-Andrés, J. Dataset: Variable Message Signal Annotated Images for Object Detection. Data 2022, 7, 41. https://doi.org/10.3390/data7040041

    This dataset consists of Spanish road images taken from inside a vehicle, as well as annotations in XML files in PASCAL VOC format that indicate the location of Variable Message Signals within them. Also, a CSV file is attached with information regarding the geographic position, the folder where the image is located, and the text in Spanish. This can be used to train supervised learning computer vision algorithms, such as convolutional neural networks. Throughout this work, the process followed to obtain the dataset, image acquisition, and labeling, and its specifications are detailed. The dataset is constituted of 1216 instances, 888 positives, and 328 negatives, in 1152 jpg images with a resolution of 1280x720 pixels. These are divided into 576 real images and 576 images created from the data-augmentation technique. The purpose of this dataset is to help in road computer vision research since there is not one specifically for VMSs.

    The folder structure of the dataset is as follows:

    • vms_dataset/
      • data.csv
      • real_images/
        • imgs/
        • annotations/
      • data-augmentation/
        • imgs/
        • annotations/

    In which:

    • data.csv: Each row contains the following information separated by commas (,): image_name, x_min, y_min, x_max, y_max, class_name, lat, long, folder, text.
    • real_images: Images extracted directly from the videos.
    • data-augmentation: Images created using data-augmentation
    • imgs: Image files in .jpg format.
    • annotations: Annotation files in .xml format.
  4. R

    Ct For Lung Cancer Diagnosis (lung Pet Ct Dx) Pascal Voc Annotions Dataset

    • universe.roboflow.com
    zip
    Updated Jun 26, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mehmet Fatih AKCA (2021). Ct For Lung Cancer Diagnosis (lung Pet Ct Dx) Pascal Voc Annotions Dataset [Dataset]. https://universe.roboflow.com/mehmet-fatih-akca/yolotransfer/model/1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 26, 2021
    Dataset authored and provided by
    Mehmet Fatih AKCA
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Variables measured
    Cancer Bounding Boxes
    Description

    This dataset consists of CT and PET-CT DICOM images of lung cancer subjects with XML Annotation files that indicate tumor location with bounding boxes. The images were retrospectively acquired from patients with suspicion of lung cancer, and who underwent standard-of-care lung biopsy and PET/CT. Subjects were grouped according to a tissue histopathological diagnosis. Patients with Names/IDs containing the letter 'A' were diagnosed with Adenocarcinoma, 'B' with Small Cell Carcinoma, 'E' with Large Cell Carcinoma, and 'G' with Squamous Cell Carcinoma.

    The images were analyzed on the mediastinum (window width, 350 HU; level, 40 HU) and lung (window width, 1,400 HU; level, –700 HU) settings. The reconstructions were made in 2mm-slice-thick and lung settings. The CT slice interval varies from 0.625 mm to 5 mm. Scanning mode includes plain, contrast and 3D reconstruction.

    Before the examination, the patient underwent fasting for at least 6 hours, and the blood glucose of each patient was less than 11 mmol/L. Whole-body emission scans were acquired 60 minutes after the intravenous injection of 18F-FDG (4.44MBq/kg, 0.12mCi/kg), with patients in the supine position in the PET scanner. FDG doses and uptake times were 168.72-468.79MBq (295.8±64.8MBq) and 27-171min (70.4±24.9 minutes), respectively. 18F-FDG with a radiochemical purity of 95% was provided. Patients were allowed to breathe normally during PET and CT acquisitions. Attenuation correction of PET images was performed using CT data with the hybrid segmentation method. Attenuation corrections were performed using a CT protocol (180mAs,120kV,1.0pitch). Each study comprised one CT volume, one PET volume and fused PET and CT images: the CT resolution was 512 × 512 pixels at 1mm × 1mm, the PET resolution was 200 × 200 pixels at 4.07mm × 4.07mm, with a slice thickness and an interslice distance of 1mm. Both volumes were reconstructed with the same number of slices. Three-dimensional (3D) emission and transmission scanning were acquired from the base of the skull to mid femur. The PET images were reconstructed via the TrueX TOF method with a slice thickness of 1mm.

    The location of each tumor was annotated by five academic thoracic radiologists with expertise in lung cancer to make this dataset a useful tool and resource for developing algorithms for medical diagnosis. Two of the radiologists had more than 15 years of experience and the others had more than 5 years of experience. After one of the radiologists labeled each subject the other four radiologists performed a verification, resulting in all five radiologists reviewing each annotation file in the dataset. Annotations were captured using Labellmg. The image annotations are saved as XML files in PASCAL VOC format, which can be parsed using the PASCAL Development Toolkit: https://pypi.org/project/pascal-voc-tools/. Python code to visualize the annotation boxes on top of the DICOM images can be downloaded here.

    Two deep learning researchers used the images and the corresponding annotation files to train several well-known detection models which resulted in a maximum a posteriori probability (MAP) of around 0.87 on the validation set.

    Dataset link: https://wiki.cancerimagingarchive.net/pages/viewpage.action?pageId=70224216

  5. Data from: Roundabout Aerial Images for Vehicle Detection

    • zenodo.org
    • portalcientifico.universidadeuropea.com
    csv, xz
    Updated Oct 2, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gonzalo De-Las-Heras; Gonzalo De-Las-Heras; Javier Sánchez-Soriano; Javier Sánchez-Soriano; Enrique Puertas; Enrique Puertas (2022). Roundabout Aerial Images for Vehicle Detection [Dataset]. http://doi.org/10.5281/zenodo.6407460
    Explore at:
    csv, xzAvailable download formats
    Dataset updated
    Oct 2, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Gonzalo De-Las-Heras; Gonzalo De-Las-Heras; Javier Sánchez-Soriano; Javier Sánchez-Soriano; Enrique Puertas; Enrique Puertas
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    If you use this dataset, please cite this paper: Puertas, E.; De-Las-Heras, G.; Fernández-Andrés, J.; Sánchez-Soriano, J. Dataset: Roundabout Aerial Images for Vehicle Detection. Data 2022, 7, 47. https://doi.org/10.3390/data7040047

    This publication presents a dataset of Spanish roundabouts aerial images taken from an UAV, along with annotations in PASCAL VOC XML files that indicate the position of vehicles within them. Additionally, a CSV file is attached containing information related to the location and characteristics of the captured roundabouts. This work details the process followed to obtain them: image capture, processing and labeling. The dataset consists of 985,260 total instances: 947,400 cars, 19,596 cycles, 2,262 trucks, 7,008 buses and 2,208 empty roundabouts, in 61,896 1920x1080px JPG images. These are divided into 15,474 extracted images from 8 roundabouts with different traffic flows and 46,422 images created using data augmentation techniques. The purpose of this dataset is to help research on computer vision on the road, as such labeled images are not abundant. It can be used to train supervised learning models, such as convolutional neural networks, which are very popular in object detection.

    Roundabout (scenes)

    Frames

    Car

    Truck

    Cycle

    Bus

    Empty

    1 (00001)

    1,996

    34,558

    0

    4229

    0

    0

    2 (00002)

    514

    743

    0

    0

    0

    157

    3 (00003-00017)

    1,795

    4822

    58

    0

    0

    0

    4 (00018-00033)

    1,027

    6615

    0

    0

    0

    0

    5 (00034-00049)

    1,261

    2248

    0

    550

    0

    81

    6 (00050-00052)

    5,501

    180,342

    1420

    120

    1376

    0

    7 (00053)

    2,036

    5,789

    562

    0

    226

    92

    8 (00054)

    1,344

    1,733

    222

    0

    150

    222

    Total

    15,474

    236,850

    2,262

    4,899

    1,752

    552

    Data augmentation

    x4

    x4

    x4

    x4

    x4

    x4

    Total

    61,896

    947,400

    9048

    19,596

    7,008

    2,208

  6. Retail Classification

    • kaggle.com
    zip
    Updated Dec 8, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alitquan Mallick (2020). Retail Classification [Dataset]. https://www.kaggle.com/alitquanmallick/grocery-classifier
    Explore at:
    zip(953455521 bytes)Available download formats
    Dataset updated
    Dec 8, 2020
    Authors
    Alitquan Mallick
    Description

    Context

    Just a simple dataset to demonstrate object detection and classification in the retail environment, preferably using computer vision.

    Content

    This dataset contains resized images which have been annotated using LabelIMG. These resized images are founded in the directory 'ResizedImages' while corresponding XML notations in the Pascal VOC format. I used a YOLOv3 model to use this data. As of November 13, 2020, only three categories of products exists: 'can', 'shampoo', and 'spice.' Images vary in number of objects, with some images sporting only one object of one class, others sporting multiple object of the same class, and lastly, some sporting multiple objects of different classes.

    Inspiration

    The inspiration from this dataset was the need for a submission to the FLAIRS conference.

  7. H

    Data from: Annotated cows in aerial images for use in deep learning models

    • dataverse.harvard.edu
    Updated May 31, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    G.J. Franke; Sander Mucher (2021). Annotated cows in aerial images for use in deep learning models [Dataset]. http://doi.org/10.7910/DVN/N7GJYU
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 31, 2021
    Dataset provided by
    Harvard Dataverse
    Authors
    G.J. Franke; Sander Mucher
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    A large dataset containing aerial images from fields in Juchowo, Poland and Wageningen, the Netherlands, with annotated cows present in the images using Pascal VOC XML Annotation Format. This dataset has been used to train various Deep Learning models (nanonets, yolov3 and the like) as part of the GenTORE project (https://www.gentore.eu ) Please download all the files, then use 7-zip to unzip the multi part archive.

  8. Z

    Turkish Pedestrian Dataset - TURPED

    • data.niaid.nih.gov
    Updated Jul 19, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mustafa (2022). Turkish Pedestrian Dataset - TURPED [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6860744
    Explore at:
    Dataset updated
    Jul 19, 2022
    Dataset provided by
    M. Alper
    Mustafa
    Tuğçe
    Description

    Data abstract: This Zenodo upload contains the Turkish Pedestrian Dataset (TURPED) for benchmarking and developing pedestrian detection methods for autonomous driving assistance systems. There are three folders named "Annotations", "Image Sets" and "JPEGImages". The annotations folder includes the pedestrian labels for each image in an XML format. The standard Pascal VOC XML annotation format is chosen to ease of use. The TXT files in the image sets folder describe which images are in training/validation/test sets. Finally, the images can be found in the JPEGImages folder in JPEG format.

  9. m

    Mexican Sign Language's Dactylology and Ten First Numbers - Labeled images...

    • data.mendeley.com
    Updated May 30, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mario Rodriguez (2023). Mexican Sign Language's Dactylology and Ten First Numbers - Labeled images and videos. From person #1 to #5 [Dataset]. http://doi.org/10.17632/5s4mt7xrd9.1
    Explore at:
    Dataset updated
    May 30, 2023
    Authors
    Mario Rodriguez
    License

    Attribution-NonCommercial 3.0 (CC BY-NC 3.0)https://creativecommons.org/licenses/by-nc/3.0/
    License information was derived automatically

    Area covered
    Mexico
    Description

    The dataset comprises edited recordings of Mexican sign language's Dactylology (29 signs) and Ten First Numbers (from 1 to 10), including static and continuous signs accordingly from person 1 to person 5. The edited recordings are organized for easy access and management. Edited videos and screenshots of static signs are labeled in their file with corresponding sign language representations and stored in a consistent order per person, the number of the cycle of recording, and per hand. Static sign images can be exported in PASCAL VOC format with XML annotations too. The dataset is designed to facilitate feature extraction and further analysis in Mexican sign language recognition research.

  10. Iran-Vehicle-plate-dataset

    • kaggle.com
    zip
    Updated May 21, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Samyar Rahimi (2021). Iran-Vehicle-plate-dataset [Dataset]. https://www.kaggle.com/samyarr/iranvehicleplatedataset
    Explore at:
    zip(641496850 bytes)Available download formats
    Dataset updated
    May 21, 2021
    Authors
    Samyar Rahimi
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    Iran
    Description

    Context

    This dataset contains 313 images of Iranian vehicle plates in 224x224. Annotations are for 224x224 images and they are in PASCAL VOC XML format.

    Original images are 1280x1280 which do not have annotations.

    https://github.com/Samyarrahimi/Iran-Vehicle-plate-dataset

  11. Z

    Personal Protective Equipment Dataset (PPED)

    • data.niaid.nih.gov
    • zenodo.org
    Updated May 17, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anonymous (2022). Personal Protective Equipment Dataset (PPED) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6551757
    Explore at:
    Dataset updated
    May 17, 2022
    Dataset authored and provided by
    Anonymous
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Personal Protective Equipment Dataset (PPED)

    This dataset serves as a benchmark for PPE in chemical plants We provide datasets and experimental results.

    1. The dataset

    We produced a data set based on the actual needs and relevant regulations in chemical plants. The standard GB 39800.1-2020 formulated by the Ministry of Emergency Management of the People’s Republic of China defines the protective requirements for plants and chemical laboratories. The complete dataset is contained in the folder PPED/data.

    1.1. Image collection

    We took more than 3300 pictures. We set the following different characteristics, including different environments, different distances, different lighting conditions, different angles, and the diversity of the number of people photographed.

    Backgrounds: There are 4 backgrounds, including office, near machines, factory and regular outdoor scenes.

    Scale: By taking pictures from different distances, the captured PPEs are classified in small, medium and large scales.

    Light: Good lighting conditions and poor lighting conditions were studied.

    Diversity: Some images contain a single person, and some contain multiple people.

    Angle: The pictures we took can be divided into front and side.

    A total of more than 3300 photos were taken in the raw data under all conditions. All images are located in the folder “PPED/data/JPEGImages”.

    1.2. Label

    We use Labelimg as the labeling tool, and we use the PASCAL-VOC labelimg format. Yolo use the txt format, we can use trans_voc2yolo.py to convert the XML file in PASCAL-VOC format to txt file. Annotations are stored in the folder PPED/data/Annotations

    1.3. Dataset Features

    The pictures are made by us according to the different conditions mentioned above. The file PPED/data/feature.csv is a CSV file which notes all the .os of all the image. It records every feature of the picture, including lighting conditions, angles, backgrounds, number of people and scale.

    1.4. Dataset Division

    The data set is divided into 9:1 training set and test set.

    1. Baseline Experiments

    We provide baseline results with five models, namely Faster R-CNN ®, Faster R-CNN (M), SSD, YOLOv3-spp, and YOLOv5. All code and results is given in folder PPED/experiment.

    2.1. Environment and Configuration:

    Intel Core i7-8700 CPU

    NVIDIA GTX1060 GPU

    16 GB of RAM

    Python: 3.8.10

    pytorch: 1.9.0

    pycocotools: pycocotools-win

    Windows 10

    2.2. Applied Models

    The source codes and results of the applied models is given in folder PPED/experiment with sub-folders corresponding to the model names.

    2.2.1. Faster R-CNN

    Faster R-CNN

    backbone: resnet50+fpn

    We downloaded the pre-training weights from https://download.pytorch.org/models/fasterrcnn_resnet50_fpn_coco-258fb6c6.pth.

    We modified the dataset path, training classes and training parameters including batch size.

    We run train_res50_fpn.py start training.

    Then, the weights are trained by the training set.

    Finally, we validate the results on the test set.

    backbone: mobilenetv2

    the same training method as resnet50+fpn, but the effect is not as good as resnet50+fpn, so it is directly discarded.

    The Faster R-CNN source code used in our experiment is given in folder PPED/experiment/Faster R-CNN. The weights of the fully-trained Faster R-CNN (R), Faster R-CNN (M) model are stored in file PPED/experiment/trained_models/resNetFpn-model-19.pth and mobile-model.pth. The performance measurements of Faster R-CNN (R) Faster R-CNN (M) are stored in folder PPED/experiment/results/Faster RCNN(R)and Faster RCNN(M).

    2.2.2. SSD

    backbone: resnet50

    We downloaded pre-training weights from https://download.pytorch.org/models/resnet50-19c8e357.pth.

    The same training method as Faster R-CNN is applied.

    The SSD source code used in our experiment is given in folder PPED/experiment/ssd. The weights of the fully-trained SSD model are stored in file PPED/experiment/trained_models/SSD_19.pth. The performance measurements of SSD are stored in folder PPED/experiment/results/SSD.

    2.2.3. YOLOv3-spp

    backbone: DarkNet53

    We modified the type information of the XML file to match our application.

    We run trans_voc2yolo.py to convert the XML file in VOC format to a txt file.

    The weights used are: yolov3-spp-ultralytics-608.pt.

    The YOLOv3-spp source code used in our experiment is given in folder PPED/experiment/YOLOv3-spp. The weights of the fully-trained YOLOv3-spp model are stored in file PPED/experiment/trained_models/YOLOvspp-19.pt. The performance measurements of YOLOv3-spp are stored in folder PPED/experiment/results/YOLOv3-spp.

    2.2.4. YOLOv5

    backbone: CSP_DarkNet

    We modified the type information of the XML file to match our application.

    We run trans_voc2yolo.py to convert the XML file in VOC format to a txt file.

    The weights used are: yolov5s.

    The YOLOv5 source code used in our experiment is given in folder PPED/experiment/yolov5. The weights of the fully-trained YOLOv5 model are stored in file PPED/experiment/trained_models/YOLOv5.pt. The performance measurements of YOLOv5 are stored in folder PPED/experiment/results/YOLOv5.

    2.3. Evaluation

    The computed evaluation metrics as well as the code needed to compute them from our dataset are provided in the folder PPED/experiment/eval.

    1. Code Sources

    Faster R-CNN (R and M)

    https://github.com/WZMIAOMIAO/deep-learning-for-image-processing/tree/master/pytorch_object_detection/faster_rcnn

    official code: https://github.com/pytorch/vision/blob/main/torchvision/models/detection/faster_rcnn.py

    SSD

    https://github.com/WZMIAOMIAO/deep-learning-for-image-processing/tree/master/pytorch_object_detection/ssd

    official code: https://github.com/pytorch/vision/blob/main/torchvision/models/detection/ssd.py

    YOLOv3-spp

    https://github.com/WZMIAOMIAO/deep-learning-for-image-processing/tree/master/pytorch_object_detection/yolov3-spp

    YOLOv5

    https://github.com/ultralytics/yolov5

  12. Data from: ODDS: Real-Time Object Detection using Depth Sensors on Embedded...

    • zenodo.org
    • data.niaid.nih.gov
    application/gzip
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Niluthpol Chowdhury Mithun; Sirajum Munir; Karen Guo; Charles Shelton; Niluthpol Chowdhury Mithun; Sirajum Munir; Karen Guo; Charles Shelton (2020). ODDS: Real-Time Object Detection using Depth Sensors on Embedded GPUs [Dataset]. http://doi.org/10.5281/zenodo.1163770
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Niluthpol Chowdhury Mithun; Sirajum Munir; Karen Guo; Charles Shelton; Niluthpol Chowdhury Mithun; Sirajum Munir; Karen Guo; Charles Shelton
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    ODDS Smart Building Depth Dataset

    #Introduction:

    The goal of this dataset is to facilitate research focusing on recognizing objects in smart buildings using the depth sensor mounted at the ceiling. This dataset contains annotations of depth images for eight frequently seen object classes. The classes are: person, backpack, laptop, gun, phone, umbrella, cup, and box.


    #Data Collection:

    We collected data from two settings. We had Kinect mounted at a 9.3 feet ceiling near to a 6 feet wide door. We also used a tripod with a horizontal extender holding the kinect at a similar height looking downwards. We asked about 20 volunteers to enter and exit a number of times each in different directions (3 times walking straight, 3 times walking towards left side, 3 times walking towards right side) holding objects in many different ways and poses underneath the Kinect. Each subject was using his/her own backpack, purse, laptop, etc. As a result, we considered varieties within the same object, e.g., for laptops, we considered Macbooks, HP laptops, Lenovo laptops of different years and models, and for backpacks, we considered backpacks, side bags, and purse of women. We asked the subjects to walk while holding it in many ways, e.g., for laptop, the laptop was fully open, partially closed, and fully closed while carried. Also, people hold laptops in front and side of their bodies, and underneath their elbow. The subjects carried their backpacks in their back, in their side at different levels from foot to shoulder. We wanted to collect data with real guns. However, bringing real guns to the office is prohibited. So, we obtained a few nerf guns and the subjects were carrying these guns pointing it to front, side, up, and down while walking.


    #Annotated Data Description:

    The Annotated dataset is created following the structure of Pascal VOC devkit, so that the data preparation becomes simple and it can be used quickly with different with object detection libraries that are friendly to Pascal VOC style annotations (e.g. Faster-RCNN, YOLO, SSD). The annotated data consists of a set of images; each image has an annotation file giving a bounding box and object class label for each object in one of the eight classes present in the image. Multiple objects from multiple classes may be present in the same image. The dataset has 3 main directories:

    1)DepthImages: Contains all the images of training set and validation set.

    2)Annotations: Contains one xml file per image file, (e.g., 1.xml for image file 1.png). The xml file includes the bounding box annotations for all objects in the corresponding image.

    3)ImagesSets: Contains two text files training_samples.txt and testing_samples.txt. The training_samples.txt file has the name of images used in training and the testing_samples.txt has the name of images used for testing. (We randomly choose 80%, 20% split)


    #UnAnnotated Data Description:

    The un-annotated data consists of several set of depth images. No ground-truth annotation is available for these images yet. These un-annotated sets contain several challenging scenarios and no data has been collected from this office during annotated dataset construction. Hence, it will provide a way to test generalization performance of the algorithm.


    #Citation:

    If you use ODDS Smart Building dataset in your work, please cite the following reference in any publications:
    @inproceedings{mithun2018odds,
    title={ODDS: Real-Time Object Detection using Depth Sensors on Embedded GPUs},
    author={Niluthpol Chowdhury Mithun and Sirajum Munir and Karen Guo and Charles Shelton},
    booktitle={ ACM/IEEE Conference on Information Processing in Sensor Networks (IPSN)},
    year={2018},
    }

  13. O

    PESMOD (PExels Small Moving Object Detection)

    • opendatalab.com
    zip
    Updated Mar 6, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sakarya University (2023). PESMOD (PExels Small Moving Object Detection) [Dataset]. https://opendatalab.com/PESMOD
    Explore at:
    zip(3501709659 bytes)Available download formats
    Dataset updated
    Mar 6, 2023
    Dataset provided by
    Sakarya University
    Description

    The PESMOD (PExels Small Moving Object Detection) dataset consists of high resolution aerial images in which moving objects are labelled manually. It was created from videos selected from the Pexels website. The aim of this dataset is to provide a different and challenging dataset for moving object detection methods evaluation. Each moving object is labelled for each frame with PASCAL VOC format in a XML file. The dataset consists of 8 different video sequences.

  14. R

    Larch Casebearer Detection Dataset

    • universe.roboflow.com
    zip
    Updated May 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    KTUHomeWork (2024). Larch Casebearer Detection Dataset [Dataset]. https://universe.roboflow.com/ktuhomework/larch-casebearer-detection/dataset/1
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 23, 2024
    Dataset authored and provided by
    KTUHomeWork
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Trees Bounding Boxes
    Description

    The larch casebearer, Coleophora laricella, is a moth that mainly attacks larch trees and has caused significant damage in larch stands in Västergötland, Sweden.

    Original dataset of aerial drone images of larch forests was modified to remove duplicates or badly annotated images. Only images taken in may (20190527) are present here as they contain damage level classes. Also, the annotation files in Pascal VOC XML format were converted to YOLO PyTorch TXT format.

    Images were taken in 5 areas around Västergötland, Sweden and the names of the files correspond to different areas:

    • B01 - Bebehojd
    • B02 - Ekbacka
    • B03 - Jallasvag
    • B04 - Kampe
    • B05 - Nordkap

    There are 4 classes in total: * H - healthy larch trees * LD - light damage to larch trees * HD - high damage to larch trees * other - non-larch trees

    Training, validation and testing images were selected from different acquisition sites to try creating a better generalizing detection model.

    If you use these data in a publication or report, please use the following citation:

    Swedish Forest Agency (2021): Forest Damages – Larch Casebearer 1.0. National Forest Data Lab. Dataset.

    For questions about this data set, contact Halil Radogoshi (halil.radogoshi@skogsstyrelsen.se) at the Swedish Forest Agency.

    More information about the dataset can be found on LILA BC page.

  15. Z

    ZeroCostDL4Mic - YoloV2 example training and test dataset

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jul 14, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lucas von Chamier (2020). ZeroCostDL4Mic - YoloV2 example training and test dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3941907
    Explore at:
    Dataset updated
    Jul 14, 2020
    Dataset provided by
    Lucas von Chamier
    Guillaume Jacquemet
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Name: ZeroCostDL4Mic - YoloV2 example training and test dataset

    (see our Wiki for details)

    Data type: 2D grayscale .png images with corresponding bounding box annotations in .xml PASCAL Voc format.

    Microscopy data type: Phase contrast microscopy data (brightfield)

    Microscope: Inverted Zeiss Axio zoom widefield microscope equipped with an AxioCam MRm camera, an EL Plan-Neofluar 20 × /0.5 NA objective (Carl Zeiss), with a heated chamber (37 °C) and a CO2 controller (5%).

    Cell type: MDA-MB-231 cells migrating on cell-derived matrices generated by fibroblasts.

    File format: .png (8-bit)

    Image size: 1388 x 1040 px (323 nm)

    Author(s): Guillaume Jacquemet1,2,3, Lucas von Chamier4,5

    Contact email: lucas.chamier.13@ucl.ac.uk and guillaume.jacquemet@abo.fi

    Affiliation(s):

    1) Faculty of Science and Engineering, Cell Biology, Åbo Akademi University, 20520 Turku, Finland

    2) Turku Bioscience Centre, University of Turku and Åbo Akademi University, FI-20520 Turku

    3) ORCID: 0000-0002-9286-920X

    4) MRC-Laboratory for Molecular Cell Biology. University College London, London, UK

    5) ORCID: 0000-0002-9243-912X

    Associated publications: Jacquemet et al 2016. DOI: 10.1038/ncomms13297

    Funding bodies: G.J. was supported by grants awarded by the Academy of Finland, the Sigrid Juselius Foundation and Åbo Akademi University Research Foundation (CoE CellMech) and by Drug Discovery and Diagnostics strategic funding to Åbo Akademi University.

  16. Z

    Chinese Chemical Safety Signs (CCSS)

    • data.niaid.nih.gov
    Updated Mar 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anonymous (2023). Chinese Chemical Safety Signs (CCSS) [Dataset]. https://data.niaid.nih.gov/resources?id=ZENODO_5482333
    Explore at:
    Dataset updated
    Mar 21, 2023
    Dataset authored and provided by
    Anonymous
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Notice: We have currently a paper under double-blind review that introduces this dataset. Therefore, we have anonymized the dataset authorship. Once the review process has concluded, we will update the authorship information of this dataset.

    Chinese Chemical Safety Signs (CCSS)

    This dataset is compiled as a benchmark for recognizing chemical safety signs from images. We provide both the dataset and the experimental results at doi:10.5281/zenodo.5482334.

    1. The Dataset

    The complete dataset is contained in the folder ccss/data in archive css_data.zip. The images include signs based on the Chinese standard "Safety Signs and their Application Guidelines" (GB 2894-2008) for safety signs in chemical environments. This standard, in turn, refers to the standards ISO 7010 (Graphical symbols – Safety Colours and Safety Signs – Safety signs used in workplaces and public areas), GB/T 10001 (Public Information Graphic Symbols for Signs), and GB 13495 (Fire Safety Signs)

    1.1. Image Collection

    We collect photos commonly used chemical safety signs in chemical laboratories and chemical teaching buildings. For a discussion of the standards we base our collections, refer to the book "Talking about Hazardous Chemicals and Safety Signs" for common signs, and refer to the safety signs guidelines (GB 2894-2008).

    The shooting was mainly carried out in 6 locations, namely on the road, in a parking lot, construction walls, in a chemical laboratory, outside near big machines, and inside the factory and corridor.

    Shooting scale: Images in which the signs appear in small, medium and large scales were taken for each location by shooting photos from different distances.

    Shooting light: good lighting conditions and poor lighting conditions were investigated.

    Part of the images contain multiple targets and the other part contains only single signs.

    Under all conditions, a total of 4650 photos were taken in the original data. These were expanded to 27'900 photos were via data enhancement. All images are located in folder ccss/data/JPEGImages.

    The file ccss/data/features/enhanced_data_to_original_data.csv provides a mapping between the enhanced image name and the corresponding original image.

    1.2. Annotation and Labelling

    The labelling tool is Labelimg, which uses the PASCAL-VOC labelling format. The annotation is stored in the folder ccss/data/Annotations.

    Faster R-CNN and SSD are two algorithms that use this format. When training YOLOv5, you can run trans_voc2yolo.py to convert the XML file in PASCAL-VOC format to a txt file.

    We provide further meta-information about the dataset in form of a CSV file features.csv which notes, for each image, which other features it has (lighting conditions, scale, multiplicity, etc.).

    1.3. Dataset Features

    As stated above, the images have been shot under different conditions. We provide all the feature information in folder ccss/data/features. For each feature, there is a separate list of file names in that folder. The file ccss/data/features/features_on_original_data.csv is a CSV file which notes all the features of each original image.

    1.4. Dataset Division

    The data set is fixedly divided into 7:3 training set and test set. You can find the corresponding image names in the files ccss/data/training_data_file_names.txt and ccss/data/test_data_file_names.txt.

    1. Baseline Experiments

    We provide baseline results with the three models of Faster R-CNN, SSD, and YOLOv5. All code and results is given in folder ccss/experiment in archive ccss_experiment.

    2.2. Environment and Configuration

    Single Intel Core i7-8700 CPU

    NVIDIA GTX1060 GPU

    16 GB of RAM

    Python: 3.8.10

    pytorch: 1.9.0

    pycocotools: pycocotools-win

    Visual Studio 2017

    Windows 10

    2.3. Applied Models

    The source codes and results of the applied models is given in folder ccss/experiment with sub-folders corresponding to the model names.

    2.3.1. Faster R-CNN

    backbone: resnet50+fpn.

    we downloaded the pre-training weights from

    we modify the type information of the JSON file to match our application.

    run train_res50_fpn.py

    finally, the weights trained by the training set.

    backbone: mobilenetv2

    the same training method as resnet50+fpn, but the effect is not as good as resnet50+fpn, so it is directly discarded.

    The Faster R-CNN source code used in our experiment is given in folder ccss/experiment/sources/faster_rcnn. The weights of the fully-trained Faster R-CNN model are stored in file ccss/experiment/trained_models/faster_rcnn.pth. The performance measurements of Faster R-CNN are stored in folder ccss/experiment/performance_indicators/faster_rcnn.

    2.3.2. SSD

    backbone: resnet50

    we downloaded pre-training weights from

    the same training method as Faster R-CNN is applied.

    The SSD source code used in our experiment is given in folder ccss/experiment/sources/ssd. The weights of the fully-trained SSD model are stored in file ccss/experiment/trained_models/ssd.pth. The performance measurements of SSD are stored in folder ccss/experiment/performance_indicators/ssd.

    2.3.4. YOLOv5

    backbone: CSP_DarkNet

    we modified the type information of the YML file to match our application

    run trans_voc2yolo.py to convert the XML file in VOC format to a txt file.

    the weights used are: yolov5s.

    The YOLOv5 source code used in our experiment is given in folder ccss/experiment/sources/yolov5. The weights of the fully-trained YOLOv5 model are stored in file ccss/experiment/trained_models/yolov5.pt. The performance measurements of YOLOv5 are stored in folder ccss/experiment/performance_indicators/yolov5.

    2.4. Evaluation

    The computed evaluation metrics as well as the code needed to compute them from our dataset are provided in the folder ccss/experiment/performance_indicators. They are provided over the complete test st as well as separately for the image features (over the test set).

    1. Code Sources

    Faster R-CNN

    official code:

    SSD

    official code:

    YOLOv5

    We are particularly thankful to the author of the GitHub repository WZMIAOMIAO/deep-learning-for-image-processing (with whom we are not affiliated). Their instructive videos and codes were most helpful during our work. In particular, we based our own experimental codes on his work (and obtained permission to include it in this archive).

    1. Licensing

    While our dataset and results are published under the Creative Commons Attribution 4.0 License, this does not hold for the included code sources. These sources are under the particular license of the repository where they have been obtained from (see Section 3 above).

  17. RFI_AI4QC dataset

    • zenodo.org
    • data.niaid.nih.gov
    Updated Sep 27, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zenodo (2024). RFI_AI4QC dataset [Dataset]. http://doi.org/10.5281/zenodo.13848248
    Explore at:
    Dataset updated
    Sep 27, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jul 1, 2024
    Description
    This dataset was used in the AI4QC project (Artificial Intelligence for Quality Control), in the context of RFI detection through an object detection task. It consists of a set of labeled RFIs (radio frequency interferences). These interferences are caused by man-made sources and can lead to an artefact in the satellite image, typically a bright rectangular pattern. Bounding boxes were defined around RFI artefacts in 3940 Sentinel-1 quick-looks (png images). A few "other anomalies" were identified as well, leaving a total of 11724 "RFI" bounding boxes and 301 "Other Anomalies" bounding boxes.
    The labeled RFIs are available in three formats: PASCAL VOC (xml files), COCO (json files) and YOLO (txt files). Each is contained in a different zip file. The S1_images zip file contains the 3940 Sentinel-1 quick-looks. One can combine the label files (in a chosen format) with the S1 images to train object detection algorithms to automatically detect RFIs in a satellite image. A predefined train/test split is available (80% training and 20% testing), with training and testing zip folders containing the images and labels for each subset. The data was split according to 3 criterias: RFI over land vs sea, size of the RFI bounding boxes and geographic location.
  18. u

    Dataset of Drawings in Medieval Manuscripts (DMM)

    • fdr.uni-hamburg.de
    png, zip
    Updated Jan 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hussein Mohammed (2023). Dataset of Drawings in Medieval Manuscripts (DMM) [Dataset]. http://doi.org/10.25592/uhhfdm.11236
    Explore at:
    png, zipAvailable download formats
    Dataset updated
    Jan 21, 2023
    Dataset provided by
    Universität Hamburg
    Authors
    Hussein Mohammed
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    A subset of 124 images has been selected from the DocExplore dataset in order to create a detection dataset of drawings in medieval manuscripts. Since the original dataset has been published without providing any annotations, we selected and annotated 8 different patterns in the subset, which resulted in a total of 268 annotated instance.

    The dataset has been split into three sub-sets (train, validation, test) in order to follow the standard procedure followed by modern deep-learning models. Furthermore, each image has one corresponding ".xml" file containing the annotation information of drawings in that image using the Pascal Voc format.

    The research for this work was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany's Excellence Strategy – EXC 2176 ‘Understanding Written Artefacts: Material, Interaction and Transmission in Manuscript Cultures', project no. 390893796. The research was conducted within the scope of the Centre for the Study of Manuscript Cultures (CSMC) at Universität Hamburg.

    In addition, we thank Aneta Yotova for annotating the samples in this dataset.

  19. r

    MangoYOLO data set

    • researchdata.edu.au
    • acquire.cqu.edu.au
    Updated Apr 8, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Z Wang; Kerry Walsh; C McCarthy; Anand Koirala (2021). MangoYOLO data set [Dataset]. https://researchdata.edu.au/mangoyolo-set
    Explore at:
    Dataset updated
    Apr 8, 2021
    Dataset provided by
    Central Queensland University
    Authors
    Z Wang; Kerry Walsh; C McCarthy; Anand Koirala
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Datasets and directories are structured similar to the PASCAL VOC dataset, avoiding the need to change scripts already available, with the detection frameworks ready to parse PASCAL VOC annotations into their format.

    The sub-directory JPEGImages consist of 1730 images (612x512 pixels) used for train, test and validation. Each image has at least one annotated fruit. The sub-directory Annotations consists of all the annotation files (record of bounding box coordinates for each image) in xml format and have the same name as the image name. The sub-directory Main consists of the text file that contains image names (without extension) used for train, test and validation. Training set (train.txt) lists 1300 train images Validation set (val.txt) lists 130 validation images Test set (test.txt) lists 300 test images

    Each image has an XML annotation file (filename = image name) and each image set (training validation and test set) has associated text files (train.txt, val.txt and test.txt) containing the list of image names to be used for training and testing. The XML annotation file contains the image attributes (name, width, height), the object attributes (class name, object bounding box co-ordinates (xmin, ymin, xmax, ymax)). (xmin, ymin) and (xmax, ymax) are the pixel co-ordinates of the bounding box’s top-left corner and bottom-right corner respectively.

  20. m

    Tank Detection-Count Dataset

    • data.mendeley.com
    Updated Feb 17, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    JAKARIA RABBI (2020). Tank Detection-Count Dataset [Dataset]. http://doi.org/10.17632/bkxj8z84m9.1
    Explore at:
    Dataset updated
    Feb 17, 2020
    Authors
    JAKARIA RABBI
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Tank Detection and Count Dataset contains 760 satellite image tiles of size 512*512 pixels and one-pixel cover 30cm*30cm at ground level. Each tile is associated with .xml and .txt files. Both .xml and .txt file contains the same annotations of oil/gas tanks but in a different format. .xml contains the pascal VOC format and in .txt file, every line contains the class of the tank and four coordinates of the bounding box: xmin, ymin, xmax, ymax.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Mcii34 (2024). PASCAL VOC-Formatted Object Detection Dataset [Dataset]. https://www.kaggle.com/datasets/mcii34/utilities-detection-voc-dataset
Organization logo

PASCAL VOC-Formatted Object Detection Dataset

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 4, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Mcii34
Description

This dataset is formatted in the PASCAL VOC format, a widely-used structure for object detection tasks. It includes high-quality images, their corresponding bounding box annotations, and predefined splits for training, validation, and testing.

  • Structure: VOC/ ├── Annotations # XML files with bounding box and class labels ├── JPEGImages # All images in .jpg format ├── ImageSets │ └── Main # Contains train.txt, val.txt, and test.txt

  • Content:

    • Images: High-quality .jpg files stored in JPEGImages.
    • Annotations: .xml files in Annotations folder that contain bounding box coordinates and object class names for each image.
    • Splits: train.txt, val.txt, and test.txt in ImageSets/Main specify which images belong to each split for training, validation, and testing.

What Has Been Done

  1. Organized Dataset:

    • Ensured the dataset is structured according to the VOC format for seamless use with object detection frameworks.
  2. Standardized XML Files:

    • Updated the <path> tag in each XML file to reflect the correct relative path (e.g., JPEGImages/image1.jpg) or removed it if unnecessary.
    • Ensured the <folder> tag is standardized (e.g., set to VOC) or removed for compatibility.
  3. Created Train-Val-Test Splits:

    • Generated train.txt, val.txt, and test.txt files in the ImageSets/Main directory.
    • Applied stratified sampling to ensure an equal representation of object classes in each split.
  4. Validated Class Distribution:

    • Counted the number of images containing each object class in the training, validation, and testing splits to confirm balanced sampling.

Object Classes

This dataset contains the following object classes:

Class Name (English)Class Name (Chinese)Abbreviation
Manhole Cover井盖 (jg)jg
Crossing Light人行灯 (rxd)rxd
Pipeline Indicating Pile地下管线桩 (dxgx)dxgx
Traffic Signs指示牌 (zsp)zsp
Hydrant消防栓 (xfs)xfs
Camera电子眼 (dzy)dzy
Traffic Light红绿灯 (lhd)lhd
Guidepost街道路名牌 (jdp)jdp
Traffic Warning Sign警示牌 (jsp)jsp
Streetlamp路灯 (ld)ld
Communication Box通讯箱 (txx)txx
Search
Clear search
Close search
Google apps
Main menu