33 datasets found
  1. f

    Image datasets for training, validating and testing deep learning models to...

    • figshare.com
    bin
    Updated May 10, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kazuhide Mimura; Kentaro Nakamura; Kazutaka Yasukawa; Elizabeth C Sibert; Junichiro Ohta; Takahiro Kitazawa; Yasuhiro Kato (2023). Image datasets for training, validating and testing deep learning models to detect microfossil fish teeth and denticles called ichthyolith using YOLOv7 [Dataset]. http://doi.org/10.6084/m9.figshare.22736609.v1
    Explore at:
    binAvailable download formats
    Dataset updated
    May 10, 2023
    Dataset provided by
    figshare
    Authors
    Kazuhide Mimura; Kentaro Nakamura; Kazutaka Yasukawa; Elizabeth C Sibert; Junichiro Ohta; Takahiro Kitazawa; Yasuhiro Kato
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This repository contains datasets to train, validate and test deep learning models to detect microfossil fish teeth and denticles called "ichthyoliths". All the dataset contains images of glass slides prepared from deep-sea sediment obtained from Pacific Ocean, and annotation label files formatted to YOLO. 01_original_all The dataset contains 12219 images and 6945 label files. 6945 images include at least one ichthyolith and 5274 images include no ichthyolith. These images and label files were randomly split into three subset, "train" that contains 9740 images and 5551 label files, "val" that contains 1235 images and 695 label files and "test" that contains 1244 images and 699 label files. All the images were selected manually. 02_original_selected This dataset is generated from 01_original_all by removing images without ichthyoliths. The dataset contains 6945 images that include at least one ichthyolith and 6945 label files. The dataset contains three subset, "train" that contains 5551 images and label files, "val" that contains 695 images and label files and "test" that contains 699 images and label files. 03_extended_all This dataset is generated from 01_original_all by adding 4463 images detected by deep learning models. The dataset contains 16682 images and 9473 label files. 9473 images include at least one ichthyolith and 7209 images include no ichthyolith. These images and label files were split into three subset, "train" that contains 13332 images and 7594 label files, "val" that contains 1690 images and 947 label files and "test" that contains 1660 images and 932 label files. Label files were checked manually. 04_extended_selected This dataset is generated from 03_extended_all by removing images without ichthyoliths. The dataset contains 9473 images that include at least one ichthyolith and 9473 label files. The dataset contains three subset, "train" that contains 7594 images and label files, "val" that contains 947 images and label files and "test" that contains 932 images and label files.

  2. f

    Chemistry Lab Image Dataset Covering 25 Apparatus Categories

    • figshare.com
    application/x-rar
    Updated May 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Md. Sakhawat Hossain; Md. Sadman Haque; Md. Mostafizur Rahman; Md. Mosaddik Mashrafi Mousum; Zobaer Ibn Razzaque; Robiul Awoul Robin (2025). Chemistry Lab Image Dataset Covering 25 Apparatus Categories [Dataset]. http://doi.org/10.6084/m9.figshare.29110433.v1
    Explore at:
    application/x-rarAvailable download formats
    Dataset updated
    May 20, 2025
    Dataset provided by
    figshare
    Authors
    Md. Sakhawat Hossain; Md. Sadman Haque; Md. Mostafizur Rahman; Md. Mosaddik Mashrafi Mousum; Zobaer Ibn Razzaque; Robiul Awoul Robin
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains 4,599 high-quality, annotated images of 25 commonly used chemistry lab apparatuses. The images, each containing structures in real-world settings, have been captured from different angles, backgrounds, and distances, while also undergoing variations in lighting to aid in the robustness of object detection models. Every image has been labeled using bounding box annotation in YOLO and COCO format, alongside the class IDs and normalized bounding box coordinates making object detection more precise. The annotations and bounding boxes have been built using the Roboflow platform.To achieve a better learning procedure, the dataset has been split into three sub-datasets: training, validation, and testing. The training dataset constitutes 70% of the entire dataset, with validation and testing at 20% and 10% respectively. In addition, all images undergo scaling to a standard of 640x640 pixels while being auto-oriented to rectify rotation discrepancies brought about by the EXIF metadata. The dataset is structured in three main folders - train, valid, and test, and each contains images/ and labels/ subfolders. Every image contains a label file containing class and bounding box data corresponding to each detected object.The whole dataset features 6,960 labeled instances per 25 apparatus categories including beakers, conical flasks, measuring cylinders, test tubes, among others. The dataset can be utilized for the development of automation systems, real-time monitoring and tracking systems, tools for safety monitoring, alongside AI educational tools.

  3. MegaWeeds dataset

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Apr 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sophie Wildeboer; Sophie Wildeboer (2025). MegaWeeds dataset [Dataset]. http://doi.org/10.5281/zenodo.8077195
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 24, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Sophie Wildeboer; Sophie Wildeboer
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The MegaWeeds dataset consists of seven existing datasets:

    - WeedCrop dataset; Sudars, K., Jasko, J., Namatevs, I., Ozola, L., & Badaukis, N. (2020). Dataset of annotated food crops and weed images for robotic computer vision control. Data in Brief, 31, 105833. https://doi.org/https://doi.org/10.1016/j.dib.2020.105833

    - Chicory dataset; Gallo, I., Rehman, A. U., Dehkord, R. H., Landro, N., La Grassa, R., & Boschetti, M. (2022). Weed detection by UAV 416a Image Dataset. https://universe.roboflow.com/chicory-crop-weeds-5m7vo/weed-detection-by-uav-416a/dataset/1

    - Sesame dataset; Utsav, P., Raviraj, P., & Rayja, M. (2020). crop and weed detection data with bounding boxes. https://www.kaggle.com/datasets/ravirajsinh45/crop-and-weed-detection-data-with-bounding-boxes

    - Sugar beet dataset; Wangyongkun. (2020). sugarbeetsAndweeds. https://www.kaggle.com/datasets/wangyongkun/sugarbeetsandweeds

    - Weed-Detection-v2; Tandon, K. (2021, June). Weed_Detection_v2. https://www.kaggle.com/datasets/kushagratandon12/weed-detection-v2

    - Maize dataset; Correa, J. M. L., D. Andújar, M. Todeschini, J. Karouta, JM Begochea, & Ribeiro A. (2021). WeedMaize. Zenodo. https://doi.org/10.5281/ZENODO.5106795

    - CottonWeedDet12 dataset; Dang, F., Chen, D., Lu, Y., & Li, Z. (2023). YOLOWeeds: A novel benchmark of YOLO object detectors for multi-class weed detection in cotton production systems. Computers and Electronics in Agriculture, 205, 107655. https://doi.org/https://doi.org/10.1016/j.compag.2023.107655

    All the datasets contain open-field images from crops and weeds with annotations. The annotation files were converted to text files so it can be used in the YOLO model. All the datasets were combined into one big dataset with in total 19,317 images. The dataset is split into a training and validation set.

  4. TACO Dataset YOLO Format

    • kaggle.com
    Updated May 25, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marionette 👺 (2023). TACO Dataset YOLO Format [Dataset]. https://www.kaggle.com/datasets/vencerlanz09/taco-dataset-yolo-format/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 25, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Marionette 👺
    Description

    The TACO (Trash Annotations in Context) dataset, now made available in YOLO (You Only Look Once) format on Kaggle, is a comprehensive dataset that is designed for the detection and classification of litter (trash). Originally provided by 'Papers with Code', this version has been processed for direct usage in YOLO-based object detection models.

    TACO comprises a diverse range of high-resolution images of various types of litter in different contexts and environments. The dataset encompasses a broad variety of litter categories that are commonly found in our surroundings, making it a valuable asset for training models for environmental cleanup and monitoring purposes.

    Each image in this dataset is associated with a respective annotation file (.txt file), as per the YOLO dataset standard. These annotation files contain the coordinates of bounding boxes for the litter present in the image and the respective classes of this litter. The bounding box annotations are normalized according to the image size, ranging from 0 to 1.

    The primary goal of this dataset is to support the development of robust and accurate object detection models for litter identification and classification. This can help create effective solutions for environmental problems such as pollution and littering, and potentially contribute to the development of automated cleanup systems.

    Although the dataset isn't split into separate training, validation, or testing subsets, users are encouraged to make such divisions as per their model development requirements.

    Please abide by the terms and conditions specified by the original dataset providers when using this dataset. If you find this dataset beneficial for your research or project, do consider citing the original source to acknowledge the creators' efforts.

  5. h

    Side-view-Pigs

    • huggingface.co
    Updated Jun 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mao (2025). Side-view-Pigs [Dataset]. https://huggingface.co/datasets/MingweiMao/Side-view-Pigs
    Explore at:
    Dataset updated
    Jun 19, 2025
    Authors
    Mao
    License

    https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/

    Description

    This dataset consists of pig farming images captured from a side-view perspective.

      After downloading the dataset, place the images and labels in the 'JPEGImages' and 'Annotations' folders under 'VOCdevkit/VOC2007'.
    
    
    
    
    
      Running 'VOC.py' will categorize the data into training, validation, and test datasets according to specified ratios in VOC format.
    
    
    
    
    
      Running 'voc-yolo.py' will categorize the data into training, validation, and test datasets in YOLO format… See the full description on the dataset page: https://huggingface.co/datasets/MingweiMao/Side-view-Pigs.
    
  6. R

    Cdd Dataset

    • universe.roboflow.com
    zip
    Updated Sep 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    hakuna matata (2023). Cdd Dataset [Dataset]. https://universe.roboflow.com/hakuna-matata/cdd-g8a6g/3
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 5, 2023
    Dataset authored and provided by
    hakuna matata
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Cumcumber Diease Detection Bounding Boxes
    Description

    Project Documentation: Cucumber Disease Detection

    1. Title and Introduction Title: Cucumber Disease Detection

    Introduction: A machine learning model for the automatic detection of diseases in cucumber plants is to be developed as part of the "Cucumber Disease Detection" project. This research is crucial because it tackles the issue of early disease identification in agriculture, which can increase crop yield and cut down on financial losses. To train and test the model, we use a dataset of pictures of cucumber plants.

    1. Problem Statement Problem Definition: The research uses image analysis methods to address the issue of automating the identification of diseases, including Downy Mildew, in cucumber plants. Effective disease management in agriculture depends on early illness identification.

    Importance: Early disease diagnosis helps minimize crop losses, stop the spread of diseases, and better allocate resources in farming. Agriculture is a real-world application of this concept.

    Goals and Objectives: Develop a machine learning model to classify cucumber plant images into healthy and diseased categories. Achieve a high level of accuracy in disease detection. Provide a tool for farmers to detect diseases early and take appropriate action.

    1. Data Collection and Preprocessing Data Sources: The dataset comprises of pictures of cucumber plants from various sources, including both healthy and damaged specimens.

    Data Collection: Using cameras and smartphones, images from agricultural areas were gathered.

    Data Preprocessing: Data cleaning to remove irrelevant or corrupted images. Handling missing values, if any, in the dataset. Removing outliers that may negatively impact model training. Data augmentation techniques applied to increase dataset diversity.

    1. Exploratory Data Analysis (EDA) The dataset was examined using visuals like scatter plots and histograms. The data was examined for patterns, trends, and correlations. Understanding the distribution of photos of healthy and ill plants was made easier by EDA.

    2. Methodology Machine Learning Algorithms:

    Convolutional Neural Networks (CNNs) were chosen for image classification due to their effectiveness in handling image data. Transfer learning using pre-trained models such as ResNet or MobileNet may be considered. Train-Test Split:

    The dataset was split into training and testing sets with a suitable ratio. Cross-validation may be used to assess model performance robustly.

    1. Model Development The CNN model's architecture consists of layers, units, and activation operations. On the basis of experimentation, hyperparameters including learning rate, batch size, and optimizer were chosen. To avoid overfitting, regularization methods like dropout and L2 regularization were used.

    2. Model Training During training, the model was fed the prepared dataset across a number of epochs. The loss function was minimized using an optimization method. To ensure convergence, early halting and model checkpoints were used.

    3. Model Evaluation Evaluation Metrics:

    Accuracy, precision, recall, F1-score, and confusion matrix were used to assess model performance. Results were computed for both training and test datasets. Performance Discussion:

    The model's performance was analyzed in the context of disease detection in cucumber plants. Strengths and weaknesses of the model were identified.

    1. Results and Discussion Key project findings include model performance and disease detection precision. a comparison of the many models employed, showing the benefits and drawbacks of each. challenges that were faced throughout the project and the methods used to solve them.

    2. Conclusion recap of the project's key learnings. the project's importance to early disease detection in agriculture should be highlighted. Future enhancements and potential research directions are suggested.

    3. References Library: Pillow,Roboflow,YELO,Sklearn,matplotlib Datasets:https://data.mendeley.com/datasets/y6d3z6f8z9/1

    4. Code Repository https://universe.roboflow.com/hakuna-matata/cdd-g8a6g

    Rafiur Rahman Rafit EWU 2018-3-60-111

  7. Smoke-Fire-Detection-YOLO

    • kaggle.com
    Updated Jan 27, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sayed Gamal (2025). Smoke-Fire-Detection-YOLO [Dataset]. https://www.kaggle.com/datasets/sayedgamal99/smoke-fire-detection-yolo/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 27, 2025
    Dataset provided by
    Kaggle
    Authors
    Sayed Gamal
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    D-Fire Dataset for Smoke and Fire Detection

    This dataset is an enhanced version of the original D-Fire dataset, designed to facilitate smoke and fire detection tasks. It has been restructured to include a validation split, making it more accessible and user-friendly.

    Explore Flare Guard

    Introducing Flare Guard — an advanced, open-source solution for real-time fire and smoke detection.

    This system uses YOLOv11, an advanced object detection model, to monitor live video feeds and detect fire hazards in real-time. Detected threats trigger instant alerts via Telegram and WhatsApp for rapid response.

    🔗 Quick Access Links

    Example of reached Results:

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12748471%2F632cfe5056cc683123c1873547d670ce%2Falert_20250210-034709-167281.jpg?generation=1742122420748481&alt=media" alt="CCVT_EXAMPLE">

    Directory Structure

    The dataset is organized as follows:

    • train/
      • images/: Training images
      • labels/: Training labels in YOLO format
    • val/
      • images/: Validation images
      • labels/: Validation labels in YOLO format
    • test/
      • images/: Test images
      • labels/: Test labels in YOLO format

    Classes

    The dataset includes annotations for the following classes:

    • 0: Smoke
    • 1: Fire

    Dataset Statistics

    The dataset comprises over 21,000 images, categorized as follows:

    CategoryNumber of Images
    Only fire1,164
    Only smoke5,867
    Fire and smoke4,658
    None9,838

    Total bounding boxes:

    • Fire: 14,692
    • Smoke: 11,865

    Data Splits

    The dataset is divided into training, validation, and test sets to support model development and evaluation.

    Citation

    If you use this dataset in your research or projects, please cite the original paper:

    Pedro Vinícius Almeida Borges de Venâncio, Adriano Chaves Lisboa, Adriano Vilela Barbosa. "An automatic fire detection system based on deep convolutional neural networks for low-power, resource-constrained devices." Neural Computing and Applications, vol. 34, no. 18, 2022, pp. 15349–15368. DOI: 10.1007/s00521-022-07467-z.

    Acknowledgments

    Credit for the original dataset goes to the researchers from Gaia, solutions on demand (GAIA). The original dataset and more information can be found in the D-Fire GitHub repository.

  8. R

    Accident Detection Model Dataset

    • universe.roboflow.com
    zip
    Updated Apr 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Accident detection model (2024). Accident Detection Model Dataset [Dataset]. https://universe.roboflow.com/accident-detection-model/accident-detection-model/model/1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 8, 2024
    Dataset authored and provided by
    Accident detection model
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Accident Bounding Boxes
    Description

    Accident-Detection-Model

    Accident Detection Model is made using YOLOv8, Google Collab, Python, Roboflow, Deep Learning, OpenCV, Machine Learning, Artificial Intelligence. It can detect an accident on any accident by live camera, image or video provided. This model is trained on a dataset of 3200+ images, These images were annotated on roboflow.

    Problem Statement

    • Road accidents are a major problem in India, with thousands of people losing their lives and many more suffering serious injuries every year.
    • According to the Ministry of Road Transport and Highways, India witnessed around 4.5 lakh road accidents in 2019, which resulted in the deaths of more than 1.5 lakh people.
    • The age range that is most severely hit by road accidents is 18 to 45 years old, which accounts for almost 67 percent of all accidental deaths.

    Accidents survey

    https://user-images.githubusercontent.com/78155393/233774342-287492bb-26c1-4acf-bc2c-9462e97a03ca.png" alt="Survey">

    Literature Survey

    • Sreyan Ghosh in Mar-2019, The goal is to develop a system using deep learning convolutional neural network that has been trained to identify video frames as accident or non-accident.
    • Deeksha Gour Sep-2019, uses computer vision technology, neural networks, deep learning, and various approaches and algorithms to detect objects.

    Research Gap

    • Lack of real-world data - We trained model for more then 3200 images.
    • Large interpretability time and space needed - Using google collab to reduce interpretability time and space required.
    • Outdated Versions of previous works - We aer using Latest version of Yolo v8.

    Proposed methodology

    • We are using Yolov8 to train our custom dataset which has been 3200+ images, collected from different platforms.
    • This model after training with 25 iterations and is ready to detect an accident with a significant probability.

    Model Set-up

    Preparing Custom dataset

    • We have collected 1200+ images from different sources like YouTube, Google images, Kaggle.com etc.
    • Then we annotated all of them individually on a tool called roboflow.
    • During Annotation we marked the images with no accident as NULL and we drew a box on the site of accident on the images having an accident
    • Then we divided the data set into train, val, test in the ratio of 8:1:1
    • At the final step we downloaded the dataset in yolov8 format.
      #### Using Google Collab
    • We are using google colaboratory to code this model because google collab uses gpu which is faster than local environments.
    • You can use Jupyter notebooks, which let you blend code, text, and visualisations in a single document, to write and run Python code using Google Colab.
    • Users can run individual code cells in Jupyter Notebooks and quickly view the results, which is helpful for experimenting and debugging. Additionally, they enable the development of visualisations that make use of well-known frameworks like Matplotlib, Seaborn, and Plotly.
    • In Google collab, First of all we Changed runtime from TPU to GPU.
    • We cross checked it by running command ‘!nvidia-smi’
      #### Coding
    • First of all, We installed Yolov8 by the command ‘!pip install ultralytics==8.0.20’
    • Further we checked about Yolov8 by the command ‘from ultralytics import YOLO from IPython.display import display, Image’
    • Then we connected and mounted our google drive account by the code ‘from google.colab import drive drive.mount('/content/drive')’
    • Then we ran our main command to run the training process ‘%cd /content/drive/MyDrive/Accident Detection model !yolo task=detect mode=train model=yolov8s.pt data= data.yaml epochs=1 imgsz=640 plots=True’
    • After the training we ran command to test and validate our model ‘!yolo task=detect mode=val model=runs/detect/train/weights/best.pt data=data.yaml’ ‘!yolo task=detect mode=predict model=runs/detect/train/weights/best.pt conf=0.25 source=data/test/images’
    • Further to get result from any video or image we ran this command ‘!yolo task=detect mode=predict model=runs/detect/train/weights/best.pt source="/content/drive/MyDrive/Accident-Detection-model/data/testing1.jpg/mp4"’
    • The results are stored in the runs/detect/predict folder.
      Hence our model is trained, validated and tested to be able to detect accidents on any video or image.

    Challenges I ran into

    I majorly ran into 3 problems while making this model

    • I got difficulty while saving the results in a folder, as yolov8 is latest version so it is still underdevelopment. so i then read some blogs, referred to stackoverflow then i got to know that we need to writ an extra command in new v8 that ''save=true'' This made me save my results in a folder.
    • I was facing problem on cvat website because i was not sure what
  9. S

    An open flame and smoke detection dataset for deep learning in remote...

    • scidb.cn
    Updated Aug 2, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ming Wang; Peng Yue; Liangcun Jiang; Dayu Yu; Tianyu Tuo (2022). An open flame and smoke detection dataset for deep learning in remote sensing based fire detection [Dataset]. http://doi.org/10.57760/sciencedb.j00104.00103
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 2, 2022
    Dataset provided by
    Science Data Bank
    Authors
    Ming Wang; Peng Yue; Liangcun Jiang; Dayu Yu; Tianyu Tuo
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    FASDD is a largest and most generalized Flame And Smoke Detection Dataset for object detection tasks, characterized by the utmost complexity in fire scenes, the highest heterogeneity in feature distribution, and the most significant variations in image size and shape. FASDD serves as a benchmark for developing advanced fire detection models, which can be deployed on watchtowers, drones, or satellites in a space-air-ground integrated observation network for collaborative fire warning. This endeavor provides valuable insights for government decision-making and fire rescue operations. FASDD contains fire, smoke, and confusing non-fire/non-smoke images acquired at different distances (near and far), different scenes (indoor and outdoor), different light intensities (day and night), and from various visual sensors (surveillance cameras, UAVs, and satellites). FASDD consists of three sub-datasets, a Computer Vision (CV) dataset (i.e. FASDD_CV), a Unmanned Aerial Vehicle (UAV) dataset (i.e. FASDD_UAV), and an Remote Sensing (RS) dataset (i.e. FASDD_RS). FASDD comprises 122,634 samples, with 70,581 annotated as positive samples and 52,073 labeled as negative samples. There are 113,154 instances of flame objects and 73,072 instances of smoke objects in the entire dataset. FASDD_CV contains 95,314 samples for general computer vision, while FASDD_UAV consists of 25,097 samples captured by UAV, and FASDD_RS comprises 2,223 samples from satellite imagery. FASDD_CV contains 73,297 fire instances and 53,080 smoke instances. The CV dataset exhibits considerable variation in image size, ranging from 78 to 10,600 pixels in width and 68 to 8,858 pixels in height. The aspect ratios of the images also vary significantly, ranging from 1:6.6 to 1:0.18. FASDD_UAV contains 36,308 fire instances and 17,222 smoke instances, with image aspect ratios primarily distributed between 4:3 and 16:9. In FASDD_RS, there are 2,770 smoke instances and 3,549 flame instances. The sizes of remote sensing images are predominantly around 1,000×1,000 pixels.FASDD is provided in three compressed files: FASDD_CV.zip, FASDD_UAV.zip, and FASDD_RS.zip, which correspond to the CV dataset, the UAV dataset, and the RS dataset, respectively. Additionally, there is a FASDD_RS_SWIR. zip folder storing pseudo-color images for detecting flame objects in remote sensing imagery. Each zip file contains two folders: "images" for storing the source data and "annotations" for storing the labels. The "annotations" folder consists of label files in four formats: YOLO, VOC, COCO, and TDML. The dataset is divided randomly into training, validation, and test sets, with a ratio of 1/2, 1/3, and 1/6, respectively, within each label format. In FASDD_CV, FASDD_UAV, and FASDD_RS, images and their corresponding annotation files have been individually sorted starting from 0. The flame and smoke objects in FASDD are given the labels "fire" and "smoke" for the object detection task, respectively. The names of all images and annotation files are prefixed with "Fire", "Smoke", "FireAndSmoke", and "NeitherFireNorSmoke", representing different categories for scene classification tasks.When using this dataset, please cite the following paper. Thank you very much for your support and cooperation:################################################################################使用数据集请引用对应论文,非常感谢您的关注和支持:Wang, M., Yue, P., Jiang, L., Yu, D., Tuo, T., & Li, J. (2024). An open flame and smoke detection dataset for deep learning in remote sensing based fire detection. Geo-spatial Information Science, 1-16.################################################################################

  10. m

    Camellia oleifera fruit posture detection dataset

    • data.mendeley.com
    Updated Mar 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shouxiang Jin (2025). Camellia oleifera fruit posture detection dataset [Dataset]. http://doi.org/10.17632/65mwb97zwz.1
    Explore at:
    Dataset updated
    Mar 4, 2025
    Authors
    Shouxiang Jin
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The dataset consists of Camellia oleifera fruit images and labels used for YOLO detection model training. The dataset is divided into 3 sub-datasets, of which 2016 samples are ‘training’ samples, 672 samples are ‘validation’ samples, and 671 samples are ‘test’ samples. The labels are in the format required for YOLO11 detection.

  11. R

    Risk Detection 1 Dataset

    • universe.roboflow.com
    zip
    Updated Dec 14, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    PBL5MU (2024). Risk Detection 1 Dataset [Dataset]. https://universe.roboflow.com/pbl5mu/risk-detection-1/dataset/2
    Explore at:
    zipAvailable download formats
    Dataset updated
    Dec 14, 2024
    Dataset authored and provided by
    PBL5MU
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Crosswalk Bounding Boxes
    Description

    Risk Detection

    Dataset for Training YOLOv11 AI Model to Detect Risky Objects for Assistive Guidance.

    Objective

    The goal of this project is to create a comprehensive dataset to train an AI model using YOLOv11 (You Only Look Once version 11). The model will detect and identify "risky objects" that blind and visually impaired individuals may encounter in indoor and outdoor environments. This dataset serves as the foundation for an assistive technology tool designed to enhance mobility and safety by providing real-time object detection and guidance.

    Dataset Overview

    Targeted Objects

    Objects identified as potentially risky were selected through research and user studies. The dataset focuses on items that could obstruct paths, pose tripping hazards, or cause injury if unnoticed.

    Examples include:
    - Outdoor Risks:
    * Vehicles
    * Bicycles
    * Potholes
    * Curbs
    * Barriers
    * People

    • Indoor Risks:
      • Chairs
      • Tables
      • Shelves
      • Furniture (e.g., cabinet, closet)

    Key Features of the Dataset

    • Comprehensive Annotations:
      Every image is annotated with bounding boxes and labeled with object categories.
    • Diverse Scenarios:
      Images are captured under varied lighting conditions, environments (urban, suburban, rural), and angles to ensure robustness.

    Dataset Structure

    1. Total Images: x labeled images.
    2. Categories: x object classes identified as risky.
    3. Formats: Compatible with YOLO input formats and others.

    Benefits of the AI Model

    Real-Time Hazard Detection

    The YOLOv11 model will process visual data from a wearable or smartphone camera, identifying and alerting the user to risks in real-time.

    Improved Independence

    By providing proactive guidance, the system empowers blind and visually impaired individuals to navigate more independently and safely.

    Technical Details

    Model Training

    • YOLOv11 architecture optimized for high accuracy and real-time performance.
    • The dataset will be split into:
      1. Training (70%)
      2. Validation (20%)
      3. Testing (10%)

    Augmentation Techniques

    • Data Augmentation:
      Techniques such as rotation, scaling, and brightness adjustments to increase dataset robustness.
    • Simulated Obstacles:
      Simulations of real-world obstacles (e.g., occlusion, partial object visibility).

    Evaluation Metrics

    • Precision
    • Recall
    • F1-Score
    • mAP (mean Average Precision)

    Conclusion

    This project aims to leverage advanced AI technology to address the unique challenges faced by blind and visually impaired individuals. By creating a specialized dataset for training YOLOv11, the model can detect risky objects with high precision, enhancing safety and mobility. The ultimate outcome is an AI-powered assistive system that provides greater independence and confidence to its users in their everyday lives.

    Credits and Acknowledgments

    This project incorporates images from the following public datasets. We extend our gratitude to the creators and contributors of these datasets for making their work freely available to the research community:

    1. Dataset Name 1
      • Description of the dataset (e.g., type of images, purpose).
      • License: CC BY 4.0.

    We adhere to the terms and conditions of these datasets' licenses and greatly appreciate their contribution to advancing research in AI and assistive technologies.

  12. Z

    Personal Protective Equipment Dataset (PPED)

    • data.niaid.nih.gov
    • zenodo.org
    Updated May 17, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anonymous (2022). Personal Protective Equipment Dataset (PPED) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6551757
    Explore at:
    Dataset updated
    May 17, 2022
    Dataset authored and provided by
    Anonymous
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Personal Protective Equipment Dataset (PPED)

    This dataset serves as a benchmark for PPE in chemical plants We provide datasets and experimental results.

    1. The dataset

    We produced a data set based on the actual needs and relevant regulations in chemical plants. The standard GB 39800.1-2020 formulated by the Ministry of Emergency Management of the People’s Republic of China defines the protective requirements for plants and chemical laboratories. The complete dataset is contained in the folder PPED/data.

    1.1. Image collection

    We took more than 3300 pictures. We set the following different characteristics, including different environments, different distances, different lighting conditions, different angles, and the diversity of the number of people photographed.

    Backgrounds: There are 4 backgrounds, including office, near machines, factory and regular outdoor scenes.

    Scale: By taking pictures from different distances, the captured PPEs are classified in small, medium and large scales.

    Light: Good lighting conditions and poor lighting conditions were studied.

    Diversity: Some images contain a single person, and some contain multiple people.

    Angle: The pictures we took can be divided into front and side.

    A total of more than 3300 photos were taken in the raw data under all conditions. All images are located in the folder “PPED/data/JPEGImages”.

    1.2. Label

    We use Labelimg as the labeling tool, and we use the PASCAL-VOC labelimg format. Yolo use the txt format, we can use trans_voc2yolo.py to convert the XML file in PASCAL-VOC format to txt file. Annotations are stored in the folder PPED/data/Annotations

    1.3. Dataset Features

    The pictures are made by us according to the different conditions mentioned above. The file PPED/data/feature.csv is a CSV file which notes all the .os of all the image. It records every feature of the picture, including lighting conditions, angles, backgrounds, number of people and scale.

    1.4. Dataset Division

    The data set is divided into 9:1 training set and test set.

    1. Baseline Experiments

    We provide baseline results with five models, namely Faster R-CNN ®, Faster R-CNN (M), SSD, YOLOv3-spp, and YOLOv5. All code and results is given in folder PPED/experiment.

    2.1. Environment and Configuration:

    Intel Core i7-8700 CPU

    NVIDIA GTX1060 GPU

    16 GB of RAM

    Python: 3.8.10

    pytorch: 1.9.0

    pycocotools: pycocotools-win

    Windows 10

    2.2. Applied Models

    The source codes and results of the applied models is given in folder PPED/experiment with sub-folders corresponding to the model names.

    2.2.1. Faster R-CNN

    Faster R-CNN

    backbone: resnet50+fpn

    We downloaded the pre-training weights from https://download.pytorch.org/models/fasterrcnn_resnet50_fpn_coco-258fb6c6.pth.

    We modified the dataset path, training classes and training parameters including batch size.

    We run train_res50_fpn.py start training.

    Then, the weights are trained by the training set.

    Finally, we validate the results on the test set.

    backbone: mobilenetv2

    the same training method as resnet50+fpn, but the effect is not as good as resnet50+fpn, so it is directly discarded.

    The Faster R-CNN source code used in our experiment is given in folder PPED/experiment/Faster R-CNN. The weights of the fully-trained Faster R-CNN (R), Faster R-CNN (M) model are stored in file PPED/experiment/trained_models/resNetFpn-model-19.pth and mobile-model.pth. The performance measurements of Faster R-CNN (R) Faster R-CNN (M) are stored in folder PPED/experiment/results/Faster RCNN(R)and Faster RCNN(M).

    2.2.2. SSD

    backbone: resnet50

    We downloaded pre-training weights from https://download.pytorch.org/models/resnet50-19c8e357.pth.

    The same training method as Faster R-CNN is applied.

    The SSD source code used in our experiment is given in folder PPED/experiment/ssd. The weights of the fully-trained SSD model are stored in file PPED/experiment/trained_models/SSD_19.pth. The performance measurements of SSD are stored in folder PPED/experiment/results/SSD.

    2.2.3. YOLOv3-spp

    backbone: DarkNet53

    We modified the type information of the XML file to match our application.

    We run trans_voc2yolo.py to convert the XML file in VOC format to a txt file.

    The weights used are: yolov3-spp-ultralytics-608.pt.

    The YOLOv3-spp source code used in our experiment is given in folder PPED/experiment/YOLOv3-spp. The weights of the fully-trained YOLOv3-spp model are stored in file PPED/experiment/trained_models/YOLOvspp-19.pt. The performance measurements of YOLOv3-spp are stored in folder PPED/experiment/results/YOLOv3-spp.

    2.2.4. YOLOv5

    backbone: CSP_DarkNet

    We modified the type information of the XML file to match our application.

    We run trans_voc2yolo.py to convert the XML file in VOC format to a txt file.

    The weights used are: yolov5s.

    The YOLOv5 source code used in our experiment is given in folder PPED/experiment/yolov5. The weights of the fully-trained YOLOv5 model are stored in file PPED/experiment/trained_models/YOLOv5.pt. The performance measurements of YOLOv5 are stored in folder PPED/experiment/results/YOLOv5.

    2.3. Evaluation

    The computed evaluation metrics as well as the code needed to compute them from our dataset are provided in the folder PPED/experiment/eval.

    1. Code Sources

    Faster R-CNN (R and M)

    https://github.com/WZMIAOMIAO/deep-learning-for-image-processing/tree/master/pytorch_object_detection/faster_rcnn

    official code: https://github.com/pytorch/vision/blob/main/torchvision/models/detection/faster_rcnn.py

    SSD

    https://github.com/WZMIAOMIAO/deep-learning-for-image-processing/tree/master/pytorch_object_detection/ssd

    official code: https://github.com/pytorch/vision/blob/main/torchvision/models/detection/ssd.py

    YOLOv3-spp

    https://github.com/WZMIAOMIAO/deep-learning-for-image-processing/tree/master/pytorch_object_detection/yolov3-spp

    YOLOv5

    https://github.com/ultralytics/yolov5

  13. R

    Afo Aerial Of Floating Object Dataset

    • universe.roboflow.com
    zip
    Updated Oct 9, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Large Benchmark Datasets (2023). Afo Aerial Of Floating Object Dataset [Dataset]. https://universe.roboflow.com/large-benchmark-datasets/afo-aerial-dataset-of-floating-object
    Explore at:
    zipAvailable download formats
    Dataset updated
    Oct 9, 2023
    Dataset authored and provided by
    Large Benchmark Datasets
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Documents Bounding Boxes
    Description

    AFO dataset is the first free dataset for training machine learning and deep learning models for maritime Search and Rescue applications. It contains aerial-drone videos with 40,000 hand-annotated persons and objects floating in the water, many of small size, which makes them difficult to detect.

    The AFO dataset contains images taken from fifty video clips containing objects floating on the water surface, captured by the various drone-mounted cameras (from 1280x720 to 3840x2160 resolutions), which have been used to create AFO. From these videos, manually annotated 3647 images that contain 39991 objects were extracted. These have been then split into three parts: the training (67,4% of objects), the test (19,12% of objects), and the validation set (13,48% of objects). In order to prevent overfitting of the model to the given data, the test set contains selected frames from nine videos that were not used in either the training or validation sets.

  14. R

    Wider Face Dataset

    • universe.roboflow.com
    • tensorflow.org
    • +3more
    zip
    Updated Apr 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Large Benchmark Datasets (2025). Wider Face Dataset [Dataset]. https://universe.roboflow.com/large-benchmark-datasets/wider-face-ndtcz/model/1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 3, 2025
    Dataset authored and provided by
    Large Benchmark Datasets
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Aerial Bounding Boxes
    Description

    WIDER FACE: Result Multimedia Laboratory, Department of Information Engineering, The Chinese University of Hong Kong

    WIDER FACE dataset is a face detection benchmark dataset, of which images are selected from the publicly available WIDER dataset from Multimedia Laboratory, Department of Information Engineering, The Chinese University of Hong Kong. Dataset consists of 32,203 images and label 393,703 faces with a high degree of variability in scale, pose and occlusion as depicted in the sample images. WIDER FACE dataset is organized based on 61 event classes. For each event class, randomly selected 40%/10%/50% data as training, validation and testing sets.

  15. h

    Egg-Instance-Segmentation

    • huggingface.co
    Updated Mar 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Afshin Dini (2025). Egg-Instance-Segmentation [Dataset]. https://huggingface.co/datasets/afshin-dini/Egg-Instance-Segmentation
    Explore at:
    Dataset updated
    Mar 26, 2025
    Authors
    Afshin Dini
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Egg Instance Segmentation

    This is a dataset with images of eggs that can be used for egg segmentation purposes. The dataset is divided into two classes: white-egg and brown-egg. This is YOLO format dataset. The training and validation images are in the train and val folders respectively. The polygon annotations specifying the exact boundaries of eggs are in the related labels folders.

      Goal
    

    This dataset is collected to train a YOLO model to segment different types of eggs… See the full description on the dataset page: https://huggingface.co/datasets/afshin-dini/Egg-Instance-Segmentation.

  16. Data from: People Detection Dataset

    • kaggle.com
    Updated Jun 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Adil Shamim (2025). People Detection Dataset [Dataset]. https://www.kaggle.com/datasets/adilshamim8/people-detection
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 15, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Adil Shamim
    License

    Attribution-NoDerivs 4.0 (CC BY-ND 4.0)https://creativecommons.org/licenses/by-nd/4.0/
    License information was derived automatically

    Description

    Give Machines the Power to See People.

    This isn’t just a dataset — it’s a foundation for building the future of human-aware technology. Carefully crafted and annotated with precision, the People Detection dataset enables AI systems to recognize and understand human presence in dynamic, real-world environments.

    Whether you’re building smart surveillance, autonomous vehicles, crowd analytics, or next-gen robotics, this dataset gives your model the eyes it needs.

    What Makes This Dataset Different?

    • Real-World Images – Diverse environments, realistic lighting, and real human motion
    • High-Quality Annotations – Every person labeled with clean YOLO-format bounding boxes
    • Plug-and-Play – Comes with pre-split training, validation, and test sets — no extra prep needed
    • Speed-Optimized – Perfect for real-time object detection applications

    Built for Visionaries

    • Detect people instantly — in cities, offices, or crowds
    • Build systems that respond to human presence
    • Train intelligent agents to navigate human spaces safely and smartly

    Created using Roboflow. Optimized for clarity, performance, and scale. Source Dataset on Roboflow →

    This is more than a dataset. It’s a step toward a smarter world — One where machines can understand people.

  17. Spatio-Temporal Vehicle Detection Dataset (STVD)

    • zenodo.org
    zip
    Updated Sep 10, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kristina Telegraph; Christos Kyrkou; Christos Kyrkou; Kristina Telegraph (2024). Spatio-Temporal Vehicle Detection Dataset (STVD) [Dataset]. http://doi.org/10.5281/zenodo.11468690
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 10, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Kristina Telegraph; Christos Kyrkou; Christos Kyrkou; Kristina Telegraph
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    A dataset suitable for spatiotemporal object detection is constructed using several aerial video clips of traffic in different road segments in Nicosia, Cyprus, captured using UAVs, rather than single areas in low resolution satelite images as other datasets.

    By compiling multiple sequences of images extracted from these videos, the dataset accumulates a substantial corpus of 6,600 frames. The dataset encapsulates 3 classes: ‘car’, ‘truck’ and ‘bus’ with a distribution of 81165, 1541, and 1625 respectively in the case that we only use the even frame annotations, which approximately doubles when considering the entire dataset. An additional challenge of the dataset that mirrors real world application is the fact that the classes are not balanced, as there is a significantly larger number of cars compared to trucks and buses, as in a regular transportation network. The images have Full-HD resolution, with object sizes approximately between 20x20 to 150x150 pixels. The dataset was prepared in the YOLO format. The dataset was split into 80% for training and the remaining 20% for validation. The importance of such a dataset lies in its capability to encapsulate both spatial and temporal nuances. We note the frames belonging in the same continuous sequence as such the dataset can potentially be used to develop approaches that operate on multiple sequential frames for object detection by sampling a number of frames from the same sequence.

    Dataset FeatureDescription
    Total Images~6600
    Image Sizes1920x1080
    ClassesCar,Bus,Truck
    Data CollectionCollect from UAVs at different locations in Nicosia, Cyprus
    Data FormatPNG
    Labelling FormatYOLO
  18. YOLO v5 format of the Traffic Signs dataset

    • kaggle.com
    Updated Nov 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Valentyn Sichkar (2023). YOLO v5 format of the Traffic Signs dataset [Dataset]. http://doi.org/10.34740/kaggle/ds/4059603
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 28, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Valentyn Sichkar
    Description

    :triangular_flag_on_post: Details

    The dataset includes image files and appropriate annotations to train YOLO v5 detector. It is separated into two versions: 1. with 4 classes only 1. and with all 43 classes

    Before training, edit dataset.yaml file and specify there appropriate path 👇

    # The root directory of the dataset
    # (!) Update the root path according to your location
    path: ..\..\Downloads\ts_yolo_v5_format\ts4classes
    
    train: images\train\   # train images (relative to 'path')
    val: images\validation\  # val images (relative to 'path')
    test: images\test\    # test images (relative to 'path')
    
    # Number of classes and their names
    nc: 4
    names: [ 'prohibitory', 'danger', 'mandatory', 'other']
    


    🎥 Watch video about YOLO format 👇

    https://www.youtube.com/watch?v=-bU0ZBbG8l4" alt="">


    🎓 YOLO v5: Label, Train and Test. Join the course! 👇

    https://www.udemy.com/course/yolo-v5-label-train-and-test

    Have a look at the abilities that you will obtain:
    📢 Run YOLO v5 to detect objects on image, video and in real time by camera in the first lectures.
    📢 Label-Create-Convert own dataset in YOLO format.
    📢 Train & Test both: in your local machine and in the cloud machine (with custom data and by few lines of the code).


    Concept map of the YOLO v5 course 👇

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F3400968%2Fac1893f68be61efb21e376b3c405147c%2Fconcept_map_YOLO_v5.png?generation=1701165575909796&alt=media" alt="Concept map of the YOLO v5 course">

    Join the course! 👇

    https://www.udemy.com/course/yolo-v5-label-train-and-test


    Acknowledgements

    Initial data is The German Traffic Sign Recognition Benchmarks (GTSRB).

  19. Airplane yolo annotated

    • kaggle.com
    Updated Mar 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    samuel ayman (2025). Airplane yolo annotated [Dataset]. https://www.kaggle.com/datasets/samuelayman/airplane-yolo-annotated/versions/1
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 6, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    samuel ayman
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    • Dataset Composition:

    – 1000 images, each containing airplane instances.

    – Annotations in YOLO’s bounding box format (class index, x_center, y_center, width, height).

    • Data Split:

    – Training Set: 75% (750 images) for model learning.

    – Validation Set: 20% (200 images) for tuning hyperparameters and checking for overfitting.

    – Test Set: 5% (50 images) for final performance evaluation.

    This arrangement helps ensure effective training, validation, and unbiased testing of the airplane detection model.

  20. Z

    Multi-Altitude Aerial Vehicles Dataset

    • data.niaid.nih.gov
    Updated Apr 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Panayiotis Kolios (2023). Multi-Altitude Aerial Vehicles Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7736335
    Explore at:
    Dataset updated
    Apr 5, 2023
    Dataset provided by
    Panayiotis Kolios
    Rafael Makrigiorgis
    Christos Kyrkou
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Custom Multi-Altitude Aerial Vehicles Dataset:

    Created for publishing results for ICUAS 2023 paper "How High can you Detect? Improved accuracy and efficiency at varying altitudes for Aerial Vehicle Detection", following the abstract of the paper.

    Abstract—Object detection in aerial images is a challenging task mainly because of two factors, the objects of interest being really small, e.g. people or vehicles, making them indistinguishable from the background; and the features of objects being quite different at various altitudes. Especially, when utilizing Unmanned Aerial Vehicles (UAVs) to capture footage, the need for increased altitude to capture a larger field of view is quite high. In this paper, we investigate how to find the best solution for detecting vehicles in various altitudes, while utilizing a single CNN model. The conditions for choosing the best solution are the following; higher accuracy for most of the altitudes and real-time processing ( > 20 Frames per second (FPS) ) on an Nvidia Jetson Xavier NX embedded device. We collected footage of moving vehicles from altitudes of 50-500 meters with a 50-meter interval, including a roundabout and rooftop objects as noise for high altitude challenges. Then, a YoloV7 model was trained on each dataset of each altitude along with a dataset including all the images from all the altitudes. Finally, by conducting several training and evaluation experiments and image resizes we have chosen the best method of training objects on multiple altitudes to be the mixup dataset with all the altitudes, trained on a higher image size resolution, and then performing the detection using a smaller image resize to reduce the inference performance. The main results

    The creation of a custom dataset was necessary for altitude evaluation as no other datasets were available. To fulfill the requirements, the footage was captured using a small UAV hovering above a roundabout near the University of Cyprus campus, where several structures and buildings with solar panels and water tanks were visible at varying altitudes. The data were captured during a sunny day, ensuring bright and shadowless images. Images were extracted from the footage, and all data were annotated with a single class labeled as 'Car'. The dataset covered altitudes ranging from 50 to 500 meters with a 50-meter step, and all images were kept at their original high resolution of 3840x2160, presenting challenges for object detection. The data were split into 3 sets for training, validation, and testing, with the number of vehicles increasing as altitude increased, which was expected due to the larger field of view of the camera. Each folder consists of an aerial vehicle dataset captured at the corresponding altitude. For each altitude, the dataset annotations are generated in YOLO, COCO, and VOC formats. The dataset consists of the following images and detection objects:

        Data
        Subset
        Images
        Cars
    
    
        50m
        Train
        130
        269
    
    
        50m
        Test
        32
        66
    
    
        50m
        Valid
        33
        73
    
    
        100m
        Train
        246
        937
    
    
        100m
        Test
        61
        226
    
    
        100m
        Valid
        62
        250
    
    
        150m
        Train
        244
        1691
    
    
        150m
        Test
        61
        453
    
    
        150m
        Valid
        61
        426
    
    
        200m
        Train
        246
        1753
    
    
        200m
        Test
        61
        445
    
    
        200m
        Valid
        62
        424
    
    
        250m
        Train
        245
        3326
    
    
        250m
        Test
        61
        821
    
    
        250m
        Valid
        61
        823
    
    
        300m
        Train
        246
        6250
    
    
        300m
        Test
        61
        1553
    
    
        300m
        Valid
        62
        1585
    
    
        350m
        Train
        246
        10741
    
    
        350m
        Test
        61
        2591
    
    
        350m
        Valid
        62
        2687
    
    
        400m
        Train
        245
        20072
    
    
        400m
        Test
        61
        4974
    
    
        400m
        Valid
        61
        4924
    
    
        450m
        Train
        246
        31794
    
    
        450m
        Test
        61
        7887
    
    
        450m
        Valid
        61
        7880
    
    
        500m
        Train
        270
        49782
    
    
        500m
        Test
        67
        12426
    
    
        500m
        Valid
        68
        12541
    
    
        mix_alt
        Train
        2364
        126615
    
    
        mix_alt
        Test
        587
        31442
    
    
        mix_alt
        Valid
        593
        31613
    

    It is advised to further enhance the dataset so that random augmentations are probabilistically applied to each image prior to adding it to the batch for training. Specifically, there are a number of possible transformations such as geometric (rotations, translations, horizontal axis mirroring, cropping, and zooming), as well as image manipulations (illumination changes, color shifting, blurring, sharpening, and shadowing).

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Kazuhide Mimura; Kentaro Nakamura; Kazutaka Yasukawa; Elizabeth C Sibert; Junichiro Ohta; Takahiro Kitazawa; Yasuhiro Kato (2023). Image datasets for training, validating and testing deep learning models to detect microfossil fish teeth and denticles called ichthyolith using YOLOv7 [Dataset]. http://doi.org/10.6084/m9.figshare.22736609.v1

Image datasets for training, validating and testing deep learning models to detect microfossil fish teeth and denticles called ichthyolith using YOLOv7

Explore at:
2 scholarly articles cite this dataset (View in Google Scholar)
binAvailable download formats
Dataset updated
May 10, 2023
Dataset provided by
figshare
Authors
Kazuhide Mimura; Kentaro Nakamura; Kazutaka Yasukawa; Elizabeth C Sibert; Junichiro Ohta; Takahiro Kitazawa; Yasuhiro Kato
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This repository contains datasets to train, validate and test deep learning models to detect microfossil fish teeth and denticles called "ichthyoliths". All the dataset contains images of glass slides prepared from deep-sea sediment obtained from Pacific Ocean, and annotation label files formatted to YOLO. 01_original_all The dataset contains 12219 images and 6945 label files. 6945 images include at least one ichthyolith and 5274 images include no ichthyolith. These images and label files were randomly split into three subset, "train" that contains 9740 images and 5551 label files, "val" that contains 1235 images and 695 label files and "test" that contains 1244 images and 699 label files. All the images were selected manually. 02_original_selected This dataset is generated from 01_original_all by removing images without ichthyoliths. The dataset contains 6945 images that include at least one ichthyolith and 6945 label files. The dataset contains three subset, "train" that contains 5551 images and label files, "val" that contains 695 images and label files and "test" that contains 699 images and label files. 03_extended_all This dataset is generated from 01_original_all by adding 4463 images detected by deep learning models. The dataset contains 16682 images and 9473 label files. 9473 images include at least one ichthyolith and 7209 images include no ichthyolith. These images and label files were split into three subset, "train" that contains 13332 images and 7594 label files, "val" that contains 1690 images and 947 label files and "test" that contains 1660 images and 932 label files. Label files were checked manually. 04_extended_selected This dataset is generated from 03_extended_all by removing images without ichthyoliths. The dataset contains 9473 images that include at least one ichthyolith and 9473 label files. The dataset contains three subset, "train" that contains 7594 images and label files, "val" that contains 947 images and label files and "test" that contains 932 images and label files.

Search
Clear search
Close search
Google apps
Main menu