100+ datasets found
  1. R

    Hard Hat Workers Dataset

    • universe.roboflow.com
    zip
    Updated Sep 30, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Joseph Nelson (2022). Hard Hat Workers Dataset [Dataset]. https://universe.roboflow.com/joseph-nelson/hard-hat-workers/model/13
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 30, 2022
    Dataset authored and provided by
    Joseph Nelson
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Variables measured
    Workers Bounding Boxes
    Description

    Overview

    The Hard Hat dataset is an object detection dataset of workers in workplace settings that require a hard hat. Annotations also include examples of just "person" and "head," for when an individual may be present without a hard hart.

    The original dataset has a 75/25 train-test split.

    Example Image: https://i.imgur.com/7spoIJT.png" alt="Example Image">

    Use Cases

    One could use this dataset to, for example, build a classifier of workers that are abiding safety code within a workplace versus those that may not be. It is also a good general dataset for practice.

    Using this Dataset

    Use the fork or Download this Dataset button to copy this dataset to your own Roboflow account and export it with new preprocessing settings (perhaps resized for your model's desired format or converted to grayscale), or additional augmentations to make your model generalize better. This particular dataset would be very well suited for Roboflow's new advanced Bounding Box Only Augmentations.

    Dataset Versions:

    Image Preprocessing | Image Augmentation | Modify Classes * v1 (resize-416x416-reflect): generated with the original 75/25 train-test split | No augmentations * v2 (raw_75-25_trainTestSplit): generated with the original 75/25 train-test split | These are the raw, original images * v3 (v3): generated with the original 75/25 train-test split | Modify Classes used to drop person class | Preprocessing and Augmentation applied * v5 (raw_HeadHelmetClasses): generated with a 70/20/10 train/valid/test split | Modify Classes used to drop person class * v8 (raw_HelmetClassOnly): generated with a 70/20/10 train/valid/test split | Modify Classes used to drop head and person classes * v9 (raw_PersonClassOnly): generated with a 70/20/10 train/valid/test split | Modify Classes used to drop head and helmet classes * v10 (raw_AllClasses): generated with a 70/20/10 train/valid/test split | These are the raw, original images * v11 (augmented3x-AllClasses-FastModel): generated with a 70/20/10 train/valid/test split | Preprocessing and Augmentation applied | 3x image generation | Trained with Roboflow's Fast Model * v12 (augmented3x-HeadHelmetClasses-FastModel): generated with a 70/20/10 train/valid/test split | Preprocessing and Augmentation applied, Modify Classes used to drop person class | 3x image generation | Trained with Roboflow's Fast Model * v13 (augmented3x-HeadHelmetClasses-AccurateModel): generated with a 70/20/10 train/valid/test split | Preprocessing and Augmentation applied, Modify Classes used to drop person class | 3x image generation | Trained with Roboflow's Accurate Model * v14 (raw_HeadClassOnly): generated with a 70/20/10 train/valid/test split | Modify Classes used to drop person class, and remap/relabel helmet class to head

    Choosing Between Computer Vision Model Sizes | Roboflow Train

    About Roboflow

    Roboflow makes managing, preprocessing, augmenting, and versioning datasets for computer vision seamless.

    Developers reduce 50% of their code when using Roboflow's workflow, automate annotation quality assurance, save training time, and increase model reproducibility.

    Roboflow Workmark

  2. d

    Dump truck object detection dataset including scale-models - Dataset -...

    • b2find.dkrz.de
    Updated Mar 18, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2020). Dump truck object detection dataset including scale-models - Dataset - B2FIND [Dataset]. https://b2find.dkrz.de/dataset/0146293f-468c-5326-ace7-fa741e01cb40
    Explore at:
    Dataset updated
    Mar 18, 2020
    Description

    Object detection is a vital part of any autonomous vision system and to obtain a high performing object detector data is needed. The object detection task aims to detect and classify different objects using camera input and getting bounding boxes containing the objects as output. This is usually done by utilizing deep neural networks. When training an object detector a large amount of data is used, however it is not always practical to collect large amounts of data. This has led to multiple different techniques which decreases the amount of data needed. Examples of such techniques are transfer learning and domain adaptation. Working with construction equipment is a time consuming process and we wanted to examine if it was possible to use scale-model data to train a network and then used that network to detect real objects with no additional training. This small dataset contains training and validation data of a scale dump truck in different environments while the test set contains images of a full size dump truck of similar model. The aim of the dataset is to train a network to classify wheels, cabs and tipping bodies of a scale-model dump truck and use that to classify the same classes on a full-scale dump truck. The label structure of the dataset is the YOLO v3 structure, where the classes corresponds to a integer value, such that: Wheel: 0 Cab: 1 Tipping body: 2

  3. Trojan Detection Software Challenge - object-detection-jul2022-train

    • catalog.data.gov
    • gimi9.com
    • +1more
    Updated Mar 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institute of Standards and Technology (2025). Trojan Detection Software Challenge - object-detection-jul2022-train [Dataset]. https://catalog.data.gov/dataset/trojan-detection-software-challenge-object-detection-jul2022-train
    Explore at:
    Dataset updated
    Mar 14, 2025
    Dataset provided by
    National Institute of Standards and Technologyhttp://www.nist.gov/
    Description

    Round 10 Train DatasetThis is the training data used to create and evaluate trojan detection software solutions. This data, generated at NIST, consists of object detection AIs trained on the COCO dataset. A known percentage of these trained AI models have been poisoned with a known trigger which induces incorrect behavior. This data will be used to develop software solutions for detecting which trained AI models have been poisoned via embedded triggers. This dataset consists of 144 AI models using a small set of model architectures. Half (50%) of the models have been poisoned with an embedded trigger which causes misclassification of the input when the trigger is present.

  4. Vehicles Openimages Dataset

    • universe.roboflow.com
    zip
    Updated Jun 17, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Roboflow (2022). Vehicles Openimages Dataset [Dataset]. https://universe.roboflow.com/roboflow-gw7yv/vehicles-openimages/dataset/1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 17, 2022
    Dataset authored and provided by
    Roboflow
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Vehicles Bounding Boxes
    Description

    https://i.imgur.com/ztezlER.png" alt="Image example">

    Overview

    This dataset contains 627 images of various vehicle classes for object detection. These images are derived from the Open Images open source computer vision datasets.

    This dataset only scratches the surface of the Open Images dataset for vehicles!

    https://i.imgur.com/4ZHN8kk.png" alt="Image example">

    Use Cases

    • Train object detector to differentiate between a car, bus, motorcycle, ambulance, and truck.
    • Checkpoint object detector for autonomous vehicle detector
    • Test object detector on high density of ambulances in vehicles
    • Train ambulance detector
    • Explore the quality and range of Open Image dataset

    Tools Used to Derive Dataset

    https://i.imgur.com/1U0M573.png" alt="Image example">

    These images were gathered via the OIDv4 Toolkit This toolkit allows you to pick an object class and retrieve a set number of images from that class with bound box lables.

    We provide this dataset as an example of the ability to query the OID for a given subdomain. This dataset can easily be scaled up - please reach out to us if that interests you.

  5. P

    Paimon Dataset YOLO Detection Dataset

    • paperswithcode.com
    • gts.ai
    Updated Mar 18, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Paimon Dataset YOLO Detection Dataset [Dataset]. https://paperswithcode.com/dataset/paimon-dataset-yolo-detection
    Explore at:
    Dataset updated
    Mar 18, 2025
    Description

    Description:

    👉 Download the dataset here

    This dataset consists of a diverse collection of images featuring Paimon, a popular character from the game Genshin Impact. The images have been sourced from in-game gameplay footage and capture Paimon from various angles and in different sizes (scales), making the dataset suitable for training YOLO object detection models.

    The dataset provides a comprehensive view of Paimon in different lighting conditions, game environments, and positions, ensuring the model can generalize well to similar characters or object detection tasks. While most annotations are accurately labeled, a small number of annotations may include minor inaccuracies due to manual labeling errors. This is ideal for researchers and developers working on character recognition, object detection in gaming environments, or other AI vision tasks.

    Download Dataset

    Dataset Features:

    Image Format: .jpg files in 640×320 resolution.

    Annotation Format: .txt files in YOLO format, containing bounding box data with:

    class_id

    x_center

    y_center

    width

    height

    Use Cases:

    Character Detection in Games: Train YOLO models to detect and identify in-game characters or NPCs.

    Gaming Analytics: Improve recognition of specific game elements for AI-powered game analytics tools.

    Research: Contribute to academic research focused on object detection or computer vision in animated and gaming environments.

    Data Structure:

    Images: High-quality .jpg images captured from multiple perspectives, ensuring robust model training across various orientations and lighting scenarios.

    Annotations: Each image has an associated .txt file that follows the YOLO format. The annotations are structured to include class identification, object location (center coordinates), and

    bounding box dimensions.

    Key Advantages:

    Varied Angles and Scales: The dataset includes Paimon from multiple perspectives, aiding in creating more versatile and adaptable object detection models.

    Real-World Scenario: Extracted from actual gameplay footage, the dataset simulates real-world detection challenges such as varying backgrounds, motion blur, and changing character scales.

    Training Ready: Suitable for training YOLO models and other deep learning frameworks that require object detection capabilities.

    This dataset is sourced from Kaggle.

  6. Trojan Detection Software Challenge - object-detection-feb2023-train

    • data.nist.gov
    • catalog.data.gov
    Updated Mar 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institute of Standards and Technology (2023). Trojan Detection Software Challenge - object-detection-feb2023-train [Dataset]. http://doi.org/10.18434/mds2-2959
    Explore at:
    Dataset updated
    Mar 16, 2023
    Dataset provided by
    National Institute of Standards and Technologyhttp://www.nist.gov/
    License

    https://www.nist.gov/open/licensehttps://www.nist.gov/open/license

    Description

    Round 13 Train Dataset This is the training data used to create and evaluate trojan detection software solutions. This data, generated at NIST, consists of object detection AIs trained both on synthetic image data build from Cityscapes and the DOTA_v2 dataset. A known percentage of these trained AI models have been poisoned with a known trigger which induces incorrect behavior. This data will be used to develop software solutions for detecting which trained AI models have been poisoned via embedded triggers. This dataset consists of 128 AI models using a small set of model architectures. Half (50%) of the models have been poisoned with an embedded trigger which causes misclassification of the input when the trigger is present.

  7. H

    Replication Data for: Training Deep Convolutional Object Detectors for...

    • dataverse.harvard.edu
    Updated Apr 16, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tomasz Gandor (2022). Replication Data for: Training Deep Convolutional Object Detectors for Images Affected by Lossy Compression [Dataset]. http://doi.org/10.7910/DVN/UHEP3C
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 16, 2022
    Dataset provided by
    Harvard Dataverse
    Authors
    Tomasz Gandor
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This collection contains the trained models and object detection results of 2 architectures found in the Detectron2 library, on the MS COCO val2017 dataset, under different JPEG compresion level Q = {5, 12, 19, 26, 33, 40, 47, 54, 61, 68, 75, 82, 89, 96} (14 levels per trained model). Architectures: F50 – Faster R-CNN on ResNet-50 with FPN R50 – RetinaNet on ResNet-50 with FPN Training type: D2 – Detectron2 Model ZOO pre-trained 1x model (90.000 iterations, batch 16) STD – standard 1x training (90.000 iterations) on original train2017 dataset Q20 – 1x training (90.000 iterations) on train2017 dataset degraded to Q=20 Q40 – 1x training (90.000 iterations) on train2017 dataset degraded to Q=40 T20 – extra 1x training on top of D2 on train2017 dataset degraded to Q=20 T40 – extra 1x training on top of D2 on train2017 dataset degraded to Q=40 Model and metrics files models_FasterRCNN.tar.gz (F50-STD, F50-Q20, …) models_RetinaNet.tar.gz (R50-STD, R50-Q20, …) For every model there are 3 files: config.yaml – the Detectron2 config of the model. model_final.pth – the weights (training snapshot) in PyTorch format. metrics.json – training metrics (like time, total loss, etc.) every 20 iterations. The D2 models were not included, because they are available from the Detectron2 Model ZOO, as faster_rcnn_R_50_FPN_1x (F50-D2) and retinanet_R_50_FPN_1x (R50-D2). Result files F50-results.tar.gz – results for Faster R-CNN models (inluding D2). R50-results.tar.gz – results for RetinaNet models (inluding D2). For every model there are 14 subdirectories, e.g. evaluator_dump_R50x1_005 through evaluator_dump_R50x1_096, for each of the JPEG Q values. Each such folder contains: coco_instances_results.json – all detected objects (image id, bounding box, class index and confidence). results.json – AP metrics as computed by COCO API. Source code for processing the data The data can be processed using our code, published at: https://github.com/tgandor/urban_oculus. Additional dependencies for the source code: COCO API Detectron2

  8. Smartbay Marine Types Object Detection Training dataset

    • zenodo.org
    Updated Oct 25, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eva Cullen; Eva Cullen (2024). Smartbay Marine Types Object Detection Training dataset [Dataset]. http://doi.org/10.5281/zenodo.13989527
    Explore at:
    Dataset updated
    Oct 25, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Eva Cullen; Eva Cullen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Aug 15, 2024
    Description

    The SmartBay Observatory in Galway Bay is an important contribution by Ireland to the growing global network of real-time data capture systems deployed within the ocean – technology giving us new insights into the ocean which we have not had before.

    The observatory was installed on the seafloor 1.5km off the coast of Spiddal, County Galway, Ireland . The observatory uses cameras, probes and sensors to permit continuous and remote live underwater monitoring. This observatory equipment allows ocean researchers unique real-time access to monitor ongoing changes in the marine environment. Data relating to the marine environment at the site is transferred in real-time from the SmartBay Observatory through a fibre optic telecommunications cable to the Marine Institute headquarters and onwards onto the internet. The data includes a live video stream, the depth of the observatory node, the sea temperature and salinity, and estimates of the chlorophyll and turbidity levels in the water which give an indication of the volume of phytoplankton and other particles, such as sediment, in the water.

    The Smartbay Marine Types Object Detection training Dataset is an initial Bounding Box Annotated image dataset used in attempting to Train a YOLOv8 Object Detection Model to classify the Marine Fauna observed in the Smartbay Observatory Video footage using broad "Marine Type" classes.

    The imagery used in this training dataset consists of image frame captures from the Smartbay video Archive files, CC-BY imagery from the www.minka-sdg.org website and images taken by Eva Cullen in the "Galway Atlantaquaria" Aquarium in Galway, Ireland.

    The imagery were annotated using CVAT, collated on Roboflow and exported in YOLOv8 trainign dataset format.

  9. d

    Image-Guided Object Detection using OWL-ViTand Enhanced Query Embedding...

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Sep 24, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Melih Serin (2024). Image-Guided Object Detection using OWL-ViTand Enhanced Query Embedding Extraction [Dataset]. http://doi.org/10.7910/DVN/PRHQMK
    Explore at:
    Dataset updated
    Sep 24, 2024
    Dataset provided by
    Harvard Dataverse
    Authors
    Melih Serin
    Description

    Computer vision has been receiving increasing attention with the recent complex Generative AI models released by tech industry giants, such as OpenAI and Google. However, there is a specific subfield that we wanted to focus on, that is, Image-Guided Object Detection. A detailed literature survey directed us towards a successful study called Simple Open-Vocabulary Object Detection with Vision Transformers (OWL-ViT) [1], which is a multifunctional complex model that can also perform image-guided object detection as a side function. In this study, some experiments have been conducted utilizing OWL-ViT architecture as the base model and manipulated the necessary parts to achieve a better one-shot performance. Code and models are available on GitHub.

  10. i

    15M+ Images | AI Training Data | Annotated imagery data for AI | Object &...

    • data.imagedatasets.ai
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Image Datasets, 15M+ Images | AI Training Data | Annotated imagery data for AI | Object & Scene Detection | Global Coverage [Dataset]. https://data.imagedatasets.ai/products/2m-images-annotated-imagery-data-full-exif-data-object-image-datasets
    Explore at:
    Dataset authored and provided by
    Image Datasets
    Area covered
    Czechia, Gabon, Israel, Singapore, Brazil, Belize, Marshall Islands, Gambia, Martinique, Senegal
    Description

    A comprehensive dataset of 15M+ images sourced globally, featuring full EXIF data, including camera settings and photography details. Enriched with object and scene detection metadata, this dataset is ideal for AI model training in image recognition, classification, and segmentation.

  11. R

    Chess Pieces Dataset

    • universe.roboflow.com
    zip
    Updated Apr 1, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Joseph Nelson (2021). Chess Pieces Dataset [Dataset]. https://universe.roboflow.com/joseph-nelson/chess-pieces-new/model/19
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 1, 2021
    Dataset authored and provided by
    Joseph Nelson
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Variables measured
    Pieces Bounding Boxes
    Description

    Overview

    This is a dataset of Chess board photos and various pieces. All photos were captured from a constant angle, a tripod to the left of the board. The bounding boxes of all pieces are annotated as follows: white-king, white-queen, white-bishop, white-knight, white-rook, white-pawn, black-king, black-queen, black-bishop, black-knight, black-rook, black-pawn. There are 2894 labels across 292 images.

    https://i.imgur.com/nkjobw1.png" alt="Chess Example">

    Follow this tutorial to see an example of training an object detection model using this dataset or jump straight to the Colab notebook.

    Use Cases

    At Roboflow, we built a chess piece object detection model using this dataset.

    https://blog.roboflow.ai/content/images/2020/01/chess-detection-longer.gif" alt="ChessBoss">

    You can see a video demo of that here. (We did struggle with pieces that were occluded, i.e. the state of the board at the very beginning of a game has many pieces obscured - let us know how your results fare!)

    Using this Dataset

    We're releasing the data free on a public license.

    About Roboflow

    Roboflow makes managing, preprocessing, augmenting, and versioning datasets for computer vision seamless.

    Developers reduce 50% of their boilerplate code when using Roboflow's workflow, save training time, and increase model reproducibility.

    Roboflow Workmark

  12. Style Transfer for Object Detection in Art

    • kaggle.com
    zip
    Updated Mar 11, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    David Kadish (2021). Style Transfer for Object Detection in Art [Dataset]. https://www.kaggle.com/davidkadish/style-transfer-for-object-detection-in-art
    Explore at:
    zip(3762347804 bytes)Available download formats
    Dataset updated
    Mar 11, 2021
    Authors
    David Kadish
    Description

    Context

    Despite recent advances in object detection using deep learning neural networks, these neural networks still struggle to identify objects in art images such as paintings and drawings. This challenge is known as the cross depiction problem and it stems in part from the tendency of neural networks to prioritize identification of an object's texture over its shape. In this paper we propose and evaluate a process for training neural networks to localize objects - specifically people - in art images. We generated a large dataset for training and validation by modifying the images in the COCO dataset using AdaIn style transfer (style-coco.tar.xz). This dataset was used to fine-tune a Faster R-CNN object detection network (2020-12-10_09-45-15_58672_resnet152_stylecoco_epoch_15.pth), which is then tested on the existing People-Art testing dataset (PeopleArt-Coco.tar.xz). The result is a significant improvement on the state of the art and a new way forward for creating datasets to train neural networks to process art images.

    Content

    2020-12-10_09-45-15_58672_resnet152_stylecoco_epoch_15.pth: Trained object detection network (Faster-RCNN using a ResNet152 backbone pretrained on ImageNet) for use with PyTorch PeopleArt-Coco.tar.xz: People-Art dataset with COCO-formatted annotations (original at https://github.com/BathVisArtData/PeopleArt) style-coco.tar.xz: Stylized COCO dataset containing only the person category. Used to train 2020-12-10_09-45-15_58672_resnet152_stylecoco_epoch_15.pth

    Code

    The code is available on github at https://github.com/dkadish/Style-Transfer-for-Object-Detection-in-Art

    Citing

    If you are using this code or the concept of style transfer for object detection in art, please cite our paper (https://arxiv.org/abs/2102.06529):

    D. Kadish, S. Risi, and A. S. Løvlie, “Improving Object Detection in Art Images Using Only Style Transfer,” Feb. 2021.

  13. d

    15M+ Images | AI Training Data | Annotated imagery data for AI | Object &...

    • datarade.ai
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Image Datasets, 15M+ Images | AI Training Data | Annotated imagery data for AI | Object & Scene Detection | Global Coverage [Dataset]. https://datarade.ai/data-products/2m-images-annotated-imagery-data-full-exif-data-object-image-datasets
    Explore at:
    .bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
    Dataset authored and provided by
    Image Datasets
    Area covered
    Malta, United States Minor Outlying Islands, Albania, Chad, Georgia, Brunei Darussalam, Mexico, New Zealand, Qatar, Anguilla
    Description

    This dataset features over 15,000,000 high-quality images sourced from photographers worldwide. Designed to support AI and machine learning applications, it provides a diverse and richly annotated collection of imagery.

    Key Features: 1. Comprehensive Metadata: the dataset includes full EXIF data, detailing camera settings such as aperture, ISO, shutter speed, and focal length. Additionally, each image is pre-annotated with object and scene detection metadata, making it ideal for tasks like classification, detection, and segmentation. Popularity metrics, derived from engagement on our proprietary platform, are also included.

    1. Unique Sourcing Capabilities: the images are collected through a proprietary gamified platform for photographers. Competitions focused on flower photography ensure fresh, relevant, and high-quality submissions. Custom datasets can be sourced on-demand within 72 hours, allowing for specific requirements such as particular flower species or geographic regions to be met efficiently.

    2. Global Diversity: photographs have been sourced from contributors in over 100 countries, ensuring a vast array of flower species, colors, and environmental settings. The images feature varied contexts, including natural habitats, gardens, bouquets, and urban landscapes, providing an unparalleled level of diversity.

    3. High-Quality Imagery: the dataset includes images with resolutions ranging from standard to high-definition to meet the needs of various projects. Both professional and amateur photography styles are represented, offering a mix of artistic and practical perspectives suitable for a variety of applications.

    4. Popularity Scores Each image is assigned a popularity score based on its performance in GuruShots competitions. This unique metric reflects how well the image resonates with a global audience, offering an additional layer of insight for AI models focused on user preferences or engagement trends.

    5. I-Ready Design: this dataset is optimized for AI applications, making it ideal for training models in tasks such as image recognition, classification, and segmentation. It is compatible with a wide range of machine learning frameworks and workflows, ensuring seamless integration into your projects.

    6. Licensing & Compliance: the dataset complies fully with data privacy regulations and offers transparent licensing for both commercial and academic use.

    Use Cases 1. Training AI systems for plant recognition and classification. 2. Enhancing agricultural AI models for plant health assessment and species identification. 3. Building datasets for educational tools and augmented reality applications. 4. Supporting biodiversity and conservation research through AI-powered analysis.

    This dataset offers a comprehensive, diverse, and high-quality resource for training AI and ML models, tailored to deliver exceptional performance for your projects. Customizations are available to suit specific project needs. Contact us to learn more!

  14. ZeroCostDL4Mic - YoloV2 example training and test dataset

    • zenodo.org
    • data.niaid.nih.gov
    Updated Jul 14, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Guillaume Jacquemet; Guillaume Jacquemet; Lucas von Chamier; Lucas von Chamier (2020). ZeroCostDL4Mic - YoloV2 example training and test dataset [Dataset]. http://doi.org/10.5281/zenodo.3941908
    Explore at:
    Dataset updated
    Jul 14, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Guillaume Jacquemet; Guillaume Jacquemet; Lucas von Chamier; Lucas von Chamier
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Name: ZeroCostDL4Mic - YoloV2 example training and test dataset

    (see our Wiki for details)

    Data type: 2D grayscale .png images with corresponding bounding box annotations in .xml PASCAL Voc format.

    Microscopy data type: Phase contrast microscopy data (brightfield)

    Microscope: Inverted Zeiss Axio zoom widefield microscope equipped with an AxioCam MRm camera, an EL Plan-Neofluar 20 × /0.5 NA objective (Carl Zeiss), with a heated chamber (37 °C) and a CO2 controller (5%).

    Cell type: MDA-MB-231 cells migrating on cell-derived matrices generated by fibroblasts.

    File format: .png (8-bit)

    Image size: 1388 x 1040 px (323 nm)

    Author(s): Guillaume Jacquemet1,2,3, Lucas von Chamier4,5

    Contact email: lucas.chamier.13@ucl.ac.uk and guillaume.jacquemet@abo.fi

    Affiliation(s):

    1) Faculty of Science and Engineering, Cell Biology, Åbo Akademi University, 20520 Turku, Finland

    2) Turku Bioscience Centre, University of Turku and Åbo Akademi University, FI-20520 Turku

    3) ORCID: 0000-0002-9286-920X

    4) MRC-Laboratory for Molecular Cell Biology. University College London, London, UK

    5) ORCID: 0000-0002-9243-912X

    Associated publications: Jacquemet et al 2016. DOI: 10.1038/ncomms13297

    Funding bodies: G.J. was supported by grants awarded by the Academy of Finland, the Sigrid Juselius Foundation and Åbo Akademi University Research Foundation (CoE CellMech) and by Drug Discovery and Diagnostics strategic funding to Åbo Akademi University.

  15. P

    Underwater Object Detection Dataset Dataset

    • paperswithcode.com
    Updated Feb 18, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Underwater Object Detection Dataset Dataset [Dataset]. https://paperswithcode.com/dataset/underwater-object-detection-dataset
    Explore at:
    Dataset updated
    Feb 18, 2025
    Description

    Description:

    👉 Download the dataset here

    This dataset is designed for advanced underwater object detection and classification. It provides a comprehensive collection of images featuring underwater objects, each precisely annotated with bounding boxes. The dataset aims to support a wide range of research applications, from environmental monitoring to underwater robotics.

    Download Dataset

    Classes:

    Fish (individual and grouped)

    Crab

    Human Diver

    Trash (marine pollution)

    Jellyfish

    Coral Reef

    Sea Turtle

    Starfish

    Dataset Structure:

    Training Set (70%): A robust sample for building detection models.

    Validation Set (10%): Used to fine-tune model performance.

    Test Set (20%): A carefully selected set of images for evaluating model accuracy.

    Pre-processing Techniques:

    Auto-Orientation: Ensures all images are correctly aligned.

    Resizing: Images are scaled to 640×640 pixels for uniformity.

    Brightness Normalization: Corrects for underwater lighting conditions.

    Contrast Stretching: Enhances visibility for objects in murky or low-contrast scenes.

    New Annotation Techniques:

    Polygonal Segmentation: Introduces more precise segmentation for irregular shapes such as coral reefs.

    3D Depth Mapping: For enhanced understanding of object placement in underwater space.

    Dataset Use Cases:

    Marine Ecology: Assessing species diversity and tracking the impact of environmental changes.

    Pollution Analysis: Detecting and classifying marine trash, aiding in cleanup efforts.

    Underwater Robotics: Training AUVs to recognize and navigate around complex underwater structures like coral reefs or large groups of fish.

    Conclusion:

    The expanded Underwater Object Detection provides a rich resource for researchers, environmentalists, and engineers working on underwater object detection and classification. Its enhanced classes, precise annotations, and preprocessing techniques make it a valuable asset for developing robust models in marine exploration and conservation.

    This dataset is sourced from Kaggle.

  16. The Semantic PASCAL-Part Dataset

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Jan 20, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ivan Donadello; Ivan Donadello; Luciano Serafini; Luciano Serafini (2022). The Semantic PASCAL-Part Dataset [Dataset]. http://doi.org/10.5281/zenodo.5878773
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 20, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Ivan Donadello; Ivan Donadello; Luciano Serafini; Luciano Serafini
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The Semantic PASCAL-Part dataset

    The Semantic PASCAL-Part dataset is the RDF version of the famous PASCAL-Part dataset used for object detection in Computer Vision. Each image is annotated with bounding boxes containing a single object. Couples of bounding boxes are annotated with the part-whole relationship. For example, the bounding box of a car has the part-whole annotation with the bounding boxes of its wheels.

    This original release joins Computer Vision with Semantic Web as the objects in the dataset are aligned with concepts from:

    • the provided supporting ontology;
    • the WordNet database through its synstes;
    • the Yago ontology.

    The provided Python 3 code (see the GitHub repo) is able to browse the dataset and convert it in RDF knowledge graph format. This new format easily allows the fostering of research in both Semantic Web and Machine Learning fields.

    Structure of the semantic PASCAL-Part Dataset

    This is the folder structure of the dataset:

    • semanticPascalPart: it contains the refined images and annotations (e.g., small specific parts are merged into bigger parts) of the PASCAL-Part dataset in Pascal-voc style.
      • Annotations_set: the test set annotations in .xml format. For further information See the PASCAL VOC format here.
      • Annotations_trainval: the train and validation set annotations in .xml format. For further information See the PASCAL VOC format here.
      • JPEGImages_test: the test set images in .jpg format.
      • JPEGImages_trainval: the train and validation set images in .jpg format.
      • test.txt: the 2416 image filenames in the test set.
      • trainval.txt: the 7687 image filenames in the train and validation set.

    The PASCAL-Part Ontology

    The PASCAL-Part OWL ontology formalizes, through logical axioms, the part-of relationship between whole objects (22 classes) and their parts (39 classes). The ontology contains 85 logical axiomns in Description Logic in (for example) the following form:

    Every potted_plant has exactly 1 plant AND
              has exactly 1 pot
    

    We provide two versions of the ontology: with and without cardinality constraints in order to allow users to experiment with or without them. The WordNet alignment is encoded in the ontology as annotations. We further provide the WordNet_Yago_alignment.csv file with both WordNet and Yago alignments.

    The ontology can be browsed with many Semantic Web tools such as:

    • Protégé: a graphical tool for ongology modelling;
    • OWLAPI: Java API for manipulating OWL ontologies;
    • rdflib: Python API for working with the RDF format.
    • RDF stores: databases for storing and semantically retrieve RDF triples. See here for some examples.

    Citing semantic PASCAL-Part

    If you use semantic PASCAL-Part in your research, please use the following BibTeX entry

    @article{DBLP:journals/ia/DonadelloS16,
     author  = {Ivan Donadello and
            Luciano Serafini},
     title   = {Integration of numeric and symbolic information for semantic image
            interpretation},
     journal  = {Intelligenza Artificiale},
     volume  = {10},
     number  = {1},
     pages   = {33--47},
     year   = {2016}
    }
    
  17. P

    MS COCO Dataset

    • paperswithcode.com
    Updated Apr 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tsung-Yi Lin; Michael Maire; Serge Belongie; Lubomir Bourdev; Ross Girshick; James Hays; Pietro Perona; Deva Ramanan; C. Lawrence Zitnick; Piotr Dollár, MS COCO Dataset [Dataset]. https://paperswithcode.com/dataset/coco
    Explore at:
    Dataset updated
    Apr 15, 2024
    Authors
    Tsung-Yi Lin; Michael Maire; Serge Belongie; Lubomir Bourdev; Ross Girshick; James Hays; Pietro Perona; Deva Ramanan; C. Lawrence Zitnick; Piotr Dollár
    Description

    The MS COCO (Microsoft Common Objects in Context) dataset is a large-scale object detection, segmentation, key-point detection, and captioning dataset. The dataset consists of 328K images.

    Splits: The first version of MS COCO dataset was released in 2014. It contains 164K images split into training (83K), validation (41K) and test (41K) sets. In 2015 additional test set of 81K images was released, including all the previous test images and 40K new images.

    Based on community feedback, in 2017 the training/validation split was changed from 83K/41K to 118K/5K. The new split uses the same images and annotations. The 2017 test set is a subset of 41K images of the 2015 test set. Additionally, the 2017 release contains a new unannotated dataset of 123K images.

    Annotations: The dataset has annotations for

    object detection: bounding boxes and per-instance segmentation masks with 80 object categories, captioning: natural language descriptions of the images (see MS COCO Captions), keypoints detection: containing more than 200,000 images and 250,000 person instances labeled with keypoints (17 possible keypoints, such as left eye, nose, right hip, right ankle), stuff image segmentation – per-pixel segmentation masks with 91 stuff categories, such as grass, wall, sky (see MS COCO Stuff), panoptic: full scene segmentation, with 80 thing categories (such as person, bicycle, elephant) and a subset of 91 stuff categories (grass, sky, road), dense pose: more than 39,000 images and 56,000 person instances labeled with DensePose annotations – each labeled person is annotated with an instance id and a mapping between image pixels that belong to that person body and a template 3D model. The annotations are publicly available only for training and validation images.

  18. R

    Aerial Maritime Drone Dataset

    • universe.roboflow.com
    zip
    Updated Sep 28, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jacob Solawetz (2022). Aerial Maritime Drone Dataset [Dataset]. https://universe.roboflow.com/jacob-solawetz/aerial-maritime/model/4
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 28, 2022
    Dataset authored and provided by
    Jacob Solawetz
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Variables measured
    Movable Objects Bounding Boxes
    Description

    Overview

    Drone Example

    This dataset contains 74 images of aerial maritime photographs taken with via a Mavic Air 2 drone and 1,151 bounding boxes, consisting of docks, boats, lifts, jetskis, and cars. This is a multi class problem. This is an aerial object detection dataset. This is a maritime object detection dataset.

    The drone was flown at 400 ft. No drones were harmed in the making of this dataset.

    This dataset was collected and annotated by the Roboflow team, released with MIT license.

    https://i.imgur.com/9ZYLQSO.jpg" alt="Image example">

    Use Cases

    • Identify number of boats on the water over a lake via quadcopter.
    • Boat object detection dataset
    • Aerial Object Detection proof of concept
    • Identify if boat lifts have been taken out via a drone
    • Identify cars with a UAV drone
    • Find which lakes are inhabited and to which degree.
    • Identify if visitors are visiting the lake house via quad copter.
    • Proof of concept for UAV imagery project
    • Proof of concept for maritime project
    • Etc.

    This dataset is a great starter dataset for building an aerial object detection model with your drone.

    Getting Started

    Fork or download this dataset and follow our How to train state of the art object detector YOLOv4 for more. Stay tuned for particular tutorials on how to teach your UAV drone how to see and comprable airplane imagery and airplane footage.

    Annotation Guide

    See here for how to use the CVAT annotation tool that was used to create this dataset.

    About Roboflow

    Roboflow makes managing, preprocessing, augmenting, and versioning datasets for computer vision seamless. :fa-spacer: Developers reduce 50% of their boilerplate code when using Roboflow's workflow, save training time, and increase model reproducibility. :fa-spacer:

    Roboflow Wordmark

  19. Image dataset for training of an insect detection model for the Insect...

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Dec 10, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Maximilian Sittinger; Maximilian Sittinger (2023). Image dataset for training of an insect detection model for the Insect Detect DIY camera trap [Dataset]. http://doi.org/10.5281/zenodo.7725941
    Explore at:
    zipAvailable download formats
    Dataset updated
    Dec 10, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Maximilian Sittinger; Maximilian Sittinger
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains images of an artifical flower platform with different insects sitting on it or flying above it. All images were automatically recorded with the Insect Detect DIY camera trap, a hardware combination of the Luxonis OAK-1, Raspberry Pi Zero 2 W and PiJuice Zero pHAT for automated insect monitoring (bioRxiv preprint).

    Classes

    The following object classes were annotated in this dataset:

    • wasp (mostly Vespula sp.)
    • hbee (Apis mellifera)
    • fly (mostly Brachycera)
    • hovfly (various Syrphidae, e.g. Episyrphus balteatus)
    • other (all Arthropods with insufficient occurences, e.g. various Hymenoptera, true bugs, beetles)
    • shadow (shadows of the recorded insects)

    View the Health Check for more info on class balance.

    Versions

    Deployment

    You can use this dataset as starting point to train your own insect detection models. Check the model training instructions for more information.

    Open source Python scripts to deploy the trained models can be found at the insect-detect GitHub repo.

  20. d

    20K+ Parking Lots Images | AI Training Data | Machine Learning (ML) data |...

    • datarade.ai
    Updated Nov 29, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Image Datasets (2024). 20K+ Parking Lots Images | AI Training Data | Machine Learning (ML) data | Object & Scene Detection | Global Coverage [Dataset]. https://datarade.ai/data-products/20k-parking-lots-images-machine-learning-ml-data-full-image-datasets
    Explore at:
    .bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
    Dataset updated
    Nov 29, 2024
    Dataset authored and provided by
    Image Datasets
    Area covered
    Saint Pierre and Miquelon, Slovakia, Malta, Liechtenstein, Saudi Arabia, El Salvador, Bolivia (Plurinational State of), Botswana, Saint Helena, Pitcairn
    Description

    This dataset features over 20,000 high-quality images of parking lots sourced from photographers worldwide. Designed to support AI and machine learning applications, it provides a diverse and richly annotated collection of flower imagery.

    Key Features: 1. Comprehensive Metadata: the dataset includes full EXIF data, detailing camera settings such as aperture, ISO, shutter speed, and focal length. Additionally, each image is pre-annotated with object and scene detection metadata, making it ideal for tasks like classification, detection, and segmentation. Popularity metrics, derived from engagement on our proprietary platform, are also included.

    1. Unique Sourcing Capabilities: the images are collected through a proprietary gamified platform for photographers. Competitions focused on flower photography ensure fresh, relevant, and high-quality submissions. Custom datasets can be sourced on-demand within 72 hours, allowing for specific requirements such as particular flower species or geographic regions to be met efficiently.

    2. Global Diversity: photographs have been sourced from contributors in over 100 countries, ensuring a vast array of flower species, colors, and environmental settings. The images feature varied contexts, including natural habitats, gardens, bouquets, and urban landscapes, providing an unparalleled level of diversity.

    3. High-Quality Imagery: the dataset includes images with resolutions ranging from standard to high-definition to meet the needs of various projects. Both professional and amateur photography styles are represented, offering a mix of artistic and practical perspectives suitable for a variety of applications.

    4. Popularity Scores Each image is assigned a popularity score based on its performance in GuruShots competitions. This unique metric reflects how well the image resonates with a global audience, offering an additional layer of insight for AI models focused on user preferences or engagement trends.

    5. I-Ready Design: this dataset is optimized for AI applications, making it ideal for training models in tasks such as image recognition, classification, and segmentation. It is compatible with a wide range of machine learning frameworks and workflows, ensuring seamless integration into your projects.

    6. Licensing & Compliance: the dataset complies fully with data privacy regulations and offers transparent licensing for both commercial and academic use.

    Use Cases 1. Training AI systems for plant recognition and classification. 2. Enhancing agricultural AI models for plant health assessment and species identification. 3. Building datasets for educational tools and augmented reality applications. 4. Supporting biodiversity and conservation research through AI-powered analysis.

    This dataset offers a comprehensive, diverse, and high-quality resource for training AI and ML models, tailored to deliver exceptional performance for your projects. Customizations are available to suit specific project needs. Contact us to learn more!

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Joseph Nelson (2022). Hard Hat Workers Dataset [Dataset]. https://universe.roboflow.com/joseph-nelson/hard-hat-workers/model/13

Hard Hat Workers Dataset

hard-hat-workers

hard-hat-workers-dataset

Explore at:
58 scholarly articles cite this dataset (View in Google Scholar)
zipAvailable download formats
Dataset updated
Sep 30, 2022
Dataset authored and provided by
Joseph Nelson
License

CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically

Variables measured
Workers Bounding Boxes
Description

Overview

The Hard Hat dataset is an object detection dataset of workers in workplace settings that require a hard hat. Annotations also include examples of just "person" and "head," for when an individual may be present without a hard hart.

The original dataset has a 75/25 train-test split.

Example Image: https://i.imgur.com/7spoIJT.png" alt="Example Image">

Use Cases

One could use this dataset to, for example, build a classifier of workers that are abiding safety code within a workplace versus those that may not be. It is also a good general dataset for practice.

Using this Dataset

Use the fork or Download this Dataset button to copy this dataset to your own Roboflow account and export it with new preprocessing settings (perhaps resized for your model's desired format or converted to grayscale), or additional augmentations to make your model generalize better. This particular dataset would be very well suited for Roboflow's new advanced Bounding Box Only Augmentations.

Dataset Versions:

Image Preprocessing | Image Augmentation | Modify Classes * v1 (resize-416x416-reflect): generated with the original 75/25 train-test split | No augmentations * v2 (raw_75-25_trainTestSplit): generated with the original 75/25 train-test split | These are the raw, original images * v3 (v3): generated with the original 75/25 train-test split | Modify Classes used to drop person class | Preprocessing and Augmentation applied * v5 (raw_HeadHelmetClasses): generated with a 70/20/10 train/valid/test split | Modify Classes used to drop person class * v8 (raw_HelmetClassOnly): generated with a 70/20/10 train/valid/test split | Modify Classes used to drop head and person classes * v9 (raw_PersonClassOnly): generated with a 70/20/10 train/valid/test split | Modify Classes used to drop head and helmet classes * v10 (raw_AllClasses): generated with a 70/20/10 train/valid/test split | These are the raw, original images * v11 (augmented3x-AllClasses-FastModel): generated with a 70/20/10 train/valid/test split | Preprocessing and Augmentation applied | 3x image generation | Trained with Roboflow's Fast Model * v12 (augmented3x-HeadHelmetClasses-FastModel): generated with a 70/20/10 train/valid/test split | Preprocessing and Augmentation applied, Modify Classes used to drop person class | 3x image generation | Trained with Roboflow's Fast Model * v13 (augmented3x-HeadHelmetClasses-AccurateModel): generated with a 70/20/10 train/valid/test split | Preprocessing and Augmentation applied, Modify Classes used to drop person class | 3x image generation | Trained with Roboflow's Accurate Model * v14 (raw_HeadClassOnly): generated with a 70/20/10 train/valid/test split | Modify Classes used to drop person class, and remap/relabel helmet class to head

Choosing Between Computer Vision Model Sizes | Roboflow Train

About Roboflow

Roboflow makes managing, preprocessing, augmenting, and versioning datasets for computer vision seamless.

Developers reduce 50% of their code when using Roboflow's workflow, automate annotation quality assurance, save training time, and increase model reproducibility.

Roboflow Workmark

Search
Clear search
Close search
Google apps
Main menu