Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Tsinghua Dogs Dataset with ground truth labels for breeds in YOLOv5 format.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
An object detection dataset used for training, validating and evaluating sea urchin detection models. This dataset contains 9,872 images and over 44,000 annotations belonging to three urchin species from a variety of locations around New Zealand and Australia.
Complete_urchin_dataset.csv contain a full list of images in the dataset and the corresponding bounding boxes and additional metadata, including image source, campaign/deployment names, latitude, longitude, depth, altitude, time stamps and more. High_conf_clipped_dataset.csv is a preprocessed version of the complete dataset that has removed annotations with low annotators' confidence scores (< 0.7), removed annotations that are flagged for review and clipped all bounding boxes to fit within the bounds of the images.
Run the download_images.py script to download all the images from the URLs in the csv files.
Labels.zip (YOLOv5 formatted txt bounding box label files), yolov5_dataset.yaml (YOLOv5 dataset configuration file) and train/val/test.txt (training, validation and test splits) can be used to train a YOLOv5 object detection model on this dataset.
See https://github.com/kraw084/Urchin-Detector for code, models and more documentation relating to this dataset.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Further investigation is needed to improve the identification and classification of fish in underwater images using artificial intelligence, specifically deep learning. Questions that need to be explored include the importance of using diverse backgrounds, the effect of (not) labeling small fish on precision, the number of images needed for successful classification, and whether they should be randomly selected. To address these questions, a new labeled dataset was created with over 18,400 recorded Mediterranean fish from 20 species from over 1,600 underwater images with different backgrounds. Two state-of-the-art object detectors/classifiers, YOLOv5m and Faster RCNN, were compared for the detection of the ‘fish’ category in different datasets. YOLOv5m performed better and was thus selected for classifying an increasing number of species in six combinations of labeled datasets varying in background types, balanced or unbalanced number of fishes per background, number of labeled fish, and quality of labeling. Results showed that i) it is cost-efficient to work with a reduced labeled set (a few hundred labeled objects per category) if images are carefully selected, ii) the usefulness of the trained model for classifying unseen datasets improves with the use of different backgrounds in the training dataset, and iii) avoiding training with low-quality labels (e.g., small relative size or incomplete silhouettes) yields better classification metrics. These results and dataset will help select and label images in the most effective way to improve the use of deep learning in studying underwater organisms.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
**Road Sign Detection: Project Overview **
The Road Sign Detection project aims to develop a robust and accurate machine learning model for detecting and classifying road signs in real-time, using advanced computer vision techniques. This project serves as a critical component in the development of autonomous driving systems, intelligent transportation, and driver-assistance technologies, enhancing road safety by reliably identifying road signs under diverse conditions.
**Project Objectives **
Detection and Classification: Detect the presence of road signs in images or video frames and classify them accurately according to specific sign categories. Real-Time Performance: Optimize the model to achieve real-time inference speeds suitable for deployment in systems where latency is critical, such as autonomous vehicles or traffic monitoring systems. Generalization Across Environments: Ensure high performance across varied lighting, weather, and geographical conditions by training on a diverse dataset of annotated road signs. Classes and Tags This project involves multiple classes of road signs, which may include, but are not limited to:
Data Collection and Annotation
Dataset Size: 739 annotated images. Data Annotation: Each image has been manually annotated to include precise bounding boxes around each road sign, ensuring high-quality training data. Data Diversity: The dataset includes images taken from various perspectives, in different lighting conditions, and at varying levels of image clarity to improve the model's robustness. Current Status and Timeline Data Collection and Annotation: Completed. Model Training: Ongoing, with initial results demonstrating promising accuracy in detecting and classifying road signs. Deployment: Plans are underway to deploy the model on edge devices, making it suitable for use in real-world applications where immediate response times are critical. Project Timeline: The project is set to complete the final stages of training and optimization within the next two months, with active testing and iterative improvements ongoing. External Resources Project on Roboflow Universe: View Project on Roboflow Universe Documentation and API Reference: Detailed documentation on the dataset structure, model training parameters, and deployment options can be accessed within the Roboflow workspace. Contribution and Labeling Guidelines Contributors are welcome to expand the dataset by labeling additional road sign images and diversifying annotations. To maintain consistency:
Labeling Standards: Use bounding boxes to tightly enclose each road sign, ensuring no extra space or missing parts. Quality Control: Annotated images should be reviewed for accuracy, clarity, and proper categorization according to the predefined class types. This Road Sign Detection project is publicly listed on Roboflow Universe, where users and collaborators can download, contribute to, or learn more about the dataset and model performance.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
DA: Data augmentation, mAP: mean average precision, YOLO: you only look once.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
BackgroundUrogenital schistosomiasis is considered a Neglected Tropical Disease (NTD) by the World Health Organization (WHO). It is estimated to affect 150 million people worldwide, with a high relevance in resource-poor settings of the African continent. The gold-standard diagnosis is still direct observation of Schistosoma haematobium eggs in urine samples by optical microscopy. Novel diagnostic techniques based on digital image analysis by Artificial Intelligence (AI) tools are a suitable alternative for schistosomiasis diagnosis.MethodologyDigital images of 24 urine sediment samples were acquired in non-endemic settings. S. haematobium eggs were manually labeled in digital images by laboratory professionals and used for training YOLOv5 and YOLOv8 models, which would achieve automatic detection and localization of the eggs. Urine sediment images were also employed to perform binary classification of images to detect erythrocytes/leukocytes with the MobileNetv3Large, EfficientNetv2, and NasNetLarge models. A robotized microscope system was employed to automatically move the slide through the X-Y axis and to auto-focus the sample.ResultsA total number of 1189 labels were annotated in 1017 digital images from urine sediment samples. YOLOv5x training demonstrated a 99.3% precision, 99.4% recall, 99.3% F-score, and 99.4% mAP0.5 for S. haematobium detection. NasNetLarge has an 85.6% accuracy for erythrocyte/leukocyte detection with the test dataset. Convolutional neural network training and comparison demonstrated that YOLOv5x for the detection of eggs and NasNetLarge for the binary image classification to detect erythrocytes/leukocytes were the best options for our digital image database.ConclusionsThe development of low-cost novel diagnostic techniques based on the detection and identification of S. haematobium eggs in urine by AI tools would be a suitable alternative to conventional microscopy in non-endemic settings. This technical proof-of-principle study allows laying the basis for improving the system, and optimizing its implementation in the laboratories.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Object Tracking on a Monopoly Game Board
Author: Nathan Hoebeke
Supervisors: Maxim Van de Wynckel, Prof. Dr. Beat Signer
About
The goal of this dataset was to track game pieces on the physical game board of Monopoly. We make use of object classification where our training data consists of 100 pictures (taken at an angle) of the game board in order to classify the individual (moving) pieces. The training dataset was on the 9th of April 2023 and the test date recorded on the 7th of May 2023 using an iPhone 13 mini and iPhone 12.
Two participants played a game of Monopoly and each individually took pictures of the current game state after every move. These images were then processed by our application to determine the location of pawns and other game pieces such as the red and green houses.
Raw images are unprocessed but may have minor edits to ensure anonymisation of participants in the background. We used Roboflow to label and train our dataset which is included in this repository.
For more information about our processing and this dataset you can download the full Bachelor thesis here: https://wise.vub.ac.be/thesis/location-tracking-physical-game-board (download link available after embargo at the end of the academic year)
This dataset was published as part of the bachelor thesis: Location Tracking on a Physical Game Board for obtaining the degree of Bachelor in Computer Sciences at the Vrije Universiteit Brussel.
Data
Data | Pictures | Device |
---|---|---|
Training | 213 | iPhone 13 mini |
Test #1 | 102 | iPhone 12 |
Test #2 | 93 | iPhone 13 mini |
Dataset contents
model
: Trained YOLOv5 model with labels. This dataset can also be found here.train
: Training data made by the author.
raw
: Raw pictures of the game board at various states.
GAME_2023-04-09_
: Images formatted based on the date and time when they were captured.processed
: Processed pictures with perspective transformation applied.
canvas
: (Pre)-processed image.test
: Test data made by indepedent participants.
participant_1
: Participant 1 data
raw
: Raw pictures of the game board taken by the parcipant after every move.
GAME_2023-05-07_
: Images formatted based on the date and time when they were captured.processed
: Processed pictures with perspective transformation applied. Yellow rectangles are included when our own algorithm was able to determine the location.
canvas
: Processed image.participant_2
: Participant 2 data
raw
: Raw pictures of the game board taken by the parcipant after every move.
GAME_2023-05-07_
: Images formatted based on the date and time when they were captured.README.md
: Documentation and information about the dataset.License
This license applies to the dataset for the game Monopoly. Any artwork or intellectual property from the game that is captured by this dataset is property of Hasbro, Inc.
Copyright 2022-2023 Nathan Hoebeke, Beat Signer, Maxim Van de Wynckel, Vrije Universiteit Brussel
Permission is hereby granted, free of charge, to any person obtaining a copy of this dataset and associated documentation files (the “Dataset”), to deal in the Dataset without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Dataset, and to permit persons to whom the Dataset is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions that make use of the Dataset.
THE DATASET IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE DATASET OR THE USE OR OTHER DEALINGS IN THE DATASET.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Supplementary Data Protocol
This supplementary dataset includes all files necessary to reproduce and evaluate the training and validation of YOLOv8 and CNN models for detecting GUS-stained and haustoria-containing cells with the BluVision Haustoria software.
1. gus_training_set_yolo/
- Contains the complete YOLOv8-compatible training dataset for GUS classification.
- Format: PyTorch YOLOv5/8 structure from Roboflow export.
- Subfolders:
- train/, test/, val/: Image sets and corresponding label files.
- data.yaml: Configuration file specifying dataset structure and classes.
2. haustoria_training_set_yolo/
- Contains the complete YOLOv8-compatible training dataset for haustoria detection.
- Format identical to gus_training_set_yolo/.
3. haustoria_training_set_cnn/
- Dataset formatted for CNN-based classification.
- Structure:
- gus/: Images of cells without haustoria.
- hau/: Images of cells with haustoria.
- Suitable for binary classification pipelines (e.g., Keras, PyTorch).
4. yolo_models/
- Directory containing the final trained YOLOv8 model weights.
- Includes:
- gus.pt: YOLOv8 model trained on GUS data.
- haustoria.pt: YOLOv8 model trained on haustoria data.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
BackgroundUrogenital schistosomiasis is considered a Neglected Tropical Disease (NTD) by the World Health Organization (WHO). It is estimated to affect 150 million people worldwide, with a high relevance in resource-poor settings of the African continent. The gold-standard diagnosis is still direct observation of Schistosoma haematobium eggs in urine samples by optical microscopy. Novel diagnostic techniques based on digital image analysis by Artificial Intelligence (AI) tools are a suitable alternative for schistosomiasis diagnosis.MethodologyDigital images of 24 urine sediment samples were acquired in non-endemic settings. S. haematobium eggs were manually labeled in digital images by laboratory professionals and used for training YOLOv5 and YOLOv8 models, which would achieve automatic detection and localization of the eggs. Urine sediment images were also employed to perform binary classification of images to detect erythrocytes/leukocytes with the MobileNetv3Large, EfficientNetv2, and NasNetLarge models. A robotized microscope system was employed to automatically move the slide through the X-Y axis and to auto-focus the sample.ResultsA total number of 1189 labels were annotated in 1017 digital images from urine sediment samples. YOLOv5x training demonstrated a 99.3% precision, 99.4% recall, 99.3% F-score, and 99.4% mAP0.5 for S. haematobium detection. NasNetLarge has an 85.6% accuracy for erythrocyte/leukocyte detection with the test dataset. Convolutional neural network training and comparison demonstrated that YOLOv5x for the detection of eggs and NasNetLarge for the binary image classification to detect erythrocytes/leukocytes were the best options for our digital image database.ConclusionsThe development of low-cost novel diagnostic techniques based on the detection and identification of S. haematobium eggs in urine by AI tools would be a suitable alternative to conventional microscopy in non-endemic settings. This technical proof-of-principle study allows laying the basis for improving the system, and optimizing its implementation in the laboratories.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Tsinghua Dogs Dataset with ground truth labels for breeds in YOLOv5 format.