Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Enhanced Image Segmentation using Double Hybrid DEGA and PSO-SA
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset was captured for the purpose of segmenting and classifying the terrain based on the movability constraints of three different mobile robots, see Semantic Terrain Segmentation with an Original RGB Data Set, Targeting Elevation Differences.
The dataset aims to enable autonomous terrain segmentation and classification based on the height characteristics of the terrain.
The name of the dataset, Vale, is inspired by the capture location. Campus Do Vale, The Federal University of Rio Grande Du Sul (UFRGS), Brazil.
The data within is primarily aimed for use with Deeplabv3+ but can be used for any semantic image segmentation purpose.
Environment: Semi-urban
Source: DJI Mavic Pro
Images: 600
Size: 1920x1080 (RGB)
Camera angle: 45degrees towards the ground
Altitude: ~2 meters.
Area: Campus Do Vale UFRGS
Time of the day: Midday
Capture Date: November 20th, 2018 and May 6th, 2019
Naming: 5-digit name (ex. 03001.*), two first digits (03) correspond to origin-video the frame was extracted from. The three following digits (001) correspond to the image/frame number.
Classes | Hight characteristics | Color | 8bit code |
---|---|---|---|
Non-Traversable | (200 -> mm | Red | 4 |
Legged | (50 -> 200] mm | Orange | 3 |
Belted/Tracked | (20 -> 50] mm | Yellow | 2 |
Wheeled | [0 -> 20] mm | Green | 1 |
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F3695042%2F96f8f122abb98ad7bb21559a344a57b0%2Fvalev2_pixel_distribution_original.png?generation=1577009300284899&alt=media" alt="">
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F3695042%2Ff961ea3a3638fca5a58ae89a7385310d%2Fvalev2_pixel_distribution_filled.png?generation=1577009332159170&alt=media" alt="">
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F3695042%2F9ee51631a0c4317b824943a4684d481e%2FSegment_count_valev2.png?generation=1577008998518821&alt=media" alt="">
Dataset captured by: Sadegh Hosseinpoor, Mathias Mantelli and Diego "kindin" Pittol.
@inproceedings{dutta2019vgg,
author = {Dutta, Abhishek and Zisserman, Andrew},
title = {The {VIA} Annotation Software for Images, Audio and Video},
booktitle = {Proceedings of the 27th ACM International Conference on Multimedia},
series = {MM '19},
year = {2019},
isbn = {978-1-4503-6889-6/19/10},
location = {Nice, France},
numpages = {4},
url = {https://doi.org/10.1145/3343031.3350535},
doi = {10.1145/3343031.3350535},
publisher = {ACM},
address = {New York, NY, USA},
}
@misc{dutta2016via,
author = "Dutta, A. and Gupta, A. and Zissermann, A.",
title = "{VGG} Image Annotator ({VIA})",
year = "2016",
howpublished = "http://www.robots.ox.ac.uk/~vgg/software/via/",
note = "Version: 1.0.6, Accessed: 18/02/2019"
}
@inproceedings{deeplabv3plus2018,
title={Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation},
author={Liang-Chieh Chen and Yukun Zhu and George Papandreou and Florian Schroff and Hartwig Adam},
booktitle={ECCV},
year={2018}
}
We hope this will be of use for the machine vision community and push for further development of the field!
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
1. Digitized specimens are an indispensable resource for rapidly acquiring big datasets and typically must be preprocessed prior to conducting analyses. One crucial image preprocessing step in any image analysis workflow is image segmentation, or the ability to clearly contrast the foreground target from the background noise in an image. This procedure is typically done manually, creating a potential bottleneck for efforts to quantify biodiversity from image databases. Image segmentation meta-algorithms using deep learning provide an opportunity to relax this bottleneck. However, the most accessible pre-trained convolutional neural networks (CNNs) have been trained on a small fraction of biodiversity, thus limiting their utility.
2. We trained a deep learning model to automatically segment target fish from images with both standardized and complex, noisy backgrounds. We then assessed the performance of our deep learning model using qualitative visual inspection and quantitative image segmentation metrics of pixel overlap between reference segmentation masks generated manually by experts and those automatically predicted by our model.
3. Visual inspection revealed that our model segmented fishes with high precision and relatively few artifacts. These results suggest that the meta-algorithm (Mask R-CNN), in which our current fish segmentation model relies on, is well-suited for generating high-fidelity segmented specimen images across a variety of background contexts at rapid pace.
4. We present Sashimi, a user-friendly command line toolkit to facilitate rapid, automated high-throughput image segmentation of digitized organisms. Sashimi is accessible to non-programmers and does not require experience with deep learning to use. The flexibility of Mask R-CNN allows users to generate a segmentation model for use on diverse animal and plant images using transfer learning with training datasets as small as a few hundred images. To help grow the taxonomic scope of images that can be recognized, Sashimi also includes a central database for sharing and distributing custom-trained segmentation models of other unrepresented organisms. Lastly, Sashimi includes both auxiliary image preprocessing functions useful for some popular downstream color pattern analysis workflows, as well as a simple script to aid users in qualitatively and quantitatively assessing segmentation model performance for complementary sets of automatically and manually segmented images.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Segment Anything Model (SAM) with Prioritized Memory Overview The Segment Anything Model (SAM) by Meta is a state-of-the-art image segmentation model leveraging vision transformers. However, it suffers from high memory usage and computational inefficiencies. Our research introduces a prioritized memory mechanism to enhance SAM’s performance while optimizing resource consumption. Methodology We propose a structured memory hierarchy to efficiently manage image embeddings and self-attention… See the full description on the dataset page: https://huggingface.co/datasets/vinit000/Enhancing-Segment-Anything-Model-with-Prioritized-Memory-For-Efficient-Image-Embeddings.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
IntroductionImage segmentation is an important process for quantifying characteristics of malignant bone lesions, but this task is challenging and laborious for radiologists. Deep learning has shown promise in automating image segmentation in radiology, including for malignant bone lesions. The purpose of this review is to investigate deep learning-based image segmentation methods for malignant bone lesions on Computed Tomography (CT), Magnetic Resonance Imaging (MRI), and Positron-Emission Tomography/CT (PET/CT).MethodThe literature search of deep learning-based image segmentation of malignant bony lesions on CT and MRI was conducted in PubMed, Embase, Web of Science, and Scopus electronic databases following the guidelines of Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA). A total of 41 original articles published between February 2017 and March 2023 were included in the review.ResultsThe majority of papers studied MRI, followed by CT, PET/CT, and PET/MRI. There was relatively even distribution of papers studying primary vs. secondary malignancies, as well as utilizing 3-dimensional vs. 2-dimensional data. Many papers utilize custom built models as a modification or variation of U-Net. The most common metric for evaluation was the dice similarity coefficient (DSC). Most models achieved a DSC above 0.6, with medians for all imaging modalities between 0.85–0.9.DiscussionDeep learning methods show promising ability to segment malignant osseous lesions on CT, MRI, and PET/CT. Some strategies which are commonly applied to help improve performance include data augmentation, utilization of large public datasets, preprocessing including denoising and cropping, and U-Net architecture modification. Future directions include overcoming dataset and annotation homogeneity and generalizing for clinical applicability.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset contains key characteristics about the data described in the Data Descriptor An annotated fluorescence image dataset for training nuclear segmentation methods. Contents:
1. human readable metadata summary table in CSV format
2. machine readable metadata file in JSON format
A comprehensive dataset of 340K+ jewelry images sourced globally, featuring full EXIF data, including camera settings and photography details. Enriched with object and scene detection metadata, this dataset is ideal for AI model training in image recognition, classification & segmentation
Segmentation models perform a pixel-wise classification by classifying the pixels into different classes. The classified pixels correspond to different objects or regions in the image. These models have a wide variety of use cases across multiple domains. When used with satellite and aerial imagery, these models can help to identify features such as building footprints, roads, water bodies, crop fields, etc.Generally, every segmentation model needs to be trained from scratch using a dataset labeled with the objects of interest. This can be an arduous and time-consuming task. Meta's Segment Anything Model (SAM) is aimed at creating a foundational model that can be used to segment (as the name suggests) anything using zero-shot learning and generalize across domains without additional training. SAM is trained on the Segment Anything 1-Billion mask dataset (SA-1B) which comprises a diverse set of 11 million images and over 1 billion masks. This makes the model highly robust in identifying object boundaries and differentiating between various objects across domains, even though it might have never seen them before. Use this model to extract masks of various objects in any image.Using the modelFollow the guide to use the model. Before using this model, ensure that the supported deep learning libraries are installed. For more details, check Deep Learning Libraries Installer for ArcGIS. Fine-tuning the modelThis model can be fine-tuned using SamLoRA architecture in ArcGIS. Follow the guide and refer to this sample notebook to fine-tune this model.Input8-bit, 3-band imagery.OutputFeature class containing masks of various objects in the image.Applicable geographiesThe model is expected to work globally.Model architectureThis model is based on the open-source Segment Anything Model (SAM) by Meta.Training dataThis model has been trained on the Segment Anything 1-Billion mask dataset (SA-1B) which comprises a diverse set of 11 million images and over 1 billion masks.Sample resultsHere are a few results from the model.
This dataset contains scanning electron microscope (SEM) images and labels from our paper "Towards Unsupervised SEM Image Segmentation for IC Layout Extraction", which are licensed under a Creative Commons Attribution 4.0 International License (CC-BY 4.0). The SEM images cover the logic area of the metal-1 (M1) and metal-2 (M2) layers of a commercial IC produced on a 128 nm technology node. We used an electron energy of 15 keV with a backscattered electron detector and a dwell time of 3 μs for SEM capture. The images are 4096×3536 pixels in size, with a resolution of 14.65 nm per pixel and 10% overlap. We discarded images on the logic area boundaries and publish the remaining ones in random order. We additionally provide labels for tracks and vias on the M2 layer, which are included as .svg files. For labeling, we employed automatic techniques, such as thresholding, edge detection, and size, position, and complexity filtering, before manually validating and correcting the generated labels. The labels may contain duplicates for detected vias. Tracks spanning multiple images may not be present in the label file of each image. The implementation of our approach, as well as accompanying evaluation and utility routines can be found in the following GitHub repository: https://github.com/emsec/unsupervised-ic-sem-segmentation Please make sure to always cite our study when using any part of our data set or code for your own research publications! @inproceedings {2023rothaug, author = {Rothaug, Nils and Klix, Simon and Auth, Nicole and B"ocker, Sinan and Puschner, Endres and Becker, Steffen and Paar, Christof}, title = {Towards Unsupervised SEM Image Segmentation for IC Layout Extraction}, booktitle = {Proceedings of the 2023 Workshop on Attacks and Solutions in Hardware Security}, series = {ASHES'23}, year = {2023}, month = {november}, keywords = {ic-layout-extraction;sem-image-segmentation;unsupervised-deep-learning;open-source-dataset}, url = {https://doi.org/10.1145/3605769.3624000}, doi = {10.1145/3605769.3624000}, isbn = {9798400702624}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA} }
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Imaging Data Commons (IDC)(https://imaging.datacommons.cancer.gov/) [1] connects researchers with publicly available cancer imaging data, often linked with other types of cancer data. Many of the collections have limited annotations due to the expense and effort required to create these manually. The increased capabilities of AI analysis of radiology images provide an opportunity to augment existing IDC collections with new annotation data. To further this goal, we trained several nnUNet [2] based models for a variety of radiology segmentation tasks from public datasets and used them to generate segmentations for IDC collections.
To validate the model's performance, roughly 10% of the AI predictions were assigned to a validation set. For this set, a board-certified radiologist graded the quality of AI predictions on a Likert scale. If they did not 'strongly agree' with the AI output, the reviewer corrected the segmentation.
This record provides the AI segmentations, Manually corrected segmentations, and Manual scores for the inspected IDC Collection images.
Only 10% of the AI-derived annotations provided in this dataset are verified by expert radiologists . More details, on model training and annotations are provided within the associated manuscript to ensure transparency and reproducibility.
This work was done in two stages. Versions 1.x of this record were from the first stage. Versions 2.x added additional records. In the Version 1.x collections, a medical student (non-expert) reviewed all the AI predictions and rated them on a 5-point Likert Scale, for any AI predictions in the validation set that they did not 'strongly agree' with, the non-expert provided corrected segmentations. This non-expert was not utilized for the Version 2.x additional records.
Likert Score Definition:
Guidelines for reviewers to grade the quality of AI segmentations.
5 Strongly Agree - Use-as-is (i.e., clinically acceptable, and could be used for treatment without change)
4 Agree - Minor edits that are not necessary. Stylistic differences, but not clinically important. The current segmentation is acceptable
3 Neither agree nor disagree - Minor edits that are necessary. Minor edits are those that the review judges can be made in less time than starting from scratch or are expected to have minimal effect on treatment outcome
2 Disagree - Major edits. This category indicates that the necessary edit is required to ensure correctness, and sufficiently significant that user would prefer to start from the scratch
1 Strongly disagree - Unusable. This category indicates that the quality of the automatic annotations is so bad that they are unusable.
Zip File Folder Structure
Each zip file in the collection correlates to a specific segmentation task. The common folder structure is
ai-segmentations-dcm This directory contains the AI model predictions in DICOM-SEG format for all analyzed IDC collection files
qa-segmentations-dcm This directory contains manual corrected segmentation files, based on the AI prediction, in DICOM-SEG format. Only a fraction, ~10%, of the AI predictions were corrected. Corrections were performed by radiologist (rad*) and non-experts (ne*)
qa-results.csv CSV file linking the study/series UIDs with the ai segmentation file, radiologist corrected segmentation file, radiologist ratings of AI performance.
qa-results.csv Columns
The qa-results.csv file contains metadata about the segmentations, their related IDC case image, as well as the Likert ratings and comments by the reviewers.
Column
Description
Collection
The name of the IDC collection for this case
PatientID
PatientID in DICOM metadata of scan. Also called Case ID in the IDC
StudyInstanceUID
StudyInstanceUID in the DICOM metadata of the scan
SeriesInstanceUID
SeriesInstanceUID in the DICOM metadata of the scan
Validation
true/false if this scan was manually reviewed
Reviewer
Coded ID of the reviewer. Radiologist IDs start with ‘rad’ non-expect IDs start with ‘ne’
AimiProjectYear
2023 or 2024, This work was split over two years. The main methodology difference between the two is that in 2023, a non-expert also reviewed the AI output, but a non-expert was not utilized in 2024.
AISegmentation
The filename of the AI prediction file in DICOM-seg format. This file is in the ai-segmentations-dcm folder.
CorrectedSegmentation
The filename of the reviewer-corrected prediction file in DICOM-seg format. This file is in the qa-segmentations-dcm folder. If the reviewer strongly agreed with the AI for all segments, they did not provide any correction file.
Was the AI predicted ROIs accurate?
This column appears one for each segment in the task for images from AimiProjectYear 2023. The reviewer rates segmentation quality on a Likert scale. In tasks that have multiple labels in the output, there is only one rating to cover them all.
Was the AI predicted {SEGMENT_NAME} label accurate?
This column appears one for each segment in the task for images from AimiProjectYear 2024. The reviewer rates each segment for its quality on a Likert scale.
Do you have any comments about the AI predicted ROIs?
Open ended question for the reviewer
Do you have any comments about the findings from the study scans?
Open ended question for the reviewer
File Overview
brain-mr.zip
Segment Description: brain tumor regions: necrosis, edema, enhancing
IDC Collection: UPENN-GBM
Links: model weights, github
breast-fdg-pet-ct.zip
Segment Description: FDG-avid lesions in breast from FDG PET/CT scans QIN-Breast
IDC Collection: QIN-Breast
Links: model weights, github
breast-mr.zip
Segment Description: Breast, Fibroglandular tissue, structural tumor
IDC Collection: duke-breast-cancer-mri
Links: model weights, github
kidney-ct.zip
Segment Description: Kidney, Tumor, and Cysts from contrast enhanced CT scans
IDS Collection: TCGA-KIRC, TCGA-KIRP, TCGA-KICH, CPTAC-CCRCC
Links: model weights, github
liver-ct.zip
Segment Description: Liver from CT scans
IDC Collection: TCGA-LIHC
Links: model weights, github
liver2-ct.zip
Segment Description: Liver and Lesions from CT scans
IDC Collection: HCC-TACE-SEG, COLORECTAL-LIVER-METASTASES
Links: model weights, github
liver-mr.zip
Segment Description: Liver from T1 MRI scans
IDC Collection: TCGA-LIHC
Links: model weights, github
lung-ct.zip
Segment Description: Lung and Nodules (3mm-30mm) from CT scans
IDC Collections:
Anti-PD-1-Lung
LUNG-PET-CT-Dx
NSCLC Radiogenomics
RIDER Lung PET-CT
TCGA-LUAD
TCGA-LUSC
Links: model weights 1, model weights 2, github
lung2-ct.zip
Improved model version
Segment Description: Lung and Nodules (3mm-30mm) from CT scans
IDC Collections:
QIN-LUNG-CT, SPIE-AAPM Lung CT Challenge
Links: model weights, github
lung-fdg-pet-ct.zip
Segment Description: Lungs and FDG-avid lesions in the lung from FDG PET/CT scans
IDC Collections:
ACRIN-NSCLC-FDG-PET
Anti-PD-1-Lung
LUNG-PET-CT-Dx
NSCLC Radiogenomics
RIDER Lung PET-CT
TCGA-LUAD
TCGA-LUSC
Links: model weights, github
prostate-mr.zip
Segment Description: Prostate from T2 MRI scans
IDC Collection: ProstateX, Prostate-MRI-US-Biopsy
Links: model weights, github
Changelog
2.0.2 - Fix the brain-mr segmentations to be transformed correctly
2.0.1 - added AIMI 2024 radiologist comments to qa-results.csv
2.0.0 - added AIMI 2024 segmentations
1.X - AIMI 2023 segmentations and reviewer scores
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Brain tumors, characterized by the uncontrolled growth of abnormal cells, pose a significant threat to human health. Early detection is crucial for successful treatment and improved patient outcomes. Magnetic Resonance Imaging (MRI) is the primary diagnostic tool for brain tumors, providing detailed visualizations of the brain’s intricate structures. However, the complexity and variability of tumor shapes and locations often challenge physicians in achieving accurate tumor segmentation on MRI images. Precise tumor segmentation is essential for effective treatment planning and prognosis. To address this challenge, we propose a novel hybrid deep learning technique, Convolutional Neural Network and ResNeXt101 (ConvNet-ResNeXt101), for automated tumor segmentation and classification. Our approach commences with data acquisition from the BRATS 2020 dataset, a benchmark collection of MRI images with corresponding tumor segmentations. Next, we employ batch normalization to smooth and enhance the collected data, followed by feature extraction using the AlexNet model. This involves extracting features based on tumor shape, position, shape, and surface characteristics. To select the most informative features for effective segmentation, we utilize an advanced meta-heuristics algorithm called Advanced Whale Optimization (AWO). AWO mimics the hunting behavior of humpback whales to iteratively search for the optimal feature subset. With the selected features, we perform image segmentation using the ConvNet-ResNeXt101 model. This deep learning architecture combines the strengths of ConvNet and ResNeXt101, a type of ConvNet with aggregated residual connections. Finally, we apply the same ConvNet-ResNeXt101 model for tumor classification, categorizing the segmented tumor into distinct types. Our experiments demonstrate the superior performance of our proposed ConvNet-ResNeXt101 model compared to existing approaches, achieving an accuracy of 99.27% for the tumor core class with a minimum learning elapsed time of 0.53 s.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Doodleverse/Segmentation Zoo Res-UNet models for Aerial/nadir/2-class (water, nowater) segmentation of RGB 1024x768 high-res. images
These Residual-UNet models have been created using Segmentation Gym* using the following dataset**:
Image size used by model: 1024 x 768 x 3 pixels
classes:
File descriptions
For each model, there are 5 files with the same root name:
1. '.json' config file: this is the file that was used by Segmentation Gym* to create the weights file. It contains instructions for how to make the model and the data it used, as well as instructions for how to use the model for prediction. It is a handy wee thing and mastering it means mastering the entire Doodleverse.
2. '.h5' weights file: this is the file that was created by the Segmentation Gym* function `train_model.py`. It contains the trained model's parameter weights. It can called by the Segmentation Gym* function `seg_images_in_folder.py`. Models may be ensembled.
3. '_modelcard.json' model card file: this is a json file containing fields that collectively describe the model origins, training choices, and dataset that the model is based upon. There is some redundancy between this file and the `config` file (described above) that contains the instructions for the model training and implementation. The model card file is not used by the program but is important metadata so it is important to keep with the other files that collectively make the model and is such is considered part of the model
4. '_model_history.npz' model training history file: this numpy archive file contains numpy arrays describing the training and validation losses and metrics. It is created by the Segmentation Gym function `train_model.py`
5. '.png' model training loss and mean IoU plot: this png file contains plots of training and validation losses and mean IoU scores during model training. A subset of data inside the .npz file. It is created by the Segmentation Gym function `train_model.py`
Additionally, BEST_MODEL.txt contains the name of the model with the best validation loss and mean IoU
References
*Segmentation Gym: Buscombe, D., & Goldstein, E. B. (2022). A reproducible and reusable pipeline for segmentation of geoscientific imagery. Earth and Space Science, 9, e2022EA002332. https://doi.org/10.1029/2022EA002332 See: https://github.com/Doodleverse/segmentation_gym
**
A comprehensive dataset of 25M+ images sourced globally, featuring full EXIF data, including camera settings and photography details. Enriched with object and scene detection metadata, this dataset is ideal for AI model training in image recognition, classification, and segmentation.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Download this file and unzip to somewhere on your machine (although not inside the segmentation_gym
folder), then see the relevant page on the segmentation gym wiki for further explanation.
This dataset and associated models were made by Dr Daniel Buscombe, Marda Science LLC, for the purposes of demonstrating the functionality of Segmentation Gym. The labels were created using Doodler.
Previous versions:
1.0. https://zenodo.org/record/5895128#.Y1G5s3bMIuU original release, Oct 2021, conforming to Segmentation Gym functionality on Oct 2021
2.0 https://zenodo.org/record/7036025#.Y1G57XbMIuU, Jan 23 2022, conforming to Segmentation Gym functionality on Jan 23 2022
This is version 4.0, created 2/25/23, and has been tested with Segmentation Gym using doodleverse-utils 0.0.26 https://pypi.org/project/doodleverse-utils/0.0.26/
/Users/Someone/my_segmentation_zoo_datasets
│ ├── config
│ | └── *.json
│ ├── capehatteras_data
| | ├── fromDoodler
| | | ├──images
│ | | └──labels
| | ├──npzForModel
│ | └──toPredict
│ └── modelOut
│ └── *.png
│ └── weights
│ └── *.h5
There are 4 config files:
1. /config/hatteras_l8_resunet.json
2. /config/hatteras_l8_vanilla_unet.json
3. /config/hatteras_l8_resunet_model2.json
/config/hatteras_l8_segformer.json
The first two are for res-unet and unet models respectively. The third one differs from the first only with specification of kernel size. It is provided as an example of how to conduct model training experiments, modifying one hyperparameter at a time in the effort to create an optimal model. The last one is based on the new Segformer model architecture.
They all contain the same essential information and differ as indicated below
{
"TARGET_SIZE": [768,768], # the size of the imagery you wish the model to train on. This may not be the original size
"MODEL": "resunet", # model name. Otherwise, "unet" or "segformer"
"NCLASSES": 4, # number of classes
"KERNEL":9, # horizontal size of convolution kernel in pixels
"STRIDE":2, # stride in convolution kernel
"BATCH_SIZE": 7, # number of images/labels per batch
"FILTERS":6, # number of filters
"N_DATA_BANDS": 3, # number of image bands
"DROPOUT":0.1, # amount of dropout
"DROPOUT_CHANGE_PER_LAYER":0.0, # change in dropout per layer
"DROPOUT_TYPE":"standard", # type of dropout. Otherwise "spatial"
"USE_DROPOUT_ON_UPSAMPLING":false, # if true, dropout is used on upsampling as well as downsampling
"DO_TRAIN": false, # if false, the model will not train, but you will select this config file, data directory, and the program will load the model weights and test the model on the validation subset
if true, the model will train from scratch (warning! this will overwrite the existing weights file in h5 format)
"LOSS":"dice", # model training loss function, otherwise "cat" for categorical cross-entropy
"PATIENCE": 10, # number of epochs of no model improvement before training is aborted
"MAX_EPOCHS": 100, # maximum number of training epochs
"VALIDATION_SPLIT": 0.6, #proportion to use for validation
"RAMPUP_EPOCHS": 20, # [LR-scheduler] rampup to maximim
"SUSTAIN_EPOCHS": 0.0, # [LR-scheduler] sustain at maximum
"EXP_DECAY": 0.9, # [LR-scheduler] decay rate
"START_LR": 1e-7, # [LR-scheduler] start lr
"MIN_LR": 1e-7, # [LR-scheduler] min lr
"MAX_LR": 1e-4, # [LR-scheduler] max lr
"FILTER_VALUE": 0, #if >0, the size of a median filter to apply on outputs (not recommended unless you have noisy outputs)
"DOPLOT": true, #make plots
"ROOT_STRING": "hatteras_l8_aug_768", #data file (npz) prefix string
"USEMASK": false, # use the convention 'mask' in label image file names, instead of the preferred 'label'
"AUG_ROT": 5, # [augmentation] amount of rotation in degrees
"AUG_ZOOM": 0.05, # [augmentation] amount of zoom as a proportion
"AUG_WIDTHSHIFT": 0.05, # [augmentation] amount of random width shift as a proportion
"AUG_HEIGHTSHIFT": 0.05,# [augmentation] amount of random width shift as a proportion
"AUG_HFLIP": true, # [augmentation] if true, randomly apply horizontal flips
"AUG_VFLIP": false, # [augmentation] if true, randomly apply vertical flips
"AUG_LOOPS": 10, #[augmentation] number of portions to split the data into (recommended > 2 to save memory)
"AUG_COPIES": 5 #[augmentation] number iof augmented copies to make
"SET_GPU": "0" #which GPU to use. If multiple, list separated by a comma, e.g. '0,1,2'. If CPU is requested, use "-1"
"WRITE_MODELMETADATA": false, #if true, the prompts `seg_images_in_folder.py` to write detailed metadata for each sample file
"DO_CRF": true #if true, apply CRF post-processing to outputs
"LOSS_WEIGHTS": false, #if true, apply per-class weights to loss function
"MODE": "all", #'all' means use both non-augmented and augmented files, "noaug" means use non-augmented only, "aug" uses augmented only
"SET_PCI_BUS_ID": true, #if true, make keras aware of the PCI BUS ID (advanced or nonstandard GPU usage)
"TESTTIMEAUG": true, #if true, apply test-time augmentation when model in inference mode
"WRITE_MODELMETADATA": true,# if true, write model metadata per image when model in inference mode
"OTSU_THRESHOLD": true# if true, and NCLASSES=2 only, use per-image Otsu threshold rather than decision boundary of 0.5 on softmax scores
}
Folder containing all the model input data
│ ├── capehatteras_data: folder containing all the model input data
| | ├── fromDoodler: folder containing images and labels exported from Doodler using [this program](https://github.com/dbuscombe-usgs/dash_doodler/blob/main/utils/gen_images_and_labels_4_zoo.py)
| | | ├──images: jpg format files, one per label image
│ | | └──labels: jpg format files, one per image
| | ├──npzForModel: npz format files for model training using [this program](https://github.com/dbuscombe-usgs/segmentation_zoo/blob/main/train_model.py) that have been created following the workflow [documented here](https://github.com/dbuscombe-usgs/segmentation_zoo/wiki/Create-a-model-ready-dataset) using [this program](https://github.com/dbuscombe-usgs/segmentation_zoo/blob/main/make_nd_dataset.py)
│ | └──toPredict: a folder of images to test model prediction using [this program](https://github.com/dbuscombe-usgs/segmentation_zoo/blob/main/seg_images_in_folder.py)
PNG format files containing example model outputs from the train ('_train_' in filename) and validation ('_val_' in filename) subsets as well as an image showing training loss and accuracy curves with trainhist
in the filename. There are two sets of these files, those associated with the residual unet trained with dice loss contain resunet
in their name, and those from the UNet are named with vanilla_unet
.
There are model weights files associated with each config files.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
MatSeg Dataset and benchmark for zero-shot material state segmentation.
MatSeg Benchmark containing 1220 real-world images and their annotations is available at MatSeg_Benchmark.zip the file contains documentation and Python readers.
MatSeg dataset containing synthetic images with infused natural images patterns is available at MatSeg3D_part_*.zip and MatSeg3D_part_*.zip (* stand for number).
MatSeg3D_part_*.zip: contain synthethc 3D scenes
MatSeg2D_part_*.zip: contain syntethc 2D scenes
Readers and documentation for the synthetic data are available at: Dataset_Documentation_And_Readers.zip
Readers and documentation for the real-images benchmark are available at: MatSeg_Benchmark.zip
The Code used to generate the MatSeg Dataset is available at: https://zenodo.org/records/11401072
Additional permanent sources for downloading the dataset and metadata: 1, 2
Evaluation scripts for the Benchmark are now available at:
https://zenodo.org/records/13402003 and https://e.pcloud.link/publink/show?code=XZsP8PZbT7AJzG98tV1gnVoEsxKRbBl8awX
Materials and their states form a vast array of patterns and textures that define the physical and visual world. Minerals in rocks, sediment in soil, dust on surfaces, infection on leaves, stains on fruits, and foam in liquids are some of these almost infinite numbers of states and patterns.
Image segmentation of materials and their states is fundamental to the understanding of the world and is essential for a wide range of tasks, from cooking and cleaning to construction, agriculture, and chemistry laboratory work.
The MatSeg dataset focuses on zero-shot segmentation of materials and their states, meaning identifying the region of an image belonging to a specific material type of state, without previous knowledge or training of the material type, states, or environment.
The dataset contains a large set of (100k) synthetic images and benchmarks of 1220 real-world images for testing.
The benchmark contains 1220 real-world images with a wide range of material states and settings. For example: food states (cooked/burned..), plants (infected/dry.) to rocks/soil (minerals/sediment), construction/metals (rusted, worn), liquids (foam/sediment), and many other states in without being limited to a set of classes or environment. The goal is to evaluate the segmentation of material materials without knowledge or pretraining on the material or setting. The focus is on materials with complex scattered boundaries, and gradual transition (like the level of wetness of the surface).
Evaluation scripts for the Benchmark are now available at: 1 and 2.
The synthetic dataset is composed of synthetic scenes rendered in 2d and 3d using a blender. The synthetic data is infused with patterns, materials, and textures automatically extracted from real images allowing it to capture the complexity and diversity of the real world while maintaining the precision and scale of synthetic data. 100k images and their annotation are available to download.
License
This dataset, including all its components, is released under the CC0 1.0 Universal (CC0 1.0) Public Domain Dedication. To the extent possible under law, the authors have dedicated all copyright and related and neighboring rights to this dataset to the public domain worldwide. This dedication applies to the dataset and all derivative works.
The MatSeg 2D and 3D synthetic were generated using the open-images dataset which is licensed under the https://www.apache.org/licenses/LICENSE-2.0. For these components, you must comply with the terms of the Apache License. In addition, the MatSege3D dataset uses Shapenet 3D assets with GNU license.
An Example of a training and evaluation code for a net trained on the dataset and evaluated on the benchmark is given at these urls: 1, 2
This include an evaluation script on the MatSeg benchmark.
Training script using the MatSeg dataset.
And weights of a trained model
Paper:
More detail on the work ca be found in the paper "Infusing Synthetic Data with Real-World Patterns for
Zero-Shot Material State Segmentation"
Croissant metadata and additional sources for downloading the dataset are available at 1,2
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset contains key characteristics about the data described in the Data Descriptor Segmentation of vestibular schwannoma from MRI, an open annotated dataset and baseline algorithm. Contents:
1. human readable metadata summary table in CSV format
2. machine readable metadata file in JSON format
This dataset features over 80,000 high-quality texture images sourced from photographers and visual creators worldwide. Curated specifically for AI and machine learning applications, it offers an extensively annotated and diverse range of natural and man-made surface patterns ideal for generative models, segmentation tasks, and visual synthesis.
Key Features: 1. Comprehensive Metadata: each image includes full EXIF data—covering camera settings like aperture, ISO, and shutter speed—along with annotations for texture type (e.g., wood, metal, fabric), material properties (e.g., glossy, rough, porous), and pattern complexity. Lighting and angle metadata enhance use in 3D modeling and neural rendering.
Unique Sourcing Capabilities: images are obtained via a proprietary gamified photography platform, with specialized challenges in surface, pattern, and material photography. Custom datasets can be sourced within 72 hours, targeting specific texture families (e.g., stone, skin, rust, bark) or resolution/format preferences (tileable, seamless, 4K+).
Global Diversity: textures have been photographed in over 100 countries, capturing a vast range of environmental and cultural surfaces—natural, industrial, architectural, and organic. This supports generalization in AI models across geographies and use-cases.
High-Quality Imagery: images are captured with professional and enthusiast gear, producing ultra-detailed macro and wide-frame shots. Many textures are seamless or tileable by design, supporting use in gaming, 3D rendering, and AR/VR environments.
Popularity Scores: each image carries a popularity score from its performance in GuruShots competitions. These scores can guide dataset curation for aesthetic training, visual taste modeling, or generative art.
AI-Ready Design: the dataset is structured for use in training generative models (e.g., GANs), segmentation algorithms, material classification, and image style transfer. It integrates easily with common ML pipelines and 3D content creation tools.
Licensing & Compliance: all content is fully compliant with international IP and commercial use regulations. Licensing is clear and adaptable to use in visual effects, gaming, AR/VR, and academic research.
Use Cases: 1. Training AI models for texture synthesis, material recognition, and 3D surface recreation. 2. Powering generative design tools for visual art, games, and virtual environments. 3. Enhancing AR/VR realism with high-quality tileable textures. 4. Supporting style transfer, neural rendering, and vision-based inspection systems.
This dataset delivers a scalable, high-resolution resource for AI applications in visual effects, design, gaming, and synthetic data creation. Custom texture packs and formats are available. Contact us to learn more!
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset contains key characteristics about the data described in the Data Descriptor Serial scanning electron microscopy of anti-PKHD1L1 immuno-gold labeled mouse hair cell stereocilia bundles. Contents:
1. human readable metadata summary table in CSV format
2. machine readable metadata file in JSON format
Coast Train is a library of images of coastal environments, annotations, and corresponding thematic label masks (or ‘label images’) collated for the purposes of training and evaluating machine learning (ML), deep learning, and other models for image segmentation. It includes image sets from both geospatial satellite, aerial, and UAV imagery and orthomosaics, as well as non-geospatial oblique and nadir imagery. Images include a diverse range of coastal environments from the U.S. Pacific, Gulf of Mexico, Atlantic, and Great Lakes coastlines, consisting of time-series of high-resolution (≤1m) orthomosaics and satellite image tiles (10–30m). Each image, image annotation, and labelled image is available as a single NPZ zipped file. NPZ files follow the following naming convention: {datasource}{numberofclasses}{threedigitdatasetversion}.zip, where {datasource} is the source of the original images (for example, NAIP, Landsat 8, Sentinel 2), {numberofclasses} is the number of classes used to annotate the images, and {threedigitdatasetversion} is the three-digit code corresponding to the dataset version (in other words, 001 is version 1). Each zipped folder contains a collection of NPZ format files, each of which corresponds to an individual image. An individual NPZ file is named after the image that it represents and contains (1) a CSV file with detail information for every image in the zip folder and (2) a collection of the following NPY files: orig_image.npy (original input image unedited), image.npy (original input image after color balancing and normalization), classes.npy (list of classes annotated and present in the labelled image), doodles.npy (integer image of all image annotations), color_doodles.npy (color image of doodles.npy), label.npy (labelled image created from the classes present in the annotations), and settings.npy (annotation and machine learning settings used to generate the labelled image from annotations). All NPZ files can be extracted using the utilities available in Doodler (Buscombe, 2022). A merged CSV file containing detail information on the complete imagery collection is available at the top level of this data release, details of which are available in the Entity and Attribute section of this metadata file.
This repository hosts the results of processing example imaging mass cytometry (IMC) data hosted at 10.5281/zenodo.5949116 using the IMC Segmentation Pipeline available at https://github.com/BodenmillerGroup/ImcSegmentationPipeline (DOI: 10.5281/zenodo.6402666). Please refer to https://github.com/BodenmillerGroup/steinbock as alternative processing framework and 10.5281/zenodo.6043600 for the data generated by steinbock. The following files are part of the analysis.zip folder when running the IMC Segmentation Pipeline: cpinp: contains input files for the segmentation pipeline cpout: contains all final output files of the pipeline: cell.csv containing the single-cell features; Experiment.csv containing CellProfiler metadata; Image.csv containing acquisition metadata; Object relationships.csv containing an edge list indicating interacting cells; panel.csv containing channel information; var_cell.csv containing cell feature information; var_Image.csv containing acquisition feature information; images containing the hot pixel filtered multi-channel images and the channel order; masks containing the segmentation masks; probabilities containing the pixel probabilities. histocat: contains single channel .tiff files per acquisition for upload to histoCAT (https://bodenmillergroup.github.io/histoCAT/) crops: contains upscaled image crops in .h5 format for ilastik (https://www.ilastik.org/) training ometiff: contains .ome.tiff files per acquisition, .png files per panorama and additional metadata files per slide ilastik: multi channel images for ilastik pixel classification (_ilastik.full) and their channel order (_ilastik.csv); upscaled multi channel images for ilastik pixel prediction (_ilastik_s2.h5); upscaled 3 channel images containing ilastik pixel probabilities (_ilastik_s2_Probabilities.tiff). The remaining files are part of the root directory: docs.zip: Documentation of the pipeline in markdown format IMCWorkflow.ilp: Ilastik pixel classifier pre-trained on the example data resources.zip: The CellProfiler pipelines and CellProfiler plugins used for the analysis sample_metadata.xlsx: Metadata per sample including the cancer type scripts.zip: Python notebooks used for pre-processing and downloading the example data src.zip: Scripts for the imcsegpipe python package
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Enhanced Image Segmentation using Double Hybrid DEGA and PSO-SA