Coast Train is a library of images of coastal environments, annotations, and corresponding thematic label masks (or ‘label images’) collated for the purposes of training and evaluating machine learning (ML), deep learning, and other models for image segmentation. It includes image sets from both geospatial satellite, aerial, and UAV imagery and orthomosaics, as well as non-geospatial oblique and nadir imagery. Images include a diverse range of coastal environments from the U.S. Pacific, Gulf of Mexico, Atlantic, and Great Lakes coastlines, consisting of time-series of high-resolution (≤1m) orthomosaics and satellite image tiles (10–30m). Each image, image annotation, and labelled image is available as a single NPZ zipped file. NPZ files follow the following naming convention: {datasource}{numberofclasses}{threedigitdatasetversion}.zip, where {datasource} is the source of the original images (for example, NAIP, Landsat 8, Sentinel 2), {numberofclasses} is the number of classes used to annotate the images, and {threedigitdatasetversion} is the three-digit code corresponding to the dataset version (in other words, 001 is version 1). Each zipped folder contains a collection of NPZ format files, each of which corresponds to an individual image. An individual NPZ file is named after the image that it represents and contains (1) a CSV file with detail information for every image in the zip folder and (2) a collection of the following NPY files: orig_image.npy (original input image unedited), image.npy (original input image after color balancing and normalization), classes.npy (list of classes annotated and present in the labelled image), doodles.npy (integer image of all image annotations), color_doodles.npy (color image of doodles.npy), label.npy (labelled image created from the classes present in the annotations), and settings.npy (annotation and machine learning settings used to generate the labelled image from annotations). All NPZ files can be extracted using the utilities available in Doodler (Buscombe, 2022). A merged CSV file containing detail information on the complete imagery collection is available at the top level of this data release, details of which are available in the Entity and Attribute section of this metadata file.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Download this file and unzip to somewhere on your machine (although not inside the segmentation_gym
folder), then see the relevant page on the segmentation gym wiki for further explanation.
This dataset and associated models were made by Dr Daniel Buscombe, Marda Science LLC, for the purposes of demonstrating the functionality of Segmentation Gym. The labels were created using Doodler.
Previous versions:
1.0. https://zenodo.org/record/5895128#.Y1G5s3bMIuU original release, Oct 2021, conforming to Segmentation Gym functionality on Oct 2021
2.0 https://zenodo.org/record/7036025#.Y1G57XbMIuU, Jan 23 2022, conforming to Segmentation Gym functionality on Jan 23 2022
This is version 4.0, created 2/25/23, and has been tested with Segmentation Gym using doodleverse-utils 0.0.26 https://pypi.org/project/doodleverse-utils/0.0.26/
/Users/Someone/my_segmentation_zoo_datasets
│ ├── config
│ | └── *.json
│ ├── capehatteras_data
| | ├── fromDoodler
| | | ├──images
│ | | └──labels
| | ├──npzForModel
│ | └──toPredict
│ └── modelOut
│ └── *.png
│ └── weights
│ └── *.h5
There are 4 config files:
1. /config/hatteras_l8_resunet.json
2. /config/hatteras_l8_vanilla_unet.json
3. /config/hatteras_l8_resunet_model2.json
/config/hatteras_l8_segformer.json
The first two are for res-unet and unet models respectively. The third one differs from the first only with specification of kernel size. It is provided as an example of how to conduct model training experiments, modifying one hyperparameter at a time in the effort to create an optimal model. The last one is based on the new Segformer model architecture.
They all contain the same essential information and differ as indicated below
{
"TARGET_SIZE": [768,768], # the size of the imagery you wish the model to train on. This may not be the original size
"MODEL": "resunet", # model name. Otherwise, "unet" or "segformer"
"NCLASSES": 4, # number of classes
"KERNEL":9, # horizontal size of convolution kernel in pixels
"STRIDE":2, # stride in convolution kernel
"BATCH_SIZE": 7, # number of images/labels per batch
"FILTERS":6, # number of filters
"N_DATA_BANDS": 3, # number of image bands
"DROPOUT":0.1, # amount of dropout
"DROPOUT_CHANGE_PER_LAYER":0.0, # change in dropout per layer
"DROPOUT_TYPE":"standard", # type of dropout. Otherwise "spatial"
"USE_DROPOUT_ON_UPSAMPLING":false, # if true, dropout is used on upsampling as well as downsampling
"DO_TRAIN": false, # if false, the model will not train, but you will select this config file, data directory, and the program will load the model weights and test the model on the validation subset
if true, the model will train from scratch (warning! this will overwrite the existing weights file in h5 format)
"LOSS":"dice", # model training loss function, otherwise "cat" for categorical cross-entropy
"PATIENCE": 10, # number of epochs of no model improvement before training is aborted
"MAX_EPOCHS": 100, # maximum number of training epochs
"VALIDATION_SPLIT": 0.6, #proportion to use for validation
"RAMPUP_EPOCHS": 20, # [LR-scheduler] rampup to maximim
"SUSTAIN_EPOCHS": 0.0, # [LR-scheduler] sustain at maximum
"EXP_DECAY": 0.9, # [LR-scheduler] decay rate
"START_LR": 1e-7, # [LR-scheduler] start lr
"MIN_LR": 1e-7, # [LR-scheduler] min lr
"MAX_LR": 1e-4, # [LR-scheduler] max lr
"FILTER_VALUE": 0, #if >0, the size of a median filter to apply on outputs (not recommended unless you have noisy outputs)
"DOPLOT": true, #make plots
"ROOT_STRING": "hatteras_l8_aug_768", #data file (npz) prefix string
"USEMASK": false, # use the convention 'mask' in label image file names, instead of the preferred 'label'
"AUG_ROT": 5, # [augmentation] amount of rotation in degrees
"AUG_ZOOM": 0.05, # [augmentation] amount of zoom as a proportion
"AUG_WIDTHSHIFT": 0.05, # [augmentation] amount of random width shift as a proportion
"AUG_HEIGHTSHIFT": 0.05,# [augmentation] amount of random width shift as a proportion
"AUG_HFLIP": true, # [augmentation] if true, randomly apply horizontal flips
"AUG_VFLIP": false, # [augmentation] if true, randomly apply vertical flips
"AUG_LOOPS": 10, #[augmentation] number of portions to split the data into (recommended > 2 to save memory)
"AUG_COPIES": 5 #[augmentation] number iof augmented copies to make
"SET_GPU": "0" #which GPU to use. If multiple, list separated by a comma, e.g. '0,1,2'. If CPU is requested, use "-1"
"WRITE_MODELMETADATA": false, #if true, the prompts `seg_images_in_folder.py` to write detailed metadata for each sample file
"DO_CRF": true #if true, apply CRF post-processing to outputs
"LOSS_WEIGHTS": false, #if true, apply per-class weights to loss function
"MODE": "all", #'all' means use both non-augmented and augmented files, "noaug" means use non-augmented only, "aug" uses augmented only
"SET_PCI_BUS_ID": true, #if true, make keras aware of the PCI BUS ID (advanced or nonstandard GPU usage)
"TESTTIMEAUG": true, #if true, apply test-time augmentation when model in inference mode
"WRITE_MODELMETADATA": true,# if true, write model metadata per image when model in inference mode
"OTSU_THRESHOLD": true# if true, and NCLASSES=2 only, use per-image Otsu threshold rather than decision boundary of 0.5 on softmax scores
}
Folder containing all the model input data
│ ├── capehatteras_data: folder containing all the model input data
| | ├── fromDoodler: folder containing images and labels exported from Doodler using [this program](https://github.com/dbuscombe-usgs/dash_doodler/blob/main/utils/gen_images_and_labels_4_zoo.py)
| | | ├──images: jpg format files, one per label image
│ | | └──labels: jpg format files, one per image
| | ├──npzForModel: npz format files for model training using [this program](https://github.com/dbuscombe-usgs/segmentation_zoo/blob/main/train_model.py) that have been created following the workflow [documented here](https://github.com/dbuscombe-usgs/segmentation_zoo/wiki/Create-a-model-ready-dataset) using [this program](https://github.com/dbuscombe-usgs/segmentation_zoo/blob/main/make_nd_dataset.py)
│ | └──toPredict: a folder of images to test model prediction using [this program](https://github.com/dbuscombe-usgs/segmentation_zoo/blob/main/seg_images_in_folder.py)
PNG format files containing example model outputs from the train ('_train_' in filename) and validation ('_val_' in filename) subsets as well as an image showing training loss and accuracy curves with trainhist
in the filename. There are two sets of these files, those associated with the residual unet trained with dice loss contain resunet
in their name, and those from the UNet are named with vanilla_unet
.
There are model weights files associated with each config files.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Overview
The Stereo Instances on Surfaces Dataset (STIOS) is created for evaluation of instance-based algorithms. It is a representative dataset to achieve uniform comparability for instance detection and segmentation with different input modalities (RGB, RGB-D, stereo RGB). STIOS is mainly intended for robotic applications (e.g. object manipulation), which is why the dataset refers to horizontal surfaces.
Sensors
STIOS contains recordings from two different sensors: a rc_visard 65 color and a Stereolabs ZED camera. Aside stereo RGB (left and right RGB image), the internally generated depth maps are also saved for both sensors. In addition, the ZED sensor provides normal images and point cloud data which are also provided in STIOS. Since some objects / surfaces have little texture and this would have a negative impact on the quality of the depth map, an additional LED projector with a random point pattern is used when recording the depth images (only used for rc_visard 65 color). Consequently, for the rc_visard 65 color STIOS includes RGB images and the resulting depth maps with and without a projected pattern.
The large number of different input modalities should enable evaluation of a wide variety of methods. As you can see in the picture, the ZED sensor was mounted above the rc_visard 65 lenses to get a similar viewing angle. This enables an evaluation between the sensors, whereby comparisons can be made about the generalization of a method with regard to sensors or the quality of the input modality.
Objects
The dataset contains the following objects from the YCB video dataset and thus covers several application areas such as unknown instance segmentation, instance detection and segmentation (detection + classification):
003_cracker_box, 005_tomato_soup_can, 006_mustard_bottle, 007_tuna_fish_can, 008_pudding_box, 010_potted_meat_can, 011_banana, 019_pitcher_base, 021_bleach_cleanser, 024_bowl, 025_mug, 035_power_drill, 037_scissors, 052_extra_large_clamp, 061_foam_brick.
Due to the widespread use of these objects in robotic applications there are 3D models for each of the objects which can be used to generate synthetic training data for e.g. instance detection based on RGB-D. In order to guarantee an evenly distributed occurrence of the 15 objects, 4-6 random objects are selected by machine for each sample. The alignment of the objects is either easy (objects do not touch) or difficult (objects may touch or lie on top of each other).
Surroundings
The data set contains 8 different environments in order to cover the variation of environmental parameters such as lighting, background or scene surfaces. Scenes for the data set were recorded in the following environments: office carpet, workbench, white table, wooden table, conveyor belt, lab floor, wooden plank und tool cabinet.
The scenes where chosen carefully to ensure that they contain surfaces that are both friendly as well as challenging to stereo sensors. STIOS therefore contains low-texture surfaces (e.g. white table, conveyor belt) and texture-rich surfaces (e.g. lab floor, wooden plank). The above-mentioned variations of the surfaces and environments allows to evaluate methods in terms of robustness against and generalization to various environmental parameters.
For each scene surface, 3 easy and 3 difficult samples are generated from 4 manually set camera angles (approx. 0.3-1m distance). As the illustration shows, even with light object alignment the objects can occlude each other in some camera angles. The 6 samples per camera setting result in 24 samples per environment for each sensor, which results in a total of 192 samples per sensor.
Annotations
For each of these samples (192x2) all object instances in the left camera image were annotated manually (instance mask + object class). The annotations are available in the form of 8-bit grayscale images, which represent the semantic classes in the image. Since each object appears only once in the image, object instance masks can also be obtained from this format at the same time.
The dataset is structured as follows:
STIOS
|--rc_visard
| |--conveyor_belt
| | |--left_rgb
| | |--right_rgb
| | |--gt
| | |--depth
| | |--left_rgb_pattern
| | |--right_rgb_pattern
| | |--depth_pattern
| |--lab_floor
| |-- ...
|--zed
| |-- conveyor_belt
| | |--left_rgb
| | |--right_rgb
| | |--gt
| | |--depth
| | |--normals
| | |--pcd
| |--lab_floor
| |--...
We also provide code utilities which allow visualization of images and annotations of STIOS and contain various utility functions to e.g. generate bounding box annotations from the semantic grayscale images. Please find them here: https://github.com/DLR-RM/stios-utils.
Citation
If STIOS is useful for your research please cite
@misc{durner2021unknown,
title={Unknown Object Segmentation from Stereo Images},
author={Maximilian Durner and Wout Boerdijk and Martin Sundermeyer and Werner Friedl and Zoltan-Csaba Marton and Rudolph Triebel},
year={2021},
eprint={2103.06796},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
STIOS in projects
Unknown Object Segmentation from Stereo Images
M. Durner, W. Boerdijk, M. Sundermeyer, W. Friedl, Z.-C. Marton, and R. Triebel. "Unknown Object Segmentation from Stereo Images", arXiv preprint arXiv:2103.06796 (2021).
This method enables the segmentation of unknown object instances that are located on horizontal surfaces (e.g. tables, floors, etc.). Due to the often incomplete depth data in robotic applications, stereo RGB images are used here. On the one hand, STIOS is employed to show the functionality of stereo images for unknown instance segmentation, and on the other hand, to make a comparison with existing work, which for the most part directly access depth data.
"What's This?" - Learning to Segment Unknown Objects from Manipulation Sequences
W. Boerdijk, M. Sundermeyer, M. Durner, and R. Triebel. "'What's This?' - Learning to Segment Unknown Objects from Manipulation Sequences", International Conference on Robotics and Automation (ICRA), 2021 (to appear).
This work deals with the segmentation of objects that have been grasped by a robotic arm. With the help of this method it is possible to generate object-specific image data in an automated process. This data can then be used for training object detectors or segmentation approaches. In order to show the usability of the generated data, STIOS is used as an evaluation data set for instance segmentation on RGB images.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Doodleverse/Segmentation Zoo Res-UNet models for Aerial/NOAA ERI/2-class (water, nowater) segmentation of RGB 1024x768 high-res. images
Residual-UNet models are trained on 1,179 pairs of human-generated segmentation labels and images from Emergency Response Imagery (ERI) collected by US National Oceanic and Atmospheric Administration (NOAA) after Hurricane Barry, Delta, Dorian, Florence, Ida, Laura, Michael, Sally, Zeta, and Tropical Storm Gordon.
The dataset is available here**: https://doi.org/10.5281/zenodo.7268082
Models have been created using Segmentation Gym*:
Code - https://github.com/Doodleverse/segmentation_gym
Paper - https://doi.org/10.1029/2022EA002332
The model takes input images that are 512 x 512 x 3 pixels, and the output is 512 x 512 x 2, corresponding to 2 classes:
1. water
2. other
Included here are 6 files with the same root name:
'.json' config file: this is the file that was used by Segmentation Gym to create the weights file. It contains instructions for how to make the model and the data it used, as well as instructions for how to use the model for prediction.
'.h5' weights file: this is the file that was created by the Segmentation Gym function `train_model.py`. It contains the trained model's parameter weights. It can called by the Segmentation Gym function `seg_images_in_folder.py`.
'_model_history.npz' model training history file: this numpy archive file contains numpy arrays describing the training and validation losses and metrics. It is created by the Segmentation Gym function `train_model.py`
'.png' model training loss and mean IoU plot: this png file contains plots of training and validation losses and mean IoU scores during model training. A subset of data inside the .npz file. It is created by the Segmentation Gym function `train_model.py`
'.zip' of the model in the Tensorflow ‘saved model’ format. It is created by the Segmentation Gym function `utils/gen_saved_model.py`
'_modelcard.json' model card file: this is a json file containing fields that collectively describe the model origins, training choices, and dataset that the model is based upon. There is some redundancy between this file and the `config` file (described above) that contains the instructions for the model training and implementation. The model card file is not used by the program but is important metadata so it is important to keep with the other files that collectively make the model and is such is considered part of the model
Additionally, BEST_MODEL.txt contains the name of the model with the best validation loss and mean IoU
References
*Segmentation Gym: Buscombe, D., & Goldstein, E. B. (2022). A reproducible and reusable pipeline for segmentation of geoscientific imagery. Earth and Space Science, 9, e2022EA002332. https://doi.org/10.1029/2022EA002332 See: https://github.com/Doodleverse/segmentation_gym
** Goldstein, Evan B., Buscombe, Daniel, Budavi, Priyanka, Favela, Jaycee, Fitzpatrick, Sharon, Gabbula, Sai Ram Ajay Krishna, Ku, Venus, Lazarus, Eli D., McCune, Ryan, Shah, Manish, Sigdel, Rajesh, & Tagner, Steven. (2022). Segmentation Labels for Emergency Response Imagery from Hurricane Barry, Delta, Dorian, Florence, Isaias, Laura, Michael, Sally, Zeta, and Tropical Storm Gordon (Version v1) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.7268083
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The PENGWIN segmentation challenge is designed to advance the development of automated pelvic fracture segmentation techniques in both 3D CT scans (Task 1) and 2D X-ray images (Task 2), aiming to enhance their accuarcy and robustness. The full 3D dataset comprises CT scans from 150 patients scheduled for pelvic reduction surgery, collected from multiple institutions using a variety of scanning devices. This dataset represents a diverse range of patient cohorts and fracture types. Ground-truth segmentations for sacrum and hipbone fragments have been semi-automatically annotated and subsequently validated by medical experts, and are available here. From this 3D data, we have generated high-quality, realistic X-ray images and corresponding 2D labels from the CT data using DeepDRR, incorporating a range of virtual C-arm camera positions and surgical tools. This dataset contains the training set for fragment segmentation in synthetic X-ray (task 2).
The training set is derived from 100 CTs, with 500 images each, for a total of 50,000 training images and segmentations. The C-arm geometry is randomly sampled for each CT within reasonable parameters for a full-size C-arm. The virtual patient is assumed to be in a head-first supine position. Imaging centers are randomly sampled within 50 mm of a fragment, ensuring good visibility. Viewing directions are sampled uniformly on the sphere within 45 degrees of vertical. Half of the images (IDs XXX_0250 - XXX_0500) contain up to 10 simulated K-wires and/or orthopaedic screws oriented randomly in the field of view.
The input images are raw intensity images without any windowing or normalization applied. It is standard practice to first apply the negative log transformation and then window each image appropriately for feeding into a model. See the included augmentation pipeline in pengwin_utils.py
for one approach. For viewing raw images, the FIJI image viewer is a viable option, but it is recommended to use the included visualization functions in pengwin_utilities.py
to first apply CLAHE normalization and save to a universally readable PNG (see example usage below).
Because X-ray images feature overlapping segmentation maks, the segmentations have been encoded as multi-label uint32 images, where each pixel should be treated as a binary vector with bits 1 - 10 for SA fragments, 11 - 20 for LI, and 21 - 30 for RI. Thus, the raw segmentation files are not viewable with standard image viewing software. pengwin_utilities.py
includes functions for converting to and from this format and for visualizing masks overlaid onto the original image (see below).
To use the utilities, first install dependencies with pip install -r requirement.txt
. Then, to visualize an image with its segmentation, you can do the following (assuming the training set has been downloaded and unzipped in the same folder):
import pengwin_utils from PIL import Image
image_path = "train/input/images/x-ray/001_0000.tif" seg_path = "train/output/images/x-ray/001_0000.tif"
image = pengwin_utils.load_image(image_path) # raw intensity image masks, category_ids, fragment_ids = pengwin_utils.load_masks(seg_path)
vis_image = pengwin_utils.visualize_sample(image, masks, category_ids, fragment_ids) vis_path = "vis_image.png" Image.fromarray(vis_image).save(vis_path) print(f"Wrote visualization to {vis_path}")
pred_masks, pred_category_ids, pred_fragment_ids = masks, category_ids, fragment_ids # replace with your model
pred_seg = pengwin_utils.masks_to_seg(pred_masks, pred_category_ids, pred_fragment_ids) pred_seg_path = "pred/train/output/images/x-ray/001_0000.tif" # ensure dir exists! Image.fromarray(pred_seg).save(pred_seg_path) print(f"Wrote segmentation to {pred_seg_path}")
The pengwin_utils.Dataset
class is provided as an example of a Pytorch dataset, with strong domain randomization included to facilitate sim-to-real performance, but it is recommended to write your own as needed.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This record contains code and data for segmentation using a three-dimensional level-set method, written by Amy Tabb in C++. The record also contains two datasets of root systems in media imaged with X-Ray CT, and the results of running the code on those datasets. The code will also perform a pre-processing task in three-dimensional image sets, and a dataset for that purpose is included as well. This work is a companion to the paper : "Segmenting root systems in X-ray computed tomography images using level sets" (WACV 2018) by the authors or this record, and and open-access version of the paper is here -- https://arxiv.org/abs/1809.06398 . The code is also available from Github: https://github.com/amy-tabb/tabb-level-set-segmentation , using a DOI and stable releases https://doi.org/10.5281/zenodo.3344906.
Format of the data:
Three input datasets are provided; two for the segmentation functionality of the code, and one to test the pre-processing functionality. The two segmentation sets are the same as were used in the paper, and are CassavaDataset, and SoybeanDataset. The pre-processing set is CassavaSlices. The output set for Soybean is SoybeanResultsJul11. The Cassava result set is large, so I broke it into three compressed folders, CassavaResultsJul12_A, _B, _C. _B is the largest, and only contains the results overwritten on the original X-Ray images. Unless your connection to Zenodo is extremely fast, it will be faster to compute the result than to download it.
The goal of this project is to provide all the materials to the community to resolve the problem of echocardiographic image segmentation and volume estimation from 2D ultrasound sequences (both two and four-chamber views). To this aim, the following solutions were set-up introduction of the largest publicly-available and fully-annotated dataset for 2D echocardiographic assessment (to our knowledge). The CAMUS dataset, containing 2D apical four-chamber and two-chamber view sequences acquired from 500 patients, is made available for download.
The overall CAMUS dataset consists of clinical exams from 500 patients, acquired at the University Hospital of St Etienne (France) and included in this study within the regulation set by the local ethical committee of the hospital after full anonymization. The acquisitions were optimized to perform left ventricle ejection fraction measurements. In order to enforce clinical realism, neither prerequisite nor data selection have been performed. Consequently,
some cases were difficult to trace;
the dataset involves a wide variability of acquisition settings;
for some patients, parts of the wall were not visible in the images;
for some cases, the probe orientation recommendation to acquire a rigorous four-chambers view was simply impossible to follow and a five-chambers view was acquired instead. This produced a highly heterogeneous dataset, both in terms of image quality and pathological cases, which is typical of daily clinical practice data.
The dataset has been made available to the community HERE. The dataset comprises : i) a training set of 450 patients along with the corresponding manual references based on the analysis of one clinical expert; ii) a testing set composed of 50 new patients. The raw input images are provided through the raw/mhd file format.
Half of the dataset population has a left ventricle ejection fraction lower than 45%, thus being considered at pathological risk (beyond the uncertainty of the measurement). Also, 19% of the images have a poor quality (based on the opinion of one expert), indicating that for this subgroup the localization of the left ventricle endocarium and left ventricle epicardium as well as the estimation of clinical indices are not considered clinically accurate and workable. In classical analysis, poor quality images are usually removed from the dataset because of their clinical uselessness. Therefore, those data were not involved in this project during the computation of the different metrics but were used to study their influence as part of the training and validation sets for deep learning techniques.
The full dataset was acquired from GE Vivid E95 ultrasound scanners (GE Vingmed Ultrasound, Horten Norway), with a GE M5S probe (GE Healthcare, US). No additional protocol than the one used in clinical routine was put in place. For each patient, 2D apical four-chamber and two-chamber view sequences were exported from EchoPAC analysis software (GE Vingmed Ultrasound, Horten, Norway). These standard cardiac views were chosen for this study to enable the estimation of left ventricle ejection fraction values based on the Simpson’s biplane method of discs. Each exported sequence corresponds to a set of B-mode images expressed in polar coordinates. The same interpolation procedure was used to express all sequences in Cartesian coordinates with a unique grid resolution, i.e. λ/2 = 0.3 mm along the x-axis (axis parallel to the probe) and λ/4 = 0.15 mm along the z-axis (axis perpendicular to the probe), where λ corresponds to the wavelength of the ultrasound probe. At least one full cardiac cycle was acquired for each patient in each view, allowing manual annotation of cardiac structures at ED and ES.
****This work has published to IEEE TMI journal. You must cite this paper for any use of the CAMUS database.****
- S. Leclerc, E. Smistad, J. Pedrosa, A. Ostvik, et al.
"Deep Learning for Segmentation using an Open Large-Scale Dataset in 2D Echocardiography" in IEEE Transactions on Medical Imaging, vol. 38, no. 9, pp. 2198-2210, Sept. 2019.
doi: 10.1109/TMI.2019.2900516
Please note: This is a large data product with 2.7 million polygon features (1.2GB file in ESRI File Geodatabase format). It is not possible to download in Shapefile format. Please access the data using the APIs or select another download format.This is the spatial framework around which the Living England Phase II habitat classification is based. The segmentation was created in the Trimble eCognition software using Sentinel-2 Analysis Ready Data (ARD) image mosaics for winter (February 2019) and summer (June 2019).
Sentinel-2 Analysis Ready Data (ARD) produced by the Earth Observation Data Service (JNCC / DEFRA) were used as the input for the segmentation. The Sentinel-2 ARD is available under an Open Government License (OGL). It is not intended that the 2019 segmentation will be revised, however, as Living England progresses and up-to-date image mosaics are created new habitat segmentation datasets will be developed from the up-to-date imagery.Full metadata can be viewed on data.gov.uk.
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Introduction
The Surgical Planning Laboratory (SPL) and the National Center for Image Guided Therapy (NCIGT) are making this dataset available as a resource to aid in the development of algorithms and tools for deformable registration, segmentation and analysis of prostate magnetic resonance imaging (MRI) and ultrasound (US) images.
Description
This dataset contains anonymized images of the human prostate (N=3 patients) collected during two sessions for each patient:
These are three-dimensional (multi-slice) scalar images.
Image files are stored using NRRD file format (files with .nrrd extension), see details at http://teem.sourceforge.net/nrrd/format.html. Each image file includes a code for the case number (internal numbering at the research site) and the modality (US or MR).
Image annotations were prepared by Dr. Fedorov (no professional training in radiology) and Dr. Tuncali (10+ in prostate imaging interpretation). Annotations include
Viewing the collection
We tested visualization of images, segmentations and fiducials in 3D Slicer software, and thus recommend 3D Slicer as the platform for visualization. 3D Slicer is a free open source platform (see http://slicer.org), with the pre-compiled binaries available for all major operating systems. You can download 3D Slicer at http://download.slicer.org.
Acknowledgments
Preparation of this data collection was made possible thanks to the funding from the National Institutes of Health (NIH) through grants R01 CA111288 and P41 RR019703.
If you use this dataset in a publication, please cite the following manuscript. You can also learn more about this dataset from the publication below.
Fedorov, A., Khallaghi, S., Antonio Sánchez, C., Lasso, A., Fels, S., Tuncali, K., Sugar, E. N., Kapur, T., Zhang, C., Wells, W., Nguyen, P. L., Abolmaesumi, P. & Tempany, C. Open-source image registration for MRI–TRUS fusion-guided prostate interventions. Int J CARS 10, 925–934 (2015). https://pubmed.ncbi.nlm.nih.gov/25847666/
Contact
Andrey Fedorov, fedorov@bwh.harvard.edu
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
The dataset aims to annotate various elements found in digitized historical documents to comprehend document structure and layout. The classes include text zones, graphic elements, numberings, decorative elements, and more, to encapsulate the layout.
Represents the overall layout structure of a document.
Annotate the visible edges of the document including all elements that contribute to the structural layout. Do not capture decorative or specific text areas under this class.
Zones indicating artefacts from the digitization process, such as shadows or scanner marks.
Mark any unintended markings or shadows that are a result of scanning. Avoid annotating intended text or graphic content.
Areas containing non-text graphics, like images or illustrations.
Outline the complete area of the image or illustration, including associated captions if directly attached. Do not separate text unless specifically part of the graphic.
Decorative elements that embellish text or page borders.
Enclose artwork such as decorative borders or flourishes that enhance the page layout. Exclude surrounding text.
Descriptions that accompany figures or graphics.
Label the area containing text that describes graphics. Ensure not to capture the graphic itself as part of this annotation.
Headings associated with graphics, usually found above or beside them.
Identify headings directly related to graphics, encircling the text without extending to main text areas.
Text that forms the primary content of the document.
Ensure clarity between different sub-zones, and focus on text alignment and indentation cues.
Notes or additional comments located in the margins.
Prioritize notable deviations in alignment or format from the main text.
Numerical identifiers, such as page numbers.
Spot all numerical indicators located in header or footer regions, ensuring no overlap with other annotations.
Marks showing quire or gathering information.
Encircle symbols or shorthand describing document assembly, ignoring main or marginal text.
Titles or headings that repeat on multiple pages.
Cover repeated headers or titles at the top edge of pages, avoiding interaction with the main text body.
Stamped markings on the documents.
Identify all institutional or approval stamps, along with philatelic elements. Ensure clear separation from text and images.
Zones containing tabular data.
Delineate the boundaries of grids or tables including headings if within the zone. Avoid extending beyond table borders.
Front page of the d
The goal of the Automated Cardiac Diagnosis Challenge (ACDC) challenge is to:
compare the performance of automatic methods on the segmentation of the left ventricular endocardium and epicardium as the right ventricular endocardium for both end diastolic and end systolic phase instances; compare the performance of automatic methods for the classification of the examinations in five classes (normal case, heart failure with infarction, dilated cardiomyopathy, hypertrophic cardiomyopathy, abnormal right ventricle).
The overall ACDC dataset was created from real clinical exams acquired at the University Hospital of Dijon. Acquired data were fully anonymized and handled within the regulations set by the local ethical committee of the Hospital of Dijon (France). Our dataset covers several well-defined pathologies with enough cases to (1) properly train machine learning methods and (2) clearly assess the variations of the main physiological parameters obtained from cine-MRI (in particular diastolic volume and ejection fraction). The dataset is composed of 150 exams (all from different patients) divided into 5 evenly distributed subgroups (4 pathological plus 1 healthy subject groups) as described below. Furthermore, each patient comes with the following additional information : weight, height, as well as the diastolic and systolic phase instants.
The database is made available to participants through two datasets from the dedicated online evaluation website after a personal registration: i) a training dataset of 100 patients along with the corresponding manual references based on the analysis of one clinical expert; ii) a testing dataset composed of 50 new patients, without manual annotations but with the patient information given above. The raw input images are provided through the Nifti format.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description of the folder content :
1) The macro in .ijm format.
Suited for analysis of 3-channel confocal fluorescence microscopy images of mammalian cells (~200*200µm).
Requires ImageJ v1.4 with Bio-render plugin.
Images should be as .nd2 format but it can easily be changed, simply search & replace all occurences of ".nd2" with your format in the macro code.
Images should be organized with every replicate of a same test-condition in a unique folder. The macro will analyze the whole folder at once and will create a folder in it to save results.
2) A folder named "example_data", it contains 3 representative images that can be used to test the macro.
It also contains a results folder with representative data obtained via the analysis of these representative images with the macro (see Description of the macro for description of the results obtained)
_
Description of the macro :
input : 3-channel image with
C1 = nucleus labeling (e.g. DAPI, Hoechst, etc.)
C2 = signal of interest, the one you want to measure in whole cells & in the region of interest
C3 = region of interest (ROI) (e.g. an antibody directed against a particular organelle, in our case Golgi apparatus)
this macro will :
count the cells according to C1 (user input of threshold values for C1)
create ROI(s) according to C3 (user input of threshold values, or manual setting of each image for C3)
measure signal of C2 (mean min max grey values, integrated density, area) in whole cells (user input of threshold values for C2) measure signal of C2 in ROI(s)
save results as a .csv file
it will also create several .png images for each analyzed one :
C1+nucleusROI (to assess correct cell counting)
C3+ROIC3 (to assess correct creation of ROI(s) from C3 signal)
C2 (glow LUT) + ROIC3 (to assess correct thresholding of C2 signal)
C2+ROIC3
merge C1+C2+C3
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is Part 2/2 of the ActiveHuman dataset! Part 1 can be found here.
Dataset Description
ActiveHuman was generated using Unity's Perception package.
It consists of 175428 RGB images and their semantic segmentation counterparts taken at different environments, lighting conditions, camera distances and angles. In total, the dataset contains images for 8 environments, 33 humans, 4 lighting conditions, 7 camera distances (1m-4m) and 36 camera angles (0-360 at 10-degree intervals).
The dataset does not include images at every single combination of available camera distances and angles, since for some values the camera would collide with another object or go outside the confines of an environment. As a result, some combinations of camera distances and angles do not exist in the dataset.
Alongside each image, 2D Bounding Box, 3D Bounding Box and Keypoint ground truth annotations are also generated via the use of Labelers and are stored as a JSON-based dataset. These Labelers are scripts that are responsible for capturing ground truth annotations for each captured image or frame. Keypoint annotations follow the COCO format defined by the COCO keypoint annotation template offered in the perception package.
Folder configuration
The dataset consists of 3 folders:
Essential Terminology
Dataset Data
The dataset includes 4 types of JSON annotation files files:
Most Labelers generate different annotation specifications in the spec key-value pair:
Each Labeler generates different annotation specifications in the values key-value pair:
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This page only provides the drone-view image dataset.
The dataset contains drone-view RGB images, depth maps and instance segmentation labels collected from different scenes. Data from each scene is stored in a separate .7z file, along with a color_palette.xlsx
file, which contains the RGB_id and corresponding RGB values.
All files follow the naming convention: {central_tree_id}_{timestamp}
, where {central_tree_id}
represents the ID of the tree centered in the image, which is typically in a prominent position, and timestamp
indicates the time when the data was collected.
Specifically, each 7z file includes the following folders:
rgb: This folder contains the RGB images (PNG) of the scenes and their metadata (TXT). The metadata describes the weather conditions and the world time when the image was captured. An example metadata entry is: Weather:Snow_Blizzard,Hour:10,Minute:56,Second:36
.
depth_pfm: This folder contains absolute depth information of the scenes, which can be used to reconstruct the point cloud of the scene through reprojection.
instance_segmentation: This folder stores instance segmentation labels (PNG) for each tree in the scene, along with metadata (TXT) that maps tree_id
to RGB_id
. The tree_id
can be used to look up detailed information about each tree in obj_info_final.xlsx
, while the RGB_id
can be matched to the corresponding RGB values in color_palette.xlsx
. This mapping allows for identifying which tree corresponds to a specific color in the segmentation image.
obj_info_final.xlsx: This file contains detailed information about each tree in the scene, such as position, scale, species, and various parameters, including trunk diameter (in cm), tree height (in cm), and canopy diameter (in cm).
landscape_info.txt: This file contains the ground location information within the scene, sampled every 0.5 meters.
For birch_forest, broadleaf_forest, redwood_forest and rainforest, we also provided COCO-format annotation files (.json). Two such files can be found in these datasets:
⚠️: 7z files that begin with "!" indicate that the RGB values in the images within the instance_segmentation
folder cannot be found in color_palette.xlsx
. Consequently, this prevents matching the trees in the segmentation images to their corresponding tree information, which may hinder the application of the dataset to certain tasks. This issue is related to a bug in Colossium/AirSim, which has been reported in link1 and link2.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Work in progress...
This dataset was developed in the context of my master's thesis titled "Physics-Guided Deep Learning for Sparse Data-Driven Brain Shift Registration", which investigates the integration of physics-based biomechanical modeling into deep learning frameworks for the task of brain shift registration. The core objective of this project is to improve the accuracy and reliability of intraoperative brain shift prediction by enabling deep neural networks to interpolate sparse intraoperative data under biomechanical constraints. Such capabilities are critical for enhancing image-guided neurosurgery systems, especially when full intraoperative imaging is unavailable or impractical.
The dataset integrates and extends data from two publicly available sources: ReMIND and UPENN-GBM. A total of 207 patient cases (45 cases from ReMIND and 162 cases from UPENN-GBM), each represented as a separate folder with all relevant data grouped per case, are included in this dataset. It contains preoperative imaging (unstripped), synthetic ground truth displacement fields, anatomical segmentations, and keypoints, structured to support machine learning and registration tasks.
For details on the image acquisition and other topics related to the original datasets, see their original links above.
Each patient folder contains the following subfolders:
images/
: Preoperative MRI scans (T1ce, T2) in NIfTI format.
segmentations/
: Brain and tumor segmentations in NRRD format.
simulations/
: Biomechanically simulated displacement fields with initial and final point coordinates (LPS) in .npz and .txt formats, respectively.
keypoints/
: 3D SIFT-Rank keypoints and their descriptors in both voxel space and world coordinates (RAS?) as .key files.
The folder naming and organization are consistent across patients for ease of use and scripting.
ReMIND: is a multimodal imaging dataset of 114 brain tumor patients that underwent image-guided surgical resection at Brigham and Women’s Hospital, containing preoperative MRI, intraoperative MRI, and 3D intraoperative ultrasound data. It includes over 300 imaging series and 350 expert-annotated segmentations such as tumors, resection cavities, cerebrum, and ventricles. Demographic and clinico-pathological information (e.g., tumor type, grade, eloquence) is also provided.
UPENN-GBM: comprises multi-parametric MRI scans from de novo glioblastoma (GBM) patients treated at the University of Pennsylvania Health System. It includes co-registered and skull-stripped T1-weighted, T1-weighted contrast-enhanced, T2-weighted, and FLAIR images. The dataset features high-quality tumor and brain segmentation labels, initially produced by automated methods and subsequently corrected and approved by board-certified neuroradiologists. Alongside imaging data, the collection provides comprehensive clinical metadata including patient demographics, genomic profiles, survival outcomes, and tumor progression indicators.
This dataset is tailored for researchers and developers working on:
It is especially well-suited for evaluating learning-based registration methods that incorporate physical priors or aim to generalize under sparse supervision.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
# README: IRHMapNet Radargram and Mask Patches Dataset
## Dataset Overview
This dataset contains radargram patches and corresponding mask patches used for training and evaluating the **IRHMapNet** model. The dataset is designed for segmentation of internal reflection horizons (IRHs) from radio-echo sounding data. The data is organized into two directories: radargram patches (`grams_patches`) and mask patches (`masks_patches`), with each patch having dimensions of 512x512 pixels.
### Contents
- **grams_patches/**: Contains 600 `.csv` files representing radargram patches. Each file is a 512x512 matrix corresponding to a small section of the radargram image.
- **masks_patches/**: Contains 600 `.csv` files representing the ground-truth mask patches for segmentation. Each file is a 512x512 binary mask, where `1` indicates the presence of an internal reflection horizon (IRH), and `0` represents background or ice.
## Data Format
- The files in both directories are named consistently, with matching pairs of radargram and mask patches.
- Example: `grams_patches/patch_001.csv` corresponds to `masks_patches/patch_001.csv`.
- Each `.csv` file is a comma-separated values (CSV) file containing 512 rows and 512 columns.
## Directory Structure
```
DATA_IRHMapNet/
├── grams_patches/ # Radargram patches
│ ├── patch_001.csv
│ ├── patch_002.csv
│ └── ... (600 patches)
└── masks_patches/ # Mask patches (Ground truth)
├── patch_001.csv
├── patch_002.csv
└── ... (600 patches)
```
## Usage Instructions
1. **Loading the data**: Each `.csv` file can be loaded using standard CSV reading functions in Python, such as `numpy.loadtxt()` or `pandas.read_csv()`.
Example in Python using `numpy`:
```python
import numpy as np
radargram_patch = np.loadtxt('grams_patches/patch_001.csv', delimiter=',')
mask_patch = np.loadtxt('masks_patches/patch_001.csv', delimiter=',')
```
2. **Model training**: These patches are designed for input into a U-Net or similar convolutional neural network architectures for pixel-wise classification tasks. The radargram patches serve as input, and the mask patches provide the ground-truth labels for training.
## License
This dataset is made available under the Creative Commons Attribution 4.0 International (CC BY 4.0) license. You are free to:
Share — copy and redistribute the material in any medium or format.
Adapt — remix, transform, and build upon the material for any purpose, even commercially.
You must give appropriate credit by citing the following publication:
**Citation**: Moqadam, H., et al. (2024). Going deeper with deep learning: Automatically tracing internal reflection horizons in ice sheets. *Journal of Geophysical Research: Machine Learning and Computation*. DOI: [insert DOI]
## Contact
For questions or further information, please contact Hameed Moqadam at [hameed.moqadam@awi.de].
Data Curator: Hameed Moqadam
Annotator: Hameed Moqadam
Data Manager: Hameed Moqadam
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Coast Train is a library of images of coastal environments, annotations, and corresponding thematic label masks (or ‘label images’) collated for the purposes of training and evaluating machine learning (ML), deep learning, and other models for image segmentation. It includes image sets from both geospatial satellite, aerial, and UAV imagery and orthomosaics, as well as non-geospatial oblique and nadir imagery. Images include a diverse range of coastal environments from the U.S. Pacific, Gulf of Mexico, Atlantic, and Great Lakes coastlines, consisting of time-series of high-resolution (≤1m) orthomosaics and satellite image tiles (10–30m). Each image, image annotation, and labelled image is available as a single NPZ zipped file. NPZ files follow the following naming convention: {datasource}{numberofclasses}{threedigitdatasetversion}.zip, where {datasource} is the source of the original images (for example, NAIP, Landsat 8, Sentinel 2), {numberofclasses} is the number of classes used to annotate the images, and {threedigitdatasetversion} is the three-digit code corresponding to the dataset version (in other words, 001 is version 1). Each zipped folder contains a collection of NPZ format files, each of which corresponds to an individual image. An individual NPZ file is named after the image that it represents and contains (1) a CSV file with detail information for every image in the zip folder and (2) a collection of the following NPY files: orig_image.npy (original input image unedited), image.npy (original input image after color balancing and normalization), classes.npy (list of classes annotated and present in the labelled image), doodles.npy (integer image of all image annotations), color_doodles.npy (color image of doodles.npy), label.npy (labelled image created from the classes present in the annotations), and settings.npy (annotation and machine learning settings used to generate the labelled image from annotations). All NPZ files can be extracted using the utilities available in Doodler (Buscombe, 2022). A merged CSV file containing detail information on the complete imagery collection is available at the top level of this data release, details of which are available in the Entity and Attribute section of this metadata file.