100+ datasets found

U
Coast Train--Labeled imagery for training and evaluation of data-driven...
data.usgs.gov
catalog.data.gov
Updated Aug 31, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Phillipe Wernette; Daniel Buscombe; Jaycee Favela; Sharon Fitzpatrick; Evan Goldstein; Nicholas Enwright; Erin Dunand (2024). Coast Train--Labeled imagery for training and evaluation of data-driven models for image segmentation [Dataset]. http://doi.org/10.5066/P91NP87I
Explore at:
Unique identifier
https://doi.org/10.5066/P91NP87I
Dataset updated
Aug 31, 2024
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Authors
Phillipe Wernette; Daniel Buscombe; Jaycee Favela; Sharon Fitzpatrick; Evan Goldstein; Nicholas Enwright; Erin Dunand
License
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
Time period covered
Jan 1, 2008 - Dec 31, 2020
Description
Coast Train is a library of images of coastal environments, annotations, and corresponding thematic label masks (or ‘label images’) collated for the purposes of training and evaluating machine learning (ML), deep learning, and other models for image segmentation. It includes image sets from both geospatial satellite, aerial, and UAV imagery and orthomosaics, as well as non-geospatial oblique and nadir imagery. Images include a diverse range of coastal environments from the U.S. Pacific, Gulf of Mexico, Atlantic, and Great Lakes coastlines, consisting of time-series of high-resolution (≤1m) orthomosaics and satellite image tiles (10–30m). Each image, image annotation, and labelled image is available as a single NPZ zipped file. NPZ files follow the following naming convention: {datasource}_{numberofclasses}_{threedigitdatasetversion}.zip, where {datasource} is the source of the original images (for example, NAIP, Landsat 8, Sentinel 2), {numberofclasses} is the number of classes us ...
n
Data from: imageseg: An R package for deep learning-based image segmentation...
data.niaid.nih.gov
datadryad.org
zip
Updated Aug 6, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jürgen Niedballa; Jan Axtner; Timm Döbert; Andrew Tilker; An Nguyen; Seth Wong; Christian Fiderer; Marco Heurich; Andreas Wilting (2022). imageseg: An R package for deep learning-based image segmentation [Dataset]. http://doi.org/10.5061/dryad.x0k6djhnj
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.x0k6djhnj
Dataset updated
Aug 6, 2022
Dataset provided by
University of Alberta
Leibniz Institute for Zoo and Wildlife Research
Bavarian Forest National Park
Authors
Jürgen Niedballa; Jan Axtner; Timm Döbert; Andrew Tilker; An Nguyen; Seth Wong; Christian Fiderer; Marco Heurich; Andreas Wilting
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
Convolutional neural networks (CNNs) and deep learning are powerful and robust tools for ecological applications, and are particularly suited for image data. Image segmentation (the classification of all pixels in images) is one such application and can for example be used to assess forest structural metrics. While CNN-based image segmentation methods for such applications have been suggested, widespread adoption in ecological research has been slow, likely due to technical difficulties in implementation of CNNs and lack of toolboxes for ecologists.

Here, we present R package imageseg which implements a CNN-based workflow for general-purpose image segmentation using the U-Net and U-Net++ architectures in R. The workflow covers data (pre)processing, model training, and predictions. We illustrate the utility of the package with image recognition models for two forest structural metrics: tree canopy density and understory vegetation density. We trained the models using large and diverse training data sets from a variety of forest types and biomes, consisting of 2877 canopy images (both canopy cover and hemispherical canopy closure photographs) and 1285 understory vegetation images.

Overall segmentation accuracy of the models was high with a Dice score of 0.91 for the canopy model and 0.89 for the understory vegetation model (assessed with 821 and 367 images, respectively). The image segmentation models performed significantly better than commonly used thresholding methods, and generalized well to data from study areas not included in training. This indicates robustness to variation in input images and good generalization strength across forest types and biomes.

The package and its workflow allow simple yet powerful assessments of forest structural metrics using pre-trained models. Furthermore, the package facilitates custom image segmentation with single or multiple classes and based on color or grayscale images, e.g. for applications in cell biology or for medical images. Our package is free, open source, and available from CRAN. It will enable easier and faster implementation of deep learning-based image segmentation within R for ecological applications and beyond.
FeM dataset – An iron ore labeled images dataset for segmentation training...
zenodo.org
data.niaid.nih.gov
zip
Updated Jul 16, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Otávio da Fonseca Martins Gomes; Otávio da Fonseca Martins Gomes; Sidnei Paciornik; Sidnei Paciornik; Michel Pedro Filippo; Michel Pedro Filippo; Gilson Alexandre Ostwald Pedro da Costa; Gilson Alexandre Ostwald Pedro da Costa; Guilherme Lucio Abelha Mota; Guilherme Lucio Abelha Mota (2021). FeM dataset – An iron ore labeled images dataset for segmentation training and testing [Dataset]. http://doi.org/10.5281/zenodo.5014700
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.5014700
Dataset updated
Jul 16, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Otávio da Fonseca Martins Gomes; Otávio da Fonseca Martins Gomes; Sidnei Paciornik; Sidnei Paciornik; Michel Pedro Filippo; Michel Pedro Filippo; Gilson Alexandre Ostwald Pedro da Costa; Gilson Alexandre Ostwald Pedro da Costa; Guilherme Lucio Abelha Mota; Guilherme Lucio Abelha Mota
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is composed of 81 pairs of correlated images. Each pair contains one image of an iron ore sample acquired through reflected light microscopy (RGB, 24-bit), and the corresponding binary reference image (8-bit), in which the pixels are labeled as belonging to one of two classes: ore (0) or embedding resin (255).

The sample came from an itabiritic iron ore concentrate from Quadrilátero Ferrífero (Brazil) mainly composed of hematite and quartz, with little magnetite and goethite. It was classified by size and concentrated with a dense liquid. Then, the fraction -149+105 μm with density greater than 3.2 was cold mounted with epoxy resin and subsequently ground and polished.

Correlative microscopy was employed for image acquisition. Thus, 81 fields were imaged on a reflected light microscope with a 10× (NA 0.20) objective lens and on a scanning electron microscope (SEM). In sequence, they were registered, resulting in images of 999×756 pixels with a resolution of 1.05 µm/pixel. Finally, the images from SEM were thresholded to generate the reference images.

Further description of this sample and its imaging procedure can be found in the work by Gomes and Paciornik (2012).

This dataset was created for developing and testing deep learning models on semantic segmentation tasks. The paper of Filippo et al. (2021) presented a variant of the DeepLabv3+ model that reached mean values of 91.43% and 93.13% for overall accuracy and F1 score, respectively, for 5 rounds of experiments (training and testing), each with a different, random initialization of network weights.

For further questions and suggestions, please do not hesitate to contact us.

Contact email: ogomes@gmail.com

If you use this dataset in your own work, please cite this DOI: 10.5281/zenodo.5014700

Please also cite this paper, which provides additional details about the dataset:

Michel Pedro Filippo, Otávio da Fonseca Martins Gomes, Gilson Alexandre Ostwald Pedro da Costa, Guilherme Lucio Abelha Mota. Deep learning semantic segmentation of opaque and non-opaque minerals from epoxy resin in reflected light microscopy images. Minerals Engineering, Volume 170, 2021, 107007, https://doi.org/10.1016/j.mineng.2021.107007.
R
Target Image Segmentation Data Dataset
universe.roboflow.com
zip
Updated Apr 3, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Aim Training Target Extraction (2025). Target Image Segmentation Data Dataset [Dataset]. https://universe.roboflow.com/aim-training-target-extraction/target-image-segmentation-data/model/5
Explore at:
zipAvailable download formats
Dataset updated
Apr 3, 2025
Dataset authored and provided by
Aim Training Target Extraction
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Targets Polygons
Description
Target Image Segmentation Data

## Overview Target Image Segmentation Data is a dataset for instance segmentation tasks - it contains Targets annotations for 293 images. ## Getting Started You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model. ## License This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
h
brain-tumor-image-dataset-semantic-segmentation
huggingface.co
Updated Aug 19, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Don Branson (2023). brain-tumor-image-dataset-semantic-segmentation [Dataset]. https://huggingface.co/datasets/dwb2023/brain-tumor-image-dataset-semantic-segmentation
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 19, 2023
Authors
Don Branson
Description
Dataset Card for "brain-tumor-image-dataset-semantic-segmentation"

Dataset Description

The Brain Tumor Image Dataset (BTID) for Semantic Segmentation contains MRI images and annotations aimed at training and evaluating segmentation models. This dataset was sourced from Kaggle and includes detailed segmentation masks indicating the presence and boundaries of brain tumors. This dataset can be used for developing and benchmarking algorithms for medical image segmentation… See the full description on the dataset page: https://huggingface.co/datasets/dwb2023/brain-tumor-image-dataset-semantic-segmentation.
Images and 2-class labels for semantic segmentation of Sentinel-2 and...
zenodo.org
data.niaid.nih.gov
txt, zip
Updated Dec 2, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daniel Buscombe; Daniel Buscombe (2022). Images and 2-class labels for semantic segmentation of Sentinel-2 and Landsat RGB satellite images of coasts (water, other) [Dataset]. http://doi.org/10.5281/zenodo.7384242
Explore at:
zip, txtAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7384242
Dataset updated
Dec 2, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Daniel Buscombe; Daniel Buscombe
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Images and 2-class labels for semantic segmentation of Sentinel-2 and Landsat RGB satellite images of coasts (water, other)

Images and 2-class labels for semantic segmentation of Sentinel-2 and Landsat RGB satellite images of coasts (water, other)

Description

4088 images and 4088 associated labels for semantic segmentation of Sentinel-2 and Landsat RGB satellite images of coasts. The 2 classes are 1=water, 0=other. Imagery are a mixture of 10-m Sentinel-2 and 15-m pansharpened Landsat 7, 8, and 9 visible-band imagery of various sizes. Red, Green, Blue bands only

These images and labels could be used within numerous Machine Learning frameworks for image segmentation, but have specifically been made for use with the Doodleverse software package, Segmentation Gym**.

Two data sources have been combined

Dataset 1

1018 image-label pairs from the following data release**** https://doi.org/10.5281/zenodo.7335647

Labels have been reclassified from 4 classes to 2 classes.

Some (422) of these images and labels were originally included in the Coast Train*** data release, and have been modified from their original by reclassifying from the original classes to the present 2 classes.

These images and labels have been made using the Doodleverse software package, Doodler*.

Dataset 2

3070 image-label pairs from the Sentinel-2 Water Edges Dataset (SWED)***** dataset, https://openmldata.ukho.gov.uk/, described by Seale et al. (2022)******

A subset of the original SWED imagery (256 x 256 x 12) and labels (256 x 256 x 1) have been chosen, based on the criteria of more than 2.5% of the pixels represent water

File descriptions

classes.txt, a file containing the class names

images.zip, a zipped folder containing the 3-band RGB images of varying sizes and extents

labels.zip, a zipped folder containing the 1-band label images

overlays.zip, a zipped folder containing a semi-transparent overlay of the color-coded label on the image (red=1=water, bllue=0=other)

resized_images.zip, RGB images resized to 512x512x3 pixels

resized_labels.zip, label images resized to 512x512x1 pixels

References

*Doodler: Buscombe, D., Goldstein, E.B., Sherwood, C.R., Bodine, C., Brown, J.A., Favela, J., Fitzpatrick, S., Kranenburg, C.J., Over, J.R., Ritchie, A.C. and Warrick, J.A., 2021. Human‐in‐the‐Loop Segmentation of Earth Surface Imagery. Earth and Space Science, p.e2021EA002085https://doi.org/10.1029/2021EA002085. See https://github.com/Doodleverse/dash_doodler.

**Segmentation Gym: Buscombe, D., & Goldstein, E. B. (2022). A reproducible and reusable pipeline for segmentation of geoscientific imagery. Earth and Space Science, 9, e2022EA002332. https://doi.org/10.1029/2022EA002332 See: https://github.com/Doodleverse/segmentation_gym

***Coast Train data release: Wernette, P.A., Buscombe, D.D., Favela, J., Fitzpatrick, S., and Goldstein E., 2022, Coast Train--Labeled imagery for training and evaluation of data-driven models for image segmentation: U.S. Geological Survey data release, https://doi.org/10.5066/P91NP87I. See https://coasttrain.github.io/CoastTrain/ for more information

****Buscombe, Daniel, Goldstein, Evan, Bernier, Julie, Bosse, Stephen, Colacicco, Rosa, Corak, Nick, Fitzpatrick, Sharon, del Jesús González Guillén, Anais, Ku, Venus, Paprocki, Julie, Platt, Lindsay, Steele, Bethel, Wright, Kyle, & Yasin, Brandon. (2022). Images and 4-class labels for semantic segmentation of Sentinel-2 and Landsat RGB satellite images of coasts (water, whitewater, sediment, other) (v1.0) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.7335647

*****Seale, C., Redfern, T., Chatfield, P. 2022. Sentinel-2 Water Edges Dataset (SWED) https://openmldata.ukho.gov.uk/

******Seale, C., Redfern, T., Chatfield, P., Luo, C. and Dempsey, K., 2022. Coastline detection in satellite imagery: A deep learning approach on new benchmark data. Remote Sensing of Environment, 278, p.113044.
Doodleverse/Segmentation Zoo/Seg2Map Res-UNet models for FloodNet/10-class...
zenodo.org
data.niaid.nih.gov
bin, json, png, txt +1
Updated Jul 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daniel Buscombe; Daniel Buscombe (2024). Doodleverse/Segmentation Zoo/Seg2Map Res-UNet models for FloodNet/10-class segmentation of RGB 768x512 UAV images [Dataset]. http://doi.org/10.5281/zenodo.7566810
Explore at:
json, png, bin, zip, txtAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7566810
Dataset updated
Jul 12, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Daniel Buscombe; Daniel Buscombe
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Doodleverse/Segmentation Zoo/Seg2Map Res-UNet models for FloodNet/10-class segmentation of RGB 768x512 UAV images

These Residual-UNet model data are based on [FloodNet](https://github.com/BinaLab/FloodNet-Challenge-EARTHVISION2021) images and associated labels.

Models have been created using Segmentation Gym* using the following dataset**: https://github.com/BinaLab/FloodNet-Challenge-EARTHVISION2021

Image size used by model: 768 x 512 x 3 pixels

classes:
1. Background
2. Building-flooded
3. Building-non-flooded
4. Road-flooded
5. Road-non-flooded
6. Water
7. Tree
8. Vehicle
9. Pool
10. Grass

File descriptions

For each model, there are 5 files with the same root name:

1. '.json' config file: this is the file that was used by Segmentation Gym* to create the weights file. It contains instructions for how to make the model and the data it used, as well as instructions for how to use the model for prediction. It is a handy wee thing and mastering it means mastering the entire Doodleverse.

2. '.h5' weights file: this is the file that was created by the Segmentation Gym* function `train_model.py`. It contains the trained model's parameter weights. It can called by the Segmentation Gym* function `seg_images_in_folder.py`. Models may be ensembled.

3. '_modelcard.json' model card file: this is a json file containing fields that collectively describe the model origins, training choices, and dataset that the model is based upon. There is some redundancy between this file and the `config` file (described above) that contains the instructions for the model training and implementation. The model card file is not used by the program but is important metadata so it is important to keep with the other files that collectively make the model and is such is considered part of the model

4. '_model_history.npz' model training history file: this numpy archive file contains numpy arrays describing the training and validation losses and metrics. It is created by the Segmentation Gym function `train_model.py`

5. '.png' model training loss and mean IoU plot: this png file contains plots of training and validation losses and mean IoU scores during model training. A subset of data inside the .npz file. It is created by the Segmentation Gym function `train_model.py`

Additionally, BEST_MODEL.txt contains the name of the model with the best validation loss and mean IoU

images.zip and labels.zip contain the images and labels, respectively, used to train the model

References
*Segmentation Gym: Buscombe, D., & Goldstein, E. B. (2022). A reproducible and reusable pipeline for segmentation of geoscientific imagery. Earth and Space Science, 9, e2022EA002332. https://doi.org/10.1029/2022EA002332 See: https://github.com/Doodleverse/segmentation_gym

** Rahnemoonfar, M., Chowdhury, T., Sarkar, A., Varshney, D., Yari, M. and Murphy, R.R., 2021. Floodnet: A high resolution aerial imagery dataset for post flood scene understanding. IEEE Access, 9, pp.89644-89654.
Deep Learning Image Segmentation of Sandy Beaches in Southeastern Australia
researchdata.edu.au
datadownload
Updated Jul 25, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Julian O'Grady; SukYee Yong (2024). Deep Learning Image Segmentation of Sandy Beaches in Southeastern Australia [Dataset]. https://researchdata.edu.au/deep-learning-image-southeastern-australia/3378927
Explore at:
datadownloadAvailable download formats
Dataset updated
Jul 25, 2024
Dataset provided by
CSIROhttp://www.csiro.au/
Authors
Julian O'Grady; SukYee Yong
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered

Description
The collection includes beach coastlines from Southeastern Australia, specifically Victoria and New South Wales, used to train an image segmentation model using the U-Net deep learning architecture for mapping sandy beaches. The dataset contains polygons that represent the outline or extent of the raster images and polygons drawn by citizen-scientists. Additionally, we provide the trained model itself, which can be utilized for further evaluation or refined through fine-tuning. The resulting predictions are also available in Shapefiles format, which can be loaded to NationalMap.

This collection supplements the publication: Regional-Scale Image Segmentation of Sandy Beaches: Comparison of Training and Prediction Across Two Extensive Coastlines in Southeastern Australia (Yong et al.) Lineage: The training dataset of citizen science-drawn beach outlines and polygons was sourced from OpenStreetMap (OSM) https://www.openstreetmap.org/). Tiled images along the coast were sourced from Microsoft Bing imagery to process new beach outlines, as it is also one of the main sources of imagery used for drawing features in OSM. Note, the original OSM data was licensed ODbL and should be considered when using the processed dataset, which required a Creative Commons Licence to be published in this portal. CC-BY was identified as the most suitable license in the portal to align with ODbL.

The saved deep learning model was trained on the dataset using a U-Net architecture, which is used to generate the predicted maps.
f
Semantic segmentation results using a training dataset of real underwater...
plos.figshare.com
xls
Updated Jun 16, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Eon-ho Lee; Byungjae Park; Myung-Hwan Jeon; Hyesu Jang; Ayoung Kim; Sejin Lee (2023). Semantic segmentation results using a training dataset of real underwater sonar images and synthetic underwater sonar images. [Dataset]. http://doi.org/10.1371/journal.pone.0272602.t006
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0272602.t006
Dataset updated
Jun 16, 2023
Dataset provided by
PLOS ONE
Authors
Eon-ho Lee; Byungjae Park; Myung-Hwan Jeon; Hyesu Jang; Ayoung Kim; Sejin Lee
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Semantic segmentation results using a training dataset of real underwater sonar images and synthetic underwater sonar images.
Atrium-Segmentation
kaggle.com
Updated Mar 28, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Paddy (2022). Atrium-Segmentation [Dataset]. https://www.kaggle.com/datasets/paddytheprogrammer/atriumsegmentaion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 28, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Paddy
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
Target: Left Atrium Modality: Mono-modal MRI
Size: 30 3D volumes (20 Training + 10 Testing) Source: King’s College London Challenge: Small training dataset with large variability
P
ImageNet-S Dataset
paperswithcode.com
Updated Dec 5, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ShangHua Gao; Zhong-Yu Li; Ming-Hsuan Yang; Ming-Ming Cheng; Junwei Han; Philip Torr (2023). ImageNet-S Dataset [Dataset]. https://paperswithcode.com/dataset/imagenet-s
Explore at:
Dataset updated
Dec 5, 2023
Authors
ShangHua Gao; Zhong-Yu Li; Ming-Hsuan Yang; Ming-Ming Cheng; Junwei Han; Philip Torr
Description
Powered by the ImageNet dataset, unsupervised learning on large-scale data has made significant advances for classification tasks. There are two major challenges to allowing such an attractive learning modality for segmentation tasks: i) a large-scale benchmark for assessing algorithms is missing; ii) unsupervised shape representation learning is difficult. We propose a new problem of large-scale unsupervised semantic segmentation (LUSS) with a newly created benchmark dataset to track the research progress. Based on the ImageNet dataset, we propose the ImageNet-S dataset with 1.2 million training images and 50k high-quality semantic segmentation annotations for evaluation. Our benchmark has a high data diversity and a clear task objective. We also present a simple yet effective baseline method that works surprisingly well for LUSS. In addition, we benchmark related un/weakly/fully supervised methods accordingly, identifying the challenges and possible directions of LUSS.
f
Segmentation data of the training images.
plos.figshare.com
xls
Updated Jun 2, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dong-heng Xie; Ming Lu; Yong-fang Xie; Duan Liu; Xiong Li (2023). Segmentation data of the training images. [Dataset]. http://doi.org/10.1371/journal.pone.0210411.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0210411.t001
Dataset updated
Jun 2, 2023
Dataset provided by
PLOS ONE
Authors
Dong-heng Xie; Ming Lu; Yong-fang Xie; Duan Liu; Xiong Li
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Segmentation data of the training images.
Training data for 'Segmenteverygrain'
zenodo.org
zip
Updated Jul 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zoltan Sylvester; Zoltan Sylvester (2025). Training data for 'Segmenteverygrain' [Dataset]. http://doi.org/10.5281/zenodo.15786086
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.15786086
Dataset updated
Jul 1, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Zoltan Sylvester; Zoltan Sylvester
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This folder contains 48 images of grains and the corresponding segmentation masks that form the majority of the images that were used to train the U-Net model that the 'Segmenteverygrain' Python package is based on. The images have filenames that terminate in '_image.png'; the mask filenames terminate in '_mask.png'. The mask rasters only contain three values: 0 for background, 1 for the grain itself, and 2 for the grain boundary.

These files can be used to train a new U-Net model, either using 'Segmenteverygrain' functions, or using any machine learning framework that has functionality for training image segmentation models.

Some of these images come from the SediNet project (Buscombe, 2019). A few images of fluvial gravel were collected by Mair et al. (2022), using UAVs; see this repository. The remaining images were taken either with a handheld digital camera or using a microscope.
An Image Dataset for Training Deep Learning Segmentation Models to Identify...
zenodo.org
data.niaid.nih.gov
bin, tiff
Updated Jul 17, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Junfeng Zhu; Junfeng Zhu; Muhammad Usman Rafique; Muhammad Usman Rafique; Nathan Jacobs; Nathan Jacobs (2024). An Image Dataset for Training Deep Learning Segmentation Models to Identify Karst Sinkholes [Dataset]. http://doi.org/10.5281/zenodo.5789436
Explore at:
tiff, binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.5789436
Dataset updated
Jul 17, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Junfeng Zhu; Junfeng Zhu; Muhammad Usman Rafique; Muhammad Usman Rafique; Nathan Jacobs; Nathan Jacobs
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The image dataset was prepared for training deep learning image segmentation models to identify karst sinkholes. Information about the work can be found at (https://github.com/mvrl/sink-seg/). The dataset consists of a DEM image, an aerial image, and a binary sinkhole label image in an area in central Kentucky, USA. It also includes four images derived from the DEM image. The image dataset is sourced from publicly available data from Kentucky's Elevation Data & Aerial Photography Program (https://kyfromabove.ky.gov/) and Kentucky LiDAR-derived sinkholes (https://kgs.uky.edu/geomap).
j
Data from: Training images for semantic segmentation of bridge damage...
jstagedata.jst.go.jp
txt
Updated Dec 25, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tonan FUJISHIMA; Ji DANG; Pang-jo Chun (2023). Training images for semantic segmentation of bridge damage detection [Dataset]. http://doi.org/10.50915/data.jsceiii.24750210.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.50915/data.jsceiii.24750210.v1
Dataset updated
Dec 25, 2023
Dataset provided by
Japan Society of Civil Engineers
Authors
Tonan FUJISHIMA; Ji DANG; Pang-jo Chun
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
"Image.zip" contains 955 corrrosion images, 1480 crack images, 1269 free lime images, 873 water leakage images, and 1244 spalling images. These images are labeled with numbers from 0 to 6 including the background. The "Label.zip" file contains the labeled images, and the "Image.json" file contains the label information.
T
segment_anything
tensorflow.org
Updated Dec 11, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). segment_anything [Dataset]. https://www.tensorflow.org/datasets/catalog/segment_anything
Explore at:
Dataset updated
Dec 11, 2024
Description
SA-1B Download

Segment Anything 1 Billion (SA-1B) is a dataset designed for training general-purpose object segmentation models from open world images. The dataset was introduced in the paper "Segment Anything".

The SA-1B dataset consists of 11M diverse, high-resolution, licensed, and privacy-protecting images and 1.1B mask annotations. Masks are given in the COCO run-length encoding (RLE) format, and do not have classes.

The license is custom. Please, read the full terms and conditions on https://ai.facebook.com/datasets/segment-anything-downloads.

All the features are in the original dataset except image.content (content of the image).

You can decode segmentation masks with:

import tensorflow_datasets as tfds pycocotools = tfds.core.lazy_imports.pycocotools ds = tfds.load('segment_anything', split='train') for example in tfds.as_numpy(ds): segmentation = example['annotations']['segmentation'] for counts, size in zip(segmentation['counts'], segmentation['size']): encoded_mask = {'size': size, 'counts': counts} mask = pycocotools.decode(encoded_mask) # np.array(dtype=uint8) mask ...

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('segment_anything', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.
h
roads-segmentation-dataset
huggingface.co
Updated Sep 16, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Training Data (2023). roads-segmentation-dataset [Dataset]. https://huggingface.co/datasets/TrainingDataPro/roads-segmentation-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 16, 2023
Authors
Training Data
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Description
Road Segmentation Dataset

This dataset comprises a collection of images captured through DVRs (Digital Video Recorders) showcasing roads. Each image is accompanied by segmentation masks demarcating different entities (road surface, cars, road signs, marking and background) within the scene.

💴 For Commercial Usage: To discuss your requirements, learn about the price and buy the dataset, leave a request on TrainingData to buy the dataset

The dataset can be utilized… See the full description on the dataset page: https://huggingface.co/datasets/TrainingDataPro/roads-segmentation-dataset.
m
Membrane Segmentation using Unet with Grad-CAM based Heatmap
data.mendeley.com
Updated Aug 14, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Duc Chung Tran (2020). Membrane Segmentation using Unet with Grad-CAM based Heatmap [Dataset]. http://doi.org/10.17632/6whw7rx6b6.1
Explore at:
Unique identifier
https://doi.org/10.17632/6whw7rx6b6.1
Dataset updated
Aug 14, 2020
Authors
Duc Chung Tran
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset visualizes Grad-CAM based heatmap applied on membrane segmentation results when using Unet. The training data are shown in "train" folder in which: - "checkpoint" folder: stores checkpoint files for 3 epochs: 100, 500, and 5,000. - "image" folder: holds training images - "label" folder: stores labelled membrane images The testing results are stored in "test_xxx" folders for 3 epochs: 100, 500 and 5,000.
Z
Training dataset for "A deep learned nanowire segmentation model using...
data.niaid.nih.gov
zenodo.org
Updated Jul 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
David, A. Santos (2024). Training dataset for "A deep learned nanowire segmentation model using synthetic data augmentation" [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6469772
Explore at:
Dataset updated
Jul 16, 2024
Dataset provided by
Lin, Binbin
Nima, Emami
Yuting, Luo
Bai-Xiang, Xu
David, A. Santos
Sarbajit, Banerjee
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This image dataset contains synthetic structure images used for training the deep-learning based nanowire segmentation model presented in our work "A deep learned nanowire segmentation model using synthetic data augmentation" to be published in npj Computational materials. Detailed information can be found in the corresponding article.
S
Two residential districts datasets from Kielce, Poland for building semantic...
scidb.cn
Updated Sep 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Agnieszka Łysak (2022). Two residential districts datasets from Kielce, Poland for building semantic segmentation task [Dataset]. http://doi.org/10.57760/sciencedb.02955
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.57760/sciencedb.02955
Dataset updated
Sep 29, 2022
Dataset provided by
Science Data Bank
Authors
Agnieszka Łysak
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Area covered
Poland, Kielce
Description
Today, deep neural networks are widely used in many computer vision problems, also for geographic information systems (GIS) data. This type of data is commonly used for urban analyzes and spatial planning. We used orthophotographic images of two residential districts from Kielce, Poland for research including urban sprawl automatic analysis with Transformer-based neural network application.Orthophotomaps were obtained from Kielce GIS portal. Then, the map was manually masked into building and building surroundings classes. Finally, the ortophotomap and corresponding classification mask were simultaneously divided into small tiles. This approach is common in image data preprocessing for machine learning algorithms learning phase. Data contains two original orthophotomaps from Wietrznia and Pod Telegrafem residential districts with corresponding masks and also their tiled version, ready to provide as a training data for machine learning models.Transformed-based neural network has undergone a training process on the Wietrznia dataset, targeted for semantic segmentation of the tiles into buildings and surroundings classes. After that, inference of the models was used to test model's generalization ability on the Pod Telegrafem dataset. The efficiency of the model was satisfying, so it can be used in automatic semantic building segmentation. Then, the process of dividing the images can be reversed and complete classification mask retrieved. This mask can be used for area of the buildings calculations and urban sprawl monitoring, if the research would be repeated for GIS data from wider time horizon.Since the dataset was collected from Kielce GIS portal, as the part of the Polish Main Office of Geodesy and Cartography data resource, it may be used only for non-profit and non-commertial purposes, in private or scientific applications, under the law "Ustawa z dnia 4 lutego 1994 r. o prawie autorskim i prawach pokrewnych (Dz.U. z 2006 r. nr 90 poz 631 z późn. zm.)". There are no other legal or ethical considerations in reuse potential.Data information is presented below.wietrznia_2019.jpg - orthophotomap of Wietrznia districtmodel's - used for training, as an explanatory imagewietrznia_2019.png - classification mask of Wietrznia district - used for model's training, as a target imagewietrznia_2019_validation.jpg - one image from Wietrznia district - used for model's validation during training phasepod_telegrafem_2019.jpg - orthophotomap of Pod Telegrafem district - used for model's evaluation after training phasewietrznia_2019 - folder with wietrznia_2019.jpg (image) and wietrznia_2019.png (annotation) images, divided into 810 tiles (512 x 512 pixels each), tiles with no information were manually removed, so the training data would contain only informative tilestiles presented - used for the model during training (images and annotations for fitting the model to the data)wietrznia_2019_vaidation - folder with wietrznia_2019_validation.jpg image divided into 16 tiles (256 x 256 pixels each) - tiles were presented to the model during training (images for validation model's efficiency); it was not the part of the training datapod_telegrafem_2019 - folder with pod_telegrafem.jpg image divided into 196 tiles (256 x 265 pixels each) - tiles were presented to the model during inference (images for evaluation model's robustness)Dataset was created as described below.Firstly, the orthophotomaps were collected from Kielce Geoportal (https://gis.kielce.eu). Kielce Geoportal offers a .pst recent map from April 2019. It is an orthophotomap with a resolution of 5 x 5 pixels, constructed from a plane flight at 700 meters over ground height, taken with a camera for vertical photos. Downloading was done by WMS in open-source QGIS software (https://www.qgis.org), as a 1:500 scale map, then converted to a 1200 dpi PNG image.Secondly, the map from Wietrznia residential district was manually labelled, also in QGIS, in the same scope, as the orthophotomap. Annotation based on land cover map information was also obtained from Kielce Geoportal. There are two classes - residential building and surrounding. Second map, from Pod Telegrafem district was not annotated, since it was used in the testing phase and imitates situation, where there is no annotation for the new data presented to the model.Next, the images was converted to an RGB JPG images, and the annotation map was converted to 8-bit GRAY PNG image.Finally, Wietrznia data files were tiled to 512 x 512 pixels tiles, in Python PIL library. Tiles with no information or a relatively small amount of information (only white background or mostly white background) were manually removed. So, from the 29113 x 15938 pixels orthophotomap, only 810 tiles with corresponding annotations were left, ready to train the machine learning model for the semantic segmentation task. Pod Telegrafem orthophotomap was tiled with no manual removing, so from the 7168 x 7168 pixels ortophotomap were created 197 tiles with 256 x 256 pixels resolution. There was also image of one residential building, used for model's validation during training phase, it was not the part of the training data, but was a part of Wietrznia residential area. It was 2048 x 2048 pixel ortophotomap, tiled to 16 tiles 256 x 265 pixels each.

Facebook

Twitter

Click to copy link

Link copied

Cite

Phillipe Wernette; Daniel Buscombe; Jaycee Favela; Sharon Fitzpatrick; Evan Goldstein; Nicholas Enwright; Erin Dunand (2024). Coast Train--Labeled imagery for training and evaluation of data-driven models for image segmentation [Dataset]. http://doi.org/10.5066/P91NP87I

Coast Train--Labeled imagery for training and evaluation of data-driven models for image segmentation

Explore at:

3 scholarly articles cite this dataset (View in Google Scholar)

Unique identifier

https://doi.org/10.5066/P91NP87I

Dataset updated

Aug 31, 2024

Dataset provided by

United States Geological Surveyhttp://www.usgs.gov/

Authors

Phillipe Wernette; Daniel Buscombe; Jaycee Favela; Sharon Fitzpatrick; Evan Goldstein; Nicholas Enwright; Erin Dunand

License

U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically

Time period covered

Jan 1, 2008 - Dec 31, 2020

Description

Coast Train is a library of images of coastal environments, annotations, and corresponding thematic label masks (or ‘label images’) collated for the purposes of training and evaluating machine learning (ML), deep learning, and other models for image segmentation. It includes image sets from both geospatial satellite, aerial, and UAV imagery and orthomosaics, as well as non-geospatial oblique and nadir imagery. Images include a diverse range of coastal environments from the U.S. Pacific, Gulf of Mexico, Atlantic, and Great Lakes coastlines, consisting of time-series of high-resolution (≤1m) orthomosaics and satellite image tiles (10–30m). Each image, image annotation, and labelled image is available as a single NPZ zipped file. NPZ files follow the following naming convention: {datasource}_{numberofclasses}_{threedigitdatasetversion}.zip, where {datasource} is the source of the original images (for example, NAIP, Landsat 8, Sentinel 2), {numberofclasses} is the number of classes us ...

Clear search

Close search

Google apps

Main menu

Coast Train--Labeled imagery for training and evaluation of data-driven...

Data from: imageseg: An R package for deep learning-based image segmentation...

FeM dataset – An iron ore labeled images dataset for segmentation training...

Target Image Segmentation Data Dataset

Target Image Segmentation Data

brain-tumor-image-dataset-semantic-segmentation

Images and 2-class labels for semantic segmentation of Sentinel-2 and...

Doodleverse/Segmentation Zoo/Seg2Map Res-UNet models for FloodNet/10-class...

Deep Learning Image Segmentation of Sandy Beaches in Southeastern Australia

Semantic segmentation results using a training dataset of real underwater...

Atrium-Segmentation

ImageNet-S Dataset

Segmentation data of the training images.

Training data for 'Segmenteverygrain'

An Image Dataset for Training Deep Learning Segmentation Models to Identify...

Data from: Training images for semantic segmentation of bridge damage...

segment_anything

SA-1B Download

roads-segmentation-dataset

Membrane Segmentation using Unet with Grad-CAM based Heatmap

Training dataset for "A deep learned nanowire segmentation model using...

Two residential districts datasets from Kielce, Poland for building semantic...

Coast Train--Labeled imagery for training and evaluation of data-driven models for image segmentationSee More Versions

Coast Train--Labeled imagery for training and evaluation of data-driven models for image segmentation