6 datasets found
  1. Z

    Doodleverse/Segmentation Zoo/Seg2Map SegFormer models for CoastTrain/8-class...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jul 12, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Buscombe, Daniel (2024). Doodleverse/Segmentation Zoo/Seg2Map SegFormer models for CoastTrain/8-class segmentation of RGB 768x768 NAIP images [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7641723
    Explore at:
    Dataset updated
    Jul 12, 2024
    Dataset authored and provided by
    Buscombe, Daniel
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Doodleverse/Segmentation Zoo/Seg2Map SegFormer models for CoastTrain/8-class segmentation of RGB 768x768 NAIP images

    These Segformer model data are based on Coast Train images and associated labels. https://coasttrain.github.io/CoastTrain/docs/Version%201:%20March%202022/data

    Models have been created using Segmentation Gym* using the following dataset**: https://doi.org/10.1038/s41597-023-01929-2

    Image size used by model: 768 x 768 x 3 pixels

    classes:

    water whitewater sediment other_bare_natural_terrain marsh_vegetation terrestrial_vegetation agricultural development

    File descriptions

    For each model, there are 5 files with the same root name:

    1. '.json' config file: this is the file that was used by Segmentation Gym* to create the weights file. It contains instructions for how to make the model and the data it used, as well as instructions for how to use the model for prediction. It is a handy wee thing and mastering it means mastering the entire Doodleverse.

    2. '.h5' weights file: this is the file that was created by the Segmentation Gym* function train_model.py. It contains the trained model's parameter weights. It can called by the Segmentation Gym* function seg_images_in_folder.py. Models may be ensembled.

    3. '_modelcard.json' model card file: this is a json file containing fields that collectively describe the model origins, training choices, and dataset that the model is based upon. There is some redundancy between this file and the config file (described above) that contains the instructions for the model training and implementation. The model card file is not used by the program but is important metadata so it is important to keep with the other files that collectively make the model and is such is considered part of the model

    4. '_model_history.npz' model training history file: this numpy archive file contains numpy arrays describing the training and validation losses and metrics. It is created by the Segmentation Gym function train_model.py

    5. '.png' model training loss and mean IoU plot: this png file contains plots of training and validation losses and mean IoU scores during model training. A subset of data inside the .npz file. It is created by the Segmentation Gym function train_model.py

    Additionally, BEST_MODEL.txt contains the name of the model with the best validation loss and mean IoU

    References *Segmentation Gym: Buscombe, D., & Goldstein, E. B. (2022). A reproducible and reusable pipeline for segmentation of geoscientific imagery. Earth and Space Science, 9, e2022EA002332. https://doi.org/10.1029/2022EA002332 See: https://github.com/Doodleverse/segmentation_gym

    **Buscombe, D., Wernette, P., Fitzpatrick, S. et al. A 1.2 Billion Pixel Human-Labeled Dataset for Data-Driven Classification of Coastal Environments. Sci Data 10, 46 (2023). https://doi.org/10.1038/s41597-023-01929-2

  2. Z

    Doodleverse/CoastSeg Segformer models for 4-class (water, whitewater,...

    • data.niaid.nih.gov
    Updated Jul 11, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Buscombe, Daniel (2024). Doodleverse/CoastSeg Segformer models for 4-class (water, whitewater, sediment and other) segmentation of Sentinel-2 and Landsat-7/8 NDWI images of coasts. [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8190741
    Explore at:
    Dataset updated
    Jul 11, 2024
    Dataset authored and provided by
    Buscombe, Daniel
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Doodleverse/CoastSeg Segformer models for 4-class (water, whitewater, sediment and other) segmentation of Sentinel-2 and Landsat-7/8 NDWI images of coasts.

    Models have been created using Segmentation Gym* using the following datasets ** https://zenodo.org/record/7384263 and ***: https://doi.org/10.5281/zenodo.7335647. Those datasets have been combined and the training and validation images and labels are provided here.

    Classes: {0=water, 1=whitewater, 2=sediment, 3=other}

    Model validation accuracy statistics

    model name: overall accuracy, mean frequency weighted IoU, mean IoU, Matthews correlation. Bold indicates best overall

        0.896016693115234
        0.832759195999637
        0.565748652153519
        0.806139409136944
    
    
    
    
    
    
    
    
        0.906008201175266
        0.847625161837392
        0.593821675882991
        0.819790462192222
    
    
    
    
    
    
    
    
        0.903999212053087
        0.844255821722932
        0.577444030164045
        0.813646408575
    

    File descriptions

    For each model, there are 5 files with the same root name:

    1. '.json' config file: this is the file that was used by Segmentation Gym* to create the weights file. It contains instructions for how to make the model and the data it used, as well as instructions for how to use the model for prediction. It is a handy wee thing and mastering it means mastering the entire Doodleverse.

    2. '.h5' weights file: this is the file that was created by the Segmentation Gym* function train_model.py. It contains the trained model's parameter weights. It can called by the Segmentation Gym* function seg_images_in_folder.py. Models may be ensembled.

    3. '_modelcard.json' model card file: this is a json file containing fields that collectively describe the model origins, training choices, and dataset that the model is based upon. There is some redundancy between this file and the config file (described above) that contains the instructions for the model training and implementation. The model card file is not used by the program but is important metadata so it is important to keep with the other files that collectively make the model and is such is considered part of the model

    4. '_model_history.npz' model training history file: this numpy archive file contains numpy arrays describing the training and validation losses and metrics. It is created by the Segmentation Gym function train_model.py

    5. '.png' model training loss and mean IoU plot: this png file contains plots of training and validation losses and mean IoU scores during model training. A subset of data inside the .npz file. It is created by the Segmentation Gym function train_model.py

    Additionally, BEST_MODEL.txt contains the name of the model with the best validation loss and mean IoU

    This is a sister model to these sets of Residual UNets:

    https://zenodo.org/record/7557072 https://zenodo.org/record/7352859

    References

    *Segmentation Gym: Buscombe, D., & Goldstein, E. B. (2022). A reproducible and reusable pipeline for segmentation of geoscientific imagery. Earth and Space Science, 9, e2022EA002332. https://doi.org/10.1029/2022EA002332 See: https://github.com/Doodleverse/segmentation_gym

    ** https://zenodo.org/record/7384263

    ***Buscombe, Daniel. (2023). June 2023 Supplement Images and 4-class labels for semantic segmentation of Sentinel-2 and Landsat RGB, NIR, and SWIR satellite images of coasts (water, whitewater, sediment, other) (v1.0) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.8011926

  3. Doodleverse/Segmentation Zoo/Seg2Map SegFormer models for segmentation of...

    • zenodo.org
    • data.niaid.nih.gov
    bin, json, png, txt
    Updated Jul 12, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Daniel Buscombe; Daniel Buscombe (2024). Doodleverse/Segmentation Zoo/Seg2Map SegFormer models for segmentation of xBD/buildings in RGB 768x768 high-res. images [Dataset]. http://doi.org/10.5281/zenodo.7613212
    Explore at:
    txt, json, bin, pngAvailable download formats
    Dataset updated
    Jul 12, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Daniel Buscombe; Daniel Buscombe
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Doodleverse/Segmentation Zoo/Seg2Map SegFormer models for segmentation of xBD/buildings in RGB 768x768 high-res. images

    Models have been created using Segmentation Gym* using the following dataset**: https://arxiv.org/abs/1911.09296

    These SegFormer model data are based on 1m spatial footprint images and associated labels of buildings.

    Image size used by model: 768 x 768 x 3 pixels

    classes:
    other
    building

    File descriptions

    For each model, there are 5 files with the same root name:

    1. '.json' config file: this is the file that was used by Segmentation Gym* to create the weights file. It contains instructions for how to make the model and the data it used, as well as instructions for how to use the model for prediction. It is a handy wee thing and mastering it means mastering the entire Doodleverse.

    2. '.h5' weights file: this is the file that was created by the Segmentation Gym* function `train_model.py`. It contains the trained model's parameter weights. It can called by the Segmentation Gym* function `seg_images_in_folder.py`. Models may be ensembled.

    3. '_modelcard.json' model card file: this is a json file containing fields that collectively describe the model origins, training choices, and dataset that the model is based upon. There is some redundancy between this file and the `config` file (described above) that contains the instructions for the model training and implementation. The model card file is not used by the program but is important metadata so it is important to keep with the other files that collectively make the model and is such is considered part of the model

    4. '_model_history.npz' model training history file: this numpy archive file contains numpy arrays describing the training and validation losses and metrics. It is created by the Segmentation Gym function `train_model.py`

    5. '.png' model training loss and mean IoU plot: this png file contains plots of training and validation losses and mean IoU scores during model training. A subset of data inside the .npz file. It is created by the Segmentation Gym function `train_model.py`

    Additionally, BEST_MODEL.txt contains the name of the model with the best validation loss and mean IoU

    References
    *Segmentation Gym: Buscombe, D., & Goldstein, E. B. (2022). A reproducible and reusable pipeline for segmentation of geoscientific imagery. Earth and Space Science, 9, e2022EA002332. https://doi.org/10.1029/2022EA002332 See: https://github.com/Doodleverse/segmentation_gym

    **Ritwik Gupta, Bryce Goodman, Nirav Patel, Ricky Hosfelt, Sandra Sajeev, Eric Heim, Jigar Doshi, Keane Lucas, Howie Choset, and Matthew Gaston. Creating xbd: A dataset for assessing building damage from satellite imagery. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, June 2019. https://arxiv.org/abs/1911.09296

  4. Z

    Doodleverse/CoastSeg Segformer models for 4-class (water, whitewater,...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jul 11, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Buscombe, Daniel (2024). Doodleverse/CoastSeg Segformer models for 4-class (water, whitewater, sediment and other) segmentation of Sentinel-2 and Landsat-7/8 MNDWI images of coasts. [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8190852
    Explore at:
    Dataset updated
    Jul 11, 2024
    Dataset authored and provided by
    Buscombe, Daniel
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Doodleverse/CoastSeg Segformer models for 4-class (water, whitewater, sediment and other) segmentation of Sentinel-2 and Landsat-7/8 MNDWI images of coasts.

    Models have been created using Segmentation Gym* using the following datasets ** https://zenodo.org/record/7384263 and ***: https://doi.org/10.5281/zenodo.7335647. Those datasets have been combined and the training and validation images and labels are provided here.

    Classes: {0=water, 1=whitewater, 2=sediment, 3=other}

    Model validation accuracy statistics

    model name: overall accuracy, mean frequency weighted IoU, mean IoU, Matthews correlation. Bold indicates best overall

    v2: 0.808, 0.7309,  0.47864, 0.656
    v3: 0.809, 0.7302, 0.4982, 0.664
    

    File descriptions

    For each model, there are 5 files with the same root name:

    1. '.json' config file: this is the file that was used by Segmentation Gym* to create the weights file. It contains instructions for how to make the model and the data it used, as well as instructions for how to use the model for prediction. It is a handy wee thing and mastering it means mastering the entire Doodleverse.

    2. '.h5' weights file: this is the file that was created by the Segmentation Gym* function train_model.py. It contains the trained model's parameter weights. It can called by the Segmentation Gym* function seg_images_in_folder.py. Models may be ensembled.

    3. '_modelcard.json' model card file: this is a json file containing fields that collectively describe the model origins, training choices, and dataset that the model is based upon. There is some redundancy between this file and the config file (described above) that contains the instructions for the model training and implementation. The model card file is not used by the program but is important metadata so it is important to keep with the other files that collectively make the model and is such is considered part of the model

    4. '_model_history.npz' model training history file: this numpy archive file contains numpy arrays describing the training and validation losses and metrics. It is created by the Segmentation Gym function train_model.py

    5. '.png' model training loss and mean IoU plot: this png file contains plots of training and validation losses and mean IoU scores during model training. A subset of data inside the .npz file. It is created by the Segmentation Gym function train_model.py

    Additionally, BEST_MODEL.txt contains the name of the model with the best validation loss and mean IoU

    This is a sister model to these sets of Residual UNets:

    https://zenodo.org/record/7352850
    https://zenodo.org/record/7557080
    

    References

    *Segmentation Gym: Buscombe, D., & Goldstein, E. B. (2022). A reproducible and reusable pipeline for segmentation of geoscientific imagery. Earth and Space Science, 9, e2022EA002332. https://doi.org/10.1029/2022EA002332 See: https://github.com/Doodleverse/segmentation_gym

    ** Buscombe, Daniel. (2022). Images and 2-class labels for semantic segmentation of Sentinel-2 and Landsat RGB, NIR, and SWIR satellite images of coasts (water, other) (v1.0) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.7384263

    ***Coast Train data release: Wernette, P.A., Buscombe, D.D., Favela, J., Fitzpatrick, S., and Goldstein E., 2022, Coast Train--Labeled imagery for training and evaluation of data-driven models for image segmentation: U.S. Geological Survey data release, https://doi.org/10.5066/P91NP87I. See https://coasttrain.github.io/CoastTrain/ for more information

    ***Buscombe, Daniel. (2023). June 2023 Supplement Images and 4-class labels for semantic segmentation of Sentinel-2 and Landsat RGB, NIR, and SWIR satellite images of coasts (water, whitewater, sediment, other) (v1.0) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.8011926

  5. h

    sidewalk-semantic

    • huggingface.co
    Updated Jun 12, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Segments.ai (2022). sidewalk-semantic [Dataset]. https://huggingface.co/datasets/segments/sidewalk-semantic
    Explore at:
    Dataset updated
    Jun 12, 2022
    Dataset provided by
    Segments.ai
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Dataset Card for sidewalk-semantic

      Dataset Summary
    

    A dataset of sidewalk images gathered in Belgium in the summer of 2021. Label your own semantic segmentation datasets on segments.ai

      Supported Tasks and Leaderboards
    

    semantic-segmentation: The dataset can be used to train a semantic segmentation model, where each pixel is classified. The model performance is measured by how high its mean IoU (intersection over union) to the reference is.

      Dataset… See the full description on the dataset page: https://huggingface.co/datasets/segments/sidewalk-semantic.
    
  6. Z

    Labeled high-resolution orthoimagery time-series of an alluvial river...

    • data.niaid.nih.gov
    Updated Nov 20, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Buscombe, Daniel (2023). Labeled high-resolution orthoimagery time-series of an alluvial river corridor; Elwha River, Washington, USA. [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10155782
    Explore at:
    Dataset updated
    Nov 20, 2023
    Dataset authored and provided by
    Buscombe, Daniel
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    United States, Washington, Elwha River
    Description

    Labeled high-resolution orthoimagery time-series of an alluvial river corridor; Elwha River, Washington, USA. Daniel Buscombe, Marda Science LLC There are two datasets in this data release: 1. Model training dataset. A manually (or semi-manually) labeled image dataset that was used to train and evaluate a machine (deep) learning model designed to identify subaerial accumulations of large wood, alluvial sediment, water, and vegetation in orthoimagery of alluvial river corridors in forested catchments. 2. Model output dataset. A labeled image dataset that uses the aforementioned model to estimate subaerial accumulations of large wood, alluvial sediment, water, and vegetation in a larger orthoimagery dataset of alluvial river corridors in forested catchments. All of these label data are derived from raw gridded data that originate from the U.S. Geological Survey (Ritchie et al., 2018). That dataset consists of 14 orthoimages of the Middle Reach (MR, in between the former Aldwell and Mills reservoirs) and 14 corresponding Lower Reach (LR, downstream of the former Mills reservoir) of the Elwha River, Washington, collected between the period 2012-04-07 and 2017-09-22. That orthoimagery was generated using SfM photogrammetry (following Over et al., 2021) using a photographic camera mounted to an aircraft wing. The imagery capture channel change as it evolved under a ~20 Mt sediment pulse initiated by the removal of the two dams. The two reaches are the ~8 km long Middle Reach (MR) and the lower-gradient ~7 km long Lower Reach (LR). The orthoimagery have been labeled (pixelwise, either manually or by an automated process) according to the following classes (inter class in the label data in parentheses): 1. vegetation / other (0) 2. water (1) 3. sediment (2) 4. large wood (3) 1. Model training dataset. Imagery was labeled using a combination of the open-source software Doodler (Buscombe et al., 2021; https://github.com/Doodleverse/dash_doodler) and hand-digitization using QGIS at 1:300 scale, rasterizeing the polygons, and gridded and clipped in the same way as all other gridded data. Doodler facilitates relatively labor-free dense multiclass labeling of natural imagery, enabling relatively rapid training dataset creation. The final training dataset consists of 4382 images and corresponding labels, each 1024 x 1024 pixels and representing just over 5% of the total data set. The training data are sampled approximately equally in time and in space among both reaches. All training and validation samples purposefully included all four label classes, to avoid model training and evaluation problems associated with class imbalance (Buscombe and Goldstein, 2022). Data are provided in geoTIFF format. The imagery and label grids (imagery) are reprojected to be co-located in the NAD83(2011) / UTM zone 10N projection, and to consist of 0.125 x 0.125m pixels. Pixel-wise labels measurements such as these facilitate development and evaluation of image segmentation, image classification, object-based image-analysis (OBIA), and object-in-image detection models, and numerous potential other machine learning models for the general purposes of river corridor classification, description, enumeration, inventory, and process or state quantification. For example this dataset may serve in transfer learning contexts for application in different river or coastal environments or for different tasks or class ontologies. Files: 1. Labels_used_for_model_training_Buscombe_Labeled_high_resolution_orthoimagery_time_series_of_an_alluvial_river_corridor_Elwha_River_Washington_USA.zip, 63 MB, label tiffs 2. Model_training_ images1of4.zip, 1.5 GB, imagery tiffs 3. Model_training_ images2of4.zip, 1.5 GB, imagery tiffs 4. Model_training_ images3of4.zip, 1.7 GB, imagery tiffs 5. Model_training_ images4of4.zip, 1.6 GB, imagery tiffs 2. Model output dataset. Imagery was labeled using a deep-learning based semantic segmentation model (Buscombe, 2023) trained specifically for the task using the Segmentation Gym (Buscombe and Goldstein, 2022) modeling suite. We use the software package Segmentation Gym (Buscombe and Goldstein, 2022) to fine-tune a Segformer (Xie et al., 2021) deep learning model for semantic image segmentation. We take the instance (i.e. model architecture and trained weights) of the model of Xie et al. (2021), itself fine-tuned on ADE20k dataset (Zhou et al., 2019) at resolution 512x512 pixels, and fine-tune it on our 1024x1024 pixel training data consisting of 4-class label images. The spatial extent of the imagery in the MR is 455157.2494695878122002,5316532.9804129302501678 : 457076.1244695878122002,5323771.7304129302501678. Imagery width is 15351 pixels and imagery height is 57910 pixels. The spatial extent of the imagery in the LR is 457704.9227139975992031,5326631.3750646486878395 : 459241.6727139975992031,5333311.0000646486878395. Imagery width is 12294 pixels and imagery height is 53437 pixels. Data are provided in Cloud-Optimzed geoTIFF (COG) format. The imagery and label grids (imagery) are reprojected to be co-located in the NAD83(2011) / UTM zone 10N projection, and to consist of 0.125 x 0.125m pixels. All grids have been clipped to the union of extents of active channel margins during the period of interest. Reach-wide pixel-wise measurements such as these facilitate comparison of wood and sediment storage at any scale or location. These data may be useful for studying the morphodynamics of wood-sediment interactions in other geomorphically complex channels, wood storage in channels, the role of wood in ecosystems and conservation or restoration efforts. Files: 1. Elwha_MR_labels_Buscombe_Labeled_high_resolution_orthoimagery_time_series_of_an_alluvial_river_corridor_Elwha_River_Washington_USA.zip, 9.67 MB, label COGs from Elwha River Middle Reach (MR) 2. ElwhaMR_ imagery_ part1_ of_ 2.zip, 566 MB, imagery COGs from Elwha River Middle Reach (MR) 3. ElwhaMR_ imagery_ part2_ of_ 2.zip, 618 MB, imagery COGs from Elwha River Middle Reach (MR) 3. Elwha_LR_labels_Buscombe_Labeled_high_resolution_orthoimagery_time_series_of_an_alluvial_river_corridor_Elwha_River_Washington_USA.zip, 10.96 MB, label COGs from Elwha River Lower Reach (LR) 4. ElwhaLR_ imagery_ part1_ of_ 2.zip, 622 MB, imagery COGs from Elwha River Middle Reach (MR) 5. ElwhaLR_ imagery_ part2_ of_ 2.zip, 617 MB, imagery COGs from Elwha River Middle Reach (MR) This dataset was created using open-source tools of the Doodleverse, a software ecosystem for geoscientific image segmentation, by Daniel Buscombe (https://github.com/dbuscombe-usgs) and Evan Goldstein (https://github.com/ebgoldstein). Thanks to the contributors of the Doodleverse!. Thanks especially Sharon Fitzpatrick (https://github.com/2320sharon) and Jaycee Favela for contributing labels. References • Buscombe, D. (2023). Doodleverse/Segmentation Gym SegFormer models for 4-class (other, water, sediment, wood) segmentation of RGB aerial orthomosaic imagery (v1.0) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.8172858 • Buscombe, D., Goldstein, E. B., Sherwood, C. R., Bodine, C., Brown, J. A., Favela, J., et al. (2021). Human-in-the-loop segmentation of Earth surface imagery. Earth and Space Science, 9, e2021EA002085. https://doi.org/10.1029/2021EA002085 • Buscombe, D., & Goldstein, E. B. (2022). A reproducible and reusable pipeline for segmentation of geoscientific imagery. Earth and Space Science, 9, e2022EA002332. https://doi.org/10.1029/2022EA002332 See: https://github.com/Doodleverse/segmentation_gym • Over, J.R., Ritchie, A.C., Kranenburg, C.J., Brown, J.A., Buscombe, D., Noble, T., Sherwood, C.R., Warrick, J.A., and Wernette, P.A., 2021, Processing coastal imagery with Agisoft Metashape Professional Edition, version 1.6—Structure from motion workflow documentation: U.S. Geological Survey Open-File Report 2021–1039, 46 p., https://doi.org/10.3133/ofr20211039. • Ritchie, A.C., Curran, C.A., Magirl, C.S., Bountry, J.A., Hilldale, R.C., Randle, T.J., and Duda, J.J., 2018, Data in support of 5-year sediment budget and morphodynamic analysis of Elwha River following dam removals: U.S. Geological Survey data release, https://doi.org/10.5066/F7PG1QWC. • Xie, E., Wang, W., Yu, Z., Anandkumar, A., Alvarez, J.M. and Luo, P., 2021. SegFormer: Simple and efficient design for semantic segmentation with transformers. Advances in Neural Information Processing Systems, 34, pp.12077-12090. • Zhou, B., Zhao, H., Puig, X., Xiao, T., Fidler, S., Barriuso, A. and Torralba, A., 2019. Semantic understanding of scenes through the ade20k dataset. International Journal of Computer Vision, 127, pp.302-321.

  7. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Buscombe, Daniel (2024). Doodleverse/Segmentation Zoo/Seg2Map SegFormer models for CoastTrain/8-class segmentation of RGB 768x768 NAIP images [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7641723

Doodleverse/Segmentation Zoo/Seg2Map SegFormer models for CoastTrain/8-class segmentation of RGB 768x768 NAIP images

Explore at:
Dataset updated
Jul 12, 2024
Dataset authored and provided by
Buscombe, Daniel
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Doodleverse/Segmentation Zoo/Seg2Map SegFormer models for CoastTrain/8-class segmentation of RGB 768x768 NAIP images

These Segformer model data are based on Coast Train images and associated labels. https://coasttrain.github.io/CoastTrain/docs/Version%201:%20March%202022/data

Models have been created using Segmentation Gym* using the following dataset**: https://doi.org/10.1038/s41597-023-01929-2

Image size used by model: 768 x 768 x 3 pixels

classes:

water whitewater sediment other_bare_natural_terrain marsh_vegetation terrestrial_vegetation agricultural development

File descriptions

For each model, there are 5 files with the same root name:

  1. '.json' config file: this is the file that was used by Segmentation Gym* to create the weights file. It contains instructions for how to make the model and the data it used, as well as instructions for how to use the model for prediction. It is a handy wee thing and mastering it means mastering the entire Doodleverse.

  2. '.h5' weights file: this is the file that was created by the Segmentation Gym* function train_model.py. It contains the trained model's parameter weights. It can called by the Segmentation Gym* function seg_images_in_folder.py. Models may be ensembled.

  3. '_modelcard.json' model card file: this is a json file containing fields that collectively describe the model origins, training choices, and dataset that the model is based upon. There is some redundancy between this file and the config file (described above) that contains the instructions for the model training and implementation. The model card file is not used by the program but is important metadata so it is important to keep with the other files that collectively make the model and is such is considered part of the model

  4. '_model_history.npz' model training history file: this numpy archive file contains numpy arrays describing the training and validation losses and metrics. It is created by the Segmentation Gym function train_model.py

  5. '.png' model training loss and mean IoU plot: this png file contains plots of training and validation losses and mean IoU scores during model training. A subset of data inside the .npz file. It is created by the Segmentation Gym function train_model.py

Additionally, BEST_MODEL.txt contains the name of the model with the best validation loss and mean IoU

References *Segmentation Gym: Buscombe, D., & Goldstein, E. B. (2022). A reproducible and reusable pipeline for segmentation of geoscientific imagery. Earth and Space Science, 9, e2022EA002332. https://doi.org/10.1029/2022EA002332 See: https://github.com/Doodleverse/segmentation_gym

**Buscombe, D., Wernette, P., Fitzpatrick, S. et al. A 1.2 Billion Pixel Human-Labeled Dataset for Data-Driven Classification of Coastal Environments. Sci Data 10, 46 (2023). https://doi.org/10.1038/s41597-023-01929-2

Search
Clear search
Close search
Google apps
Main menu