Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Images and 2-class labels for semantic segmentation of Sentinel-2 and Landsat RGB satellite images of coasts (water, other)
Images and 2-class labels for semantic segmentation of Sentinel-2 and Landsat RGB satellite images of coasts (water, other)
Description
4088 images and 4088 associated labels for semantic segmentation of Sentinel-2 and Landsat RGB satellite images of coasts. The 2 classes are 1=water, 0=other. Imagery are a mixture of 10-m Sentinel-2 and 15-m pansharpened Landsat 7, 8, and 9 visible-band imagery of various sizes. Red, Green, Blue bands only
These images and labels could be used within numerous Machine Learning frameworks for image segmentation, but have specifically been made for use with the Doodleverse software package, Segmentation Gym**.
Two data sources have been combined
Dataset 1
Dataset 2
File descriptions
References
*Doodler: Buscombe, D., Goldstein, E.B., Sherwood, C.R., Bodine, C., Brown, J.A., Favela, J., Fitzpatrick, S., Kranenburg, C.J., Over, J.R., Ritchie, A.C. and Warrick, J.A., 2021. Human‐in‐the‐Loop Segmentation of Earth Surface Imagery. Earth and Space Science, p.e2021EA002085https://doi.org/10.1029/2021EA002085. See https://github.com/Doodleverse/dash_doodler.
**Segmentation Gym: Buscombe, D., & Goldstein, E. B. (2022). A reproducible and reusable pipeline for segmentation of geoscientific imagery. Earth and Space Science, 9, e2022EA002332. https://doi.org/10.1029/2022EA002332 See: https://github.com/Doodleverse/segmentation_gym
***Coast Train data release: Wernette, P.A., Buscombe, D.D., Favela, J., Fitzpatrick, S., and Goldstein E., 2022, Coast Train--Labeled imagery for training and evaluation of data-driven models for image segmentation: U.S. Geological Survey data release, https://doi.org/10.5066/P91NP87I. See https://coasttrain.github.io/CoastTrain/ for more information
****Buscombe, Daniel, Goldstein, Evan, Bernier, Julie, Bosse, Stephen, Colacicco, Rosa, Corak, Nick, Fitzpatrick, Sharon, del Jesús González Guillén, Anais, Ku, Venus, Paprocki, Julie, Platt, Lindsay, Steele, Bethel, Wright, Kyle, & Yasin, Brandon. (2022). Images and 4-class labels for semantic segmentation of Sentinel-2 and Landsat RGB satellite images of coasts (water, whitewater, sediment, other) (v1.0) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.7335647
*****Seale, C., Redfern, T., Chatfield, P. 2022. Sentinel-2 Water Edges Dataset (SWED) https://openmldata.ukho.gov.uk/
******Seale, C., Redfern, T., Chatfield, P., Luo, C. and Dempsey, K., 2022. Coastline detection in satellite imagery: A deep learning approach on new benchmark data. Remote Sensing of Environment, 278, p.113044.
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
Coast Train is a library of images of coastal environments, annotations, and corresponding thematic label masks (or ‘label images’) collated for the purposes of training and evaluating machine learning (ML), deep learning, and other models for image segmentation. It includes image sets from both geospatial satellite, aerial, and UAV imagery and orthomosaics, as well as non-geospatial oblique and nadir imagery. Images include a diverse range of coastal environments from the U.S. Pacific, Gulf of Mexico, Atlantic, and Great Lakes coastlines, consisting of time-series of high-resolution (≤1m) orthomosaics and satellite image tiles (10–30m). Each image, image annotation, and labelled image is available as a single NPZ zipped file. NPZ files follow the following naming convention: {datasource}_{numberofclasses}_{threedigitdatasetversion}.zip, where {datasource} is the source of the original images (for example, NAIP, Landsat 8, Sentinel 2), {numberofclasses} is the number of classes us ...
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
[Please use version 1.0.1]
The CloudTracks dataset consists of 1,780 MODIS satellite images hand-labeled for the presence of more than 12,000 ship tracks. More information about how the dataset was constructed may be found at github.com/stanfordmlgroup/CloudTracks. The file structure of the dataset is as follows:
CloudTracks/
full/
images/
(sample image name) mod2002121.1920D.png
jsons/
(sample json name) mod2002121.1920D.json
The naming convention is as follows:
mod2002121.1920D: the first 3 letters specify which of the sensors on the two MODIS satellites captured the image, mod for Terra and myd for Aqua. This is followed by a 4 digit year (2002) and a 3 digit day of the year (121). The following 4 digits specify the time of day (1920; 24 hour format in the UTC timezone), followed by D or N for Day or Night.
The 1,780 MODIS Terra and Aqua images were collected between 2002 and 2021 inclusive over various stratocumulus cloud regions (such as the East Pacific and East Atlantic) where ship tracks have commonly been observed. Each image has dimension 1354 x 2030 and a spatial resolution of 1km. Of the 36 bands collected by the instruments, we selected channels 1, 20, and 32 to capture useful physical properties of cloud formations.
The labels are found in the corresponding JSON files for each image. The following keys in the json are particularly important:
imagePath: the filename of the image.
shapes: the list of annotations corresponding to the image, where each element of the list is a dictionary corresponding to a single instance annotation. The dictionary has a key with value "shiptrack" or "uncertain" which is the label of the annotation and the corresponding value is a linestrip detailing the ship track path.
Further pre-processing details may be found at the GitHub link above. If you have any questions about the dataset, contact us at:
mahmedch@stanford.edu, lynakim@stanford.edu, jirvin16@cs.stanford.edu
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Images and 2-class labels for semantic segmentation of Sentinel-2 and Landsat RGB, NIR, and SWIR satellite images of coasts (water, other)
Images and 2-class labels for semantic segmentation of Sentinel-2 and Landsat 5-band (R+G+B+NIR+SWIR) satellite images of coasts (water, other)
Description
3649 images and 3649 associated labels for semantic segmentation of Sentinel-2 and Landsat 5-band (R+G+B+NIR+SWIR) satellite images of coasts. The 2 classes are 1=water, 0=other. Imagery are a mixture of 10-m Sentinel-2 and 15-m pansharpened Landsat 7, 8, and 9 visible-band imagery of various sizes. Red, Green, Blue, near-infrared, and short-wave infrared bands only
These images and labels could be used within numerous Machine Learning frameworks for image segmentation, but have specifically been made for use with the Doodleverse software package, Segmentation Gym**.
Two data sources have been combined
Dataset 1
* 579 image-label pairs from the following data release**** https://doi.org/10.5281/zenodo.7344571
* Labels have been reclassified from 4 classes to 2 classes.
* Some (422) of these images and labels were originally included in the Coast Train*** data release, and have been modified from their original by reclassifying from the original classes to the present 2 classes.
* These images and labels have been made using the Doodleverse software package, Doodler*.
Dataset 2
File descriptions
References
*Doodler: Buscombe, D., Goldstein, E.B., Sherwood, C.R., Bodine, C., Brown, J.A., Favela, J., Fitzpatrick, S., Kranenburg, C.J., Over, J.R., Ritchie, A.C. and Warrick, J.A., 2021. Human‐in‐the‐Loop Segmentation of Earth Surface Imagery. Earth and Space Science, p.e2021EA002085https://doi.org/10.1029/2021EA002085. See https://github.com/Doodleverse/dash_doodler.
**Segmentation Gym: Buscombe, D., & Goldstein, E. B. (2022). A reproducible and reusable pipeline for segmentation of geoscientific imagery. Earth and Space Science, 9, e2022EA002332. https://doi.org/10.1029/2022EA002332 See: https://github.com/Doodleverse/segmentation_gym
***Coast Train data release: Wernette, P.A., Buscombe, D.D., Favela, J., Fitzpatrick, S., and Goldstein E., 2022, Coast Train--Labeled imagery for training and evaluation of data-driven models for image segmentation: U.S. Geological Survey data release, https://doi.org/10.5066/P91NP87I. See https://coasttrain.github.io/CoastTrain/ for more information
****Buscombe, Daniel. (2022). Images and 4-class labels for semantic segmentation of Sentinel-2 and Landsat RGB, NIR, and SWIR satellite images of coasts (water, whitewater, sediment, other) (v1.0) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.7344571
*****Seale, C., Redfern, T., Chatfield, P. 2022. Sentinel-2 Water Edges Dataset (SWED) https://openmldata.ukho.gov.uk/
******Seale, C., Redfern, T., Chatfield, P., Luo, C. and Dempsey, K., 2022. Coastline detection in satellite imagery: A deep learning approach on new benchmark data. Remote Sensing of Environment, 278, p.113044.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The folders in labels.zip contain labels for solar panel objects as part of the Solar Panels in Satellite Imagery dataset. The labels are partitioned based on corresponding image type: 31 cm native and 15.5 cm HD resolution imagery. In total, there are 2,542 object labels for each image type, following the same naming convention as the corresponding image chips. The corresponding image chips may be accessed at:
https://resources.maxar.com/product-samples/15-cm-hd-and-30-cm-view-ready-solar-panels-germany
The naming convention for all labels includes the name of the dataset, image type, tile identification number, minimum x bound, minimum y bound, and window size. The minimum bounds correspond to the origin of the chip in the full tile.
Labels are provided in .txt format compatible with the YOLTv4 architecture, where a single row in a label file contains the following information for one solar panel object: category, x-center, y-center, x-width, and y-width. Center and width values are normalized by chip sizes (416 by 416 pixels for native chips and 832 by 832 pixels for HD chips).
The geocoordinates for each solar panel object may be determined using the native resolution labels (found in the labels_native directory). The center and width values for each object, along with the relative location information provided by the naming convention for each label, may be used to determine the pixel coordinates for each object in the full, corresponding native resolution tile. The pixel coordinates may be translated to geocoordinates using the EPSG:32633 coordinate system and the following geotransform for each tile:
Tile 1: (307670.04, 0.31, 0.0, 5434427.100000001, 0.0, -0.31) Tile 2: (312749.07999999996, 0.31, 0.0, 5403952.860000001, 0.0, -0.31) Tile 3: (312749.07999999996, 0.31, 0.0, 5363320.540000001, 0.0, -0.31)
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
What this collection is: A curated, binary-classified image dataset of grayscale (1 band) 400 x 400-pixel size, or image chips, in a JPEG format extracted from processed Sentinel-1 Synthetic Aperture Radar (SAR) satellite scenes acquired over various regions of the world, and featuring clear open ocean chips, look-alikes (wind or biogenic features) and oil slick chips.
This binary dataset contains chips labelled as: - "0" for chips not containing any oil features (look-alikes or clean seas) - "1" for those containing oil features.
This binary dataset is imbalanced, and biased towards "0" labelled chips (i.e., no oil features), which correspond to 66% of the dataset. Chips containing oil features, labelled "1", correspond to 34% of the dataset.
Why: This dataset can be used for training, validation and/or testing of machine learning, including deep learning, algorithms for the detection of oil features in SAR imagery. Directly applicable for algorithm development for the European Space Agency Sentinel-1 SAR mission (https://sentinel.esa.int/web/sentinel/missions/sentinel-1 ), it may be suitable for the development of detection algorithms for other SAR satellite sensors.
Overview of this dataset: Total number of chips (both classes) is N=5,630 Class \t 0\t 1 Total\t\t3,725\t1,905
Further information and description is found in the ReadMe file provided (ReadMe_Sentinel1_SAR_OilNoOil_20221215.txt)
EuroSAT
EuroSAT is a benchmark dataset for land use and land cover classification based on Sentinel-2 satellite imagery. It contains 27,000 labeled images covering 10 classes (e.g., agricultural, residential, industrial, and forest areas). The dataset features multi-spectral bands with a spatial resolution of 10 meters per pixel and an image resolution of 64 × 64 pixels.
How to Use This Dataset
from datasets import load_dataset
dataset =… See the full description on the dataset page: https://huggingface.co/datasets/GFM-Bench/EuroSAT.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Dynamic World Training Data is a dataset of over 5 billion pixels of human-labeled ESA Sentinel-2 satellite image, distributed over 24000 tiles collected from all over the world. The dataset is designed to train and validate automated land use and land cover mapping algorithms. The 10m resolution 5.1km-by-5.1km tiles are densely labeled using a ten category classification schema indicating general land use land cover categories. The dataset was created between 2019-08-01 and 2020-02-28, using satellite imagery observations from 2019, with approximately 10% of observations extending back to 2017 in very cloudy regions of the world. This dataset is a component of the National Geographic Society - Google - World Resources Institute Dynamic World project. […]
AID is a new large-scale aerial image dataset, by collecting sample images from Google Earth imagery. Note that although the Google Earth images are post-processed using RGB renderings from the original optical aerial images, it has proven that there is no significant difference between the Google Earth images with the real optical aerial images even in the pixel-level land use/cover mapping. Thus, the Google Earth images can also be used as aerial images for evaluating scene classification algorithms.
The new dataset is made up of the following 30 aerial scene types: airport, bare land, baseball field, beach, bridge, center, church, commercial, dense residential, desert, farmland, forest, industrial, meadow, medium residential, mountain, park, parking, playground, pond, port, railway station, resort, river, school, sparse residential, square, stadium, storage tanks and viaduct. All the images are labelled by the specialists in the field of remote sensing image interpretation, and some samples of each class are shown in Fig.1. In all, the AID dataset has a number of 10000 images within 30 classes.
The images in AID are actually multi-source, as Google Earth images are from different remote imaging sensors. This brings more challenges for scene classification than the single source images like UC-Merced dataset. Moreover, all the sample images per each class in AID are carefully chosen from different countries and regions around the world, mainly in China, the United States, England, France, Italy, Japan, Germany, etc., and they are extracted at different time and seasons under different imaging conditions, which increases the intra-class diversities of the data.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
MLRSNet provides different perspectives of the world captured from satellites. That is, it is composed of high spatial resolution optical satellite images. MLRSNet contains 109,161 remote sensing images that are annotated into 46 categories, and the number of sample images in a category varies from 1,500 to 3,000. The images have a fixed size of 256×256 pixels with various pixel resolutions (~10m to 0.1m). Moreover, each image in the dataset is tagged with several of 60 predefined class labels, and the number of labels associated with each image varies from 1 to 13. The dataset can be used for multi-label based image classification, multi-label based image retrieval, and image segmentation.
The Dataset includes: 1. Images folder: 46 categories, 109,161 high-spatial resolution remote sensing images. 2. Labels folders: each category has a .csv file. 3. Categories_names. xlsx: Sheet1 lists the names of 46 categories, and the Sheet2 shows the associated multi-label to each category.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset provides hand-labeled points for four different land cover classes in coastal areas (sand, seawater, grass, trees).
It was created based on photointerpretation of high-resolution imagery in Google Earth Pro and QGIS, referring to the year 2019.
This dataset was used for the random forest classification of satellite imagery in the following manuscript:
"Satellite image processing for the coarse-scale investigation of sandy coastal areas".
If you use any part of this dataset, please cite as follows:
This layer of the map based index (GeoIndex) shows satellite data at different resolutions depending on the current map scale. At small scales, it is shown in generalised form with each pixel covering 300 metres, and at larger scales it is shown at its actual resolution of 30 metres. The satellite imagery in GeoIndex was acquired by the Landsat Thematic Mapper sensor between 1984 and 1990. The imagery has been processed by the BGS Remote Sensing Section to increase contrast and thus enhance natural boundaries. Winter imagery was chosen due to the low sun angle, which enables geomorphic features on the landscape to be distinguished and interpreted. The colours in the image are not what one would normally expect to see because we have used infrared wavelengths to help us extract more geological information than would be possible if we had used visible bands. To create a single image of the whole country, many smaller images covering different rectangular areas and taken at different dates have been patched together. This will in some cases produce marked changes where the smaller images meet and is due to the different conditions when the images were taken.
https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
1) Data Introduction • The Satellite Images of Hurricane Damage Dataset is The Satellite Images of Hurricane Damage Dataset is a binary image classification computer vision dataset based on satellite images taken in Texas, USA, after Hurricane Harvey in 2017. Each image is labeled as either ‘damage’ (indicating structural damage) or ‘no_damage’ (indicating no damage), allowing for automatic identification of building damage in disaster scenarios.
2) Data Utilization (1) Characteristics of the Satellite Images of Hurricane Damage Dataset: • The dataset is composed of real satellite images taken immediately after a natural disaster, providing a realistic and reliable training environment for the development of automated disaster response and recovery systems.
(2) Applications of the Satellite Images of Hurricane Damage Dataset: • Development of disaster damage recognition models: This dataset can be used to train deep learning-based AI models that automatically classify whether buildings have been damaged based on satellite imagery. These models can contribute to decision-making in rescue prioritization and damage extent analysis. • Geospatial risk prediction systems: By integrating with GIS systems, the dataset can help visualize damage-prone areas on maps, supporting real-time decisions and resource allocation optimization during future disasters.
Remotely sensed imagery is increasingly used by emergency managers to monitor and map the impact of flood events to support preparedness, response, and critical decision making throughout the flood event lifecycle. To reduce latency in delivery of imagery-derived information, ensure consistent and reliably derived map products, and facilitate processing of an increasing volume of remote sensing data-streams, automated flood mapping workflows are needed. The U.S. Geological Survey is facilitating the development and integration of machine-learning algorithms in collaboration with NASA, National Geospatial Intelligence Agency (NGA), University of Alabama, and University of Illinois to create a workflow for rapidly generating improved flood-map products. A major bottleneck to the training of robust, generalizable machine learning algorithms for pattern recognition is a lack of training data that is representative across the landscape. To overcome this limitation for the training of algorithms capable of detection of surface inundation in diverse contexts, this publication includes the data developed from MAXAR Worldview sensors that is input as training data for machine learning. This data release consists of 100 thematic rasters, in GeoTiff format, with image labels representing five discrete categories: water, not water, maybe water, clouds and background/no data. Specifically, these training data were created by labeling 8-band, multispectral scenes from the MAXAR-Digital Globe, Worldview-2 and 3 satellite-based sensors. Scenes were selected to be spatially and spectrally diverse and geographically representative of different water features within the continental U.S. The labeling procedures used a hybrid approach of unsupervised classification for the initial spectral clustering, followed by expert-level manual interpretation and QA/QC peer review to finalize each labeled image. Updated versions of the data may be issued along with version update documentation. The 100 raster files that make up the training data are available to download here (https://doi.org/10.5066/P9C7HYRV).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Background: The term rockfall describes the rapid displacement of a large, usually meter-sized block of rock down-slope, triggered by, for example, endogenic or exogenic events like impacts, quakes or rainfall. In a remote sensing context, the term rockfall is also being used to describe the characteristic geomorphic deposit of a rockfall event that can be identified from an air- or space-borne perspective, i.e., the combination of a displaced boulder and the track it carved into the slope substrate while bouncing, rolling, and sliding over the surface (also called boulder with track' or
rolling boulder'). In planetary science, the spatial distribution and frequency of rockfalls provide insights into the global erosional state and activity of a planetary body while their tracks act as tools that allow for the remote estimation of the surface strength properties of yet unexplored regions in preparation of future ground exploration missions, such as the lunar pyroclastic, polar sunlit and permanently shadowed regions of the Moon. Due to their small physical size (meters), the identification and mapping of rockfalls in planetary satellite imagery is challenging and very time-consuming, however. For this reason, Bickel et al. (2018) and Bickel et al. (2020) trained convolutional neural networks to automate rockfall mapping in lunar and martian satellite imagery. Parts of the unpublished datasets used for earlier work have now been complemented with newly labeled data to create a well-balanced dataset of 2,822 lunar and martian rockfall labels (which we call `RMaM-2020' --- [R]ockfall [Ma]rs [M]oon [2020], 416 MB in total, available here) that can be used for deep learning and other data science applications. Here, balanced means that the labels have been derived from imagery with a wide and continuous range of properties like spatial resolution, solar illumination, and others. So far, this dataset has been used to analyze the benefits of multi-domain learning on rockfall detector performance (Mars & Moon vs. Moon-only or Mars-only), but there are numerous other (non-planetary science) applications such as for featurization, feature or target recognition (aircraft/spacecraft autonomy), and data augmentation experiments.
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
https://vision.eng.au.dk/wp-content/uploads/2020/07/example_obs-1024x206-1024x206.jpg" alt="">
The CloudCast dataset contains 70080 cloud-labeled satellite images with 10 different cloud types corresponding to multiple layers of the atmosphere. The raw satellite images come from a satellite constellation in geostationary orbit centred at zero degrees longitude and arrive in 15-minute intervals from the European Organisationfor Meteorological Satellites (EUMETSAT). The resolution of these images is 3712 x 3712 pixels for the full-disk of Earth, which implies that every pixel corresponds to a space of dimensions 3x3km. This is the highest possible resolution from European geostationary satellites when including infrared channels. Some pre- and post-processing of the raw satellite images are also being done by EUMETSAT before being exposed to the public, such as removing airplanes. We collect all the raw multispectral satellite images and annotate them individually on a pixel-level using a segmentation algorithm. The full dataset then has a spatial resolution of 928 x 1530 pixels recorded with 15-min intervals for the period 2017-2018, where each pixel represents an area of 3×3 km. To enable standardized datasets for benchmarking computer vision methods, this includes a full-resolution gray-scaled dataset centered and projected dataset over Europe (128×128).
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
If you use this dataset in your research or elsewhere, please cite/reference the following paper: CloudCast: A Satellite-Based Dataset and Baseline for Forecasting Clouds
There are 24 folders in the dataset containing the following information:
| File | Definition | Note | | --- | --- | | X.npy | Numpy encoded array containing the actual 128x128 image with pixel values as labels, see below. | | | GEO.npz| Numpy array containing geo coordinates where the image was taken (latitude and longitude). | | | TIMESTAMPS.npy| Numpy array containing timestamps for each captured image. | Images are captured in 15-minute intervals. |
0 = No clouds or missing data 1 = Very low clouds 2 = Low clouds 3 = Mid-level clouds 4 = High opaque clouds 5 = Very high opaque clouds 6 = Fractional clouds 7 = High semitransparant thin clouds 8 = High semitransparant moderately thick clouds 9 = High semitransparant thick clouds 10 = High semitransparant above low or medium clouds
https://i.ibb.co/NFv55QW/cloudcast4.png" alt="">
https://i.ibb.co/3FhHzMT/cloudcast3.png" alt="">
https://i.ibb.co/9wCsJhR/cloudcast2.png" alt="">
https://i.ibb.co/9T5dbSH/cloudcast1.png" alt="">
The xBD dataset contains over 45,000KM2 of polygon labeled pre and post disaster imagery. The dataset provides the post-disaster imagery with transposed polygons from pre over the buildings, with damage classification labels.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Effective land use management is crucial for balancing development against environmental sustainability, preservation of biodiversity, and resilience to climate change impacts. Despite this, there is a notable scarcity of comprehensive aerial imagery datasets for refining and improving machine learning frameworks to better inform policy making. In this paper, we introduce a substantial aerial imagery dataset from New Zealand curated for the Waikato region, spanning 25,000 km\(^2\), specifically to address this gap and empower global research efforts. The dataset comprises of a main set, containing more than 140,000 images, with three supplementary sets. Each image in the main dataset is annotated with 33 fine-grained, multi-labeled classes and approximate segmentation masks of the classes, and the three supplementary datasets cover spatially coincident satellite imagery and aerial imagery five years prior and five years later from the main dataset.
This dataset was created by ishiryish
Released under Data files © Original Authors
The HRPlanesv2 dataset contains 2120 VHR Google Earth images. To further improve experiment results, images of airports from many different regions with various uses (civil/military/joint) selected and labeled. A total of 14,335 aircrafts have been labelled. Each image is stored as a ".jpg" file of size 4800 x 2703 pixels and each label is stored as YOLO ".txt" format. Dataset has been split in three parts as 70% train, %20 validation and test. The aircrafts in the images in the train and validation datasets have a percentage of 80 or more in size. Link: https://github.com/dilsadunsal/HRPlanesv2-Data-Set
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Images and 2-class labels for semantic segmentation of Sentinel-2 and Landsat RGB satellite images of coasts (water, other)
Images and 2-class labels for semantic segmentation of Sentinel-2 and Landsat RGB satellite images of coasts (water, other)
Description
4088 images and 4088 associated labels for semantic segmentation of Sentinel-2 and Landsat RGB satellite images of coasts. The 2 classes are 1=water, 0=other. Imagery are a mixture of 10-m Sentinel-2 and 15-m pansharpened Landsat 7, 8, and 9 visible-band imagery of various sizes. Red, Green, Blue bands only
These images and labels could be used within numerous Machine Learning frameworks for image segmentation, but have specifically been made for use with the Doodleverse software package, Segmentation Gym**.
Two data sources have been combined
Dataset 1
Dataset 2
File descriptions
References
*Doodler: Buscombe, D., Goldstein, E.B., Sherwood, C.R., Bodine, C., Brown, J.A., Favela, J., Fitzpatrick, S., Kranenburg, C.J., Over, J.R., Ritchie, A.C. and Warrick, J.A., 2021. Human‐in‐the‐Loop Segmentation of Earth Surface Imagery. Earth and Space Science, p.e2021EA002085https://doi.org/10.1029/2021EA002085. See https://github.com/Doodleverse/dash_doodler.
**Segmentation Gym: Buscombe, D., & Goldstein, E. B. (2022). A reproducible and reusable pipeline for segmentation of geoscientific imagery. Earth and Space Science, 9, e2022EA002332. https://doi.org/10.1029/2022EA002332 See: https://github.com/Doodleverse/segmentation_gym
***Coast Train data release: Wernette, P.A., Buscombe, D.D., Favela, J., Fitzpatrick, S., and Goldstein E., 2022, Coast Train--Labeled imagery for training and evaluation of data-driven models for image segmentation: U.S. Geological Survey data release, https://doi.org/10.5066/P91NP87I. See https://coasttrain.github.io/CoastTrain/ for more information
****Buscombe, Daniel, Goldstein, Evan, Bernier, Julie, Bosse, Stephen, Colacicco, Rosa, Corak, Nick, Fitzpatrick, Sharon, del Jesús González Guillén, Anais, Ku, Venus, Paprocki, Julie, Platt, Lindsay, Steele, Bethel, Wright, Kyle, & Yasin, Brandon. (2022). Images and 4-class labels for semantic segmentation of Sentinel-2 and Landsat RGB satellite images of coasts (water, whitewater, sediment, other) (v1.0) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.7335647
*****Seale, C., Redfern, T., Chatfield, P. 2022. Sentinel-2 Water Edges Dataset (SWED) https://openmldata.ukho.gov.uk/
******Seale, C., Redfern, T., Chatfield, P., Luo, C. and Dempsey, K., 2022. Coastline detection in satellite imagery: A deep learning approach on new benchmark data. Remote Sensing of Environment, 278, p.113044.