https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Detection of clouds is an important step in many remote sensing applications that are based on optical imagery. 95-Cloud dataset is an extensive dataset for this task to help researchers to evaluate their deep learning-based cloud segmentation models.
95-Cloud dataset is an extension of our previous 38-Cloud dataset. 95-Cloud has 57 more Landsat 8 scenes for "training" which are uploaded here. The rest of the training scene and the test scenes can be downloaded from here.
More information about the dataset can be found at: https://github.com/SorourMo/95-Cloud-An-Extension-to-38-Cloud-Dataset https://github.com/SorourMo/38-Cloud-A-Cloud-Segmentation-Dataset https://github.com/SorourMo/Cloud-Net-A-semantic-segmentation-CNN-for-cloud-detection
This dataset has been prepared by Laboratory for Robotics Vision (LRV) at School of Engineering Science, Simon Fraser University, Vancouver, Canada.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Overview
This dataset comprises cloud masks for 513 1022-by-1022 pixel subscenes, at 20m resolution, sampled random from the 2018 Level-1C Sentinel-2 archive. The design of this dataset follows from some observations about cloud masking: (i) performance over an entire product is highly correlated, thus subscenes provide more value per-pixel than full scenes, (ii) current cloud masking datasets often focus on specific regions, or hand-select the products used, which introduces a bias into the dataset that is not representative of the real-world data, (iii) cloud mask performance appears to be highly correlated to surface type and cloud structure, so testing should include analysis of failure modes in relation to these variables.
The data was annotated semi-automatically, using the IRIS toolkit, which allows users to dynamically train a Random Forest (implemented using LightGBM), speeding up annotations by iteratively improving it's predictions, but preserving the annotator's ability to make final manual changes when needed. This hybrid approach allowed us to process many more masks than would have been possible manually, which we felt was vital in creating a large enough dataset to approximate the statistics of the whole Sentinel-2 archive.
In addition to the pixel-wise, 3 class (CLEAR, CLOUD, CLOUD_SHADOW) segmentation masks, we also provide users with binary
classification "tags" for each subscene that can be used in testing to determine performance in specific circumstances. These include:
Wherever practical, cloud shadows were also annotated, however this was sometimes not possible due to high-relief terrain, or large ambiguities. In total, 424 were marked with shadows (if present), and 89 have shadows that were not annotatable due to very ambiguous shadow boundaries, or terrain that cast significant shadows. If users wish to train an algorithm specifically for cloud shadow masks, we advise them to remove those 89 images for which shadow was not possible, however, bear in mind that this will systematically reduce the difficulty of the shadow class compared to real-world use, as these contain the most difficult shadow examples.
In addition to the 20m sampled subscenes and masks, we also provide users with shapefiles that define the boundary of the mask on the original Sentinel-2 scene. If users wish to retrieve the L1C bands at their original resolutions, they can use these to do so.
Please see the README for further details on the dataset structure and more.
Contributions & Acknowledgements
The data were collected, annotated, checked, formatted and published by Alistair Francis and John Mrziglod.
Support and advice was provided by Prof. Jan-Peter Muller and Dr. Panagiotis Sidiropoulos, for which we are grateful.
We would like to extend our thanks to Dr. Pierre-Philippe Mathieu and the rest of the team at ESA PhiLab, who provided the environment in which this project was conceived, and continued to give technical support throughout.
Finally, we thank the ESA Network of Resources for sponsoring this project by providing ICT resources.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
MARIne Debris Archive (MARIDA) is a marine debris-oriented dataset on Sentinel-2 satellite images. It also includes various sea features that co-exist. MARIDA is primarily focused on the weakly supervised pixel-level semantic segmentation task.
Citation: Kikaki K, Kakogeorgiou I, Mikeli P, Raitsos DE, Karantzalos K (2022) MARIDA: A benchmark for Marine Debris detection from Sentinel-2 remote sensing data. PLoS ONE 17(1): e0262247. https://doi.org/10.1371/journal.pone.0262247
For the quick start guide visit marine-debris.github.io
The dataset contains:
i. 1381 patches (256 x 256) structured by Unique Dates and S2 Tiles. Each patch is provided along with the corresponding masks of pixel-level annotated classes (*_cl) and confidence levels (*_conf). Patches are given in GeoTiff format.
ii. Shapefiles data in WGS’84/ UTM projection, with file naming convention following the scheme: s2_dd-mm-yy_ttt, where s2 denotes the S2 sensor, dd denotes the day, mm the month, yy the year and ttt denotes the S2 tile. Shapefiles include the class of each annotation along with the confidence level and the marine debris report description.
iii. Train, Validation and Test split for evaluating machine learning algorithms.
iv. The assigned multi-labels for each patch (labels_mapping.txt).
The mapping between Digital Numbers and Classes is:
1: Marine Debris
2: Dense Sargassum
3: Sparse Sargassum
4: Natural Organic Material
5: Ship
6: Clouds
7: Marine Water
8: Sediment-Laden Water
9: Foam
10: Turbid Water
11: Shallow Water
12: Waves
13: Cloud Shadows
14: Wakes
15: Mixed Water
The mapping between Digital Numbers and Confidence level is:
1: High
2: Moderate
3: Low
The mapping between Digital Numbers and marine debris Report existence is:
1: Very close
2: Away
3: No
The final uncompressed dataset requires 4.38 GB of storage.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The Geostationary Operational Environmental Satellite-R Series (GOES-R) is the next generation of geostationary weather satellites. The GOES-R series will significantly improve the detection and observation of environmental phenomena that directly affect public safety, protection of property and our nation’s economic health and prosperity.
The GOES-16 satellite, known as GOES-R prior to launch, is the first satellite in the series. It will provide images of weather pattern and severe storms as frequently as every 30 seconds, which will contribute to more accurate and reliable weather forecasts and severe weather outlooks.
The raw dataset includes a feed of the Advanced Baseline Imager (ABI) radiance data (Level 1b) and Cloud and Moisture Imager (CMI) products (Level 2) which are freely available through the NOAA Big Data Project.
You can use the BigQuery Python client library to query tables in this dataset in Kernels. Note that methods available in Kernels are limited to querying data. Tables are at bigquery-public-data.github_repos.[TABLENAME]
. Fork this kernel to get started to learn how to safely manage analyzing large BigQuery datasets.
The NOAA Big Data Project (BDP) is an experimental collaboration between NOAA and infrastructure-as-a-service (IaaS) providers to explore methods of expand the accessibility of NOAA’s data in order to facilitate innovation and collaboration. The goal of this approach is to help form new lines of business and economic growth while making NOAA's data more discoverable for the American public.
https://storage.googleapis.com/public-dataset-images/noaa-goes-16-sample.png" alt="Sample images">
Key metadata for this dataset has been extracted into convenient BigQuery tables (one each for L1b radiance, L2 CMIP, and L2 MCMIP). These tables can be used to query metadata in order to filter the data down to only a subset of raw netcdf4 files available in Google Cloud Storage.
NEW GOES-19 Data!! On April 4, 2025 at 1500 UTC, the GOES-19 satellite will be declared the Operational GOES-East satellite. All products and services, including NODD, for GOES-East will transition to GOES-19 data at that time. GOES-19 will operate out of the GOES-East location of 75.2°W starting on April 1, 2025 and through the operational transition. Until the transition time and during the final stretch of Post Launch Product Testing (PLPT), GOES-19 products are considered non-operational regardless of their validation maturity level. Shortly following the transition of GOES-19 to GOES-East, all data distribution from GOES-16 will be turned off. GOES-16 will drift to the storage location at 104.7°W. GOES-19 data should begin flowing again on April 4th once this maneuver is complete.
NEW GOES 16 Reprocess Data!! The reprocessed GOES-16 ABI L1b data mitigates systematic data issues (including data gaps and image artifacts) seen in the Operational products, and improves the stability of both the radiometric and geometric calibration over the course of the entire mission life. These data were produced by recomputing the L1b radiance products from input raw L0 data using improved calibration algorithms and look-up tables, derived from data analysis of the NIST-traceable, on-board sources. In addition, the reprocessed data products contain enhancements to the L1b file format, including limb pixels and pixel timestamps, while maintaining compatibility with the operational products. The datasets currently available span the operational life of GOES-16 ABI, from early 2018 through the end of 2024. The Reprocessed L1b dataset shows improvement over the Operational L1b products but may still contain data gaps or discrepancies. Please provide feedback to Dan Lindsey (dan.lindsey@noaa.gov) and Gary Lin (guoqing.lin-1@nasa.gov). More information can be found in the GOES-R ABI Reprocess User Guide.
NOTICE: As of January 10th 2023, GOES-18 assumed the GOES-West position and all data files are deemed both operational and provisional, so no ‘preliminary, non-operational’ caveat is needed. GOES-17 is now offline, shifted approximately 105 degree West, where it will be in on-orbit storage. GOES-17 data will no longer flow into the GOES-17 bucket. Operational GOES-West products can be found in the GOES-18 bucket.
GOES satellites (GOES-16, GOES-17, GOES-18 & GOES-19) provide continuous weather imagery and
monitoring of meteorological and space environment data across North America.
GOES satellites provide the kind of continuous monitoring necessary for
intensive data analysis. They hover continuously over one position on the surface.
The satellites orbit high enough to allow for a full-disc view of the Earth. Because
they stay above a fixed spot on the surface, they provide a constant vigil for the
atmospheric "triggers" for severe weather conditions such as tornadoes, flash floods,
hailstorms, and hurricanes. When these conditions develop, the GOES satellites are able
to monitor storm development and track their movements. SUVI products available in both NetCDF and FITS.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Detection of clouds is an important step in many remote sensing applications that are based on optical imagery. 95-Cloud dataset is an extensive dataset for this task to help researchers to evaluate their deep learning-based cloud segmentation models.
95-Cloud dataset is an extension of our previous 38-Cloud dataset. 95-Cloud has 57 more Landsat 8 scenes for "training" which are uploaded here. The rest of the training scene and the test scenes can be downloaded from here.
More information about the dataset can be found at: https://github.com/SorourMo/95-Cloud-An-Extension-to-38-Cloud-Dataset https://github.com/SorourMo/38-Cloud-A-Cloud-Segmentation-Dataset https://github.com/SorourMo/Cloud-Net-A-semantic-segmentation-CNN-for-cloud-detection
This dataset has been prepared by Laboratory for Robotics Vision (LRV) at School of Engineering Science, Simon Fraser University, Vancouver, Canada.