5 datasets found

95-Cloud: Cloud Segmentation on Satellite Images
kaggle.com
Updated Apr 12, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sorour (2021). 95-Cloud: Cloud Segmentation on Satellite Images [Dataset]. https://www.kaggle.com/sorour/95cloud-cloud-segmentation-on-satellite-images/notebooks
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 12, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Sorour
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Context

Detection of clouds is an important step in many remote sensing applications that are based on optical imagery. 95-Cloud dataset is an extensive dataset for this task to help researchers to evaluate their deep learning-based cloud segmentation models.

Content

95-Cloud dataset is an extension of our previous 38-Cloud dataset. 95-Cloud has 57 more Landsat 8 scenes for "training" which are uploaded here. The rest of the training scene and the test scenes can be downloaded from here.

More information about the dataset can be found at: https://github.com/SorourMo/95-Cloud-An-Extension-to-38-Cloud-Dataset https://github.com/SorourMo/38-Cloud-A-Cloud-Segmentation-Dataset https://github.com/SorourMo/Cloud-Net-A-semantic-segmentation-CNN-for-cloud-detection

Acknowledgements

This dataset has been prepared by Laboratory for Robotics Vision (LRV) at School of Engineering Science, Simon Fraser University, Vancouver, Canada.
Sentinel-2 Cloud Mask Catalogue
zenodo.org
data.niaid.nih.gov
csv, pdf, zip
Updated Jul 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alistair Francis; Alistair Francis; John Mrziglod; Panagiotis Sidiropoulos; Panagiotis Sidiropoulos; Jan-Peter Muller; Jan-Peter Muller; John Mrziglod (2024). Sentinel-2 Cloud Mask Catalogue [Dataset]. http://doi.org/10.5281/zenodo.4172871
Explore at:
pdf, zip, csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.4172871
Dataset updated
Jul 19, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Alistair Francis; Alistair Francis; John Mrziglod; Panagiotis Sidiropoulos; Panagiotis Sidiropoulos; Jan-Peter Muller; Jan-Peter Muller; John Mrziglod
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Overview

This dataset comprises cloud masks for 513 1022-by-1022 pixel subscenes, at 20m resolution, sampled random from the 2018 Level-1C Sentinel-2 archive. The design of this dataset follows from some observations about cloud masking: (i) performance over an entire product is highly correlated, thus subscenes provide more value per-pixel than full scenes, (ii) current cloud masking datasets often focus on specific regions, or hand-select the products used, which introduces a bias into the dataset that is not representative of the real-world data, (iii) cloud mask performance appears to be highly correlated to surface type and cloud structure, so testing should include analysis of failure modes in relation to these variables.

The data was annotated semi-automatically, using the IRIS toolkit, which allows users to dynamically train a Random Forest (implemented using LightGBM), speeding up annotations by iteratively improving it's predictions, but preserving the annotator's ability to make final manual changes when needed. This hybrid approach allowed us to process many more masks than would have been possible manually, which we felt was vital in creating a large enough dataset to approximate the statistics of the whole Sentinel-2 archive.

In addition to the pixel-wise, 3 class (CLEAR, CLOUD, CLOUD_SHADOW) segmentation masks, we also provide users with binary
classification "tags" for each subscene that can be used in testing to determine performance in specific circumstances. These include:

SURFACE TYPE: 11 categories

CLOUD TYPE: 7 categories

CLOUD HEIGHT: low, high

CLOUD THICKNESS: thin, thick

CLOUD EXTENT: isolated, extended

Wherever practical, cloud shadows were also annotated, however this was sometimes not possible due to high-relief terrain, or large ambiguities. In total, 424 were marked with shadows (if present), and 89 have shadows that were not annotatable due to very ambiguous shadow boundaries, or terrain that cast significant shadows. If users wish to train an algorithm specifically for cloud shadow masks, we advise them to remove those 89 images for which shadow was not possible, however, bear in mind that this will systematically reduce the difficulty of the shadow class compared to real-world use, as these contain the most difficult shadow examples.

In addition to the 20m sampled subscenes and masks, we also provide users with shapefiles that define the boundary of the mask on the original Sentinel-2 scene. If users wish to retrieve the L1C bands at their original resolutions, they can use these to do so.

Please see the README for further details on the dataset structure and more.

Contributions & Acknowledgements

The data were collected, annotated, checked, formatted and published by Alistair Francis and John Mrziglod.

Support and advice was provided by Prof. Jan-Peter Muller and Dr. Panagiotis Sidiropoulos, for which we are grateful.

We would like to extend our thanks to Dr. Pierre-Philippe Mathieu and the rest of the team at ESA PhiLab, who provided the environment in which this project was conceived, and continued to give technical support throughout.

Finally, we thank the ESA Network of Resources for sponsoring this project by providing ICT resources.
MARIDA: Marine Debris Archive
zenodo.org
data.niaid.nih.gov
zip
Updated Jan 23, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Katerina Kikaki; Katerina Kikaki; Ioannis Kakogeorgiou; Ioannis Kakogeorgiou; Paraskevi Mikeli; ‪Dionysios E. Raitsos; ‪Dionysios E. Raitsos; Konstantinos Karantzalos; Konstantinos Karantzalos; Paraskevi Mikeli (2022). MARIDA: Marine Debris Archive [Dataset]. http://doi.org/10.5281/zenodo.5151941
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.5151941
Dataset updated
Jan 23, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Katerina Kikaki; Katerina Kikaki; Ioannis Kakogeorgiou; Ioannis Kakogeorgiou; Paraskevi Mikeli; ‪Dionysios E. Raitsos; ‪Dionysios E. Raitsos; Konstantinos Karantzalos; Konstantinos Karantzalos; Paraskevi Mikeli
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
MARIne Debris Archive (MARIDA) is a marine debris-oriented dataset on Sentinel-2 satellite images. It also includes various sea features that co-exist. MARIDA is primarily focused on the weakly supervised pixel-level semantic segmentation task.

Citation: Kikaki K, Kakogeorgiou I, Mikeli P, Raitsos DE, Karantzalos K (2022) MARIDA: A benchmark for Marine Debris detection from Sentinel-2 remote sensing data. PLoS ONE 17(1): e0262247. https://doi.org/10.1371/journal.pone.0262247

For the quick start guide visit marine-debris.github.io

The dataset contains:

i. 1381 patches (256 x 256) structured by Unique Dates and S2 Tiles. Each patch is provided along with the corresponding masks of pixel-level annotated classes (*_cl) and confidence levels (*_conf). Patches are given in GeoTiff format.

ii. Shapefiles data in WGS’84/ UTM projection, with file naming convention following the scheme: s2_dd-mm-yy_ttt, where s2 denotes the S2 sensor, dd denotes the day, mm the month, yy the year and ttt denotes the S2 tile. Shapefiles include the class of each annotation along with the confidence level and the marine debris report description.

iii. Train, Validation and Test split for evaluating machine learning algorithms.

iv. The assigned multi-labels for each patch (labels_mapping.txt).

The mapping between Digital Numbers and Classes is:

1: Marine Debris
2: Dense Sargassum
3: Sparse Sargassum
4: Natural Organic Material
5: Ship
6: Clouds
7: Marine Water
8: Sediment-Laden Water
9: Foam
10: Turbid Water
11: Shallow Water
12: Waves
13: Cloud Shadows
14: Wakes
15: Mixed Water

The mapping between Digital Numbers and Confidence level is:

1: High
2: Moderate
3: Low

The mapping between Digital Numbers and marine debris Report existence is:

1: Very close
2: Away
3: No

The final uncompressed dataset requires 4.38 GB of storage.
NOAA GOES-16
kaggle.com
zip
Updated Aug 30, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
NOAA (2019). NOAA GOES-16 [Dataset]. https://www.kaggle.com/noaa/goes16
Explore at:
zip(0 bytes)Available download formats
Dataset updated
Aug 30, 2019
Dataset provided by
National Oceanic and Atmospheric Administrationhttp://www.noaa.gov/
Authors
NOAA
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Overview

The Geostationary Operational Environmental Satellite-R Series (GOES-R) is the next generation of geostationary weather satellites. The GOES-R series will significantly improve the detection and observation of environmental phenomena that directly affect public safety, protection of property and our nation’s economic health and prosperity.

The GOES-16 satellite, known as GOES-R prior to launch, is the first satellite in the series. It will provide images of weather pattern and severe storms as frequently as every 30 seconds, which will contribute to more accurate and reliable weather forecasts and severe weather outlooks.

Content

The raw dataset includes a feed of the Advanced Baseline Imager (ABI) radiance data (Level 1b) and Cloud and Moisture Imager (CMI) products (Level 2) which are freely available through the NOAA Big Data Project.

Querying BigQuery tables

You can use the BigQuery Python client library to query tables in this dataset in Kernels. Note that methods available in Kernels are limited to querying data. Tables are at bigquery-public-data.github_repos.[TABLENAME]. Fork this kernel to get started to learn how to safely manage analyzing large BigQuery datasets.

Acknowledgments

The NOAA Big Data Project (BDP) is an experimental collaboration between NOAA and infrastructure-as-a-service (IaaS) providers to explore methods of expand the accessibility of NOAA’s data in order to facilitate innovation and collaboration. The goal of this approach is to help form new lines of business and economic growth while making NOAA's data more discoverable for the American public. https://storage.googleapis.com/public-dataset-images/noaa-goes-16-sample.png" alt="Sample images">

Key metadata for this dataset has been extracted into convenient BigQuery tables (one each for L1b radiance, L2 CMIP, and L2 MCMIP). These tables can be used to query metadata in order to filter the data down to only a subset of raw netcdf4 files available in Google Cloud Storage.
NOAA Geostationary Operational Environmental Satellites (GOES) 16, 17, 18 &...
registry.opendata.aws
Updated Apr 4, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
NOAA (2025). NOAA Geostationary Operational Environmental Satellites (GOES) 16, 17, 18 & 19 [Dataset]. https://registry.opendata.aws/noaa-goes/
Explore at:
Dataset updated
Apr 4, 2025
Dataset provided by
National Oceanic and Atmospheric Administrationhttp://www.noaa.gov/
Description

NEW GOES-19 Data!! On April 4, 2025 at 1500 UTC, the GOES-19 satellite will be declared the Operational GOES-East satellite. All products and services, including NODD, for GOES-East will transition to GOES-19 data at that time. GOES-19 will operate out of the GOES-East location of 75.2°W starting on April 1, 2025 and through the operational transition. Until the transition time and during the final stretch of Post Launch Product Testing (PLPT), GOES-19 products are considered non-operational regardless of their validation maturity level. Shortly following the transition of GOES-19 to GOES-East, all data distribution from GOES-16 will be turned off. GOES-16 will drift to the storage location at 104.7°W. GOES-19 data should begin flowing again on April 4th once this maneuver is complete.

NEW GOES 16 Reprocess Data!! The reprocessed GOES-16 ABI L1b data mitigates systematic data issues (including data gaps and image artifacts) seen in the Operational products, and improves the stability of both the radiometric and geometric calibration over the course of the entire mission life. These data were produced by recomputing the L1b radiance products from input raw L0 data using improved calibration algorithms and look-up tables, derived from data analysis of the NIST-traceable, on-board sources. In addition, the reprocessed data products contain enhancements to the L1b file format, including limb pixels and pixel timestamps, while maintaining compatibility with the operational products. The datasets currently available span the operational life of GOES-16 ABI, from early 2018 through the end of 2024. The Reprocessed L1b dataset shows improvement over the Operational L1b products but may still contain data gaps or discrepancies. Please provide feedback to Dan Lindsey (dan.lindsey@noaa.gov) and Gary Lin (guoqing.lin-1@nasa.gov). More information can be found in the GOES-R ABI Reprocess User Guide.

NOTICE: As of January 10th 2023, GOES-18 assumed the GOES-West position and all data files are deemed both operational and provisional, so no ‘preliminary, non-operational’ caveat is needed. GOES-17 is now offline, shifted approximately 105 degree West, where it will be in on-orbit storage. GOES-17 data will no longer flow into the GOES-17 bucket. Operational GOES-West products can be found in the GOES-18 bucket.

GOES satellites (GOES-16, GOES-17, GOES-18 & GOES-19) provide continuous weather imagery and monitoring of meteorological and space environment data across North America. GOES satellites provide the kind of continuous monitoring necessary for intensive data analysis. They hover continuously over one position on the surface. The satellites orbit high enough to allow for a full-disc view of the Earth. Because they stay above a fixed spot on the surface, they provide a constant vigil for the atmospheric "triggers" for severe weather conditions such as tornadoes, flash floods, hailstorms, and hurricanes. When these conditions develop, the GOES satellites are able to monitor storm development and track their movements. SUVI products available in both NetCDF and FITS.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Sorour (2021). 95-Cloud: Cloud Segmentation on Satellite Images [Dataset]. https://www.kaggle.com/sorour/95cloud-cloud-segmentation-on-satellite-images/notebooks

95-Cloud: Cloud Segmentation on Satellite Images

A dataset for detection of clouds in optical satellite (Landsat 8) imagery

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Apr 12, 2021

Dataset provided by

Kagglehttp://kaggle.com/

Authors

Sorour

License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

Context

Detection of clouds is an important step in many remote sensing applications that are based on optical imagery. 95-Cloud dataset is an extensive dataset for this task to help researchers to evaluate their deep learning-based cloud segmentation models.

Content

95-Cloud dataset is an extension of our previous 38-Cloud dataset. 95-Cloud has 57 more Landsat 8 scenes for "training" which are uploaded here. The rest of the training scene and the test scenes can be downloaded from here.

More information about the dataset can be found at: https://github.com/SorourMo/95-Cloud-An-Extension-to-38-Cloud-Dataset https://github.com/SorourMo/38-Cloud-A-Cloud-Segmentation-Dataset https://github.com/SorourMo/Cloud-Net-A-semantic-segmentation-CNN-for-cloud-detection

Acknowledgements

This dataset has been prepared by Laboratory for Robotics Vision (LRV) at School of Engineering Science, Simon Fraser University, Vancouver, Canada.

Clear search

Close search

Google apps

Main menu

95-Cloud: Cloud Segmentation on Satellite Images

Context

Content

Acknowledgements

Sentinel-2 Cloud Mask Catalogue

MARIDA: Marine Debris Archive

NOAA GOES-16

Overview

Content

Querying BigQuery tables

Acknowledgments

NOAA Geostationary Operational Environmental Satellites (GOES) 16, 17, 18 &...

95-Cloud: Cloud Segmentation on Satellite Images

A dataset for detection of clouds in optical satellite (Landsat 8) imagery

Context

Content

Acknowledgements