5 datasets found
  1. 95-Cloud: Cloud Segmentation on Satellite Images

    • kaggle.com
    Updated Apr 12, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sorour (2021). 95-Cloud: Cloud Segmentation on Satellite Images [Dataset]. https://www.kaggle.com/sorour/95cloud-cloud-segmentation-on-satellite-images/notebooks
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 12, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Sorour
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    Detection of clouds is an important step in many remote sensing applications that are based on optical imagery. 95-Cloud dataset is an extensive dataset for this task to help researchers to evaluate their deep learning-based cloud segmentation models.

    Content

    95-Cloud dataset is an extension of our previous 38-Cloud dataset. 95-Cloud has 57 more Landsat 8 scenes for "training" which are uploaded here. The rest of the training scene and the test scenes can be downloaded from here.

    More information about the dataset can be found at: https://github.com/SorourMo/95-Cloud-An-Extension-to-38-Cloud-Dataset https://github.com/SorourMo/38-Cloud-A-Cloud-Segmentation-Dataset https://github.com/SorourMo/Cloud-Net-A-semantic-segmentation-CNN-for-cloud-detection

    Acknowledgements

    This dataset has been prepared by Laboratory for Robotics Vision (LRV) at School of Engineering Science, Simon Fraser University, Vancouver, Canada.

  2. Sentinel-2 Cloud Mask Catalogue

    • zenodo.org
    • data.niaid.nih.gov
    csv, pdf, zip
    Updated Jul 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alistair Francis; Alistair Francis; John Mrziglod; Panagiotis Sidiropoulos; Panagiotis Sidiropoulos; Jan-Peter Muller; Jan-Peter Muller; John Mrziglod (2024). Sentinel-2 Cloud Mask Catalogue [Dataset]. http://doi.org/10.5281/zenodo.4172871
    Explore at:
    pdf, zip, csvAvailable download formats
    Dataset updated
    Jul 19, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Alistair Francis; Alistair Francis; John Mrziglod; Panagiotis Sidiropoulos; Panagiotis Sidiropoulos; Jan-Peter Muller; Jan-Peter Muller; John Mrziglod
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Overview

    This dataset comprises cloud masks for 513 1022-by-1022 pixel subscenes, at 20m resolution, sampled random from the 2018 Level-1C Sentinel-2 archive. The design of this dataset follows from some observations about cloud masking: (i) performance over an entire product is highly correlated, thus subscenes provide more value per-pixel than full scenes, (ii) current cloud masking datasets often focus on specific regions, or hand-select the products used, which introduces a bias into the dataset that is not representative of the real-world data, (iii) cloud mask performance appears to be highly correlated to surface type and cloud structure, so testing should include analysis of failure modes in relation to these variables.

    The data was annotated semi-automatically, using the IRIS toolkit, which allows users to dynamically train a Random Forest (implemented using LightGBM), speeding up annotations by iteratively improving it's predictions, but preserving the annotator's ability to make final manual changes when needed. This hybrid approach allowed us to process many more masks than would have been possible manually, which we felt was vital in creating a large enough dataset to approximate the statistics of the whole Sentinel-2 archive.

    In addition to the pixel-wise, 3 class (CLEAR, CLOUD, CLOUD_SHADOW) segmentation masks, we also provide users with binary
    classification "tags" for each subscene that can be used in testing to determine performance in specific circumstances. These include:

    • SURFACE TYPE: 11 categories
    • CLOUD TYPE: 7 categories
    • CLOUD HEIGHT: low, high
    • CLOUD THICKNESS: thin, thick
    • CLOUD EXTENT: isolated, extended

    Wherever practical, cloud shadows were also annotated, however this was sometimes not possible due to high-relief terrain, or large ambiguities. In total, 424 were marked with shadows (if present), and 89 have shadows that were not annotatable due to very ambiguous shadow boundaries, or terrain that cast significant shadows. If users wish to train an algorithm specifically for cloud shadow masks, we advise them to remove those 89 images for which shadow was not possible, however, bear in mind that this will systematically reduce the difficulty of the shadow class compared to real-world use, as these contain the most difficult shadow examples.

    In addition to the 20m sampled subscenes and masks, we also provide users with shapefiles that define the boundary of the mask on the original Sentinel-2 scene. If users wish to retrieve the L1C bands at their original resolutions, they can use these to do so.

    Please see the README for further details on the dataset structure and more.

    Contributions & Acknowledgements

    The data were collected, annotated, checked, formatted and published by Alistair Francis and John Mrziglod.

    Support and advice was provided by Prof. Jan-Peter Muller and Dr. Panagiotis Sidiropoulos, for which we are grateful.

    We would like to extend our thanks to Dr. Pierre-Philippe Mathieu and the rest of the team at ESA PhiLab, who provided the environment in which this project was conceived, and continued to give technical support throughout.

    Finally, we thank the ESA Network of Resources for sponsoring this project by providing ICT resources.

  3. MARIDA: Marine Debris Archive

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Jan 23, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Katerina Kikaki; Katerina Kikaki; Ioannis Kakogeorgiou; Ioannis Kakogeorgiou; Paraskevi Mikeli; ‪Dionysios E. Raitsos; ‪Dionysios E. Raitsos; Konstantinos Karantzalos; Konstantinos Karantzalos; Paraskevi Mikeli (2022). MARIDA: Marine Debris Archive [Dataset]. http://doi.org/10.5281/zenodo.5151941
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 23, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Katerina Kikaki; Katerina Kikaki; Ioannis Kakogeorgiou; Ioannis Kakogeorgiou; Paraskevi Mikeli; ‪Dionysios E. Raitsos; ‪Dionysios E. Raitsos; Konstantinos Karantzalos; Konstantinos Karantzalos; Paraskevi Mikeli
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    MARIne Debris Archive (MARIDA) is a marine debris-oriented dataset on Sentinel-2 satellite images. It also includes various sea features that co-exist. MARIDA is primarily focused on the weakly supervised pixel-level semantic segmentation task.

    Citation: Kikaki K, Kakogeorgiou I, Mikeli P, Raitsos DE, Karantzalos K (2022) MARIDA: A benchmark for Marine Debris detection from Sentinel-2 remote sensing data. PLoS ONE 17(1): e0262247. https://doi.org/10.1371/journal.pone.0262247

    For the quick start guide visit marine-debris.github.io

    The dataset contains:

    i. 1381 patches (256 x 256) structured by Unique Dates and S2 Tiles. Each patch is provided along with the corresponding masks of pixel-level annotated classes (*_cl) and confidence levels (*_conf). Patches are given in GeoTiff format.

    ii. Shapefiles data in WGS’84/ UTM projection, with file naming convention following the scheme: s2_dd-mm-yy_ttt, where s2 denotes the S2 sensor, dd denotes the day, mm the month, yy the year and ttt denotes the S2 tile. Shapefiles include the class of each annotation along with the confidence level and the marine debris report description.

    iii. Train, Validation and Test split for evaluating machine learning algorithms.

    iv. The assigned multi-labels for each patch (labels_mapping.txt).

    The mapping between Digital Numbers and Classes is:

    1: Marine Debris
    2: Dense Sargassum
    3: Sparse Sargassum
    4: Natural Organic Material
    5: Ship
    6: Clouds
    7: Marine Water
    8: Sediment-Laden Water
    9: Foam
    10: Turbid Water
    11: Shallow Water
    12: Waves
    13: Cloud Shadows
    14: Wakes
    15: Mixed Water

    The mapping between Digital Numbers and Confidence level is:

    1: High
    2: Moderate
    3: Low

    The mapping between Digital Numbers and marine debris Report existence is:

    1: Very close
    2: Away
    3: No

    The final uncompressed dataset requires 4.38 GB of storage.

  4. NOAA GOES-16

    • kaggle.com
    zip
    Updated Aug 30, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NOAA (2019). NOAA GOES-16 [Dataset]. https://www.kaggle.com/noaa/goes16
    Explore at:
    zip(0 bytes)Available download formats
    Dataset updated
    Aug 30, 2019
    Dataset provided by
    National Oceanic and Atmospheric Administrationhttp://www.noaa.gov/
    Authors
    NOAA
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Overview

    The Geostationary Operational Environmental Satellite-R Series (GOES-R) is the next generation of geostationary weather satellites. The GOES-R series will significantly improve the detection and observation of environmental phenomena that directly affect public safety, protection of property and our nation’s economic health and prosperity.

    The GOES-16 satellite, known as GOES-R prior to launch, is the first satellite in the series. It will provide images of weather pattern and severe storms as frequently as every 30 seconds, which will contribute to more accurate and reliable weather forecasts and severe weather outlooks.

    Content

    The raw dataset includes a feed of the Advanced Baseline Imager (ABI) radiance data (Level 1b) and Cloud and Moisture Imager (CMI) products (Level 2) which are freely available through the NOAA Big Data Project.

    Querying BigQuery tables

    You can use the BigQuery Python client library to query tables in this dataset in Kernels. Note that methods available in Kernels are limited to querying data. Tables are at bigquery-public-data.github_repos.[TABLENAME]. Fork this kernel to get started to learn how to safely manage analyzing large BigQuery datasets.

    Acknowledgments

    The NOAA Big Data Project (BDP) is an experimental collaboration between NOAA and infrastructure-as-a-service (IaaS) providers to explore methods of expand the accessibility of NOAA’s data in order to facilitate innovation and collaboration. The goal of this approach is to help form new lines of business and economic growth while making NOAA's data more discoverable for the American public. https://storage.googleapis.com/public-dataset-images/noaa-goes-16-sample.png" alt="Sample images">

    Key metadata for this dataset has been extracted into convenient BigQuery tables (one each for L1b radiance, L2 CMIP, and L2 MCMIP). These tables can be used to query metadata in order to filter the data down to only a subset of raw netcdf4 files available in Google Cloud Storage.

  5. NOAA Geostationary Operational Environmental Satellites (GOES) 16, 17, 18 &...

    • registry.opendata.aws
    Updated Apr 4, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NOAA (2025). NOAA Geostationary Operational Environmental Satellites (GOES) 16, 17, 18 & 19 [Dataset]. https://registry.opendata.aws/noaa-goes/
    Explore at:
    Dataset updated
    Apr 4, 2025
    Dataset provided by
    National Oceanic and Atmospheric Administrationhttp://www.noaa.gov/
    Description



    NEW GOES-19 Data!! On April 4, 2025 at 1500 UTC, the GOES-19 satellite will be declared the Operational GOES-East satellite. All products and services, including NODD, for GOES-East will transition to GOES-19 data at that time. GOES-19 will operate out of the GOES-East location of 75.2°W starting on April 1, 2025 and through the operational transition. Until the transition time and during the final stretch of Post Launch Product Testing (PLPT), GOES-19 products are considered non-operational regardless of their validation maturity level. Shortly following the transition of GOES-19 to GOES-East, all data distribution from GOES-16 will be turned off. GOES-16 will drift to the storage location at 104.7°W. GOES-19 data should begin flowing again on April 4th once this maneuver is complete.

    NEW GOES 16 Reprocess Data!! The reprocessed GOES-16 ABI L1b data mitigates systematic data issues (including data gaps and image artifacts) seen in the Operational products, and improves the stability of both the radiometric and geometric calibration over the course of the entire mission life. These data were produced by recomputing the L1b radiance products from input raw L0 data using improved calibration algorithms and look-up tables, derived from data analysis of the NIST-traceable, on-board sources. In addition, the reprocessed data products contain enhancements to the L1b file format, including limb pixels and pixel timestamps, while maintaining compatibility with the operational products. The datasets currently available span the operational life of GOES-16 ABI, from early 2018 through the end of 2024. The Reprocessed L1b dataset shows improvement over the Operational L1b products but may still contain data gaps or discrepancies. Please provide feedback to Dan Lindsey (dan.lindsey@noaa.gov) and Gary Lin (guoqing.lin-1@nasa.gov). More information can be found in the GOES-R ABI Reprocess User Guide.


    NOTICE: As of January 10th 2023, GOES-18 assumed the GOES-West position and all data files are deemed both operational and provisional, so no ‘preliminary, non-operational’ caveat is needed. GOES-17 is now offline, shifted approximately 105 degree West, where it will be in on-orbit storage. GOES-17 data will no longer flow into the GOES-17 bucket. Operational GOES-West products can be found in the GOES-18 bucket.

    GOES satellites (GOES-16, GOES-17, GOES-18 & GOES-19) provide continuous weather imagery and monitoring of meteorological and space environment data across North America. GOES satellites provide the kind of continuous monitoring necessary for intensive data analysis. They hover continuously over one position on the surface. The satellites orbit high enough to allow for a full-disc view of the Earth. Because they stay above a fixed spot on the surface, they provide a constant vigil for the atmospheric "triggers" for severe weather conditions such as tornadoes, flash floods, hailstorms, and hurricanes. When these conditions develop, the GOES satellites are able to monitor storm development and track their movements. SUVI products available in both NetCDF and FITS.

  6. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Sorour (2021). 95-Cloud: Cloud Segmentation on Satellite Images [Dataset]. https://www.kaggle.com/sorour/95cloud-cloud-segmentation-on-satellite-images/notebooks
Organization logo

95-Cloud: Cloud Segmentation on Satellite Images

A dataset for detection of clouds in optical satellite (Landsat 8) imagery

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 12, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Sorour
License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

Context

Detection of clouds is an important step in many remote sensing applications that are based on optical imagery. 95-Cloud dataset is an extensive dataset for this task to help researchers to evaluate their deep learning-based cloud segmentation models.

Content

95-Cloud dataset is an extension of our previous 38-Cloud dataset. 95-Cloud has 57 more Landsat 8 scenes for "training" which are uploaded here. The rest of the training scene and the test scenes can be downloaded from here.

More information about the dataset can be found at: https://github.com/SorourMo/95-Cloud-An-Extension-to-38-Cloud-Dataset https://github.com/SorourMo/38-Cloud-A-Cloud-Segmentation-Dataset https://github.com/SorourMo/Cloud-Net-A-semantic-segmentation-CNN-for-cloud-detection

Acknowledgements

This dataset has been prepared by Laboratory for Robotics Vision (LRV) at School of Engineering Science, Simon Fraser University, Vancouver, Canada.

Search
Clear search
Close search
Google apps
Main menu