https://www.apache.org/licenses/LICENSE-2.0.htmlhttps://www.apache.org/licenses/LICENSE-2.0.html
These data accompany the 2018 manuscript published in PLOS One titled "Mapping the yearly extent of surface coal mining in Central Appalachia using Landsat and Google Earth Engine". In this manuscript, researchers used the Google Earth Engine platform and freely-accessible Landsat imagery to create a yearly dataset (1985 through 2015) of surface coal mining in the Appalachian region of the United States of America.This specific dataset is a GeoTIFF file depicting when an area was first mined, from the period 1985 through 2015. The raster values depict the year that mining was first detected by the paper's processing model. A year of "1984" indicates mining that likely started at some point prior to 1985. These pre-1985 mining data are derived from a prior study; see https://skytruth.org/wp/wp-content/uploads/2017/03/SkyTruth-MTR-methodology.pdf for more information. This dataset does not indicate for how long an area was a mine or when mining ceased in a given area.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
GEE-TED: A tsetse ecological distribution model for Google Earth Engine Please refer to the associated publication: Fox, L., Peter, B.G., Frake, A.N. and Messina, J.P., 2023. A Bayesian maximum entropy model for predicting tsetse ecological distributions. International Journal of Health Geographics, 22(1), p.31. https://link.springer.com/article/10.1186/s12942-023-00349-0 Description GEE-TED is a Google Earth Engine (GEE; Gorelick et al. 2017) adaptation of a tsetse ecological distribution (TED) model developed by DeVisser et al. (2010), which was designed for use in ESRI's ArcGIS. TED uses time-series climate and land-use/land-cover (LULC) data to predict the probability of tsetse presence across space based on species habitat preferences (in this case Glossina Morsitans). Model parameterization includes (1) day and night temperatures (MODIS Land Surface Temperature; MOD11A2), (2) available moisture/humidity using a vegetation index as a proxry (MODIS NDVI; MOD13Q1), (3) LULC (MODIS Land Cover Type 1; MCD12Q1), (4) year selections, and (5) fly movement rate (meters/16-days). TED has also been used as a basis for the development of an agent-based model by Lin et al. (2015) and in a cost-benefit analysis of tsetse control in Tanzania by Yang et al. (2017). Parameterization in Fox et al. (2023): Suitable LULC types and climate thresholds used here are specific to Glossina Morsitans in Kenya and are based on the parameterization selections in DeVisser et al. (2010) and DeVisser and Messina (2009). Suitable temperatures range from 17–40°C during the day and 10–40°C at night and available moisture is characterized as NDVI > 0.39. Suitable LULC comprises predominantly woody vegetation; a complete list of suitable categories is available in DeVisser and Messina (2009). In the Fox et al. (Forthcoming) publication, two versions of MCD12Q1 were used to assess suitable LULC types: Versions 051 and 006. The GeoTIFF supplied in this dataset entry (GEE-TED_Kenya_2016-2017.tif) uses the aforementioned parameters to show the probable tsetse distribution across Kenya for the years 2016-2017. A static graphic of this GEE-TED output is shown below and an interactive version can be viewed at: https://cartoscience.users.earthengine.app/view/gee-ted. Figure associated with Fox et al. (2023) GEE code The code supplied below is generalizable across geographies and species; however, it is highly recommended that parameterization is given considerable attention to produce reliable results. Note that output visualization on-the-fly will take some time and it is recommended that results be exported as an asset within GEE or exported as a GeoTIFF. Note: Since completing the Fox et al. (2023) manuscript, GEE has removed Version 051 per NASA's deprecation of the product. The current release of GEE-TED now uses only MCD12Q1 Version 006; however, alternative LULC data selections can be used with minimal modification to the code. // Input options var tempMin = 10 // Temperature thresholds in degrees Celsius var tempMax = 40 var ndviMin = 0.39 // NDVI thresholds; proxy for available moisture/humidity var ndviMax = 1 var movement = 500 // Fly movement rate in meters/16-days var startYear = 2008 // The first 2 years will be used for model initialization var endYear = 2019 // Computed probability is based on startYear+2 to endYear var country = 'KE' // Country codes - https://en.wikipedia.org/wiki/List_of_FIPS_country_codes var crs = 'EPSG:32737' // See https://epsg.io/ for appropriate country UTM zone var rescale = 250 // Output spatial resolution var labelSuffix = '02052020' // For file export labeling only //[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17] MODIS/006/MCD12Q1 var lulcOptions006 = [1,1,1,1,1,1,1,1,1, 0, 1, 0, 0, 0, 0, 0, 0] // 1 = suitable 0 = unsuitable // No more input required ------------------------------ // var region = ee.FeatureCollection("USDOS/LSIB_SIMPLE/2017") .filterMetadata('country_co', 'equals', country) // Input parameter modifications var tempMinMod = (tempMin+273.15)/0.02 var tempMaxMod = (tempMax+273.15)/0.02 var ndviMinMod = ndviMin*10000 var ndviMaxMod = ndviMax*10000 var ndviResolution = 250 var movementRate = movement+(ndviResolution/2) // Loading image collections var lst = ee.ImageCollection('MODIS/006/MOD11A2').select('LST_Day_1km', 'LST_Night_1km') .filter(ee.Filter.calendarRange(startYear,endYear,'year')) var ndvi = ee.ImageCollection('MODIS/006/MOD13Q1').select('NDVI') .filter(ee.Filter.calendarRange(startYear,endYear,'year')) var lulc006 = ee.ImageCollection('MODIS/006/MCD12Q1').select('LC_Type1') // Lulc mode and boolean reclassification var lulcMask = lulc006.mode().remap([1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17],lulcOptions006) .eq(1).rename('remapped').clip(region) // Merge NDVI and LST image collections var combined = ndvi.combine(lst, true) var combinedList = combined.toList(10000) // Boolean reclassifications (suitable/unsuitable) for day/night temperatures and ndvi var con =...
https://www.apache.org/licenses/LICENSE-2.0.htmlhttps://www.apache.org/licenses/LICENSE-2.0.html
These data accompany the 2018 manuscript published in PLOS One titled "Mapping the yearly extent of surface coal mining in Central Appalachia using Landsat and Google Earth Engine". In this manuscript, researchers used the Google Earth Engine platform and freely-accessible Landsat imagery to create a yearly dataset (1985 through 2015) of surface coal mining in the Appalachian region of the United States of America.This specific dataset is a GeoTIFF file depicting when an area was most recently mined, from the period 1985 through 2015. The raster values depict the year that mining was most recently detected by the paper's processing model. A year of "1984" indicates mining that likely was most recently mined at some point prior to 1985. These pre-1985 mining data are derived from a prior study; see https://skytruth.org/wp/wp-content/uploads/2017/03/SkyTruth-MTR-methodology.pdf for more information. This dataset does not indicate for how long an area was a mine or when mining began in a given area.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A continuous dataset of Land Surface Temperature (LST) is vital for climatological and environmental studies. LST can be regarded as a combination of seasonal mean temperature (climatology) and daily anomaly, which is attributed mainly to the synoptic-scale atmospheric circulation (weather). To reproduce LST in cloudy pixels, time series (2002-2019) of cloud-free 1km MODIS Aqua LST images were generated and the pixel-based seasonality (climatology) was calculated using temporal Fourier analysis. To add the anomaly, we used the NCEP Climate Forecast System Version 2 (CFSv2) model, which provides air surface temperature under both cloudy and clear sky conditions. The combination of the two sources of data enables the estimation of LST in cloudy pixels.
Data structure
The dataset consists of geo-located continuous LST (Day, Night and Daily) which calculates LST values of cloudy pixels. The spatial domain of the data is the Eastern Mediterranean, at the resolution of the MYD11A1 product (~1 Km). Data are stored in GeoTIFF format as signed 16-bit integers using a scale factor of 0.02, with one file per day, each defined by 4 dimensions (Night LST Cont., Day LST Cont., Daily Average LST Cont., QA). The QA band stores information about the presence of cloud in the original pixel. If in both original files, Day LST and Night LST there was NoData due to clouds, then the QA value is 0. QA value of 1 indicates NoData at original Day LST, 2 indicates NoData at Night LST and 3 indicates valid data at both, day and night. File names follow this naming convention: LST_
The file LSTcont_validation.tif contains the validation dataset in which the MAE, RMSE, and Pearson (r) of the validation with true LST are provided. Data are stored in GeoTIFF format as signed 32-bit floats, with the same spatial extent and resolution as the LSTcont dataset. These data are stored with one file containing three bands (MAE, RMSE, and Perarson_r). The same data with the same structure is also provided in NetCDF format.
How to use
The data can be read in various of program languages such as Python, IDL, Matlab etc.and can be visualize in a GIS program such as ArcGis or Qgis. A short animation demonstrates how to visualize the data using the Qgis open source program is available in the project Github code reposetory.
Web application
The LSTcont web application (https://shilosh.users.earthengine.app/view/continuous-lst) is an Earth Engine app. The interface includes a map and a date picker. The user can select a date (July 2002 – present) and visualize LSTcont for that day anywhere on the globe. The web app calculate LSTcont on the fly based on ready-made global climatological files. The LSTcont can be downloaded as a GeoTiff with 5 bands in that order: Mean daily LSTcont, Night original LST, Night LSTcont, Day original LST, Day LSTcont.
Code availability
Datasets for other regions can be easily produced by the GEE platform with the code provided project Github code reposetory.
Monthly Aggregated NEX-GDDP Ensemble Climate Projections: Historical (1985–2005) and RCP 4.5 and RCP 8.5 (2006–2080) This dataset is a monthly-scale aggregation of the NEX-GDDP: NASA Earth Exchange Global Daily Downscaled Climate Projections processed using Google Earth Engine (Gorelick 2017). The native delivery on Google Earth Engine is at the daily timescale for each individual CMIP5 GCM model. This dataset was created to facilitate use of NEX-GDDP and reduce processing times for projects that seek an ensemble model with a coarser temporal resolution. The aggregated data have been made available in Google Earth Engine via 'users/cartoscience/GCM_NASA-NEX-GDDP/NEX-GDDP-PRODUCT-ID_Ensemble-Monthly_YEAR' (see code below on how to access), and all 171 GeoTIFFS have been uploaded to this dataverse entry. Relevant links: https://www.nasa.gov/nex https://www.nccs.nasa.gov/services/data-collections/land-based-products/nex-gddp https://esgf.nccs.nasa.gov/esgdoc/NEX-GDDP_Tech_Note_v0.pdf https://developers.google.com/earth-engine/datasets/catalog/NASA_NEX-GDDP https://journals.ametsoc.org/view/journals/bams/93/4/bams-d-11-00094.1.xml https://rd.springer.com/article/10.1007/s10584-011-0156-z#page-1 The dataset can be accessed within Google Earth Engine using the following code: var histYears = ee.List.sequence(1985,2005).getInfo() var rcpYears = ee.List.sequence(2006,2080).getInfo() var path1 = 'users/cartoscience/GCM_NASA-NEX-GDDP/NEX-GDDP-' var path2 = '_Ensemble-Monthly_' var product product = 'Hist' var hist = ee.ImageCollection( histYears.map(function(y) { return ee.Image(path1+product+path2+y) }) ) product = 'RCP45' var rcp45 = ee.ImageCollection( rcpYears.map(function(y) { return ee.Image(path1+product+path2+y) }) ) product = 'RCP85' var rcp85 = ee.ImageCollection( rcpYears.map(function(y) { return ee.Image(path1+product+path2+y) }) ) print( 'Hist (1985–2005)', hist, 'RCP45 (2006–2080)', rcp45, 'RCP85 (2006–2080)', rcp85 ) var first = hist.first() var tMax = first.select('tasmin_1') var tMin = first.select('tasmax_1') var tMean = first.select('tmean_1') var pSum = first.select('pr_1') Map.addLayer(tMax, {min: -10, max: 40}, 'Average min temperature Jan 1985 (Hist)', false) Map.addLayer(tMin, {min: 10, max: 40}, 'Average max temperature Jan 1985 (Hist)', false) Map.addLayer(tMean, {min: 10, max: 40}, 'Average temperature Jan 1985 (Hist)', false) Map.addLayer(pSum, {min: 10, max: 500}, 'Accumulated rainfall Jan 1985 (Hist)', true) https://code.earthengine.google.com/5bfd9741274679dded7a95d1b57ca51d Ensemble average based on the following models: ACCESS1-0,BNU-ESM,CCSM4,CESM1-BGC,CNRM-CM5, CSIRO-Mk3-6-0,CanESM2,GFDL-CM3,GFDL-ESM2G, GFDL-ESM2M,IPSL-CM5A-LR,IPSL-CM5A-MR,MIROC-ESM-CHEM, MIROC-ESM,MIROC5,MPI-ESM-LR,MPI-ESM-MR,MRI-CGCM3, NorESM1-M,bcc-csm1-1,inmcm4 Each annual GeoTIFF contains 48 bands (4 variables across 12 months)— Temperature: Monthly mean (tasmin, tasmax, tmean) Precipitation: Monthly sum (pr) Bands 1–48 correspond with: tasmin_1, tasmax_1, tmean_1, pr_1, tasmin_2, tasmax_2, tmean_2, pr_2, tasmin_3, tasmax_3, tmean_3, pr_3, tasmin_4, tasmax_4, tmean_4, pr_4, tasmin_5, tasmax_5, tmean_5, pr_5, tasmin_6, tasmax_6, tmean_6, pr_6, tasmin_7, tasmax_7, tmean_7, pr_7, tasmin_8, tasmax_8, tmean_8, pr_8, tasmin_9, tasmax_9, tmean_9, pr_9, tasmin_10, tasmax_10, tmean_10, pr_10, tasmin_11, tasmax_11, tmean_11, pr_11, tasmin_12, tasmax_12, tmean_12, pr_12 *Gorelick, N., Hancher, M., Dixon, M., Ilyushchenko, S., Thau, D. and Moore, R., 2017. Google Earth Engine: Planetary-scale geospatial analysis for everyone. Remote Sensing of Environment, 202, pp.18–27. Project information: SEAGUL: Southeast Asia Globalization, Urbanization, Land and Environment Changes http://seagul.info/ https://lcluc.umd.edu/projects/divergent-local-responses-globalization-urbanization-land-transition-and-environmental This project was made possible by the the NASA Land-Cover/Land-Use Change Program (Grant #: 80NSSC20K0740)
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset and the validation are fully described in a Nature Scientific Data Descriptor https://www.nature.com/articles/s41597-019-0265-5
If you want to use this dataset in an interactive environment, then use this link https://mybinder.org/v2/gh/GeographerAtLarge/TravelTime/HEAD
The following text is a summary of the information in the above Data Descriptor.
The dataset is a suite of global travel-time accessibility indicators for the year 2015, at approximately one-kilometre spatial resolution for the entire globe. The indicators show an estimated (and validated), land-based travel time to the nearest city and nearest port for a range of city and port sizes.
The datasets are in GeoTIFF format and are suitable for use in Geographic Information Systems and statistical packages for mapping access to cities and ports and for spatial and statistical analysis of the inequalities in access by different segments of the population.
These maps represent a unique global representation of physical access to essential services offered by cities and ports.
The datasets travel_time_to_cities_x.tif (where x has values from 1 to 12) The value of each pixel is the estimated travel time in minutes to the nearest urban area in 2015. There are 12 data layers based on different sets of urban areas, defined by their population in year 2015 (see PDF report).
travel_time_to_ports_x (x ranges from 1 to 5)
The value of each pixel is the estimated travel time to the nearest port in 2015. There are 5 data layers based on different port sizes.
Format Raster Dataset, GeoTIFF, LZW compressed Unit Minutes
Data type Byte (16 bit Unsigned Integer)
No data value 65535
Flags None
Spatial resolution 30 arc seconds
Spatial extent
Upper left -180, 85
Lower left -180, -60 Upper right 180, 85 Lower right 180, -60 Spatial Reference System (SRS) EPSG:4326 - WGS84 - Geographic Coordinate System (lat/long)
Temporal resolution 2015
Temporal extent Updates may follow for future years, but these are dependent on the availability of updated inputs on travel times and city locations and populations.
Methodology Travel time to the nearest city or port was estimated using an accumulated cost function (accCost) in the gdistance R package (van Etten, 2018). This function requires two input datasets: (i) a set of locations to estimate travel time to and (ii) a transition matrix that represents the cost or time to travel across a surface.
The set of locations were based on populated urban areas in the 2016 version of the Joint Research Centre’s Global Human Settlement Layers (GHSL) datasets (Pesaresi and Freire, 2016) that represent low density (LDC) urban clusters and high density (HDC) urban areas (https://ghsl.jrc.ec.europa.eu/datasets.php). These urban areas were represented by points, spaced at 1km distance around the perimeter of each urban area.
Marine ports were extracted from the 26th edition of the World Port Index (NGA, 2017) which contains the location and physical characteristics of approximately 3,700 major ports and terminals. Ports are represented as single points
The transition matrix was based on the friction surface (https://map.ox.ac.uk/research-project/accessibility_to_cities) from the 2015 global accessibility map (Weiss et al, 2018).
Code The R code used to generate the 12 travel time maps is included in the zip file that can be downloaded with these data layers. The processing zones are also available.
Validation The underlying friction surface was validated by comparing travel times between 47,893 pairs of locations against journey times from a Google API. Our estimated journey times were generally shorter than those from the Google API. Across the tiles, the median journey time from our estimates was 88 minutes within an interquartile range of 48 to 143 minutes while the median journey time estimated by the Google API was 106 minutes within an interquartile range of 61 to 167 minutes. Across all tiles, the differences were skewed to the left and our travel time estimates were shorter than those reported by the Google API in 72% of the tiles. The median difference was −13.7 minutes within an interquartile range of −35.5 to 2.0 minutes while the absolute difference was 30 minutes or less for 60% of the tiles and 60 minutes or less for 80% of the tiles. The median percentage difference was −16.9% within an interquartile range of −30.6% to 2.7% while the absolute percentage difference was 20% or less in 43% of the tiles and 40% or less in 80% of the tiles.
This process and results are included in the validation zip file.
Usage Notes The accessibility layers can be visualised and analysed in many Geographic Information Systems or remote sensing software such as QGIS, GRASS, ENVI, ERDAS or ArcMap, and also by statistical and modelling packages such as R or MATLAB. They can also be used in cloud-based tools for geospatial analysis such as Google Earth Engine.
The nine layers represent travel times to human settlements of different population ranges. Two or more layers can be combined into one layer by recording the minimum pixel value across the layers. For example, a map of travel time to the nearest settlement of 5,000 to 50,000 people could be generated by taking the minimum of the three layers that represent the travel time to settlements with populations between 5,000 and 10,000, 10,000 and 20,000 and, 20,000 and 50,000 people.
The accessibility layers also permit user-defined hierarchies that go beyond computing the minimum pixel value across layers. A user-defined complete hierarchy can be generated when the union of all categories adds up to the global population, and the intersection of any two categories is empty. Everything else is up to the user in terms of logical consistency with the problem at hand.
The accessibility layers are relative measures of the ease of access from a given location to the nearest target. While the validation demonstrates that they do correspond to typical journey times, they cannot be taken to represent actual travel times. Errors in the friction surface will be accumulated as part of the accumulative cost function and it is likely that locations that are further away from targets will have greater a divergence from a plausible travel time than those that are closer to the targets. Care should be taken when referring to travel time to the larger cities when the locations of interest are extremely remote, although they will still be plausible representations of relative accessibility. Furthermore, a key assumption of the model is that all journeys will use the fastest mode of transport and take the shortest path.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We developed datasets on the human modification of global terrestrial ecosystems from 1990 to 2020. The methods and data sources associated with these data are fully described in:
Theobald, D.M., Oakleaf, J.R., Moncrieff, G., Voigt, M., Kiesecker, J., and Kennedy, C.M.
For each 5-year step from 1990 to 2020, 9 raster datasets are provided in cloud-optimized GeoTIFF format (300 m resolution, EPSG:4326). The naming convention is as follows: HMv2024080101_
These data are available as Google Earth Engine assets via this script (including 90 m): https://code.earthengine.google.com/1b7b5976fdd6189c6533ca00a46386d1
The Google Earth Engine script to calculate human modification is here: https://code.earthengine.google.com/59c0f7da25579422ce4d459abeae1b7d
The Google Earth Engine script to clip out custom extents and export to GeoTIFF is here: https://code.earthengine.google.com/44c9f092472edb9bac3c45096aa5091d
Please see companion repo here for datasets for 2022: https://zenodo.org/uploads/14502573
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The hurricane heatmap was generated using the NOAA/IBTrACS/v4 dataset, which was filtered to focus on the North Atlantic Basin from January 1950 to October 2024. This dataset, sourced from The International Best Track Archive for Climate Stewardship (IBTrACS), offers detailed information on tropical cyclone locations and intensity, providing critical insight into storm behavior over the decades. The map visually represents the highest concentration of hurricane locations, with the intensity of storm occurrences depicted through point data derived from IBTrACS. The data utilized for this heatmap was exported from the Google Earth Engine JavaScript code editor as a GeoTIFF file, with a resolution of 75 km² per pixel, ensuring a balance between visual clarity and the preservation of spatial details. By leveraging the power of Google Earth Engine, this visualization provides an effective way to analyze and explore the frequency and distribution of hurricanes across the North Atlantic, helping to highlight regions most prone to hurricane activity and offering valuable information for climate research and disaster preparedness.
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
This dataset contains composite satellite images for the Coral Sea region based on 10 m resolution Sentinel 2 imagery from 2015 – 2021. This image collection is intended to allow mapping of the reef and island features of the Coral Sea. This is a draft version of the dataset prepared from approximately 60% of the available Sentinel 2 image. An improved version of this dataset was released https://doi.org/10.26274/NH77-ZW79.
This collection contains composite imagery for 31 Sentinel 2 tiles in the Coral Sea. For each tile there are 5 different colour and contrast enhancement styles intended to highlight different features. These include:
- DeepFalse
- Bands: B1 (ultraviolet), B2 (blue), B3 (green): False colour image that shows deep marine features to 50 - 60 m depth. This imagery exploits the clear waters of the Coral Sea to allow the ultraviolet band to provide a much deeper view of coral reefs than is typically achievable with true colour imagery. This technique doesn't work where the water is not as clear as the ultraviolet get scattered easily.
- DeepMarine
- Bands: B2 (blue), B3 (green), B4 (red): This is a contrast enhanced version of the true colour imagery, focusing on being able to better see the deeper features. Shallow features are over exposed due to the increased contrast.
- ReefTop
- Bands: B3 (red): This imagery is contrast enhanced to create an mask (black and white) of reef tops, delineating areas that are shallower or deeper than approximately 4 - 5 m. This mask is intended to assist in the creating of a GIS layer equivalent to the 'GBR Dry Reefs' dataset. The depth mapping exploits the limited water penetration of the red channel. In clear water the red channel can only see features to approximately 6 m regardless of the substrate type.
- Shallow
- Bands: B5 (red edge), B8 (Near Infrared) , B11 (Short Wave infrared): This false colour imagery focuses on identifying very shallow and dry regions in the imagery. It exploits the property that the longer wavelength bands progressively penetrate the water less. B5 penetrates the water approximately 3 - 5 m, B8 approximately 0.5 m and B11 < 0.1 m. Feature less than a couple of metres appear dark blue, dry areas are white.
- TrueColour
- Bands: B2 (blue), B3 (green), B4 (red): True colour imagery. This is useful to interpreting what shallow features are and in mapping the vegetation on cays and identifying beach rock.
For most Sentinel tiles there are two versions of the DeepFalse and DeepMarine imagery based on different collections (dates). The R1 imagery are composites made up from the best available imagery while the R2 imagery uses the next best set of imagery. This splitting of the imagery is to allow two composites to be created from the pool of available imagery so that mapped features could be checked against two images. Typically the R2 imagery will have more artefacts from clouds.
The satellite imagery was processed in tiles (approximately 100 x 100 km) to keep each final image small enough to manage. The dataset only covers the portion of the Coral Sea where there are shallow coral reefs.
# Methods:
The satellite image composites were created by combining multiple Sentinel 2 images using the Google Earth Engine. The core algorithm was:
1. For each Sentinel 2 tile, the set of Sentinel images from 2015 – 2021 were reviewed manually. In some tiles the cloud cover threshold was raised to gather more images, particularly if there were less than 20 images available. The Google Earth Engine image IDs of the best images were recorded. These were the images with the clearest water, lowest waves, lowest cloud, and lowest sun glint.
2. A composite image was created from the best images by taking the statistical median of the stack of images selected in the previous stage, after masking out clouds and their shadows (described in detail later).
3. The contrast of the images was enhanced to create a series of products for different uses. The true colour image retained the full range of tones visible, so that bright sand cays still retained some detail. The marine enhanced version stretched the blue, green and red channels so that they focused on the deeper, darker marine features. This stretching was done to ensure that when converted to 8-bit colour imagery that all the dark detail in the deeper areas were visible. This contrast enhancement resulted in bright areas of the imagery clipping, leading to loss of detail in shallow reef areas and colours of land areas looking off. A reef top estimate was produced from the red channel (B4) where the contrast was stretched so that the imagery contains almost a binary mask. The threshold was chosen to approximate the 5 m depth contour for the clear waters of the Coral Sea. Lastly a false colour image was produced to allow mapping of shallow water features such as cays and islands. This image was produced from B5 (far red), B8 (nir), B11 (nir), where blue represents depths from approximately 0.5 – 5 m, green areas with 0 – 0.5 m depth, and brown and white corresponding to dry land.
4. The various contrast enhanced composite images were exported from Google Earth Engine (default of 32 bit GeoTiff) and reprocessed to smaller LZW compresed 8 bit GeoTiff images GDAL.
## Cloud Masking
Prior to combining the best images each image was processed to mask out clouds and their shadows.
The cloud masking uses the COPERNICUS/S2_CLOUD_PROBABILITY dataset developed by SentinelHub (Google, n.d.; Zupanc, 2017). The mask includes the cloud areas, plus a mask to remove cloud shadows. The cloud shadows were estimated by projecting the cloud mask in the direction opposite the angle to the sun. The shadow distance was estimated in two parts.
A low cloud mask was created based on the assumption that small clouds have a small shadow distance. These were detected using a 40% cloud probability threshold. These were projected over 400 m, followed by a 150 m buffer to expand the final mask.
A high cloud mask was created to cover longer shadows created by taller, larger clouds. These clouds were detected based on an 80% cloud probability threshold, followed by an erosion and dilation of 300 m to remove small clouds. These were then projected over a 1.5 km distance followed by a 300 m buffer.
The parameters for the cloud masking (probability threshold, projection distance and buffer radius) were determined through trial and error on a small number of scenes. As such there are probably significant potential improvements that could be made to this algorithm.
Erosion, dilation and buffer operations were performed at a lower image resolution than the native satellite image resolution to improve the computational speed. The resolution of these operations were adjusted so that they were performed with approximately a 4 pixel resolution during these operations. This made the cloud mask significantly more spatially coarse than the 10 m Sentinel imagery. This resolution was chosen as a trade-off between the coarseness of the mask verse the processing time for these operations. With 4-pixel filter resolutions these operations were still using over 90% of the total processing resulting in each image taking approximately 10 min to compute on the Google Earth Engine.
## Sun glint removal and atmospheric correction.
Sun glint was removed from the images using the infrared B8 band to estimate the reflection off the water from the sun glint. B8 penetrates water less than 0.5 m and so in water areas it only detects reflections off the surface of the water. The sun glint detected by B8 correlates very highly with the sun glint experienced by the ultra violet and visible channels (B1, B2, B3 and B4) and so the sun glint in these channels can be removed by subtracting B8 from these channels.
This simple sun glint correction fails in very shallow and land areas. On land areas B8 is very bright and thus subtracting it from the other channels results in black land. In shallow areas (< 0.5 m) the B8 channel detects the substrate, resulting in too much sun glint correction. To resolve these issues the sun glint correction was adjusted by transitioning to B11 for shallow areas as it penetrates the water even less than B8. We don't use B11 everywhere because it is half the resolution of B8.
Land areas need their tonal levels to be adjusted to match the water areas after sun glint correction. Ideally this would be achieved using an atmospheric correction that compensates for the contrast loss due to haze in the atmosphere. Complex models for atmospheric correction involve considering the elevation of the surface (higher areas have less atmosphere to pass through) and the weather conditions. Since this dataset is focused on coral reef areas, elevation compensation is unnecessary due to the very low and flat land features being imaged. Additionally the focus of the dataset it on marine features and so only a basic atmospheric correction is needed. Land areas (as determined by very bright B8 areas) where assigned a fixed smaller correction factor to approximate atmospheric correction. This fixed atmospheric correction was determined iteratively so that land areas matched the tonal value of shallow and water areas.
## Image selection
Available Sentinel 2 images with a cloud cover of less than 0.5% were manually reviewed using an Google Earth Engine App 01-select-sentinel2-images.js. Where there were few images available (less than 30 images) the cloud cover threshold was raised to increase the set of images that were raised.
Images were excluded from the composites primarily due to two main factors: sun
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We developed datasets on the human modification of global terrestrial ecosystems for 2022. The methods and data sources associated with these data are fully described in:
Theobald, D.M., Oakleaf, J.R., Moncrieff, G., Voigt, M., Kiesecker, J., and Kennedy, C.M.
For 2022, raster datasets are provided in cloud-optimized GeoTIFF format at 300 m resolution (EPSG:4326). The naming convention is as follows: HMv2024080101_
Note that these data are available as Google Earth Engine assets via this script (including 90 m): https://code.earthengine.google.com/1b7b5976fdd6189c6533ca00a46386d1
The Google Earth Engine script to clip out custom extents and export to GeoTIFF is here: https://code.earthengine.google.com/44c9f092472edb9bac3c45096aa5091d
Please see companion repo here for datasets for 1990-2020: https://zenodo.org/uploads/14449495.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This repository contains a 10-m global oil palm extent layer for 2021 and a 30-m oil palm planting year layer from 1990 to 2021. The oil palm extent layer was produced using a convolutional neural network that identified industrial and smallholder plantations in Sentinel-1 data. The oil palm planting year was developed using a methodology specifically designed to detect the early stages of oil palm development in the Landsat time series.
The repository contains the following data:
- Grid_OilPalm2016-2021.shp: shapefile that delineates the 609 grid cells of 100 x 100 km where oil palm was found.
- GlobalOilPalm_OP-extent.zip: 609 raster tiles of 100x100 km in geotiff format. The raster files show the results of the deep learning classification at a spatial resolution of 10 meters. The classes are the following:
[0] Other land covers that are not oil palm.
[1] Industrial oil palm plantations
[2] Smallholder oil palm plantations.
- GlobalOilPalm_YoP.zip: 609 raster tiles of 100x100 km in geotiff format. The raster files depict the year of oil palm plantation. The raster files have a spatial resolution of 30 meters.
- Validation_points_GlobalOP2016-2021.shp: shapefile that contains the 17,812 points used to validate the global oil palm extent 2016–2021 and the oil palm age layer. Each point includes the attribute ‘Class’, which is the class assigned by visual interpretation of sub-meter resolution images, and the attributes ‘OP2016-2021’ and ‘OP2019’, which show the mapped classes in the oil palm extent 2016–2021 (this dataset) and the global oil palm layer 2019 (Descals et al., 2021), respectively. These attributes contain the following class values:
[0] Other land covers that are not oil palm.
[1] Industrial oil palm plantations.
[2] Smallholder oil palm plantations.
The oil palm extent and the planting year can be visualized at: https://ee-globaloilpalm.projects.earthengine.app/view/global-oil-palm-planting-year-1990-2021. This web map allows for the inspection of Landsat time series and the visualization of historical satellite images for a given oil palm plantation.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
https://www.apache.org/licenses/LICENSE-2.0.htmlhttps://www.apache.org/licenses/LICENSE-2.0.html
These data accompany the 2018 manuscript published in PLOS One titled "Mapping the yearly extent of surface coal mining in Central Appalachia using Landsat and Google Earth Engine". In this manuscript, researchers used the Google Earth Engine platform and freely-accessible Landsat imagery to create a yearly dataset (1985 through 2015) of surface coal mining in the Appalachian region of the United States of America.This specific dataset is a GeoTIFF file depicting when an area was first mined, from the period 1985 through 2015. The raster values depict the year that mining was first detected by the paper's processing model. A year of "1984" indicates mining that likely started at some point prior to 1985. These pre-1985 mining data are derived from a prior study; see https://skytruth.org/wp/wp-content/uploads/2017/03/SkyTruth-MTR-methodology.pdf for more information. This dataset does not indicate for how long an area was a mine or when mining ceased in a given area.