Facebook
TwitterThe product data are six statistics that were estimated for the chemical concentration of lanthanum in the soil C horizon of the conterminous United States (Smith and others, 2013). The estimates are made at 9998 locations that are uniformly distributed across the conterminous United States. The six statistics are the mean for the isometric log-ratio transform of the concentrations, the equivalent mean for the concentrations, the standard deviation for the isometric log-ratio transform of the concentrations, the probability of exceeding a concentration of 48.8 milligrams per kilogram, the 0.95 quantile for the isometric log-ratio transform of the concentrations, and the equivalent 0.95 quantile for the concentrations. Each statistic may be used to generate a statistical map that shows an attribute of the distribution of lanthanum concentration.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Related article: Bergroth, C., Järv, O., Tenkanen, H., Manninen, M., Toivonen, T., 2022. A 24-hour population distribution dataset based on mobile phone data from Helsinki Metropolitan Area, Finland. Scientific Data 9, 39.
In this dataset:
We present temporally dynamic population distribution data from the Helsinki Metropolitan Area, Finland, at the level of 250 m by 250 m statistical grid cells. Three hourly population distribution datasets are provided for regular workdays (Mon – Thu), Saturdays and Sundays. The data are based on aggregated mobile phone data collected by the biggest mobile network operator in Finland. Mobile phone data are assigned to statistical grid cells using an advanced dasymetric interpolation method based on ancillary data about land cover, buildings and a time use survey. The data were validated by comparing population register data from Statistics Finland for night-time hours and a daytime workplace registry. The resulting 24-hour population data can be used to reveal the temporal dynamics of the city and examine population variations relevant to for instance spatial accessibility analyses, crisis management and planning.
Please cite this dataset as:
Bergroth, C., Järv, O., Tenkanen, H., Manninen, M., Toivonen, T., 2022. A 24-hour population distribution dataset based on mobile phone data from Helsinki Metropolitan Area, Finland. Scientific Data 9, 39. https://doi.org/10.1038/s41597-021-01113-4
Organization of data
The dataset is packaged into a single Zipfile Helsinki_dynpop_matrix.zip which contains following files:
HMA_Dynamic_population_24H_workdays.csv represents the dynamic population for average workday in the study area.
HMA_Dynamic_population_24H_sat.csv represents the dynamic population for average saturday in the study area.
HMA_Dynamic_population_24H_sun.csv represents the dynamic population for average sunday in the study area.
target_zones_grid250m_EPSG3067.geojson represents the statistical grid in ETRS89/ETRS-TM35FIN projection that can be used to visualize the data on a map using e.g. QGIS.
Column names
YKR_ID : a unique identifier for each statistical grid cell (n=13,231). The identifier is compatible with the statistical YKR grid cell data by Statistics Finland and Finnish Environment Institute.
H0, H1 ... H23 : Each field represents the proportional distribution of the total population in the study area between grid cells during a one-hour period. In total, 24 fields are formatted as “Hx”, where x stands for the hour of the day (values ranging from 0-23). For example, H0 stands for the first hour of the day: 00:00 - 00:59. The sum of all cell values for each field equals to 100 (i.e. 100% of total population for each one-hour period)
In order to visualize the data on a map, the result tables can be joined with the target_zones_grid250m_EPSG3067.geojson data. The data can be joined by using the field YKR_ID as a common key between the datasets.
License Creative Commons Attribution 4.0 International.
Related datasets
Järv, Olle; Tenkanen, Henrikki & Toivonen, Tuuli. (2017). Multi-temporal function-based dasymetric interpolation tool for mobile phone data. Zenodo. https://doi.org/10.5281/zenodo.252612
Tenkanen, Henrikki, & Toivonen, Tuuli. (2019). Helsinki Region Travel Time Matrix [Data set]. Zenodo. http://doi.org/10.5281/zenodo.3247564
Facebook
TwitterThe product data are six statistics that were estimated for the chemical concentration of cobalt in the soil C horizon of the conterminous United States (Smith and others, 2013). The estimates are made at 9998 locations that are uniformly distributed across the conterminous United States. The six statistics are the mean for the isometric log-ratio transform of the concentrations, the equivalent mean for the concentrations, the standard deviation for the isometric log-ratio transform of the concentrations, the probability of exceeding a concentration of 24.4 milligrams per kilogram, the 0.95 quantile for the isometric log-ratio transform of the concentrations, and the equivalent 0.95 quantile for the concentrations. Each statistic may be used to generate a statistical map that shows an attribute of the distribution of cobalt concentration.
Facebook
Twitterhttps://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Aim: Effective management decisions depend on knowledge of species distribution and habitat use. Maps generated from species distribution models are important in predicting previously unknown occurrences of protected species. However, if populations are seasonally dynamic or locally adapted, failing to consider population level differences could lead to erroneous determinations of occurrence probability and ineffective management. The study goal was to model the distribution of a species of special concern, Townsend’s big-eared bats (Corynorhinus townsendii), in California. We incorporate seasonal and spatial differences to estimate the distribution under current and future climate conditions. Methods: We built species distribution models using all records from statewide roost surveys and by subsetting data to seasonal colonies, representing different phenological stages, and to Environmental Protection Agency Level III Ecoregions to understand how environmental needs vary based on these factors. We projected species’ distribution for 2061-2080 in response to low and high emissions scenarios and calculated the expected range shifts. Results: The estimated distribution differed between the combined (full dataset) and phenologically-explicit models, while ecoregion-specific models were largely congruent with the combined model. Across the majority of models, precipitation was the most important variable predicting the presence of C. townsendii roosts. Under future climate scnearios, distribution of C. townsendii is expected to contract throughout the state, however suitable areas will expand within some ecoregions. Main conclusion: Comparison of phenologically-explicit models with combined models indicate the combined models better predict the extent of the known range of C. townsendii in California. However, life history-explicit models aid in understanding of different environmental needs and distribution of their major phenological stages. Differences between ecoregion-specific and statewide predictions of habitat contractions highlight the need to consider regional variation when forecasting species’ responses to climate change. These models can aid in directing seasonally explicit surveys and predicting regions most vulnerable under future climate conditions. Methods Study area and survey data The study area covers the U.S. state of California, which has steep environmental gradients that support an array of species (Dobrowski et al. 2011). Because California is ecologically diverse, with regions ranging from forested mountain ranges to deserts, we examined local environmental needs by modeling at both the state-wide and ecoregion scale, using U.S. Environmental Protection Agency (EPA) Level III ecoregion designations and there are thirteen Level III ecoregions in California (Table S1.1) (Griffith et al. 2016). Species occurrence data used in this study were from a statewide survey of C. townsendii in California conducted by Harris et al. (2019). Briefly, methods included field surveys from 2014-2017 following a modified bat survey protocol to create a stratified random sampling scheme. Corynorhinus townsendii presence at roost sites was based on visual bat sightings. From these survey efforts, we have visual occurrence data for 65 maternity roosts, 82 hibernation roosts (hibernacula), and 91 active-season non-maternity roosts (transition roosts) for a total of 238 occurrence records (Figure 1, Table S1.1). Ecogeographical factors We downloaded climatic variables from WorldClim 2.0 bioclimatic variables (Fick & Hijmans, 2017) at a resolution of 5 arcmin for broad-scale analysis and 30 arcsec for our ecoregion-specific analyses. To calculate elevation and slope, we used a digital elevation model (USGS 2022) in ArcGIS 10.8.1 (ESRI, 2006). The chosen set of environmental variables reflects knowledge on climatic conditions and habitat relevant to bat physiology, phenology, and life history (Rebelo et al. 2010, Razgour et al. 2011, Loeb and Winters 2013, Razgour 2015, Ancillotto et al. 2016). To trim the global environmental variables to the same extent (the state of California), we used the R package “raster” (Hijmans et al. 2022). We performed a correlation analysis on the raster layers using the “layerStats” function and removed variables with a Pearson’s coefficient > 0.7 (see Table 1 for final model variables). For future climate conditions, we selected three general circulation models (GCMs) based on previous species distribution models of temperate bat species (Razgour et al. 2019) [Hadley Centre Global Environment Model version 2 Earth Systems model (HadGEM3-GC31_LL; Webb, 2019), Institut Pierre-Simon Laplace Coupled Model 6th Assessment Low Resolution (IPSL-CM6A-LR; Boucher et al., 2018), and Max Planck Institute for Meteorology Earth System Model Low Resolution (MPI-ESM1-2-LR; Brovkin et al., 2019)] and two contrasting greenhouse concentration trajectories (Shared Socio-economic Pathways (SSPs): a steady decline pathway with CO2 concentrations of 360 ppmv (SSP1-2.6) and an increasing pathway with CO2 reaching around 2,000 ppmv (SSP5-8.5) (IPCC6). We modeled distribution for present conditions future (2061-2080) time periods. Because one aim of our study was to determine the consequences of changing climate, we changed only the climatic data when projecting future distributions, while keeping the other variables constant over time (elevation, slope). Species distribution modeling We generated distribution maps for total occurrences (maternity + hibernacula + transition, hereafter defined as “combined models”), maternity colonies , hibernacula, and transition roosts. To estimate the present and future habitat suitability for C. townsendii in California, we used the maximum entropy (MaxEnt) algorithm in the “dismo” R package (Hijmans et al. 2021) through the advanced computing resources provided by Texas A&M High Performance Research Computing. We chose MaxEnt to aid in the comparisons of state-wide and ecoregion-specific models as MaxEnt outperforms other approaches when using small datasets (as is the case in our ecoregion-specific models). We created 1,000 background points from random points in the environmental layers and performed a 5-fold cross validation approach, which divided the occurrence records into training (80%) and testing (20%) datasets. We assessed the performance of our models by measuring the area under the receiver operating characteristic curve (AUC; Hanley & McNeil, 1982), where values >0.5 indicate that the model is performing better than random, values 0.5-0.7 indicating poor performance, 0.7-0.9 moderate performance and values of 0.9-1 excellent performance (BCCVL, Hallgren et al., 2016). We also measured the maximum true skill statistic (TSS; Allouche, Tsoar, & Kadmon, 2006) to assess model performance. The maxTSS ranges from -1 to +1:values <0.4 indicate a model that performs no better than random, 0.4-0.55 indicates poor performance, (0.55-0.7) moderate performance, (0.7-0.85) good performance, and values >0.80 indicate excellent performance (Samadi et al. 2022). Final distribution maps were generated using all occurrence records for each region (rather than the training/testing subset), and the models were projected onto present and future climate conditions. Additionally, because the climatic conditions of the different ecoregions of California vary widely, we generated separate models for each ecoregion in an attempt to capture potential local effects of climate change. A general rule in species distribution modeling is that the occurrence points should be 10 times the number of predictors included in the model, meaning that we would need 50 occurrences in each ecoregion. One common way to overcome this limitation is through the ensemble of small models (ESMs) (Breiner et al. 2015., 2018; Virtanen et al. 2018; Scherrer et al. 2019; Song et al. 2019) included in ecospat R package (references). For our ESMs we implemented MaxEnt modeling, and the final ensemble model was created by averaging individual bivariate models by weighted performance (AUC > 0.5). We also used null model significance testing with to evaluate the performance of our ESMs (Raes and Ter Steege 2007). To perform null model testing we compared AUC scores from 100 null models using randomly generated presence locations equal to the number used in the developed distribution model. All ecoregion models outperformed the null expectation (p<0.002). Estimating range shifts For each of the three GCMs and each RCP scenario, we converted the probability distribution map into a binary map (0=unsuitable, 1=suitable) using the threshold that maximizes sensitivity and specificity (Liu et al. 2016). To create the final maps for each SSP scenario, we summed the three binary GCM layers and took a consensus approach, meaning climatically suitable areas were pixels where at least two of the three models predicted species presence (Araújo and New 2007, Piccioli Cappelli et al. 2021). We combined the future binary maps (fmap) and the present binary maps (pmap) following the formula fmap x 2 + pmap (from Huang et al., 2017) to produce maps with values of 0 (areas not suitable), 1 (areas that are suitable in the present but not the future), 2 (areas that are not suitable in the present but suitable in the future), and 3 (areas currently suitable that will remain suitable) using the raster calculator function in QGIS. We then calculated the total area of suitability, area of maintenance, area of expansion, and area of contraction for each binary model using the “BIOMOD_RangeSize” function in R package “biomod2” (Thuiller et al. 2021).
Facebook
TwitterU.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
The product data are six statistics that were estimated for the chemical concentration of lithium in the soil C horizon of the conterminous United States. The estimates are made at 9998 locations that are uniformly distributed across the conterminous United States. The six statistics are the mean for the isometric log-ratio transform of the concentrations, the equivalent mean for the concentrations, the standard deviation for the isometric log-ratio transform of the concentrations, the probability of exceeding a concentration of 55 milligrams per kilogram, the 0.95 quantile for the isometric log-ratio transform of the concentrations, and the equivalent 0.95 quantile for the concentrations. Each statistic may be used to generate a statistical map that shows an attribute of the distribution of lithium concentration.
Facebook
Twitterhttps://research.csiro.au/dap/licences/csiro-data-licence/https://research.csiro.au/dap/licences/csiro-data-licence/
This dataset is a series of digital map-posters accompanying the AdaptNRM Guide: Helping Biodiversity Adapt: supporting climate adaptation planning using a community-level modelling approach.
These represent supporting materials and information about the community-level biodiversity models applied to climate change. Map posters are organised by four biological groups (vascular plants, mammals, reptiles and amphibians), two climate change scenario (1990-2050 MIROC5 and CanESM2 for RCP8.5), and five measures of change in biodiversity.
The map-posters present the nationally consistent data at locally relevant resolutions in eight parts – representing broad groupings of NRM regions based on the cluster boundaries used for climate adaptation planning (http://www.environment.gov.au/climate-change/adaptation) and also Nationally.
Map-posters are provided in PNG image format at moderate resolution (300dpi) to suit A0 printing. The posters were designed to meet A0 print size and digital viewing resolution of map detail. An additional set in PDF image format has been created for ease of download for initial exploration and printing on A3 paper. Some text elements and map features may be fuzzy at this resolution.
Each map-poster contains four dataset images coloured using standard legends encompassing the potential range of the measure, even if that range is not represented in the dataset itself or across the map extent.
Most map series are provided in two parts: part 1 shows the two climate scenarios for vascular plants and mammals and part 2 shows reptiles and amphibians. Eight cluster maps for each series have a different colour theme and map extent. A national series is also provided. Annotation briefly outlines the topics presented in the Guide so that each poster stands alone for quick reference.
An additional 77 National maps presenting the probability distributions of each of 77 vegetation types – NVIS 4.1 major vegetation subgroups (NVIS subgroups) - are currently in preparation.
Example citations:
Williams KJ, Raisbeck-Brown N, Prober S, Harwood T (2015) Generalised projected distribution of vegetation types – NVIS 4.1 major vegetation subgroups (1990 and 2050), A0 map-poster 8.1 - East Coast NRM regions. CSIRO Land and Water Flagship, Canberra. Available online at www.AdaptNRM.org and https://data.csiro.au/dap/.
Williams KJ, Raisbeck-Brown N, Harwood T, Prober S (2015) Revegetation benefit (cleared natural areas) for vascular plants and mammals (1990-2050), A0 map-poster 9.1 - East Coast NRM regions. CSIRO Land and Water Flagship, Canberra. Available online at www.AdaptNRM.org and https://data.csiro.au/dap/.
This dataset has been delivered incrementally. Please check that you are accessing the latest version of the dataset. Lineage: The map posters show case the scientific data. The data layers have been developed at approximately 250m resolution (9 second) across the Australian continent to incorporate the interaction between climate and topography, and are best viewed using a geographic information system (GIS). Each data layers is 1Gb, and inaccessible to non-GIS users. The map posters provide easy access to the scientific data, enabling the outputs to be viewed at high resolution with geographical context information provided.
Maps were generated using layout and drawing tools in ArcGIS 10.2.2
A check list of map posters and datasets is provided with the collection.
Map Series: 7.(1-77) National probability distribution of vegetation type – NVIS 4.1 major vegetation subgroup pre-1750 #0x
8.1 Generalised projected distribution of vegetation types (NVIS subgroups) (1990 and 2050)
9.1 Revegetation benefit (cleared natural areas) for plants and mammals (1990-2050)
9.2 Revegetation benefit (cleared natural areas) for reptiles and amphibians (1990-2050)
10.1 Need for assisted dispersal for vascular plants and mammals (1990-2050)
10.2 Need for assisted dispersal for reptiles and amphibians (1990-2050)
11.1 Refugial potential for vascular plants and mammals (1990-2050)
11.1 Refugial potential for reptiles and amphibians (1990-2050)
12.1 Climate-driven future revegetation benefit for vascular plants and mammals (1990-2050)
12.2 Climate-driven future revegetation benefit for vascular reptiles and amphibians (1990-2050)
Facebook
Twitterhttps://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Aim
To understand the representativeness and accuracy of expert range maps, and explore alternate methods for accurately mapping species distributions.
Location
Global
Time period
Contemporary
Major taxa studied
Terrestrial vertebrates, and Odonata
Methods
We analyzed the biases in 50,768 animal IUCN, GARD and BirdLife species maps, assessed the links between these maps and existing political and various non-ecological boundaries to assess their accuracy for certain types of analysis. We cross-referenced each species map with data from GBIF to assess if maps captured the whole range of a species, and what percentage of occurrence points fall within the species’ assessed ranges. In addition, we use a number of alternate methods to map diversity patterns and compare these to high resolution models of distribution patterns.
Results
On average 20-30% of species’ non-coastal range boundaries overlapped with administrative national boundaries. In total, 60% of areas with the highest spatial turnover in species (high densities of species range boundaries marking high levels of shift in the community of species present) occurred at political boundaries, especially commonly in Southeast Asia. Different biases existed for different taxa, with gridded analysis in reptiles, river-basins in Odonata (except the Americas) and county-boundaries for Amphibians in the US. On average, up to half (25-46%) species recorded range points fall outside their mapped distributions. Filtered Minimum-convex polygons performed better than expert range maps in reproducing modeled diversity patterns.
Main conclusions
Expert range maps showed high bias at administrative borders in all taxa, but this was highest at the transition from tropical to subtropical regions. Methods used were inconsistent across space, time and taxa, and ranges mapped did not match species distribution data. Alternate approaches can better reconstruct patterns of distribution than expert maps, and data driven approaches are needed to provide reliable alternatives to better understand species distributions.
Methods Materials and methods
We use a combination of approaches to explore the relationship between species range maps and geopolitical boundaries and a subset of geographic features. In some cases we used the density of species range boundaries to explore the relationship between these and various features (i.e. administrative boundaries, river basin boundaries etc.). Additionally, species richness and spatial turnover are used to explore changes in richness over short geographic distances. Analyses were conducted in R statistical software unless noted otherwise. All code scripts are available at https://github.com/qiaohj/iucn_fix. Workflows are shown in Figure S1a-c with associated scripts listed.
Species ranges and boundary density maps
ERMs (Expert range maps) were downloaded from the IUCN RedList website for mammals (5,709 species), odonates (2,239 species) and amphibians (6,684 species; https://www.iucnredlist.org/resources/grid/spatial-data). Shapefile maps for birds were downloaded from BirdLife (10,423 species, http://datazone.birdlife.org/species/requestdis), and for reptiles from the Global Assessment of Reptile Distributions (GARD) (10,064 species; Roll et al., 2017). Each species’ polygon boundaries were converted to a polylines to show the boundary of each species range (Figure S1a-II; codes are lines 7 – 18 in line2raster_xxxx.r ; xxxx varies based on the taxa). The associated shapefile was then split to produce independent polyline files for each species within each taxon (see Figure S1a-I, codes are lines 29 to 83 in the same file above.).
To generate species boundary density maps, species range boundaries were rasterized at 1km spatial resolution with an equal area projection (Eckert-IV), and stacked to form a single raster for each taxon (at the level of amphibians, odonates, etc.). This represented the number of species in each group and their overlapping range boundaries (Figure S1b-II, codes are in line2raster_all.r). Each cell value indicated the number of species whose distribution boundaries overlapped with each cell, enabling us to overlay this rasterized information with other features (i.e. administrative boundaries) so that the overlaps between them can be calculated in R. These species boundary density maps underlie most subsequent analyses. R code and caveats are given in the supplements, links are provided in text and Figure S1.
Geographic boundaries
Spatial exploration of species range boundaries in ArcGIS suggested that numerous geographic datasets (i.e. political and in few cases geographic features such as river basins) were used to delineate the species ranges for different regions and taxa (this is sometimes part of the methodology in developing ERMs as detailed by Ficetola et al., 2014). Thus in addition to analyzing the administrative bias and the percentage of occurrence records within each species’ ERM for all taxa, additional analyses were conducted when other biases were evident in any given taxa or region (detailed later in methods on a case-by-case basis).
For all taxa, we assessed the percentage of overlap between species range boundaries and national and provincial boundaries by digitizing each to 1km (equivalent to buffering thie polyline by 500m), both with and without coastal boundaries. An international map was used because international (Western) assessors use them, and does not necessarily denote agreed country boundaries (https://gadm.org/). The different buffers (500m, 1000m, 2500m, 5000m) were added to these administrative boundaries in ArcMap to account for potential, insignificant deviations from political boundaries (Figure S1b). An R script for the same function is provided in “country_line_buffer.r”.
To establish where multiple species shared range boundaries we reclassified the species range boundary density rasters for each taxa into richness classes using the ArcMap quartile function (Figure S1). From these ten classes the percentage of the top-two, and top-three quartiles of range densities within different buffers (500m, 1000m, 2500m, 5000m) was calculated per country to determine what percentage of highest range boundary density approximately followed administrative borders. This was done because people drawing ERMs may use detailed administrative maps or generalize near political borders, or may use political shapefiles that deviate slightly. It is consequently useful to include varying distances from administrative features to assess how range boundary densities vary in relation to administrative boundaries. Analyses of relationships between individual species range boundaries and administrative boundaries (coastal, non-coastal) were made in R and scripts provided (quantile_country_buffer_overlap.r).
Spatial turnover and administrative boundaries
Heatmaps of species richness were generated by summing entire sets of compiled species ranges for each taxon in polygonal form (Figure 1; Figure S1b-I). To assess abrupt diversity changes, standard deviations for 10km blocks were calculated using the block statistics function in ArcMap. Abrupt changes in diversity were signified by high standard deviations based on the cell statistics function in ArcGIS, which represented rapid changes in the number of species present. Maps were then classified into ten categories using the quartile function. Given the high variation in maximum diversity and taxonomic representation, only the top two –three richness categories were retained per taxon. This was then extracted using 1km buffers of national administrative boundaries to assess percentages of administrative boundaries overlapping turnover hotspots by assessing what proportion of political boundaries were covered by these turnover hotspots.
Taxon-specific analyses
Data exploration and mapping exposed taxon and regional-specific biases requiring additional analysis. Where other biases and irregularities were clear from visual inspection of the range boundary density maps for each taxa, the possible causes of biases were assessed by comparing range boundary density maps to high-resolution imagery and administrative maps via the ArcGIS server (AGOL). Standardized overlay of the taxon boundary sets with administrative or geophysical features from the image-server revealed three types of bias which were either spatially or taxonomically limited between: 1) amphibians with county borders in the United States, 2) dragonflies and river basins globally and 3) gridding of distributions of reptiles. In these cases, species boundary density maps were used as a basis to identify potential biases which were then explored empirically using appropriate methods.
For amphibians, counties in the United States (US) were digitized using a county map from the US (https://gadm.org/), then buffered by with 2.5km either side. Amphibian species range boundary density maps were reclassified showing where species range boundaries existed (with other non-range boundary areas reclassified as “no data,”) and all species boundaries numerically indicated (i.e. values of 1 indicates one species range boundary, values of 10 indicates ten species range boundaries). Percentages of species boundary areas falling on county and in the buffers, in addition to species range boundaries which did not overlap with county boundaries were calculated to give measures of what percentage of the species boundaries fell within 2.5km of county boundaries.
For Odonata, many species were mapped to river basin borders. We used river basins of levels 6-8 (sub-basin to basin) in the river hierarchy (https://hydrosheds.org) to assess the relationship between Odonata boundaries and river boundaries. Two IUCN datasets exist for Odonata; the IUCN Odonata specialist group spatial dataset
Facebook
Twitterhttps://eidc.ceh.ac.uk/licences/open-government-licence-ceh-ons/plainhttps://eidc.ceh.ac.uk/licences/open-government-licence-ceh-ons/plain
This dataset contains gridded population with a spatial resolution of 1 km x 1 km for the UK based on Census 2011 and Land Cover Map 2007 input data. Data on population distribution for the United Kingdom is available from statistical offices in England, Wales, Northern Ireland and Scotland and provided to the public e.g. via the Office for National Statistics (ONS). Population data is typically provided in tabular form or, based on a range of different geographical units, in file types for geographical information systems (GIS), for instance as ESRI Shapefiles. The geographical units reflect administrative boundaries at different levels of detail, from Devolved Administration to Output Areas (OA), wards or intermediate geographies . While the presentation of data on the level of these geographical units is useful for statistical purposes, accounting for spatial variability for instance of environmental determinants of public health requires a more spatially homogeneous population distribution. For this purpose, the dataset presented here combines 2011 UK Census population data on Output Area level with Land Cover Map 2007 land-use classes 'urban' and 'suburban' to create a consistent and comprehensive gridded population data product at 1 km x 1 km spatial resolution. The mapping product is based on British National Grid (OSGB36 datum).
Facebook
TwitterOverview: Actual Natural Vegetation (ANV): probability of occurrence for the Olive tree in its realized environment for the period 2000 - 2024 Traceability (lineage): This is an original dataset produced with a machine learning framework which used a combination of point datasets and raster datasets as inputs. Point dataset is a harmonized collection of tree occurrence data, comprising observations from National Forest Inventories (EU-Forest), GBIF and LUCAS. The complete dataset is available on Zenodo. Raster datasets used as input are: harmonized and gapfilled time series of seasonal aggregates of the Landsat GLAD ARD dataset (bands and spectral indices); monthly time series air and surface temperature and precipitation from a reprocessed version of the Copernicus ERA5 dataset; long term averages of bioclimatic variables from CHELSA, tree species distribution maps from the European Atlas of Forest Tree Species; elevation, slope and other elevation-derived metrics; long term monthly averages snow probability and long term monthly averages of cloud fraction from MODIS. For a more comprehensive list refer to Bonannella et al. (2022) (in review, preprint available at: https://doi.org/10.21203/rs.3.rs-1252972/v1). Scientific methodology: Probability and uncertainty maps were the output of a spatiotemporal ensemble machine learning framework based on stacked regularization. Three base models (random forest, gradient boosted trees and generalized linear models) were first trained on the input dataset and their predictions were used to train an additional model (logistic regression) which provided the final predictions. More details on the whole workflow are available in the listed publication. Usability: Probability maps can be used to detect potential forest degradation and compositional change across the time period analyzed. Some possible applications for these topics are explained in the listed publication. Uncertainty quantification: Uncertainty is quantified by taking the standard deviation of the probabilities predicted by the three components of the spatiotemporal ensemble model. Data validation approaches: Distribution maps were validated using a spatial 5-fold cross validation following the workflow detailed in the listed publication. Completeness: The raster files perfectly cover the entire Geo-harmonizer region as defined by the landmask raster dataset available here. Consistency: Areas which are outside of the calibration area of the point dataset (Iceland, Norway) usually have high uncertainty values. This is not only a problem of extrapolation but also of poor representation in the feature space available to the model of the conditions that are present in this countries. Positional accuracy: The rasters have a spatial resolution of 30m. Temporal accuracy: The maps cover the period 2000 - 2020, each map covers a certain number of years according to the following scheme: (1) 2000--2002, (2) 2002--2006, (3) 2006--2010, (4) 2010--2014, (5) 2014--2018 and (6) 2018--2020 Thematic accuracy: Both probability and uncertainty maps contain values from 0 to 100: in the case of probability maps, they indicate the probability of occurrence of a single individual of the target species, while uncertainty maps indicate the standard deviation of the ensemble model.
Facebook
TwitterDistribution map (raster format: geotiff) of Larix decidua, computed using the NFIs - EFDAC EForest European dataset of species presence/absence. The distribution is estimated by means of statistical interpolation (constrained spatial multi-frequency analysis, C-SMFA) Available years: 2000. The maps are available in the European Forest Data Center (EFDAC). The specific goal of EFDAC is to become a focal point for policy relevant forest data and information by hosting and pointing to relevant forest information as well as providing web-based tools for accessing information located in EFDAC.
Facebook
TwitterThis dataset contains all the spatial distributions predicted for the paper on monitoring programs of the Gulf of Mexico and the Gulf of Mexico Data Atlas, using statistical habitat models fitted to the survey data contained in the dataset whose UDI is "FL.x703.000:0002". The data provided in this dataset are PNG files showing the spatial patterns of probability of encounter of 61 fish and invertebrate functional groups/species/life stages of the Gulf of Mexico. The spatial distributions provided here are not for specific years, but rather long-term, average spatial distributions for the period 2000-present.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
The dataset is a raster file derived from Sentinel-2 series of imagery showing the probability of kelp (various species) distribution around the Falkland Islands
Facebook
TwitterDistribution map (raster format: geotiff) of Fagus sylvatica, computed using the NFIs - EFDAC EForest European dataset of species presence/absence. The distribution is estimated by means of statistical interpolation (constrained spatial multi-frequency analysis, C-SMFA) Available years: 2000. The maps are available in the European Forest Data Center (EFDAC). The specific goal of EFDAC is to become a focal point for policy relevant forest data and information by hosting and pointing to relevant forest information as well as providing web-based tools for accessing information located in EFDAC.
Facebook
TwitterThis dataset contains maps of probability of presence for the zooplankton and phytoplankton groups represented in the ecosystem model OSMOSE-WFS, for the different months of the year. Maps of probability of presence for the zooplankton group were produced from SEAPODYM data for the period 2000-present (which we requested to Patrick Lehodey and Beatriz Calmettes, Collecte Localisation Satellites, France, in October 2016). The maps of probability of presence for the phytoplankton group were produced from SeaWiFS chlorophyll-a concentration data for the period 2005-2009 (downloaded from: http://oceancolor.gsfc.nasa.gov/SeaWiFS/). The maps of probability of presence provided here are not for specific years, but rather long-term distribution maps for the period 2005-2009.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description in English:This dataset is the dataset of distribution maps of single-season rice in China, which contains distribution maps of single-season rice of 21 provincial administrative regions in China from 2017 to 2022. The file format is GeoTIFF and the spatial reference is WGS84 (EPSG:4326). The resolution of the distribution maps is 20-m in Heilongjiang, Jilin, Liaoning, Inner Mongolia, and Ningxia, and 10-m in the other provinces. The distribution maps are produced using the time-weighted dynamic time warping (TWDTW) method based on Sentinel-1 and Sentinel-2 images. The overall accuracy over 21 provincial administrative regions was 85.23 % on average based on 108195 samples, and the average R2 was 0.83 over three years compared with county-level statistical planting areas.classification system:0: non-single-season rice1: single-season riceUpdates:v1.1: Compared with the distribution map of double-season rice from Pan et al.'s dataset (2021) (https://doi.org/10.3390/rs13224609), the two datasets overlapped in provinces where both single- and double-season rice were cultivated, including Anhui, Hubei, Hunan, Jiangxi, Zhejiang, Fujian, and Guangxi. Version 1.1 of this dataset was updated by reclassifying the overlapped pixels to ensure that the two datasets no longer overlap.v1.2: Fix error caused by the caliber of late rice statistics in Fujian Province.2024-02-23: Update distribution maps for the year 2023.2025-04-01: Update distribution maps for the year 2024.
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Per-and polyfluoroalkyl substances (PFAS) are synthetic chemicals that are increasingly being detected in groundwater. The negative health consequences associated with human exposure to PFAS make it essential to quantify the distribution of PFAS in groundwater systems. Mapping PFAS distributions is particularly challenging because a national patchwork of testing and reporting requirements has resulted in sparse and spatially biased data. In this analysis, an inhomogeneous Poisson process (IPP) modeling approach is adopted from ecological statistics to continuously map PFAS distributions in groundwater across the contiguous United States. The model is trained on a unique data set of 8910 PFAS groundwater measurements, using combined concentrations of two PFAS analytes. The IPP model predictions are compared with results from random forest models to highlight the robustness of this statistical modeling approach on sparse data sets. This analysis provides a new approach to not only map PFAS contamination in groundwater but also prioritize future sampling efforts.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Basic summary statistics for all the maps, including the grid size, total population, maximum population density at 1 km2, the percentage of pixels that were empty or urban (with >1000 people per km2), and the Pareto Number, defined as the percentage x that holds a percentage 1 − x of the population.
Facebook
TwitterNode of the Institute of Statistics and Cartography of Andalusia. Regional Government of Andalusia. WMS Population Mesh Service. Integrated in the Spatial Data Infrastructure of Andalusia following the guidelines of the Statistical and Cartographic System of Andalusia. WMS map service of spatial distribution of the population of Andalusia in cells of 250m x 250m. The information represented in these maps has been georeferenced from the location of the postal address where each of the inhabitants of Andalusia resides. To facilitate the representation of the information and to preserve statistical confidentiality, a regular mesh has been drawn with cells of 250 meters on the side, where all the information that corresponds in each case has been added. Information that could not be georeferenced has been estimated using spatial analysis techniques. On December 23, 2019, the demographic statistical information of the population data, corresponding to January 1, 2018, is presented. The website of the Institute of Statistics and Cartography of Andalusia offers a visualization service: "Spatial distribution of the population of Andalusia" for interactive consultation https://www.juntadeandalucia.es/institutodeestadisticaycartografia/distributionpob/index.htm
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Probability and uncertainty maps showing the potential and realized distribution for the common hazel (Corylus avellana, L.) for Europe from the dataset prepared by Bonannella et al. (2022) and predicted using Ensemble Machine Learning (EML). Potential distribution map cover the period 2018 - 2020; realized distribution cover the period 2000 - 2020, split in the following time periods:
2000 - 2002,
2002 - 2006,
2006 - 2010,
2010 - 2014,
2014 - 2018,
2018 - 2020.
Files are named according to the following naming convention, e.g:
veg_corylus.avellana_anv.eml_md_30m_0..0cm_2000..2002_eumap_epsg3035_v0.3
with the following fields:
theme: e.g. veg,
species code: e.g. corylus.avellana,
species distribution type: e.g. anv (= actual natural vegetation),
species estimation method: e.g. eml,
species estimation type: e.g. md ( = model deviation),
resolution in meters e.g. 30m,
reference depths (vertical dimension): e.g. 0..0cm,
reference period begin end: e.g. 2000..2002,
reference area: e.g. eumap,
coordinate system: e.g. epsg3035,
data set version: e.g. v0.3.
For each species is then easy to identify probability and uncertainty distribution maps:
veg_corylus.avellana_anv.eml_md: model uncertainty for realized distribution
veg_corylus.avellana_anv.eml_p: probability for realized distribution
veg_corylus.avellana_pnv.eml_md: model uncertainty for potential distribution
veg_corylus.avellana_pnv.eml_p: probability for potential distribution
Files are provided as Cloud Optimized GeoTIFFs and projected in the Coordinate Reference System ETRS89 / LAEA Europe (= EPSG code 3035). Styling files are provided in both SLD and QML format.
If you would like to know more about the creation of the maps and the modeling:
watch the talk at Open Data Science Workshop 2021 (TIB AV-PORTAL)
access the repository with our R/Python scripts and follow the instructions (GitLab)
access the repository with the training dataset (Zenodo)
read the tutorial with executable code on our GitBook
A publication describing, in detail, all processing steps, accuracy assessment and general analysis of species distribution maps is available on PeerJ. To suggest any improvement/fix use https://gitlab.com/geoharmonizer_inea/spatial-layers/-/issues.
Facebook
TwitterThis layer shows total population count by sex and age group. This is shown by tract, county, and state boundaries. This service is updated annually to contain the most currently released American Community Survey (ACS) 5-year data, and contains estimates and margins of error. There are also additional calculated attributes related to this topic, which can be mapped or used within analysis. This layer is symbolized to show the percentage of the population that are considered dependent (ages 65+ and <18). To see the full list of attributes available in this service, go to the "Data" tab, and choose "Fields" at the top right. Current Vintage: 2019-2023ACS Table(s): B01001Data downloaded from: Census Bureau's API for American Community Survey Date of API call: December 12, 2024National Figures: data.census.govThe United States Census Bureau's American Community Survey (ACS):About the SurveyGeography & ACSTechnical DocumentationNews & UpdatesThis ready-to-use layer can be used within ArcGIS Pro, ArcGIS Online, its configurable apps, dashboards, Story Maps, custom apps, and mobile apps. Data can also be exported for offline workflows. For more information about ACS layers, visit the FAQ. Please cite the Census and ACS when using this data.Data Note from the Census:Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see Accuracy of the Data). The effect of nonsampling error is not represented in these tables.Data Processing Notes:This layer is updated automatically when the most current vintage of ACS data is released each year, usually in December. The layer always contains the latest available ACS 5-year estimates. It is updated annually within days of the Census Bureau's release schedule. Click here to learn more about ACS data releases.Boundaries come from the US Census TIGER geodatabases, specifically, the National Sub-State Geography Database (named tlgdb_(year)_a_us_substategeo.gdb). Boundaries are updated at the same time as the data updates (annually), and the boundary vintage appropriately matches the data vintage as specified by the Census. These are Census boundaries with water and/or coastlines erased for cartographic and mapping purposes. For census tracts, the water cutouts are derived from a subset of the 2020 Areal Hydrography boundaries offered by TIGER. Water bodies and rivers which are 50 million square meters or larger (mid to large sized water bodies) are erased from the tract level boundaries, as well as additional important features. For state and county boundaries, the water and coastlines are derived from the coastlines of the 2023 500k TIGER Cartographic Boundary Shapefiles. These are erased to more accurately portray the coastlines and Great Lakes. The original AWATER and ALAND fields are still available as attributes within the data table (units are square meters).The States layer contains 52 records - all US states, Washington D.C., and Puerto RicoCensus tracts with no population that occur in areas of water, such as oceans, are removed from this data service (Census Tracts beginning with 99).Percentages and derived counts, and associated margins of error, are calculated values (that can be identified by the "_calc_" stub in the field name), and abide by the specifications defined by the American Community Survey.Field alias names were created based on the Table Shells file available from the American Community Survey Summary File Documentation page.Negative values (e.g., -4444...) have been set to null, with the exception of -5555... which has been set to zero. These negative values exist in the raw API data to indicate the following situations:The margin of error column indicates that either no sample observations or too few sample observations were available to compute a standard error and thus the margin of error. A statistical test is not appropriate.Either no sample observations or too few sample observations were available to compute an estimate, or a ratio of medians cannot be calculated because one or both of the median estimates falls in the lowest interval or upper interval of an open-ended distribution.The median falls in the lowest interval of an open-ended distribution, or in the upper interval of an open-ended distribution. A statistical test is not appropriate.The estimate is controlled. A statistical test for sampling variability is not appropriate.The data for this geographic area cannot be displayed because the number of sample cases is too small.
Facebook
TwitterThe product data are six statistics that were estimated for the chemical concentration of lanthanum in the soil C horizon of the conterminous United States (Smith and others, 2013). The estimates are made at 9998 locations that are uniformly distributed across the conterminous United States. The six statistics are the mean for the isometric log-ratio transform of the concentrations, the equivalent mean for the concentrations, the standard deviation for the isometric log-ratio transform of the concentrations, the probability of exceeding a concentration of 48.8 milligrams per kilogram, the 0.95 quantile for the isometric log-ratio transform of the concentrations, and the equivalent 0.95 quantile for the concentrations. Each statistic may be used to generate a statistical map that shows an attribute of the distribution of lanthanum concentration.