Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Travel regions are not necessarily defined by political or administrative boundaries. For example, in the Schengen region of Europe, tourists can travel freely across borders irrespective of national borders. Identifying transboundary travel regions is an interesting problem which we aim to solve using mobility analysis of Twitter users. Our proposed solution comprises collecting geotagged tweets, combining them into trajectories and, thus, mining thousands of trips undertaken by twitter users. After aggregating these trips into a mobility graph, we apply a community detection algorithm to find coherent regions throughout the world. The discovered regions provide insights into international travel and can reveal both domestic and transnational travel regions.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is a set of POI data sets of Shenzhen, Guangzhou, Beijing, and Shanghai cities, China.
From the site: "Coal Pillar Locations are pillars of coal that must remain in place to provide support for a coal mine."
Spatially continuous data of environmental variables is often required for marine conservation and management. However, information for environmental variables is usually collected by point sampling, particularly for the deep ocean. Thus, methods generating such spatially continuous data by using point samples to estimate values for unknown locations become essential tools. Such methods are, however, often data- or even variable- specific and it is difficult to select an appropriate method for any given dataset. In this study, 14 methods (37 sub-methods) are compared using samples of mud content with five levels of sample density across the southwest Australian margin. Bathymetry, distance to coast, and slope were used as secondary variables. Ten-fold cross validation with relative mean absolute error (RMAE) and visual examination were used to assess the performance of these methods. A total of 1,850 prediction datasets were produced and used to assess the performance of the methods. Considering both the accuracy and the visual examination, we found that a combined method, random forest and ordinary kriging (RKrf), is the most robust method. No threshold in sample density was detected in relation to prediction accuracy. No consistent patterns were observed between the performance of the methods and data variation. The RMAE of three most accurate methods is about 30% lower than that of the best methods in previous publications, highlighting the robustness of the methods selected in this study. The limitations of this study were discussed and a number of suggestions were provided for further studies.
Data from the article "Unraveling spatial, structural, and social country-level conditions for the emergence of the foreign fighter phenomenon: an exploratory data mining approach to the case of ISIS", by Agustin Pájaro, Ignacio J. Duran and Pablo Rodrigo, published in Revista DADOS, v. 65, n. 3, 2022.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Increasing popularity of social networks made them a viable data source for many data mining applications and event detection is no exception. Researchers aim not only to find events that happen in networks but more importantly to identify and locate events occurring in the real world.In this paper, we propose an enhanced version of quadtree - convolutional quadtree (ConvTree) - and demonstrate its advantage compared to the standard quadtree. We also introduce the algorithm for searching events of different scales using geospatial data obtained from social networks. The algorithm is based on statistical analysis of historical data, generation of ConvTrees representing the normal state of the city and anomalies evaluation for events detection.Experimental study conducted on the dataset of 60 million geotagged Instagram posts in the New York City area demonstrates that the proposed approach is able to find a wide range of events from very local (indie band concert or wedding party) to city (baseball game or holiday march) and even country scale (political protest or Christmas) events. This opens up a perspective of building simple and fast yet powerful system for real-time multiscale events monitoring.
Chytridiomycosis, caused by the fungal pathogen Batrachochytrium dendrobatidis (Bd), is a major driver of amphibian decline worldwide. The global presence of Bd is driven by a synergy of factors, such as climate, species life history, and amphibian host suscepÂtibility. Here, using a Bayesian data-mining approach, we modeled the epidemiologiÂcal landscape of Bd to evaluate how infection varies across several spatial, ecological, and phylogenetic scales. We compiled global information on Bd occurrence, climate, species ranges, and phylogenetic diversity to infer the potential distribution and prevaÂlence of Bd. By calculating the degree of co-distribution between Bd and our set of environmental and biological variables (e.g. climate and species), we identified the factors that could potentially be related to Bd presence and prevalence using a geoÂgraphic correlation metric, epsilon (ε). We fitted five ecological models based on 1) amphibian species identity, 2) phylogenetic species varia..., Usage notes
These datasets include the geographic data used to build ecological and geographical models for Batrachochytrium dendrobatidis, as well as supplementary results of the following paper: Basanta et al. Epidemiological landscape of Batrachochytrium dendrobatidis and its impact on amphibian diversity at the global scale. Missing values are denoted by NA. Details for each dataset are provided in the README file. Datasets included:
Information of Bd records. Table S1.xls contains Bd occurrence records and prevalence of infection from the Bd-Maps online database (http://www.bd-maps.net), Olson et al. 2013) accessed in 2013, and searched Google Scholar for recent papers with Bd infection reports using the keywords ‘*Batrachochytrium dendrobatidis’*. We excluded records from studies of captive individuals and those without coordinates, keeping only records in which coordinates reflected site-specific sample locations. Supplementary figures Supplementary information S1.docx cont..., , # 1. Title of Dataset: Epidemiological landscape of Batrachochytrium dendrobatidis and its impact on amphibian diversity at global scale
M. Delia Basanta Department of Biology, University of Nevada Reno. Reno, Nevada, USA.
Julián A. Velasco Instituto de Ciencias de la Atmósfera y Cambio Climático, Universidad Nacional Autónoma de México. Ciudad de México, México.
Constantino González-Salazar. Instituto de Ciencias de la Atmósfera y Cambio Climático, Universidad Nacional Autónoma de México. Ciudad de México, México.
Table S1.xls contains Bd occurrence records and prevalence of infection from the Bd-Maps online da...
The Project Approval Boundary spatial data set provides information on the location of the project approvals granted for each mine in NSW by an approval authority (either NSW Department of Planning or local Council). This information may not align to the mine authorisation (i.e. mine title etc) granted under the Mining Act 1992. This information is created and submitted by each large mine operator to fulfill the Final Landuse and Rehabilitation Plan data submission requirements required under Schedule 8A of the Mining Regulation 2016. \r \r The collection of this spatial data is administered by the Resources Regulator in NSW who conducts reviews of the data submitted for assessment purposes. In some cases, information provided may contain inaccuracies that require adjustment following the assessment process by the Regulator. The Regulator will request data resubmission if issues are identified. \r \r Further information on the reporting requirements associated with mine rehabilitation can be found at https://www.resourcesregulator.nsw.gov.au/rehabilitation/mine-rehabilitation. \r \r Find more information about the data at https://www.seed.nsw.gov.au/project-approvals-boundary-layer\r \r Any data related questions should be directed to nswresourcesregulator@service-now.com
From the site: "Coverages containing industrial mineral mining data by quadrangle for the state of Pennsylvania. Digitized from the Harrisburg Bureau of Mining and Reclamation mylar map system each quadrangle contains multiple coverages identifying seams in that quad. Also includes coverages indicating coal mining refuse disposal sites, permitted sites, point coverages of deep mine entry and other surface features of deep mines and Small Operators Assistance Program (SOAP) areas."
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Land cover is the visible, biophysical cover on the Earth’s surface including trees, shrubs, grasses, soils, exposed rocks and water bodies, as well as anthropogenic elements such as plantations, crops and built environments. Land cover changes for many reasons, including seasonal weather, severe weather events such as cyclones, floods and fires, and human activities such as mining, agriculture and urbanisation. Remote sensing data recorded over a period of time allows the observation of land cover dynamics. Classifying these responses provides a robust and repeatable way of characterising land cover types. These complement on ground survey where available.
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
With the increase of mining activities in China, the ecological environment around mines is facing unprecedented pressure, and a series of resource, ecological and environmental problems such as air pollution, soil erosion, solid waste pollution, landslides and debris flows have occurred. In this paper, high-resolution optical remote sensing image GF-6 8m resolution data was used to obtain the open-pit mining dataset in the Bohai Rim through rule-based feature extraction, time-series image discrimination and visual interpretation methods. This dataset can be used to analyze the temporal and spatial pattern of open-pit mining in the Bohai Rim region, the correlation between open-pit mining distribution and influencing factors, as well as economic development.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains the results for a journal paper titled "Spatial Parameters for Circular Construction Hubs: Location Criteria for a Circular Built Environment". For this research, we reviewed policy documents and interviewed exports to identify the spatial parameters (or location requirements) for circular construction hubs, which are facilities that collect, store, and redistribute construction waste as secondary resources. The following files included document the research process and results:
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Statistically significant hotspots of fishing activities in the Mediterranean and Atlanti Seas were identified by the application of the Getis-Ord Gi statistic (Getis and Ord 2010) though the statistical software R using the globalG.test function (spdep package). The function computes a global test for spatial autocorrelation using a Monte Carlo simulation approach. It tests the null hypothesis of no autocorrelation against the alternative hypothesis of positive spatial autocorrelation. Then the local spatial autocorrelation was tested calculating the Gi statistic, using the local_g_perm function (dfdep package), which indicates the strength of the clustering.
Categorization of hotspots was performed, according to the Gi value and the p-value of a folded permutation test obtained for each grid cell, as follows:
Grid cells with a p-value > 0.1 were categorized as Insignificant.
The analyses were performed on cumulative fishing activity data at 0.5° resolution of seven different gears separately for the two macroareas.
The dataset presented includes for each area maps of each gear hotspot and spatial layers of the gears hotspots (.shp; .csv)
This geodatabase reflects the U.S. Geological Survey’s (USGS) ongoing commitment to its mission of understanding the nature and distribution of global mineral commodity supply chains by updating and publishing the georeferenced locations of mineral commodity production and processing facilities, mineral exploration and development sites, and mineral commodity exporting ports in Africa. The geodatabase and geospatial data layers serve to create a new geographic information product in the form of a geospatial portable document format (PDF) map. The geodatabase contains data layers from USGS, foreign governmental, and open-source sources as follows: (1) mineral production and processing facilities, (2) mineral exploration and development sites, (3) mineral occurrence sites and deposits, (4) undiscovered mineral resource tracts for Gabon and Mauritania, (5) undiscovered mineral resource tracts for potash, platinum-group elements, and copper, (6) coal occurrence areas, (7) electric power generating facilities, (8) electric power transmission lines, (9) liquefied natural gas terminals, (10) oil and gas pipelines, (11) undiscovered, technically recoverable conventional and continuous hydrocarbon resources (by USGS geologic/petroleum province), (12) cumulative production, and recoverable conventional resources (by oil- and gas-producing nation), (13) major mineral exporting maritime ports, (14) railroads, (15) major roads, (16) major cities, (17) major lakes, (18) major river systems, (19) first-level administrative division (ADM1) boundaries for all countries in Africa, and (20) international boundaries for all countries in Africa.
Spatial interpolation methods for generating spatially continuous data from point samples of environmental variables are essential for environmental management and conservation. They may fall into three groups: non-geostatistical methods (e.g., inverse distance weighting), geostatistical methods (e.g., ordinary kriging) and combined/hybrid methods (e.g. regression kriging); and their performance is often data-specific (Li and Heap, 2008). Because of the robustness of machine learning methods, like random forest and support vector machine, in data mining fields, we introduced them into spatial statistics by applying them to the spatial predictions of seabed mud content in combination with existing spatial interpolation methods (Li et al., 2011). This development can be viewed as an extension of the combined methods from statistical methods to machine learning field. These applications have significantly improved the prediction accuracy and opened an alternative source of methods for spatial interpolation. Given that they have only been applied to one variable, several questions remain, namely: are they dataset- specific? How reliable are their predictions for different datasets and variables? Could other machine learning methods (such as boosted regression trees) improve the spatial interpolations? To address these questions, we experimentally compared the predictions of several methods for sand content on the southwest Australian marine margin. We tested a variety of existing spatial interpolation methods, machine learning methods and their combinations. In this study, we discuss the experimental results and the value of this advancement in spatial interpolation, visually examine the spatial predictions, and compare the results with the findings in the previous publications. The outcomes of this study can be applied to the spatial prediction of marine and terrestrial environmental variables.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
These datasets provide the data underlying the publication on "Lines in the sand: quantifying the cumulative development footprint in the world’s largest remaining temperate woodland" https://link.springer.com/article/10.1007/s10980-017-0558-z. . The datasets are: (A) data in csv format: [1] development footprint by sample area: Information on the 24, ~490 km^2 sample areas assessed in the study, including the different infrastructure types (roads, railways, mapped tracks, un-mapped tracks which have been manually digitized in the study using aerial imagery and hub infrastructure such as mine pits and waste rock dumps, also manually digitized in the study). Also contains some key co-variables assessed as potential explanatory variables for development footprint. The region-wide modelling of development footprint found strong positive effects of mining project density and pastoralism, as well as a highly significant negative interaction between the two. At low mining project densities, development footprints are more extensive in pastoral areas, but at high mining project densities, pastoral areas are relatively less developed than non-pastoral areas, on average. [2] Great Western Woodlands (GWW) 20 km grid: The datasets provides data for the 20x20 km grid placed over the whole GWW and used for the regional estimation of development footprint, linear infrastructure density, and linear infrastructure type based on the region-wide analysis. Data is for each cell in the grid and provides the total length of roads in that grid cell, MINEDEX mining projects, pastoral status, etc. This dateset was used to project the data from the 24 study areas across the whole of the Great Western Woodlands and calculate region-wide estimates of development footprint and linear infrastructure lengths. [3] disturbance by patch: This dataset provides the data for each patch for the analysis of patch-level drivers of development footprint, which was performed to gain further insights into the effects of other landscape variables that what could be gleaned from the region-wide analysis. For this analysis, we divided sample areas into polygonal patch types, each with a unique combination of the following categorical co-variables: pastoral tenure, greenstone lithology, conservation tenure, ironstone formation, schedule-1 area clearing restrictions, environmentally sensitive area designation, vegetation formation, and sample area. For each patch type (n=261), we calculated the following attributes: a) number of mining projects, b) number of dead mineral tenements, c) sum of duration of all live and dead tenements, d) type of tenements (exploration/prospecting tenement, mining and related activities tenement, none), e) primary target commodity (gold, nickel, iron-ore, other), f) distance to wheatbelt, and g) distance to the nearest town. [4] mapped versus digitized tracks: This dataset provides mapped and un-mapped track widths, measured using high-resolution aerial imagery at at least 20 randomly-generated locations within each of 24 sample areas. Pastoral tenure and mining intensity for each sample area are included for analysis purposes. [5] edge effect scenarios: Hypothetical edge effect zones were created, based on effect zones gleaned from the literature and arranged under three scenarios, to reflect potential risks of offsite impacts in areas adjacent to development footprints observed (see appendix 3 of article). The calculated proportion of the entire GWW within edge effect zones varied from ~3% under the conservative scenario to ~35% under the maximal scenario. Within the range of development footprints observed in this study, the proportion of a landscape that lies within edge effect zones increases hyperbolically with the number of mining projects, and approaches 100% in the maximal scenario, 60% in the moderate scenario, and ~20% under the conservative scenario. shapefiles: [6] Great Western Woodlands boundary, [7] sample areas (layer file shows sample areas by category).
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
The U.S. Geological Survey (USGS) has compiled a geodatabase containing mineral-related geospatial data for 10 countries of interest in Southwest Asia (area of study): Afghanistan, Cambodia, Laos, India, Indonesia, Iran, Nepal, North Korea, Pakistan, and Thailand. The data can be used in analyses of the extractive fuel and nonfuel mineral industries and related economic and physical infrastructure integral for the successful operation of the mineral industries within the area of study as well as the movement of mineral products across domestic and global markets. This geodatabase reflects the USGS ongoing commitment to its mission of understanding the nature and distribution of global mineral commodity supply chains by updating and publishing the georeferenced locations of mineral commodity production and processing facilities, mineral exploration and development sites, and mineral commodity exporting ports for the countries in the area of study. The geodatabase contains data feat ...
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The GRASS GIS database containing the input raster layers needed to reproduce the results from the manuscript entitled:
"Mapping forests with different levels of naturalness using machine learning and landscape data mining" (under review)
Abstract:
To conserve biodiversity, it is imperative to maintain and restore sufficient amounts of functional habitat networks. Hence, locating remaining forests with natural structures and processes over landscapes and large regions is a key task. We integrated machine learning (Random Forest) and wall-to-wall open landscape data to scan all forest landscapes in Sweden with a 1 ha spatial resolution with respect to the relative likelihood of hosting High Conservation Value Forests (HCVF). Using independent spatial stand- and plot-level validation data we confirmed that our predictions (ROC AUC in the range of 0.89 - 0.90) correctly represent forests with different levels of naturalness, from deteriorated to those with high and associated biodiversity conservation values. Given ambitious national and international conservation objectives, and increasingly intensive forestry, our model and the resulting wall-to-wall mapping fills an urgent gap for assessing fulfilment of evidence-based conservation targets, spatial planning, and designing forest landscape restoration.
This database was compiled from the following sources:
1. HCVF. A database of High Conservation Value Forests in Sweden. Swedish Environmental Protection Agency.
source: https://geodata.naturvardsverket.se/nedladdning/skogliga_vardekarnor_2016.zip
2. NMD. National Land Cover Data. Swedish Environmental Protection Agency.
3. DEM. Terrain Model Download, grid 50+. Lantmateriet, Swedish Ministry of Finance.
source: https://www.lantmateriet.se/en/geodata/geodata-products/product-list/terrain-model-download-grid-50/
4. GFC. Global Forest Change. Global Land Analysis and Discovery, University of Maryland.
source: https://glad.earthengine.app
5. LIGHTS. A harmonized global nighttime light dataset 1992–2018. Land pollution with night-time lights expressed as calibrated digital numbers (DN).
source: https://doi.org/10.6084/m9.figshare.9828827.v2
6. POPULATION. Total Population in Sweden. Statistics Sweden.
source: https://www.scb.se/en/services/open-data-api/open-geodata/grid-statistics/
To learn more about the GRASS GIS database structure, see:
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The mine rehabilitation dataset provides information on where mining operations in NSW have been conducted and are forecast to conduct ground disturbance and rehabilitation activities, as well as the final landuse and landform following the completion of mining and rehabilitation activities. This information is created and submitted by each large mine operator to fulfil spatial data submission requirements required under Schedule 8A of the Mining Regulation 2016. The collection of this spatial data is administered by the Resources Regulator in NSW who conducts reviews of the data submitted for assessment purposes. In some cases, information provided may contain inaccuracies that require adjustment following the assessment process by the Regulator. The Regulator will request data resubmission if issues are identified. Further information on the reporting requirements associated with mine rehabilitation can be found at https://www.resourcesregulator.nsw.gov.au/rehabilitation/mine-rehabilitation.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Fishing effort data publicly accessible thought the Global Fishing Watch (GFW) site (GFW, 2022) have been used to investigate the spatiotemporal distribution of fishing activities in a wide area (Latitude: 30.7°N - 66°N; Longitude: 14.4°W - 41.9°E).
Daily fishing effort data by flag state and vessel class at 0.01° resolution, from 2015 to 2020, have been filtered and aggregated to obtain fishing effort information of six main gear categories. Categories were created by aggregation of the following vessel classes: fixed gears (pots and traps, set longlines, set gillnets and fixed gear), purse seines (purse seines, tuna purse seines and other purse seines), trawlers (trawlers), drifting longlines (drifting longlines), dredge (dredge fishing) and other (pole and line, fishing, trollers, seiners, other seines and squid jigger). Data were aggregated to obtain cumulative (fahs) and average (mfahs) fishing hours by fishing category at 0.1° and 0.5° resolution.
Maps of fishing activity for each gear have been created for eight main areas at 0.1° (Adriatic Sea, Aegean Sea, Balearic Sea, Baltic Sea, Bay of Biscay, Black Sea, Levantine Sea and North Sea) and at Mediterranean and Atlantic level at 0.5°.
The dataset presented includes for each gear spatial layers of the cumulative and average fishing hours at 0.1° and 0.5° resolution (.shp; .csv) and maps of each case study area.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Travel regions are not necessarily defined by political or administrative boundaries. For example, in the Schengen region of Europe, tourists can travel freely across borders irrespective of national borders. Identifying transboundary travel regions is an interesting problem which we aim to solve using mobility analysis of Twitter users. Our proposed solution comprises collecting geotagged tweets, combining them into trajectories and, thus, mining thousands of trips undertaken by twitter users. After aggregating these trips into a mobility graph, we apply a community detection algorithm to find coherent regions throughout the world. The discovered regions provide insights into international travel and can reveal both domestic and transnational travel regions.