U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
This dataset provides shapefile of outlines of the 68 lakes where temperature was modeled as part of this study. The format is a shapefile for all lakes combined (.shp, .shx, .dbf, and .prj files). This dataset is part of a larger data release of lake temperature model inputs and outputs for 68 lakes in the U.S. states of Minnesota and Wisconsin (http://dx.doi.org/10.5066/P9AQPIVD).
This dataset provides shapefile outlines of the 7,150 lakes that had temperature modeled as part of this study. The format is a shapefile for all lakes combined (.shp, .shx, .dbf, and .prj files). A csv file of lake metadata is also included. This dataset is part of a larger data release of lake temperature model inputs and outputs for 7,150 lakes in the U.S. states of Minnesota and Wisconsin (http://dx.doi.org/10.5066/P9CA6XP8).
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
When I started exploring how to create interactive maps (using the leaflet() package in R) I come across this free data set (shapefile format) that contains the geographical coordinates (polygons) for all the countries in the world. I thought it would be nice to share this with the Kaggle community.
The .zip folder contains all the necessary files needed for the shapefile data to work properly on your computer. If you are new to using the shapefile format, please see the information provided below:
https://en.wikipedia.org/wiki/Shapefile "The shapefile format stores the data as primitive geometric shapes like points, lines, and polygons. These shapes, together with data attributes that are linked to each shape, create the representation of the geographic data. The term "shapefile" is quite common, but the format consists of a collection of files with a common filename prefix, stored in the same directory. The three mandatory files have filename extensions .shp, .shx, and .dbf. The actual shapefile relates specifically to the .shp file, but alone is incomplete for distribution as the other supporting files are required. "
Made with Natural Earth. Free vector and raster map data @ naturalearthdata.com.
The Ontario government, generates and maintains thousands of datasets. Since 2012, we have shared data with Ontarians via a data catalogue. Open data is data that is shared with the public. Click here to learn more about open data and why Ontario releases it. Ontario’s Open Data Directive states that all data must be open, unless there is good reason for it to remain confidential. Ontario’s Chief Digital and Data Officer also has the authority to make certain datasets available publicly. Datasets listed in the catalogue that are not open will have one of the following labels: If you want to use data you find in the catalogue, that data must have a licence – a set of rules that describes how you can use it. A licence: Most of the data available in the catalogue is released under Ontario’s Open Government Licence. However, each dataset may be shared with the public under other kinds of licences or no licence at all. If a dataset doesn’t have a licence, you don’t have the right to use the data. If you have questions about how you can use a specific dataset, please contact us. The Ontario Data Catalogue endeavors to publish open data in a machine readable format. For machine readable datasets, you can simply retrieve the file you need using the file URL. The Ontario Data Catalogue is built on CKAN, which means the catalogue has the following features you can use when building applications. APIs (Application programming interfaces) let software applications communicate directly with each other. If you are using the catalogue in a software application, you might want to extract data from the catalogue through the catalogue API. Note: All Datastore API requests to the Ontario Data Catalogue must be made server-side. The catalogue's collection of dataset metadata (and dataset files) is searchable through the CKAN API. The Ontario Data Catalogue has more than just CKAN's documented search fields. You can also search these custom fields. You can also use the CKAN API to retrieve metadata about a particular dataset and check for updated files. Read the complete documentation for CKAN's API. Some of the open data in the Ontario Data Catalogue is available through the Datastore API. You can also search and access the machine-readable open data that is available in the catalogue. How to use the API feature: Read the complete documentation for CKAN's Datastore API. The Ontario Data Catalogue contains a record for each dataset that the Government of Ontario possesses. Some of these datasets will be available to you as open data. Others will not be available to you. This is because the Government of Ontario is unable to share data that would break the law or put someone's safety at risk. You can search for a dataset with a word that might describe a dataset or topic. Use words like “taxes” or “hospital locations” to discover what datasets the catalogue contains. You can search for a dataset from 3 spots on the catalogue: the homepage, the dataset search page, or the menu bar available across the catalogue. On the dataset search page, you can also filter your search results. You can select filters on the left hand side of the page to limit your search for datasets with your favourite file format, datasets that are updated weekly, datasets released by a particular organization, or datasets that are released under a specific licence. Go to the dataset search page to see the filters that are available to make your search easier. You can also do a quick search by selecting one of the catalogue’s categories on the homepage. These categories can help you see the types of data we have on key topic areas. When you find the dataset you are looking for, click on it to go to the dataset record. Each dataset record will tell you whether the data is available, and, if so, tell you about the data available. An open dataset might contain several data files. These files might represent different periods of time, different sub-sets of the dataset, different regions, language translations, or other breakdowns. You can select a file and either download it or preview it. Make sure to read the licence agreement to make sure you have permission to use it the way you want. Read more about previewing data. A non-open dataset may be not available for many reasons. Read more about non-open data. Read more about restricted data. Data that is non-open may still be subject to freedom of information requests. The catalogue has tools that enable all users to visualize the data in the catalogue without leaving the catalogue – no additional software needed. Have a look at our walk-through of how to make a chart in the catalogue. Get automatic notifications when datasets are updated. You can choose to get notifications for individual datasets, an organization’s datasets or the full catalogue. You don’t have to provide and personal information – just subscribe to our feeds using any feed reader you like using the corresponding notification web addresses. Copy those addresses and paste them into your reader. Your feed reader will let you know when the catalogue has been updated. The catalogue provides open data in several file formats (e.g., spreadsheets, geospatial data, etc). Learn about each format and how you can access and use the data each file contains. A file that has a list of items and values separated by commas without formatting (e.g. colours, italics, etc.) or extra visual features. This format provides just the data that you would display in a table. XLSX (Excel) files may be converted to CSV so they can be opened in a text editor. How to access the data: Open with any spreadsheet software application (e.g., Open Office Calc, Microsoft Excel) or text editor. Note: This format is considered machine-readable, it can be easily processed and used by a computer. Files that have visual formatting (e.g. bolded headers and colour-coded rows) can be hard for machines to understand, these elements make a file more human-readable and less machine-readable. A file that provides information without formatted text or extra visual features that may not follow a pattern of separated values like a CSV. How to access the data: Open with any word processor or text editor available on your device (e.g., Microsoft Word, Notepad). A spreadsheet file that may also include charts, graphs, and formatting. How to access the data: Open with a spreadsheet software application that supports this format (e.g., Open Office Calc, Microsoft Excel). Data can be converted to a CSV for a non-proprietary format of the same data without formatted text or extra visual features. A shapefile provides geographic information that can be used to create a map or perform geospatial analysis based on location, points/lines and other data about the shape and features of the area. It includes required files (.shp, .shx, .dbt) and might include corresponding files (e.g., .prj). How to access the data: Open with a geographic information system (GIS) software program (e.g., QGIS). A package of files and folders. The package can contain any number of different file types. How to access the data: Open with an unzipping software application (e.g., WinZIP, 7Zip). Note: If a ZIP file contains .shp, .shx, and .dbt file types, it is an ArcGIS ZIP: a package of shapefiles which provide information to create maps or perform geospatial analysis that can be opened with ArcGIS (a geographic information system software program). A file that provides information related to a geographic area (e.g., phone number, address, average rainfall, number of owl sightings in 2011 etc.) and its geospatial location (i.e., points/lines). How to access the data: Open using a GIS software application to create a map or do geospatial analysis. It can also be opened with a text editor to view raw information. Note: This format is machine-readable, and it can be easily processed and used by a computer. Human-readable data (including visual formatting) is easy for users to read and understand. A text-based format for sharing data in a machine-readable way that can store data with more unconventional structures such as complex lists. How to access the data: Open with any text editor (e.g., Notepad) or access through a browser. Note: This format is machine-readable, and it can be easily processed and used by a computer. Human-readable data (including visual formatting) is easy for users to read and understand. A text-based format to store and organize data in a machine-readable way that can store data with more unconventional structures (not just data organized in tables). How to access the data: Open with any text editor (e.g., Notepad). Note: This format is machine-readable, and it can be easily processed and used by a computer. Human-readable data (including visual formatting) is easy for users to read and understand. A file that provides information related to an area (e.g., phone number, address, average rainfall, number of owl sightings in 2011 etc.) and its geospatial location (i.e., points/lines). How to access the data: Open with a geospatial software application that supports the KML format (e.g., Google Earth). Note: This format is machine-readable, and it can be easily processed and used by a computer. Human-readable data (including visual formatting) is easy for users to read and understand. This format contains files with data from tables used for statistical analysis and data visualization of Statistics Canada census data. How to access the data: Open with the Beyond 20/20 application. A database which links and combines data from different files or applications (including HTML, XML, Excel, etc.). The database file can be converted to a CSV/TXT to make the data machine-readable, but human-readable formatting will be lost. How to access the data: Open with Microsoft Office Access (a database management system used to develop application software). A file that keeps the original layout and
This dataset contains documentation on the 146 global regions used to organize responses to the ArchaeGLOBE land use questionnaire between May 18 and July 31, 2018. The regions were formed from modern administrative regions (Natural Earth 1:50m Admin1 - states and provinces, https://www.naturalearthdata.com/downloads/50m-cultural-vectors/50m-admin-1-states-provinces/). The boundaries of the polygons represent rough geographic areas that serve as analytical units useful in two respects - for the history of land use over the past 10,000 years (a moving target) and for the history of archaeological research. Some consideration was also given to creating regions that were relatively equal in size. The regionalization process went through several rounds of feedback and redrawing before arriving at the 146 regions used in the survey. No bounded regional system could ever truly reflect the complex spatial distribution of archaeological knowledge on past human land use, but operating at a regional scale was necessary to facilitate timely collaboration while achieving global coverage. Map in Google Earth Format: ArchaeGLOBE_Regions_kml.kmz Map in ArcGIS Shapefile Format: ArchaeGLOBE_Regions.zip (multiple files in zip file) The shapefile format is a digital vector file that stores geographic location and associated attribute information. It is actually a collection of several different file types: .shp — shape format: the feature geometry .shx — shape index format: a positional index of the feature geometry .dbf — attribute format: columnar attributes for each shape .prj — projection format: the coordinate system and projection information .sbn and .sbx — a spatial index of the features .shp.xml — geospatial metadata in XML format .cpg — specifies the code page for identifying character encoding Attributes: FID - a unique identifier for every object in a shapefile table (0-145) Shape - the type of object (polygon) World_ID - coded value assigned to each feature according to its division into one of seventeen ‘World Regions’ based on the geographic regions used by the Statistics Division of the United Nations (https://unstats.un.org/unsd/methodology/m49/), with small changes to better reflect archaeological scholarly communities. These large regions provide organizational structure, but are not analytical units for the study. World_RG - text description of each ‘World Region’ Archaeo_ID - unique identifier (1-146) corresponding to the region code used in the ArchaeoGLOBE land use questionnaire and all ArchaeoGLOBE datasets Archaeo_RG - text description of each region Total_Area - the total area, in square kilometers, of each region Land-Area - the total area minus the area of all lakes and reservoirs found within each region (source: https://www.naturalearthdata.com/downloads/10m-physical-vectors/10m-lakes/) PDF of Region Attribute Table: ArchaeoGLOBE Regions Attributes.pdf Excel file of Region Attribute Table: ArchaeoGLOBE Regions Attributes.xls Printed Maps in PDF Format: ArchaeoGLOBE Regions.pdf Documentation of the ArchaeoGLOBE Regional Map: ArchaeoGLOBE Regions README.doc
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In order to use the standard color legend for Romanian soil type maps in the ESRI ArcMap-10 electronic format, a dataset consisting a shapefile set (.dbf, .shp, .shx, .sbn, and .sbx files), four different .lyr files, and three different .style files have been prepared (ESRI, 2016). The shapefile set is not a “real” georeferenced layer/coverage; it is designed only to handle all the instants of soil types from the standard legend. This legend contains 67 standard items: 63 proper colors (different color hues, each of them having, generally, 2 - 4 degrees of lightness and/or chroma, four shades of grey, and white color), and four hatching patterns on white background (ESRI, 2016). The “color difference DE*ab” between any two legend colors, calculated with the color perceptually-uniform model CIELAB , is greater than 10 units, thus ensuring acceptably-distinguishable colors in the legend. The 67 standard items are assigned to 60 main soils existing in Romania, four main nonsoils, and three special cases of unsurveyed land. The soils are specified in terms of the current Romanian system of soil taxonomy, SRTS-2012+, and of the international soil classification system WRB-2014. The four different .lyr files presented here are: legend_soilcode_srts_wrb.lyr, legend_soilcode_wrb.lyr, legend_colourcode_srts_wrb.lyr, and legend_colourcode_wrb.lyr. The first two of them are built using as value field the ‘Soil_codes’ field, and as labels (explanation texts) the ‘Soil_name’ field (storing the soil types according to SRTS/WRB classification), respectively, the ‘WRB’ field (the soil type according to WRB classification), while the last two .lyr files are built using as value field the ‘colour_code’ field (storing the color codes) and as labels the soil name in SRTS and WRB, respectively, in WRB classification. In order to exemplify how the legend is displayed, two .jpg files are also presented: legend_soil_srts_wrb.jpg and legend_colour_wrb.jpg. The first displays the legend (symbols and labels) according to the SRTS classification order, the second according to the WRB classification. The three different .style files presented here are: soil_symbols.style, wrb_codes.style, and colour_codes.style. They use as name the soil acronym in SRTS classification, soil acronym in WRB classification, and, respectively, the color code.
Mule deer populations continue to decline across much of the western United States due to loss of habitat, starvation, and severe climate patterns, such as drought. In order to track the home range size and ecological preferences of mule deer, an important species for culture, economy, and ecosystems, the New Mexico Bureau of Land Management Taos Field Office captured mule deer, attached collars to them, and released them into Rio Grande del Norte National Monument. Collected from 2015-2017, each unique entry is one deer during one year, for a total of 23 entries. The point data was then intersected with vegetation data in the area, and the density of points was determined through Kernel Density Estimation (KDE). Reclassified BLM Vegetation Treatment data was used for zonal statistics on the KDE data and offered insights into mule deer response to treatments. This project was conducted as a joint project between the NMBLM TFO, Fort Collins USGS Science Center, and Kent State University’s Biogeography & Landscape Dynamics lab. This dataset includes all spatial data (CPG, DBF, XLSX, PRJ, SBN, SBX, SHP, and SHX) files for the comprehensive location fix shapefile, the convex hulls, the reclassified LANDFIRE EVT raster, the analysis area, the reclassified BLM Vegetation Treatment groups, the Kernel Density Estimation result, and the hill shade and state boundary data.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
http://spdx.org/licenses/CC0-1.0http://spdx.org/licenses/CC0-1.0
The dataset contains simulations of freshwater runoff from ocean outlets around Svalbard as well as drainage basin outlines. The runoff is given as daily time series of total (m^3/day) and specific (m/day) runoff from a given outlet point.
The runoff is simulated by the CryoGrid community model, forced by meteorological variables from the CARRA reanalysis, at a 2.5x2.5 km resolution. These simulations have previously been described by Schmidt et al (2023), and the dataset is available here . This runoff is then divided into drainage basins. Drainage basins and ocean outlets are determined using TopoToolbox.
The drainage basins are calculated assuming both surface and subsurface routing. To determine the subglacial water routing, we calculate the hydraulic head for different flotation fractions k. k=1.1 is approximately surface flow, while k=0.8, 0.9 and 1.0 considers different degrees of subglacial routing.
The method is described in detail in the README.pdf file attached with the dataset.
specific_runoff: contains files of the specific runoff (m w.e./day) into ocean outlets around Svalbard for k=0.8-1.1
total_runoff: contains files of the total runoff (m^3/day) into ocean outlets around Svalbard for k=0.8-1.1
shapefiles: contains shapefiles with all calculated runoff outlets on Svalbard for k=0.8-1.1
This folder contains four subfolders for different values of k (0.8, 0.9, 1.0, 1.1).
The naming convention of files in the subfolders is: specific_runoff_Svalbard_{YEAR}_k_{KVALUE}.csv where
YEAR is a year between 1991 and 2023
KVALUE is the value of k used for drainage basin delineation (0.8, 0.9, 1.0, 1.1)
Variables: Specific Runoff (m w.e./day)
Time resolution: daily
Each file contains: - Longitude and latitude coordinates - Outlet depth (m a.s.l) - The daily specific runoff value (m w.e.)
This folder contains four subfolders for different values of k (0.8, 0.9, 1.0, 1.1).
The naming convention of files in the subfolders is: total_runoff_Svalbard_{YEAR}_k_{KVALUE}.csv where
YEAR is a year between 1991 and 2023
KVALUE is the value of k used for drainage basin delineation (0.8, 0.9, 1.0, 1.1)
Variables: total_runoff (m^3/day)
Time resolution: daily
Each file contains: - Longitude and latitude coordinates - Outlet depth (m a.s.l) - The daily total runoff value (m^3)
The naming convention of files in the folder is: drainage_basins_land_Svalbard_k_{KVALUE}.shp/shx where
KVALUE is the value of k used for drainage basin delineation (0.8, 0.9, 1.0, 1.1)
Variables: drainage basin outlines
Time resolution: none
Each file contains: - Geometry - Bounding box - x-coordinate (longitude) - y-coordinate (latitude)
Global Administrative Areas of Spain
Url with data from any country in the world http://www.gadm.org/country Format file: zip Inside zip files: shp, shx, csv, cpg, dbf, prj
Fields: OBJECTID ID_0 ISO NAME_0 ID_1 NAME_1 ID_2 NAME_2 HASC_2 CCN_2 CCA_2 TYPE_2 ENGTYPE_2 NL_NAME_2 VARNAME_2
URL zip http://biogeo.ucdavis.edu/data/gadm2.8/shp/ESP_adm_shp.zip
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Overview
3DHD CityScenes is the most comprehensive, large-scale high-definition (HD) map dataset to date, annotated in the three spatial dimensions of globally referenced, high-density LiDAR point clouds collected in urban domains. Our HD map covers 127 km of road sections of the inner city of Hamburg, Germany including 467 km of individual lanes. In total, our map comprises 266,762 individual items.
Our corresponding paper (published at ITSC 2022) is available here.
Further, we have applied 3DHD CityScenes to map deviation detection here.
Moreover, we release code to facilitate the application of our dataset and the reproducibility of our research. Specifically, our 3DHD_DevKit comprises:
The DevKit is available here:
https://github.com/volkswagen/3DHD_devkit.
The dataset and DevKit have been created by Christopher Plachetka as project lead during his PhD period at Volkswagen Group, Germany.
When using our dataset, you are welcome to cite:
@INPROCEEDINGS{9921866,
author={Plachetka, Christopher and Sertolli, Benjamin and Fricke, Jenny and Klingner, Marvin and
Fingscheidt, Tim},
booktitle={2022 IEEE 25th International Conference on Intelligent Transportation Systems (ITSC)},
title={3DHD CityScenes: High-Definition Maps in High-Density Point Clouds},
year={2022},
pages={627-634}}
Acknowledgements
We thank the following interns for their exceptional contributions to our work.
The European large-scale project Hi-Drive (www.Hi-Drive.eu) supports the publication of 3DHD CityScenes and encourages the general publication of information and databases facilitating the development of automated driving technologies.
The Dataset
After downloading, the 3DHD_CityScenes folder provides five subdirectories, which are explained briefly in the following.
1. Dataset
This directory contains the training, validation, and test set definition (train.json, val.json, test.json) used in our publications. Respective files contain samples that define a geolocation and the orientation of the ego vehicle in global coordinates on the map.
During dataset generation (done by our DevKit), samples are used to take crops from the larger point cloud. Also, map elements in reach of a sample are collected. Both modalities can then be used, e.g., as input to a neural network such as our 3DHDNet.
To read any JSON-encoded data provided by 3DHD CityScenes in Python, you can use the following code snipped as an example.
import json
json_path = r"E:\3DHD_CityScenes\Dataset\train.json"
with open(json_path) as jf:
data = json.load(jf)
print(data)
2. HD_Map
Map items are stored as lists of items in JSON format. In particular, we provide:
3. HD_Map_MetaData
Our high-density point cloud used as basis for annotating the HD map is split in 648 tiles. This directory contains the geolocation for each tile as polygon on the map. You can view the respective tile definition using QGIS. Alternatively, we also provide respective polygons as lists of UTM coordinates in JSON.
Files with the ending .dbf, .prj, .qpj, .shp, and .shx belong to the tile definition as “shape file” (commonly used in geodesy) that can be viewed using QGIS. The JSON file contains the same information provided in a different format used in our Python API.
4. HD_PointCloud_Tiles
The high-density point cloud tiles are provided in global UTM32N coordinates and are encoded in a proprietary binary format. The first 4 bytes (integer) encode the number of points contained in that file. Subsequently, all point cloud values are provided as arrays. First all x-values, then all y-values, and so on. Specifically, the arrays are encoded as follows.
After reading, respective values have to be unnormalized. As an example, you can use the following code snipped to read the point cloud data. For visualization, you can use the pptk package, for instance.
import numpy as np
import pptk
file_path = r"E:\3DHD_CityScenes\HD_PointCloud_Tiles\HH_001.bin"
pc_dict = {}
key_list = ['x', 'y', 'z', 'intensity', 'is_ground']
type_list = ['
5. Trajectories
We provide 15 real-world trajectories recorded during a measurement campaign covering the whole HD map. Trajectory samples are provided approx. with 30 Hz and are encoded in JSON.
These trajectories were used to provide the samples in train.json, val.json. and test.json with realistic geolocations and orientations of the ego vehicle.
- OP1 – OP5 cover the majority of the map with 5 trajectories.
- RH1 – RH10 cover the majority of the map with 10 trajectories.
Note that OP5 is split into three separate parts, a-c. RH9 is split into two parts, a-b. Moreover, OP4 mostly equals OP1 (thus, we speak of 14 trajectories in our paper). For completeness, however, we provide all recorded trajectories here.
Mule deer home range data collected by GPS collars. Collected from 2015-2017 for the New Mexico Bureau of Land Management Taos Field Office. Each unique entry is one deer during one year, for a total of 23 entries. The point data was used to create minimum convex hulls depicting one deer-year's home range extent. This dataset includes all spatial data - CPG, DBF, XLSX, PRJ, SBN, SBX, SHP, and SHX files.
GeoJunxion‘s ZIP+4 is a complete dataset based on US postal data consisting of plus 35 millions of polygons. The dataset is NOT JUST a table of spot data, which can be downloaded as csv or other text file as it happens with other suppliers. The data can be delivered as shapefile through a single RAW data delivery or through an API.
The January 2021 USPS data source has significantly changed since the previous delivery. Some States have sizably lower ZIP+4 totals across all counties when compared with previous deliveries due to USPS parcelpoint cleanup, while other States have a significant increase in ZIP+4 totals across all counties due to cleanup and other rezoning. California and North Carolina in particular have several new ZIP5s, contributing to the increase in distinct ZIPs and ZIP+4s.
GeoJunxion‘s ZIP+4 data can be used as an additional layer on an existing map to run customer or other analysis, e.g. who is my customer who not, what is the density of my customer base in a certain ZIP+4 etc.
Information can be put into visual context, due to the polygons, which is good for complex overviews or management decisions. CRM data can be enriched with the ZIP+4 to have more detailed customer information.
Key specifications:
Topologized ZIP polygons
GeoJunxion ZIP+4 polygons follow USPS postal codes
ZIP+4 code polygons:
ZIP5 attributes
State codes.
Overlapping ZIP+4
boundaries for multiple ZIP+4 addresses on one area
Updated USPS source (January 2021)
Distinct ZIP5 codes: 34 731
Distinct ZIP+4 codes: 35 146 957
The ZIP + 4 polygons are delivered in Esri shapefile format. This format allows the storage of geometry and attribute information for each of the features.
The four components for the shapefile data are:
.shp – This file stores the geometry of the feature
.shx –This file stores an index that stores the feature geometry
.dbf –This file stores attribute information relating to individual features
.prj –This file stores projection information associated with features
Current release version 2021. Earlier versions from previous years available on request.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Global prevalence of non-perennial rivers and streamsJune 2021prepared by Mathis L. Messager (mathis.messager@mail.mcgill.ca)Bernhard Lehner (bernhard.lehner@mcgill.ca)1. Overview and background 2. Repository content3. Data format and projection4. License and citations4.1 License agreement4.2 Citations and acknowledgements1. Overview and backgroundThis documentation describes the data produced for the research article: Messager, M. L., Lehner, B., Cockburn, C., Lamouroux, N., Pella, H., Snelder, T., Tockner, K., Trautmann, T., Watt, C. & Datry, T. (2021). Global prevalence of non-perennial rivers and streams. Nature. https://doi.org/10.1038/s41586-021-03565-5In this study, we developed a statistical Random Forest model to produce the first reach-scale estimate of the global distribution of non-perennial rivers and streams. For this purpose, we linked quality-checked observed streamflow data from 5,615 gauging stations (on 4,428 perennial and 1,187 non-perennial reaches) with 113 candidate environmental predictors available globally. Predictors included variables describing climate, physiography, land cover, soil, geology, and groundwater as well as estimates of long-term naturalised (i.e., without anthropogenic water use in the form of abstractions or impoundments) mean monthly and mean annual flow (MAF), derived from a global hydrological model (WaterGAP 2.2; Müller Schmied et al. 2014). Following model training and validation, we predicted the probability of flow intermittence for all river reaches in the RiverATLAS database (Linke et al. 2019), a digital representation of the global river network at high spatial resolution.The data repository includes two datasets resulting from this study:1. a geometric network of the global river system where each river segment is associated with:i. 113 hydro-environmental predictors used in model development and predictions, andii. the probability and class of flow intermittence predicted by the model.2. point locations of the 5,516 gauging stations used in model training/testing, where each station is associated with a line segment representing a reach in the river network, and a set of metadata.These datasets have been generated with source code located at messamat.github.io/globalirmap/.Note that, although several attributes initially included in RiverATLAS version 1.0 have been updated for this study, the dataset provided here is not an established new version of RiverATLAS. 2. Repository contentThe data repository has the following structure (for usage, see section 3. Data Format and Projection; GIRES stands for Global Intermittent Rivers and Ephemeral Streams):— GIRES_v10_gdb.zip/ : file geodatabase in ESRI® geodatabase format containing two feature classes (zipped) |——— GIRES_v10_rivers : river network lines |——— GIRES_v10_stations : points with streamflow summary statistics and metadata— GIRES_v10_shp.zip/ : directory containing ten shapefiles (zipped) Same content as GIRES_v10_gdb.zip for users that cannot read ESRI geodatabases (tiled by region due to size limitations). |——— GIRES_v10_rivers_af.shp : Africa |——— GIRES_v10_rivers_ar.shp : North American Arctic |——— GIRES_v10_rivers_as.shp : Asia |——— GIRES_v10_rivers_au.shp : Australasia|——— GIRES_v10_rivers_eu.shp : Europe|——— GIRES_v10_rivers_gr.shp : Greenland|——— GIRES_v10_rivers_na.shp : North America|——— GIRES_v10_rivers_sa.shp : South America|——— GIRES_v10_rivers_si.shp : Siberia|——— GIRES_v10_stations.shp : points with streamflow summary statistics and metadata— Other_technical_documentations.zip/ : directory containing three documentation files (zipped)|——— HydroATLAS_TechDoc_v10.pdf : documentation for river network framework|——— RiverATLAS_Catalog_v10.pdf : documentation for river network hydro-environmental attributes|——— Readme_GSIM_part1.txt : documentation for gauging stations from the Global Streamflow Indices and Metadata (GSIM) archive— README_Technical_documentation_GIRES_v10.pdf : full documentation for this repository3. Data format and projectionThe geometric network (lines) and gauging stations (points) datasets are distributed both in ESRI® file geodatabase and shapefile formats. The file geodatabase contains all data and is the prime, recommended format. Shapefiles are provided as a copy for users that cannot read the geodatabase. Each shapefile consists of five main files (.dbf, .sbn, .sbx, .shp, .shx), and projection information is provided in an ASCII text file (.prj). The attribute table can be accessed as a stand-alone file in dBASE format (.dbf) which is included in the Shapefile format. These datasets are available electronically in compressed zip file format. To use the data files, the zip files must first be decompressed.All data layers are provided in geographic (latitude/longitude) projection, referenced to datum WGS84. In ESRI® software this projection is defined by the geographic coordinate system GCS_WGS_1984 and datum D_WGS_1984 (EPSG: 4326).4. License and citations4.1 License agreement This documentation and datasets are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License (CC-BY-4.0 License). For all regulations regarding license grants, copyright, redistribution restrictions, required attributions, disclaimer of warranty, indemnification, liability, waiver of damages, and a precise definition of licensed materials, please refer to the License Agreement (https://creativecommons.org/licenses/by/4.0/legalcode). For a human-readable summary of the license, please see https://creativecommons.org/licenses/by/4.0/.4.2 Citations and acknowledgements.Citations and acknowledgements of this dataset should be made as follows:Messager, M. L., Lehner, B., Cockburn, C., Lamouroux, N., Pella, H., Snelder, T., Tockner, K., Trautmann, T., Watt, C. & Datry, T. (2021). Global prevalence of non-perennial rivers and streams. Nature. https://doi.org/10.1038/s41586-021-03565-5 We kindly ask users to cite this study in any published material produced using it. If possible, online links to this repository (https://doi.org/10.6084/m9.figshare.14633022) should also be provided.
The Randolph Glacier Inventory (RGI) is a globally complete inventory of glacier outlines. It is supplemental to the database compiled by the Global Land Ice Measurements from Space initiative (GLIMS). While GLIMS is a multi-temporal database with an extensive set of attributes, the RGI is intended to be a snapshot of the world’s glaciers as they were near the beginning of the 21st century (although in fact its range of dates is still substantial). Production of the RGI was motivated by the preparation of the Fifth Assessment Report of the Intergovernmental Panel on Climate Change (IPCC AR5).
Version 1.0 of the RGI was released in February 2012. It included a considerable number of unsubdivided ice bodies, which we refer to as glacier complexes, and a considerable number of nominal glaciers, which are glaciers for which only a location and an area are known; they are represented by circles of the appropriate area at the given location. Version 6.0, released in July 2017, has improved coverage of the conterminous US (regions 02-05 and 02-06), Scandinavia (region 08) and Iran (region 12-2). In Scandinavia several hundred smaller glaciers have been added and most glaciers now have exact dates. The flag attributes RGIFlag and GlacType were reorganized. Surging codes have been added from Sevestre and Benn (2015).
For version 1.0, we visualized the data in a geographic information system by overlaying outlines on modern satellite imagery, and assessed their quality relative to other available products. In several regions the outlines already in GLIMS were used for the RGI. Data from the World Glacier Inventory (WGI, http://nsidc.org/data/docs/noaa/g01130_glacier_inventory/; WGI, 1989) and the related WGI-XF (http://people.trentu.ca/~gcogley/glaciology; Cogley, 2009) were used for some nominal glaciers, mainly in the Pyrenees and in northern Asia. Where no other data were available we relied on data from the Digital Chart of the World (Danko, 1992).
The RGI is provided as shapefiles containing the outlines of glaciers in geographic coordinates (longitude and latitude, in degrees) which are referenced to the WGS84 datum. Data are organized by first-order region. For each region there is one shapefile (.SHP with accompanying .DBF, .PRJ and .SHX files) containing all glaciers and one ancillary .CSV file containing all hypsometric data. The attribute (.DBF) and hypsometric files contain one record per glacier. Each object in the RGI conforms to the data-model conventions of ESRI ArcGIS shapefiles. That is, each object consists of an outline encompassing the glacier, followed immediately by outlines representing all of its nunataks (ice-free areas enclosed by the glacier). In each object successive vertices are ordered such that glacier ice is on the right. This data model is not the same as the current GLIMS data model, in which nunataks are independent objects. The outlines of the RGI regions are provided as two shapefiles, one for first-order and one for second-order regions. A summary file containing glacier counts, glacierized area and a hypsometric list for each first-order and each second-order region is also provided. The 0.5°×0.5° grid is provided as a plain-text .DAT file in which zonal records of blank-separated glacierized areas in km2 are ordered from north to south. Information about RGI glaciers that are present in the mass balance tables of the WGMS database Fluctuations of Glaciers is provided as an ancillary .CSV file. The 19 regional attribute (.DBF) files are also provided in .CSV format.ReferencesRGI Consortium, 2017, Randolph Glacier Inventory (RGI) – A Dataset of Global Glacier Outlines: Version 6.0. Technical Report, Global Land Ice Measurements from Space, Boulder, Colorado, USA. Digital Media. DOI: https://doi.org/10.7265/N5-RGI-60 Pfeffer, W. T., Arendt, A. A., Bliss, A., Bolch, T., Cogley, J. G., Gardner, A. S., Hagen, J-O., Hock, R., Kaser, G., Kienholz, C., Miles, E. S., Moholdt, G., Molg, N., Paul, F., Radic, V., Rastner, P., Raup, B. H., Rich, J., Sharp, M. J. Glasser, N. (2014). The Randolph Glacier Inventory: A globally complete inventory of glaciers. Journal of Glaciology, 60 (221), 537-552.https://www.cambridge.org/core/services/aop-cambridge-core/content/view/730D4CC76E0E3EC1832FA3F4D90691CE/S002214300020600Xa.pdf/randolph_glacier_inventory_a_globally_complete_inventory_of_glaciers.pdf
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
These are supporting datasets for the paper titled, "Ice Content of Mantling Materials in Deuteronilus Mensae, Mars," Baker and Carter, 2023, Journal of Geophysical Research: Planets. Data Set S1 (ds01.txt) is a text file table of SHARAD radargrams analyzed in the paper. Data Sets S2 and S3 (ds02.zip and ds03.zip) are provided as ESRI shapefiles (.shp) and compressed as separate zip files. The shapefiles can be opened in most geographical information systems (GIS) software. Shapefiles consist of three individual files: main file (.shp), index file (.shx), and dBASE table (.dbf). Descriptions of each column or attribute in the text file or shapefile is provided in the readme documentation.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In order to use the Romanian color standard for soil type map legends, a dataset of ESRI ArcMap-10 files, consisting of a shapefile set (.dbf, .shp, .shx, .sbn, and .sbx files), four different .lyr files, and three different .style files (https://desktop.arcgis.com/en/arcmap/10.3/map/ : saving-layers-and-layer-packages, about-creating-new-symbols, what-are-symbols-and-styles-), have been prepared. The shapefile set is not a “real” georeferenced layer/coverage; it is designed only to handle all the instants of soil types from the standard legend.
This legend contains 67 standard items: 63 proper colors (different color hues, each of them having, generally, 2 - 4 degrees of lightness and/or chroma, four shades of grey, and white color), and four hatching patterns on white background. The “color difference DE*ab” between any two legend colors, calculated with the color perceptually-uniform model CIELAB, is greater than 10 units, thus ensuring acceptably-distinguishable colors in the legend. The 67 standard items are assigned to 60 main soils existing in Romania, four main nonsoils, and three special cases of unsurveyed land. The soils are specified in terms of the current Romanian system of soil taxonomy, SRTS-2012+, and of the international system WRB-2014.
The four different .lyr files presented here are: legend_soilcode_srts_wrb.lyr, legend_soilcode_wrb.lyr, legend_colorcode_srts_wrb.lyr, and legend_colorcode_wrb.lyr. The first two of them are built using as value field the “Soil_codes” field, and as labels (explanation texts) the “Soil_name” field (storing the soil types according to SRTS/WRB classification), respectively, the “WRB” field (the soil type according to WRB classification), while the last two .lyr files are built using as value field the “color_code” field (storing the color codes) and as labels the soil name in SRTS and WRB, respectively, in WRB classification.
In order to exemplify how the legend is displayed, two .jpg files are also presented: legend_soil_srts_wrb.jpg and legend_color_wrb.jpg. The first displays the legend (symbols and labels) according to the SRTS classification order, the second according to the WRB classification.
The three different .style files presented here are: soil_symbols.style, wrb_codes.style, and color_codes.style. They use as name the soil acronym in SRTS classification, soil acronym in WRB classification, and, respectively, the color code.
The presented file set may be used to directly implement the Romanian color standard in digital soil type map legends, or may be adjusted/modified to other specific requirements.
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
This dataset provides shapefile outlines of the 881 lakes that had temperature modeled as part of this study. The format is a shapefile for all lakes combined (.shp, .shx, .dbf, and .prj files). A csv file of lake metadata is also included. This dataset is part of a larger data release of lake temperature model inputs and outputs for 881 lakes in the U.S. state of Minnesota (https://doi.org/10.5066/P9PPHJE2).
If you use the SWORD Database in your work, please cite: Altenau et al., (2021) The Surface Water and Ocean Topography (SWOT) Mission River Database (SWORD): A Global River Network for Satellite Data Products. Water Resources Research. https://doi.org/10.1029/2021WR030054 1. Summary: The upcoming Surface Water and Ocean Topography (SWOT) satellite mission, planned to launch in 2022, will vastly expand observations of river water surface elevation (WSE), width, and slope. In order to facilitate a wide range of new analyses with flexibility, the SWOT mission will provide a range of relevant data products. One product the SWOT mission will provide are river vector products stored in shapefile format for each SWOT overpass (JPL Internal Document, 2020b). The SWOT vector data products will be most broadly useful if they allow multitemporal analysis of river nodes and reaches covering the same river areas. Doing so requires defining SWOT reaches and nodes a priori, so that SWOT data can be assigned to them. The SWOt River Database (SWORD) combines multiple global river- and satellite-related datasets to define the nodes and reaches that will constitute SWOT river vector data products. SWORD provides high-resolution river nodes (200 m) and reaches (~10 km) in shapefile and netCDF formats with attached hydrologic variables (WSE, width, slope, etc.) as well as a consistent topological system for global rivers 30 m wide and greater. 2. Data Formats: The SWORD database is provided in netCDF and shapefile formats. All files start with a two-digit continent identifier (“af” – Africa, “as” – Asia / Siberia, “eu” – Europe / Middle East, “na” – North America, “oc” – Oceania, “sa” – South America). File syntax denotes the regional information for each file and varies slightly between netCDF and shapefile formats. NetCDF files are structured in 3 groups: centerlines, nodes, and reaches. The centerline group contains location information and associated reach and node ids along the original GRWL 30 m centerlines (Allen and Pavelsky, 2018). Node and reach groups contain hydrologic attributes at the ~200 m node and ~10 km reach locations (see description of attributes below). NetCDFs are distributed at continental scales with a filename convention as follows: [continent]_sword_v2.nc (i.e. na_sword_v2.nc). SWORD shapefiles consist of four main files (.dbf, .prj, .shp, .shx). There are separate shapefiles for nodes and reaches, where nodes are represented as ~200 m spaced points and reaches are represented as polylines. All shapefiles are in geographic (latitude/longitude) projection, referenced to datum WGS84. Shapefiles are split into HydroBASINS (Lehner and Grill, 2013) Pfafstetter level 2 basins (hbXX) for each continent with a naming convention as follows: [continent]_sword_[nodes/reaches]_hb[XX]_v2.shp (i.e. na_sword_nodes_hb74_v2.shp; na_sword_reaches_hb74_v2.shp). 3. Attribute Description: This list contains the primary attributes contained in the SWORD netCDFs and shapefiles. x: Longitude of the node or reach ranging from 180°E to 180°W (units: decimal degrees). y: Latitude of the node or reach ranging from 90°S to 90°N (units: decimal degrees). node_id: ID of each node. The format of the id is as follows: CBBBBBRRRRNNNT where C = Continent (the first number of the Pfafstetter basin code), B = Remaining Pfafstetter basin code up to level 6, R = Reach number (assigned sequentially within a level 6 basin starting at the downstream end working upstream), N = Node number (assigned sequentially within a reach starting at the downstream end working upstream), T = Type (1 – river, 3 – lake on river, 4 – dam or waterfall, 5 – unreliable topology, 6 – ghost node). node_length (node files only): Node length measured along the GRWL centerline points (units: meters). reach_id: ID of each reach. The format of the id is as follows: CBBBBBRRRRT where C = Continent (the first number of the Pfafstetter basin code), B = Remaining Pfafstetter basin codes up to level 6, R = Reach number (assigned sequentially within a level 6 basin starting at the downstream end working upstream, T = Type (1 – river, 3 – lake on river, 4 – dam or waterfall, 5 – unreliable topology, 6 – ghost reach). reach_length (reach files only): Reach length measured along the GRWL centerline points (units: meters). wse: Average water surface elevation (WSE) value for a node or reach. WSEs are extracted from the MERIT Hydro dataset (Yamazaki et al., 2019) and referenced to the EGM96 geoid (units: meters). wse_var: WSE variance along the GRWL centerline points used to calculate the average WSE for each node or reach (units: square meters). width: Average width for a node or reach (units: meters). width_var: Width variance along the GRWL centerline points used to calculate the average width for each node or reach (units: square meters). max_width: Maximum width value across the channel for each node or reach that includes island and bar areas (units: meters). facc: Maxim...
This data-set contains all data resources, either directly downloadable via this platform or as links to external databases, to execute the generic modeling tool as described in D5.4
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
This dataset provides shapefile of outlines of the 68 lakes where temperature was modeled as part of this study. The format is a shapefile for all lakes combined (.shp, .shx, .dbf, and .prj files). This dataset is part of a larger data release of lake temperature model inputs and outputs for 68 lakes in the U.S. states of Minnesota and Wisconsin (http://dx.doi.org/10.5066/P9AQPIVD).