This file contains the data set used to develop a random forest model predict background specific conductivity for stream segments in the contiguous United States. This Excel readable file contains 56 columns of parameters evaluated during development. The data dictionary provides the definition of the abbreviations and the measurement units. Each row is a unique sample described as R** which indicates the NHD Hydrologic Unit (underscore), up to a 7-digit COMID, (underscore) sequential sample month. To develop models that make stream-specific predictions across the contiguous United States, we used StreamCat data set and process (Hill et al. 2016; https://github.com/USEPA/StreamCat). The StreamCat data set is based on a network of stream segments from NHD+ (McKay et al. 2012). These stream segments drain an average area of 3.1 km2 and thus define the spatial grain size of this data set. The data set consists of minimally disturbed sites representing the natural variation in environmental conditions that occur in the contiguous 48 United States. More than 2.4 million SC observations were obtained from STORET (USEPA 2016b), state natural resource agencies, the U.S. Geological Survey (USGS) National Water Information System (NWIS) system (USGS 2016), and data used in Olson and Hawkins (2012) (Table S1). Data include observations made between 1 January 2001 and 31 December 2015 thus coincident with Moderate Resolution Imaging Spectroradiometer (MODIS) satellite data (https://modis.gsfc.nasa.gov/data/). Each observation was related to the nearest stream segment in the NHD+. Data were limited to one observation per stream segment per month. SC observations with ambiguous locations and repeat measurements along a stream segment in the same month were discarded. Using estimates of anthropogenic stress derived from the StreamCat database (Hill et al. 2016), segments were selected with minimal amounts of human activity (Stoddard et al. 2006) using criteria developed for each Level II Ecoregion (Omernik and Griffith 2014). Segments were considered as potentially minimally stressed where watersheds had 0 - 0.5% impervious surface, 0 – 5% urban, 0 – 10% agriculture, and population densities from 0.8 – 30 people/km2 (Table S3). Watersheds with observations with large residuals in initial models were identified and inspected for evidence of other human activities not represented in StreamCat (e.g., mining, logging, grazing, or oil/gas extraction). Observations were removed from disturbed watersheds, with a tidal influence or unusual geologic conditions such as hot springs. About 5% of SC observations in each National Rivers and Stream Assessment (NRSA) region were then randomly selected as independent validation data. The remaining observations became the large training data set for model calibration. This dataset is associated with the following publication: Olson, J., and S. Cormier. Modeling spatial and temporal variation in natural background specific conductivity. ENVIRONMENTAL SCIENCE & TECHNOLOGY. American Chemical Society, Washington, DC, USA, 53(8): 4316-4325, (2019).
SafeGraph Places provides baseline information for every record in the SafeGraph product suite via the Places schema and polygon information when applicable via the Geometry schema. The current scope of a place is defined as any location humans can visit with the exception of single-family homes. This definition encompasses a diverse set of places ranging from restaurants, grocery stores, and malls; to parks, hospitals, museums, offices, and industrial parks. Premium sets of Places include apartment buildings, Parking Lots, and Point POIs (such as ATMs or transit stations).
SafeGraph Places is a point of interest (POI) data offering with varying coverage depending on the country. Note that address conventions and formatting vary across countries. SafeGraph has coalesced these fields into the Places schema.
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
This dataset is built for time-series Sentinel-2 cloud detection and stored in Tensorflow TFRecord (refer to https://www.tensorflow.org/tutorials/load_data/tfrecord).
Each file is compressed in 7z format and can be decompressed using Bandzip or 7-zip software.
Dataset Structure:
Each filename can be split into three parts using underscores. The first part indicates whether it is designated for training or validation ('train' or 'val'); the second part indicates the Sentinel-2 tile name, and the last part indicates the number of samples in this file.
For each sample, it includes:
Sample ID;
Array of time series 4 band image patches in 10m resolution, shaped as (n_timestamps, 4, 42, 42);
Label list indicating cloud cover status for the center (6\times6) pixels of each timestamp;
Ordinal list for each timestamp;
Sample weight list (reserved);
Here is a demonstration function for parsing the TFRecord file:
import tensorflow as tf
def parseRecordDirect(fname): sep = '/' parts = tf.strings.split(fname,sep) tn = tf.strings.split(parts[-1],sep='_')[-2] nn = tf.strings.to_number(tf.strings.split(parts[-1],sep='_')[-1],tf.dtypes.int64) t = tf.data.Dataset.from_tensors(tn).repeat().take(nn) t1 = tf.data.TFRecordDataset(fname) ds = tf.data.Dataset.zip((t, t1)) return ds
keys_to_features_direct = { 'localid': tf.io.FixedLenFeature([], tf.int64, -1), 'image_raw_ldseries': tf.io.FixedLenFeature((), tf.string, ''), 'labels': tf.io.FixedLenFeature((), tf.string, ''), 'dates': tf.io.FixedLenFeature((), tf.string, ''), 'weights': tf.io.FixedLenFeature((), tf.string, '') }
class SeriesClassificationDirectDecorder(decoder.Decoder): """A tf.Example decoder for tfds classification datasets.""" def init(self) -> None: super()._init_()
def decode(self, tid, ds): parsed = tf.io.parse_single_example(ds, keys_to_features_direct) encoded = parsed['image_raw_ldseries'] labels_encoded = parsed['labels'] decoded = tf.io.decode_raw(encoded, tf.uint16) label = tf.io.decode_raw(labels_encoded, tf.int8) dates = tf.io.decode_raw(parsed['dates'], tf.int64) weight = tf.io.decode_raw(parsed['weights'], tf.float32) decoded = tf.reshape(decoded,[-1,4,42,42]) sample_dict = { 'tid': tid, # tile ID 'dates': dates, # Date list 'localid': parsed['localid'], # sample ID 'imgs': decoded, # image array 'labels': label, # label list 'weights': weight } return sample_dict
def preprocessDirect(tid, record): parsed = tf.io.parse_single_example(record, keys_to_features_direct) encoded = parsed['image_raw_ldseries'] labels_encoded = parsed['labels'] decoded = tf.io.decode_raw(encoded, tf.uint16) label = tf.io.decode_raw(labels_encoded, tf.int8) dates = tf.io.decode_raw(parsed['dates'], tf.int64) weight = tf.io.decode_raw(parsed['weights'], tf.float32) decoded = tf.reshape(decoded,[-1,4,42,42]) return tid, dates, parsed['localid'], decoded, label, weight
t1 = parseRecordDirect('filename here') dataset = t1.map(preprocessDirect, num_parallel_calls=tf.data.experimental.AUTOTUNE)
#
Class Definition:
0: clear
1: opaque cloud
2: thin cloud
3: haze
4: cloud shadow
5: snow
Dataset Construction:
First, we randomly generate 500 points for each tile, and all these points are aligned to the pixel grid center of the subdatasets in 60m resolution (eg. B10) for consistence when comparing with other products. It is because that other cloud detection method may use the cirrus band as features, which is in 60m resolution.
Then, the time series image patches of two shapes are cropped with each point as the center.The patches of shape (42 \times 42) are cropped from the bands in 10m resolution (B2, B3, B4, B8) and are used to construct this dataset.And the patches of shape (348 \times 348) are cropped from the True Colour Image (TCI, details see sentinel-2 user guide) file and are used to interpreting class labels.
The samples with a large number of timestamps could be time-consuming in the IO stage, thus the time series patches are divided into different groups with timestamps not exceeding 100 for every group.
Jurisdictional Unit, 2022-05-21. For use with WFDSS, IFTDSS, IRWIN, and InFORM.This is a feature service which provides Identify and Copy Feature capabilities. If fast-drawing at coarse zoom levels is a requirement, consider using the tile (map) service layer located at https://nifc.maps.arcgis.com/home/item.html?id=3b2c5daad00742cd9f9b676c09d03d13.OverviewThe Jurisdictional Agencies dataset is developed as a national land management geospatial layer, focused on representing wildland fire jurisdictional responsibility, for interagency wildland fire applications, including WFDSS (Wildland Fire Decision Support System), IFTDSS (Interagency Fuels Treatment Decision Support System), IRWIN (Interagency Reporting of Wildland Fire Information), and InFORM (Interagency Fire Occurrence Reporting Modules). It is intended to provide federal wildland fire jurisdictional boundaries on a national scale. The agency and unit names are an indication of the primary manager name and unit name, respectively, recognizing that:There may be multiple owner names.Jurisdiction may be held jointly by agencies at different levels of government (ie State and Local), especially on private lands, Some owner names may be blocked for security reasons.Some jurisdictions may not allow the distribution of owner names. Private ownerships are shown in this layer with JurisdictionalUnitIdentifier=null,JurisdictionalUnitAgency=null, JurisdictionalUnitKind=null, and LandownerKind="Private", LandownerCategory="Private". All land inside the US country boundary is covered by a polygon.Jurisdiction for privately owned land varies widely depending on state, county, or local laws and ordinances, fire workload, and other factors, and is not available in a national dataset in most cases.For publicly held lands the agency name is the surface managing agency, such as Bureau of Land Management, United States Forest Service, etc. The unit name refers to the descriptive name of the polygon (i.e. Northern California District, Boise National Forest, etc.).These data are used to automatically populate fields on the WFDSS Incident Information page.This data layer implements the NWCG Jurisdictional Unit Polygon Geospatial Data Layer Standard.Relevant NWCG Definitions and StandardsUnit2. A generic term that represents an organizational entity that only has meaning when it is contextualized by a descriptor, e.g. jurisdictional.Definition Extension: When referring to an organizational entity, a unit refers to the smallest area or lowest level. Higher levels of an organization (region, agency, department, etc) can be derived from a unit based on organization hierarchy.Unit, JurisdictionalThe governmental entity having overall land and resource management responsibility for a specific geographical area as provided by law.Definition Extension: 1) Ultimately responsible for the fire report to account for statistical fire occurrence; 2) Responsible for setting fire management objectives; 3) Jurisdiction cannot be re-assigned by agreement; 4) The nature and extent of the incident determines jurisdiction (for example, Wildfire vs. All Hazard); 5) Responsible for signing a Delegation of Authority to the Incident Commander.See also: Unit, Protecting; LandownerUnit IdentifierThis data standard specifies the standard format and rules for Unit Identifier, a code used within the wildland fire community to uniquely identify a particular government organizational unit.Landowner Kind & CategoryThis data standard provides a two-tier classification (kind and category) of landownership. Attribute Fields JurisdictionalAgencyKind Describes the type of unit Jurisdiction using the NWCG Landowner Kind data standard. There are two valid values: Federal, and Other. A value may not be populated for all polygons.JurisdictionalAgencyCategoryDescribes the type of unit Jurisdiction using the NWCG Landowner Category data standard. Valid values include: ANCSA, BIA, BLM, BOR, DOD, DOE, NPS, USFS, USFWS, Foreign, Tribal, City, County, OtherLoc (other local, not in the standard), State. A value may not be populated for all polygons.JurisdictionalUnitNameThe name of the Jurisdictional Unit. Where an NWCG Unit ID exists for a polygon, this is the name used in the Name field from the NWCG Unit ID database. Where no NWCG Unit ID exists, this is the “Unit Name” or other specific, descriptive unit name field from the source dataset. A value is populated for all polygons.JurisdictionalUnitIDWhere it could be determined, this is the NWCG Standard Unit Identifier (Unit ID). Where it is unknown, the value is ‘Null’. Null Unit IDs can occur because a unit may not have a Unit ID, or because one could not be reliably determined from the source data. Not every land ownership has an NWCG Unit ID. Unit ID assignment rules are available from the Unit ID standard, linked above.LandownerKindThe landowner category value associated with the polygon. May be inferred from jurisdictional agency, or by lack of a jurisdictional agency. A value is populated for all polygons. There are three valid values: Federal, Private, or Other.LandownerCategoryThe landowner kind value associated with the polygon. May be inferred from jurisdictional agency, or by lack of a jurisdictional agency. A value is populated for all polygons. Valid values include: ANCSA, BIA, BLM, BOR, DOD, DOE, NPS, USFS, USFWS, Foreign, Tribal, City, County, OtherLoc (other local, not in the standard), State, Private.DataSourceThe database from which the polygon originated. Be as specific as possible, identify the geodatabase name and feature class in which the polygon originated.SecondaryDataSourceIf the Data Source is an aggregation from other sources, use this field to specify the source that supplied data to the aggregation. For example, if Data Source is "PAD-US 2.1", then for a USDA Forest Service polygon, the Secondary Data Source would be "USDA FS Automated Lands Program (ALP)". For a BLM polygon in the same dataset, Secondary Source would be "Surface Management Agency (SMA)."SourceUniqueIDIdentifier (GUID or ObjectID) in the data source. Used to trace the polygon back to its authoritative source.MapMethod:Controlled vocabulary to define how the geospatial feature was derived. Map method may help define data quality. MapMethod will be Mixed Method by default for this layer as the data are from mixed sources. Valid Values include: GPS-Driven; GPS-Flight; GPS-Walked; GPS-Walked/Driven; GPS-Unknown Travel Method; Hand Sketch; Digitized-Image; DigitizedTopo; Digitized-Other; Image Interpretation; Infrared Image; Modeled; Mixed Methods; Remote Sensing Derived; Survey/GCDB/Cadastral; Vector; Phone/Tablet; OtherDateCurrentThe last edit, update, of this GIS record. Date should follow the assigned NWCG Date Time data standard, using 24 hour clock, YYYY-MM-DDhh.mm.ssZ, ISO8601 Standard.CommentsAdditional information describing the feature. GeometryIDPrimary key for linking geospatial objects with other database systems. Required for every feature. This field may be renamed for each standard to fit the feature.JurisdictionalUnitID_sansUSNWCG Unit ID with the "US" characters removed from the beginning. Provided for backwards compatibility.JoinMethodAdditional information on how the polygon was matched information in the NWCG Unit ID database.LocalNameLocalName for the polygon provided from PADUS or other source.LegendJurisdictionalAgencyJurisdictional Agency but smaller landholding agencies, or agencies of indeterminate status are grouped for more intuitive use in a map legend or summary table.LegendLandownerAgencyLandowner Agency but smaller landholding agencies, or agencies of indeterminate status are grouped for more intuitive use in a map legend or summary table.DataSourceYearYear that the source data for the polygon were acquired.Data InputThis dataset is based on an aggregation of 4 spatial data sources: Protected Areas Database US (PAD-US 2.1), data from Bureau of Indian Affairs regional offices, the BLM Alaska Fire Service/State of Alaska, and Census Block-Group Geometry. NWCG Unit ID and Agency Kind/Category data are tabular and sourced from UnitIDActive.txt, in the WFMI Unit ID application (https://wfmi.nifc.gov/unit_id/Publish.html). Areas of with unknown Landowner Kind/Category and Jurisdictional Agency Kind/Category are assigned LandownerKind and LandownerCategory values of "Private" by use of the non-water polygons from the Census Block-Group geometry.PAD-US 2.1:This dataset is based in large part on the USGS Protected Areas Database of the United States - PAD-US 2.`. PAD-US is a compilation of authoritative protected areas data between agencies and organizations that ultimately results in a comprehensive and accurate inventory of protected areas for the United States to meet a variety of needs (e.g. conservation, recreation, public health, transportation, energy siting, ecological, or watershed assessments and planning). Extensive documentation on PAD-US processes and data sources is available.How these data were aggregated:Boundaries, and their descriptors, available in spatial databases (i.e. shapefiles or geodatabase feature classes) from land management agencies are the desired and primary data sources in PAD-US. If these authoritative sources are unavailable, or the agency recommends another source, data may be incorporated by other aggregators such as non-governmental organizations. Data sources are tracked for each record in the PAD-US geodatabase (see below).BIA and Tribal Data:BIA and Tribal land management data are not available in PAD-US. As such, data were aggregated from BIA regional offices. These data date from 2012 and were substantially updated in 2022. Indian Trust Land affiliated with Tribes, Reservations, or BIA Agencies: These data are not considered the system of record and are not intended to be used as such. The Bureau of Indian Affairs (BIA), Branch of Wildland Fire Management (BWFM) is not the originator of these data. The
The USGS Protected Areas Database of the United States (PAD-US) is the nation's inventory of protected areas, including public open space and voluntarily provided, private protected areas, identified as an A-16 National Geospatial Data Asset in the Cadastral Theme (http://www.fgdc.gov/ngda-reports/NGDA_Datasets.html). PAD-US is an ongoing project with several published versions of a spatial database of areas dedicated to the preservation of biological diversity, and other natural, recreational or cultural uses, managed for these purposes through legal or other effective means. The geodatabase maps and describes public open space and other protected areas. Most areas are public lands owned in fee; however, long-term easements, leases, and agreements or administrative designations documented in agency management plans may be included. The PAD-US database strives to be a complete “best available” inventory of protected areas (lands and waters) including data provided by managing agencies and organizations. The dataset is built in collaboration with several partners and data providers (http://gapanalysis.usgs.gov/padus/stewards/). See Supplemental Information Section of this metadata record for more information on partnerships and links to major partner organizations. As this dataset is a compilation of many data sets; data completeness, accuracy, and scale may vary. Federal and state data are generally complete, while local government and private protected area coverage is about 50% complete, and depends on data management capacity in the state. For completeness estimates by state: http://www.protectedlands.net/partners. As the federal and state data are reasonably complete; focus is shifting to completing the inventory of local gov and voluntarily provided, private protected areas. The PAD-US geodatabase contains over twenty-five attributes and four feature classes to support data management, queries, web mapping services and analyses: Marine Protected Areas (MPA), Fee, Easements and Combined. The data contained in the MPA Feature class are provided directly by the National Oceanic and Atmospheric Administration (NOAA) Marine Protected Areas Center (MPA, http://marineprotectedareas.noaa.gov ) tracking the National Marine Protected Areas System. The Easements feature class contains data provided directly from the National Conservation Easement Database (NCED, http://conservationeasement.us ) The MPA and Easement feature classes contain some attributes unique to the sole source databases tracking them (e.g. Easement Holder Name from NCED, Protection Level from NOAA MPA Inventory). The "Combined" feature class integrates all fee, easement and MPA features as the best available national inventory of protected areas in the standard PAD-US framework. In addition to geographic boundaries, PAD-US describes the protection mechanism category (e.g. fee, easement, designation, other), owner and managing agency, designation type, unit name, area, public access and state name in a suite of standardized fields. An informative set of references (i.e. Aggregator Source, GIS Source, GIS Source Date) and "local" or source data fields provide a transparent link between standardized PAD-US fields and information from authoritative data sources. The areas in PAD-US are also assigned conservation measures that assess management intent to permanently protect biological diversity: the nationally relevant "GAP Status Code" and global "IUCN Category" standard. A wealth of attributes facilitates a wide variety of data analyses and creates a context for data to be used at local, regional, state, national and international scales. More information about specific updates and changes to this PAD-US version can be found in the Data Quality Information section of this metadata record as well as on the PAD-US website, http://gapanalysis.usgs.gov/padus/data/history/.) Due to the completeness and complexity of these data, it is highly recommended to review the Supplemental Information Section of the metadata record as well as the Data Use Constraints, to better understand data partnerships as well as see tips and ideas of appropriate uses of the data and how to parse out the data that you are looking for. For more information regarding the PAD-US dataset please visit, http://gapanalysis.usgs.gov/padus/. To find more data resources as well as view example analysis performed using PAD-US data visit, http://gapanalysis.usgs.gov/padus/resources/. The PAD-US dataset and data standard are compiled and maintained by the USGS Gap Analysis Program, http://gapanalysis.usgs.gov/ . For more information about data standards and how the data are aggregated please review the “Standards and Methods Manual for PAD-US,” http://gapanalysis.usgs.gov/padus/data/standards/ .
The National Marine Fisheries Service (NMFS) developed this geodatabase to standardize its Endangered Species Act (ESA) critical habitat spatial data. The spatial data represent critical habitat locations; however, the complete description and official boundaries of critical habitat proposed or designated by NMFS are provided in proposed rules, final rules, and the Code of Federal Regulations (50 CFR 226). Official critical habitat boundaries may include regulatory text that modifies or clarifies maps and spatial data. Proposed rules, final rules, and the CFR also describe any areas that are excluded from critical habitat or otherwise not part of critical habitat (e.g., ineligible areas), some of which have not been clipped out of the spatial data.Geodatabase feature classes are organized by ESA listed entities. A listed entity can be a species, subspecies, distinct population segment (DPS), or evolutionarily significant unit (ESU). NMFS and the U.S. Fish and Wildlife Service share jurisdiction of some listed entities; this geodatabase only contains spatial data for NMFS critical habitat. Critical habitat has not been designated for all listed entities.Generally, each listed entity has one feature class. However, a listed entity may have critical habitat locations represented by both lines and polygons. In these instances, "_poly" and "_line" are appended to the feature class names to differentiate between the spatial data types. Lines represent rivers, streams, or beaches and polygons represent waterbodies, marine areas, estuaries, marshes, or watersheds. The 8 digit date (YYYYMMDD) in each feature class name is the publication date of the proposed or final rule in the Federal Register. Both proposed and designated critical habitat are included in this geodatabase. To differentiate between these categories, all proposed critical habitat feature classes begin with "Proposed_". Proposed critical habitat will be replaced by final designations soon after a final rule is published in the Federal Register. This geodatabase version may not include spatial data for recently proposed, modified, or designated critical habitat. Additionally, spatial data are not available for the designated critical habitat of the Southern Oregon/Northern California Coast coho salmon ESU and the Snake River spring/summer-run Chinook salmon ESU. NMFS will add these spatial data when they become available. In the meantime, please consult the final rules or CFR. NMFS may periodically update existing lines or polygons if better information becomes available, such as higher resolution bathymetric surveys. The "All_critical_habitat" feature dataset includes merged line and polygon feature classes that contain all available spatial data for critical habitat proposed or designated by NMFS; therefore, these feature classes contain overlapping features. The "All_critical_habitat_line_YYYYMMDD" and "All_critical_habitat_poly_YYYYMMDD" feature classes should be used together to represent all available spatial data. The date appended to the feature class names is the date the geoprocessing (merge) occured. Features in this geodatabase were compiled from previously developed spatial data. The methods and sources used to create these spatial data are NOT standardized. Coastlines, bathymetric contours, and river lines, for example, were all derived from a variety of sources, using many different geoprocessing techniques, over the span of decades. If information was available on source data and/or processing steps, it was documented in the metadata lineage. Metadata descriptions and the "Notes" field describe line and boundary definitions. Line and boundary definitions are specific to each proposed or designated critical habitat dataset. For example, depending on the listed entity, a coastline could represent the Mean Higher High Water (MHHW) line in one designation and the Mean Lower Low Water (MLLW) line in another designation. Metadata for each feature class is a combination of standardized and unique content. Standardized content includes the field and value definitions, spatial reference (WGS 84 geographic coordinate system), and metadata style (ISO 19139). All other metadata content is unique to each feature class. eCFR official ESA listeCFR official NMFS critical habitat designationsNMFS critical habitat websiteNMFS maps and GIS data directoryNMFS ESA threatened and endangered species directoryNMFS ESA regulations and actions directory
Layers are organized by ESA listed entities. A listed entity can be a species, subspecies, distinct population segment (DPS), or evolutionarily significant unit (ESU). NMFS and the U.S. Fish and Wildlife Service share jurisdiction of some listed entities; this service only contains spatial data for NMFS critical habitat. Critical habitat has not been designated for all listed entities.Generally, each listed entity has one layer. However, a listed entity may have critical habitat locations represented by both lines and polygons. In these instances, "_poly" and "_line" are appended to the layer names to differentiate between the spatial data types. Lines represent rivers, streams, or beaches and polygons represent waterbodies, marine areas, estuaries, marshes, or watersheds. The 8 digit date (YYYYMMDD) in each layer name is the publication date of the proposed or final rule in the Federal Register.Both proposed and designated critical habitat are included in this service. To differentiate between these categories, all proposed critical habitat layers begin with "Proposed_". Proposed critical habitat will be replaced by final designations soon after a final rule is published in the Federal Register. This service version may not include spatial data for recently proposed, modified, or designated critical habitat. Additionally, spatial data are not available for the designated critical habitat of the Southern Oregon/Northern California Coast coho salmon ESU and the Snake River spring/summer-run Chinook salmon ESU. NMFS will add these spatial data when they become available. In the meantime, please consult the final rules or CFR. NMFS may periodically update existing lines or polygons if better information becomes available, such as higher resolution bathymetric surveys.The "All_critical_habitat" layer group includes merged line and polygon feature classes that contain all available spatial data for critical habitat proposed or designated by NMFS; therefore, these layers contain overlapping features. The "All_critical_habitat_line_YYYYMMDD" and "All_critical_habitat_poly_YYYYMMDD" layers should be used together to represent all available spatial data. The date appended to the layer names is the date the geoprocessing (merge) occured.Features in this service were compiled from previously developed spatial data. The methods and sources used to create these spatial data are NOT standardized. Coastlines, bathymetric contours, and river lines, for example, were all derived from a variety of sources, using many different geoprocessing techniques, over the span of decades. If information was available on source data and/or processing steps, it was documented in the metadata lineage. Metadata descriptions and the "Notes" field describe line and boundary definitions. Line and boundary definitions are specific to each proposed or designated critical habitat dataset. For example, depending on the listed entity, a coastline could represent the Mean Higher High Water (MHHW) line in one designation and the Mean Lower Low Water (MLLW) line in another designation.Metadata for each layer is a combination of standardized and unique content and can be viewed at https://www.fisheries.noaa.gov/inport/item/65207. Standardized content includes the field and value definitions, spatial reference, and metadata style (ISO 19139). All other metadata content is unique to each layer.These data have been made publicly available from an authoritative source other than this Atlas and data should be obtained directly from that source for any re-use. See the original metadata from the authoritative source for more information about these data and use limitations. The authoritative source of these data can be found at the following location: NMFS Critical Habitat
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
http://inspire.ec.europa.eu/metadata-codelist/LimitationsOnPublicAccess/noLimitationshttp://inspire.ec.europa.eu/metadata-codelist/LimitationsOnPublicAccess/noLimitations
The dataset contains information on the European river basin districts, the river basin district sub-units, the surface water bodies and the groundwater bodies delineated for the 1st River Basin Management Plans (RBMP) under the Water Framework Directive (WFD) as well as the European monitoring sites used for the assessment of the status of the abovementioned surface water bodies and groundwater bodies.
The information was reported to the European Commission under the Water Framework Directive (WFD) reporting obligations.
The dataset compiles the available spatial data related to the 1st RBMPs which were due in 2010 (hereafter WFD2010). See http://rod.eionet.europa.eu/obligations/521 for further information on the WFD2010 reporting.
It was prepared to support the reporting of the 2nd RBMPs due in 2016 (hereafter WFD2016). See http://rod.eionet.europa.eu/obligations/715 for further information on the WFD2016 reporting.
The data reported in WFD2010 were updated using data reported in WFD2016, whenever the spatial objects are identical in 2010 and 2016. For WFD2010 objects, some information may be missing, if the objects no longer exist in the 2nd River Basin Management Plans, and were not reported in WFD2016.
Relevant concepts:
River basin district (RBD): The area of land and sea, made up of one or more neighbouring river basins together with their associated groundwaters and coastal waters, which is the main unit for management of river basins.
River basin: The area of land from which all surface run-off flows through a sequence of streams, rivers and, possibly, lakes into the sea at a single river mouth, estuary or delta.
Sub-basin: The area of land from which all surface run-off flows through a series of streams, rivers and, possibly, lakes to a particular point in a water course (normally a lake or a river confluence).
Sub-unit [Operational definition. Not in the WFD]: Reporting unit. River basin districts larger than 50000 square kilometre should be divided into comparable sub-units with an area between 5000 and 50000 square kilometre. The sub-units should be created using river basins (if more than one river basin exists in the RBD), set of contiguous river basins, or sub-basins, for example. If the RBD area is less than 50000 square kilometre, the RBD itself should be used as a sub-unit.
Surface water body: Body of surface water means a discrete and significant element of surface water such as a lake, a reservoir, a stream, river or canal, part of a stream, river or canal, a transitional water or a stretch of coastal water.
Surface water: Inland waters, except groundwater; transitional waters and coastal waters, except in respect of chemical status for which it shall also include territorial waters.
Inland water: All standing or flowing water on the surface of the land, and all groundwater on the landward side of the baseline from which the breadth of territorial waters is measured.
River: Body of inland water flowing for the most part on the surface of the land but which may flow underground for part of its course.
Lake: Body of standing inland surface water.
Transitional waters: Bodies of surface water in the vicinity of river mouths which are partly saline in character as a result of their proximity to coastal waters but which are substantially influenced by freshwater flows.
Coastal water: Surface water on the landward side of a line, every point of which is at a distance of one nautical mile on the seaward side from the nearest point of the baseline from which the breadth of territorial waters is measured, extending where appropriate up to the outer limit of transitional waters.
Territorial sea: The territorial waters, or territorial sea as defined by the 1982 United Nations Convention on the Law of the Sea, extend up to a limit not exceeding 12 nautical miles (22.2 km), measured from the baseline. The normal baseline is the low-water line along the coast.
Territorial waters [Operational definition. Not in WFD.]: Reporting unit. The zone between the limit of the coastal water bodies and the limit of the territorial sea, geometrically subdivided in Thiessen polygons according to the adjacent coastal sub-unit (or using any alternative delineation provided by the national competent authorities), and assigned to an adjacent sub-unit for the purpose of reporting the chemical status of the territorial waters under the Water Framework Directive.
Groundwater body: 'Body of groundwater' means a distinct volume of groundwater within an aquifer or aquifers.
Groundwater: All water which is below the surface of the ground in the saturation zone and in direct contact with the ground or subsoil. Aquifer: Subsurface layer or layers of rock or other geological strata of sufficient porosity and permeability to allow either a significant flow of groundwater or the abstraction of significant quantities of groundwater.
Monitoring site: [Operational definition. Not in the WFD] Monitoring point included in a WFD surveillance, operational or investigative monitoring programme.
description: The Geopspatial Fabric provides a consistent, documented, and topologically connected set of spatial features that create an abstracted stream/basin network of features useful for hydrologic modeling.The GIS vector features contained in this Geospatial Fabric (GF) data set cover the lower 48 U.S. states, Hawaii, and Puerto Rico. Four GIS feature classes are provided for each Region: 1) the Region outline ("one"), 2) Points of Interest ("POIs"), 3) a routing network ("nsegment"), and 4) Hydrologic Response Units ("nhru"). A graphic showing the boundaries for all Regions is provided at http://dx.doi.org/doi:10.5066/F7542KMD. These Regions are identical to those used to organize the NHDPlus v.1 dataset (US EPA and US Geological Survey, 2005). Although the GF Feature data set has been derived from NHDPlus v.1, it is an entirely new data set that has been designed to generically support regional and national scale applications of hydrologic models. Definition of each type of feature class and its derivation is provided within the
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Overview
3DHD CityScenes is the most comprehensive, large-scale high-definition (HD) map dataset to date, annotated in the three spatial dimensions of globally referenced, high-density LiDAR point clouds collected in urban domains. Our HD map covers 127 km of road sections of the inner city of Hamburg, Germany including 467 km of individual lanes. In total, our map comprises 266,762 individual items.
Our corresponding paper (published at ITSC 2022) is available here.
Further, we have applied 3DHD CityScenes to map deviation detection here.
Moreover, we release code to facilitate the application of our dataset and the reproducibility of our research. Specifically, our 3DHD_DevKit comprises:
The DevKit is available here:
https://github.com/volkswagen/3DHD_devkit.
The dataset and DevKit have been created by Christopher Plachetka as project lead during his PhD period at Volkswagen Group, Germany.
When using our dataset, you are welcome to cite:
@INPROCEEDINGS{9921866,
author={Plachetka, Christopher and Sertolli, Benjamin and Fricke, Jenny and Klingner, Marvin and
Fingscheidt, Tim},
booktitle={2022 IEEE 25th International Conference on Intelligent Transportation Systems (ITSC)},
title={3DHD CityScenes: High-Definition Maps in High-Density Point Clouds},
year={2022},
pages={627-634}}
Acknowledgements
We thank the following interns for their exceptional contributions to our work.
The European large-scale project Hi-Drive (www.Hi-Drive.eu) supports the publication of 3DHD CityScenes and encourages the general publication of information and databases facilitating the development of automated driving technologies.
The Dataset
After downloading, the 3DHD_CityScenes folder provides five subdirectories, which are explained briefly in the following.
1. Dataset
This directory contains the training, validation, and test set definition (train.json, val.json, test.json) used in our publications. Respective files contain samples that define a geolocation and the orientation of the ego vehicle in global coordinates on the map.
During dataset generation (done by our DevKit), samples are used to take crops from the larger point cloud. Also, map elements in reach of a sample are collected. Both modalities can then be used, e.g., as input to a neural network such as our 3DHDNet.
To read any JSON-encoded data provided by 3DHD CityScenes in Python, you can use the following code snipped as an example.
import json
json_path = r"E:\3DHD_CityScenes\Dataset\train.json"
with open(json_path) as jf:
data = json.load(jf)
print(data)
2. HD_Map
Map items are stored as lists of items in JSON format. In particular, we provide:
3. HD_Map_MetaData
Our high-density point cloud used as basis for annotating the HD map is split in 648 tiles. This directory contains the geolocation for each tile as polygon on the map. You can view the respective tile definition using QGIS. Alternatively, we also provide respective polygons as lists of UTM coordinates in JSON.
Files with the ending .dbf, .prj, .qpj, .shp, and .shx belong to the tile definition as “shape file” (commonly used in geodesy) that can be viewed using QGIS. The JSON file contains the same information provided in a different format used in our Python API.
4. HD_PointCloud_Tiles
The high-density point cloud tiles are provided in global UTM32N coordinates and are encoded in a proprietary binary format. The first 4 bytes (integer) encode the number of points contained in that file. Subsequently, all point cloud values are provided as arrays. First all x-values, then all y-values, and so on. Specifically, the arrays are encoded as follows.
After reading, respective values have to be unnormalized. As an example, you can use the following code snipped to read the point cloud data. For visualization, you can use the pptk package, for instance.
import numpy as np
import pptk
file_path = r"E:\3DHD_CityScenes\HD_PointCloud_Tiles\HH_001.bin"
pc_dict = {}
key_list = ['x', 'y', 'z', 'intensity', 'is_ground']
type_list = ['
5. Trajectories
We provide 15 real-world trajectories recorded during a measurement campaign covering the whole HD map. Trajectory samples are provided approx. with 30 Hz and are encoded in JSON.
These trajectories were used to provide the samples in train.json, val.json. and test.json with realistic geolocations and orientations of the ego vehicle.
- OP1 – OP5 cover the majority of the map with 5 trajectories.
- RH1 – RH10 cover the majority of the map with 10 trajectories.
Note that OP5 is split into three separate parts, a-c. RH9 is split into two parts, a-b. Moreover, OP4 mostly equals OP1 (thus, we speak of 14 trajectories in our paper). For completeness, however, we provide all recorded trajectories here.
This data represents the GIS Version of the Public Land Survey System including both rectangular and non-rectangular survey data. The rectangular survey data are a reference system for land tenure based upon meridian, township/range, section, section subdivision and government lots. The non-rectangular survey data represent surveys that were largely performed to protect and/or convey title on specific parcels of land such as mineral surveys and tracts. The data are largely complete in reference to the rectangular survey data at the level of first division. However, the data varies in terms of granularity of its spatial representation as well as its content below the first division. Therefore, depending upon the data source and steward, accurate subdivision of the rectangular data may not be available below the first division and the non-rectangular minerals surveys may not be present. At times, the complexity of surveys rendered the collection of data cost prohibitive such as in areas characterized by numerous, overlapping mineral surveys. In these situations, the data were often not abstracted or were only partially abstracted and incorporated into the data set. These PLSS data were compiled from a broad spectrum or sources including federal, county, and private survey records such as field notes and plats as well as map sources such as USGS 7 ½ minute quadrangles. The metadata in each data set describes the production methods for the data content. This data is optimized for data publication and sharing rather than for specific "production" or operation and maintenance. A complete PLSS data set includes the following: PLSS Townships, First Divisions and Second Divisions (the hierarchical break down of the PLSS Rectangular surveys) PLSS Special surveys (non-rectangular components of the PLSS) Meandered Water, Corners, Metadata at a Glance (which identified last revised date and data steward) and Conflicted Areas (known areas of gaps or overlaps or inconsistencies). The Entity-Attribute section of this metadata describes these components in greater detail. The second division of the PLSS is quarter, quarter-quarter, sixteenth or government lot division of the PLSS. The second and third divisions are combined into this feature class as an intentional de-normalization of the PLSS hierarchical data. The polygons in this feature class represent the smallest division to the sixteenth that has been defined for the first division. For example In some cases sections have only been divided to the quarter. Divisions below the sixteenth are in the Special Survey or Parcel Feature Class. The second division of the PLSS is quarter, quarter-quarter, sixteenth or government lot division of the PLSS. The second and third divisions are combined into this feature class as an intentional de-normalization of the PLSS hierarchical data. The polygons in this feature class represent the smallest division to the sixteenth that has been defined for the first division. For example In some cases sections have only been divided to the quarter. Divisions below the sixteenth are in the Special Survey or Parcel Feature Class.
DescriptionThis is a vector tile layer built from the same data as the Jurisdictional Units Public feature service located here: https://nifc.maps.arcgis.com/home/item.html?id=4107b5d1debf4305ba00e929b7e5971a. This service can be used alone as a fast-drawing background layer, or used in combination with the feature service when Identify and Copy Feature capabilities are needed. At fine zoom levels, the feature service will be needed.OverviewThe Jurisdictional Units dataset outlines wildland fire jurisdictional boundaries for federal, state, and local government entities on a national scale and is used within multiple wildland fire systems including the Wildland Fire Decision Support System (WFDSS), the Interior Fuels and Post-Fire Reporting System (IFPRS), the Interagency Fuels Treatment Decision Support System (IFTDSS), the Interagency Fire Occurrence Reporting Modules (InFORM), the Interagency Reporting of Wildland Fire Information System (IRWIN), and the Wildland Computer-Aided Dispatch Enterprise System (WildCAD-E).In this dataset, agency and unit names are an indication of the primary manager’s name and unit name, respectively, recognizing that:There may be multiple owner names.Jurisdiction may be held jointly by agencies at different levels of government (ie State and Local), especially on private lands, Some owner names may be blocked for security reasons.Some jurisdictions may not allow the distribution of owner names. Private ownerships are shown in this layer with JurisdictionalUnitIID=null, JurisdictionalKind=null, and LandownerKind="Private", LandownerCategory="Private". All land inside the US country boundary is covered by a polygon.Jurisdiction for privately owned land varies widely depending on state, county, or local laws and ordinances, fire workload, and other factors, and is not available in a national dataset in most cases.For publicly held lands the agency name is the surface managing agency, such as Bureau of Land Management, United States Forest Service, etc. The unit name refers to the descriptive name of the polygon (i.e. Northern California District, Boise National Forest, etc.).AttributesField NameDefinitionGeometryIDPrimary key for linking geospatial objects with other database systems. Required for every feature. Not populated for Census Block Groups.JurisdictionalUnitIDWhere it could be determined, this is the NWCG Unit Identifier (Unit ID). Where it is unknown, the value is ‘Null’. Null Unit IDs can occur because a unit may not have a Unit ID, or because one could not be reliably determined from the source data. Not every land ownership has an NWCG Unit ID. Unit ID assignment rules are available in the Unit ID standard.JurisdictionalUnitID_sansUSNWCG Unit ID with the "US" characters removed from the beginning. Provided for backwards compatibility.JurisdictionalUnitNameThe name of the Jurisdictional Unit. Where an NWCG Unit ID exists for a polygon, this is the name used in the Name field from the NWCG Unit ID database. Where no NWCG Unit ID exists, this is the “Unit Name” or other specific, descriptive unit name field from the source dataset. A value is populated for all polygons except for Census Blocks Group and for PAD-US polygons that did not have an associated name.LocalNameLocal name for the polygon provided from agency authoritative data, PAD-US, or other source.JurisdictionalKindDescribes the type of unit jurisdiction using the NWCG Landowner Kind data standard. There are two valid values: Federal, Other, and Private. A value is not populated for Census Block Groups.JurisdictionalCategoryDescribes the type of unit jurisdiction using the NWCG Landowner Category data standard. Valid values include: BIA, BLM, BOR, DOD, DOE, NPS, USFS, USFWS, Foreign, Tribal, City, County, State, OtherLoc (other local, not in the standard), Private, and ANCSA. A value is not populated for Census Block Groups.LandownerKindThe landowner kind value associated with the polygon. May be inferred from jurisdictional agency, or by lack of a jurisdictional agency. Legal values align with the NWCG Landowner Kind data standard. A value is populated for all polygons.LandownerCategoryThe landowner category value associated with the polygon. May be inferred from jurisdictional agency, or by lack of a jurisdictional agency. Legal values align with the NWCG Landowner Category data standard. A value is populated for all polygons.LandownerDepartmentFederal department information that aligns with a unit’s landownerCategory information. Legal values include: Department of Agriculture, Department of Interior, Department of Defense, and Department of Energy. A value is not populated for all polygons.DataSourceThe database from which the polygon originated. An effort is made to be as specific as possible (i.e. identify the geodatabase name and feature class in which the polygon originated).SecondaryDataSourceIf the DataSource field is an aggregation from other sources, use this field to specify the source that supplied data to the aggregation. For example, if DataSource is "PAD-US 4.0", then for a TNC polygon, the SecondaryDataSource would be " TNC_PADUS2_0_SA2015_Public_gdb ".SourceUniqueIDIdentifier (GUID or ObjectID) in the data source. Used to trace the polygon back to its authoritative source.DataSourceYearYear that the source data for the polygon were acquired.MapMethodControlled vocabulary to define how the geospatial feature was derived. MapMethod will be Mixed Methods by default for this layer as the data are from mixed sources. Valid Values include: GPS-Driven; GPS-Flight; GPS-Walked; GPS-Walked/Driven; GPS-Unknown Travel Method; Hand Sketch; Digitized-Image; DigitizedTopo; Digitized-Other; Image Interpretation; Infrared Image; Modeled; Mixed Methods; Remote Sensing Derived; Survey/GCDB/Cadastral; Vector; Phone/Tablet; Other.DateCurrentThe last edit, update, of this GIS record. Date should follow the assigned NWCG Date Time data standard, using the 24-hour clock, YYYY-MM-DDhh.mm.ssZ, ISO8601 Standard.CommentsAdditional information describing the feature.JoinMethodAdditional information on how the polygon was matched to information in the NWCG Unit ID database.LegendJurisdictionalCategoryJurisdictionalCategory values grouped for more intuitive use in a map legend or summary table. Census Block Groups are classified as “No Unit”.LegendLandownerCategoryLandownerCategory values grouped for more intuitive use in a map legend or summary table.Other Relevant NWCG Definition StandardsUnitA generic term that represents an organizational entity that only has meaning when it is contextualized by a descriptor, e.g. jurisdictional.Definition Extension: When referring to an organizational entity, a unit refers to the smallest area or lowest level. Higher levels of an organization (region, agency, department, etc.) can be derived from a unit based on organization hierarchy.Unit, JurisdictionalThe governmental entity having overall land and resource management responsibility for a specific geographical area as provided by law.Definition Extension: 1) Ultimately responsible for the fire report to account for statistical fire occurrence; 2) Responsible for setting fire management objectives; 3) Jurisdiction cannot be re-assigned by agreement; 4) The nature and extent of the incident determines jurisdiction (for example, Wildfire vs. All Hazard); 5) Responsible for signing a Delegation of Authority to the Incident Commander.See also: Protecting Unit; LandownerData SourcesThis dataset is an aggregation of multiple spatial data sources: • Authoritative land ownership records from BIA, BLM, NPS, USFS, USFWS, and the Alaska Fire Service/State of Alaska• The Protected Areas Database US (PAD-US 4.0)• Census Block-Group Geometry BIA and Tribal Data:BIA and Tribal land management data were aggregated from BIA regional offices. These data date from 2012 and were reviewed/updated in 2024. Indian Trust Land affiliated with Tribes, Reservations, or BIA Agencies: These data are not considered the system of record and are not intended to be used as such. The Bureau of Indian Affairs (BIA), Branch of Wildland Fire Management (BWFM) is not the originator of these data. The spatial data coverage is a consolidation of the best available records/data received from each of the 12 BIA Regional Offices. The data are no better than the original sources from which they were derived. Care was taken when consolidating these files. However, BWFM cannot accept any responsibility for errors, omissions, or positional accuracy in the original digital data. The information contained in these data is dynamic and is continually changing. Updates to these data will be made whenever such data are received from a Regional Office. The BWFM gives no guarantee, expressed, written, or implied, regarding the accuracy, reliability, or completeness of these data.Alaska:The state of Alaska and Alaska Fire Service (BLM) co-manage a process to aggregate authoritative land ownership, management, and jurisdictional boundary data, based on Master Title Plats. Data ProcessingTo compile this dataset, the authoritative land ownership records and the PAD-US data mentioned above were crosswalked into the Jurisdictional Unit Polygon schema and aggregated through a series of python scripts and FME models. Once aggregated, steps were taken to reduce overlaps within the data. All overlap areas larger than 300 acres were manually examined and removed with the assistance of fire management SMEs. Once overlaps were removed, Census Block Group geometry were crosswalked to the Jurisdictional Unit Polygon schema and appended in areas in which no jurisdictional boundaries were recorded within the authoritative land ownership records and the PAD-US data. Census Block Group geometries represent areas of unknown Landowner Kind/Category and Jurisdictional Kind/Category and were assigned LandownerKind and LandownerCategory values of "Private".Update
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Australia's Land Borders is a product within the Foundation Spatial Data Framework (FSDF) suite of datasets. It is endorsed by the ANZLIC - the Spatial Information Council and the Intergovernmental …Show full descriptionAustralia's Land Borders is a product within the Foundation Spatial Data Framework (FSDF) suite of datasets. It is endorsed by the ANZLIC - the Spatial Information Council and the Intergovernmental Committee on Surveying and Mapping (ICSM) as a nationally consistent and topologically correct representation of the land borders published by the Australian states and territories. The purpose of this product is to provide: (i) a building block which enables development of other national datasets; (ii) integration with other geospatial frameworks in support of data analysis; and (iii) visualisation of these borders as cartographic depiction on a map. Although this dataset depicts land borders, it is not nor does it suggests to be a legal definition of these borders. Therefore it cannot and must not be used for those use-cases pertaining to legal context. This product is constructed by Geoscience Australia (GA), on behalf of the ICSM, from authoritative open data published by the land mapping agencies in their respective Australian state and territory jurisdictions. Construction of a nationally consistent dataset required harmonisation and mediation of data issues at abutting land borders. In order to make informed and consistent determinations, other datasets were used as visual aid in determining which elements of published jurisdictional data to promote into the national product. These datasets include, but are not restricted to: (i) PSMA Australia's commercial products such as the cadastral (property) boundaries (CadLite) and Geocoded National Address File (GNAF); (ii) Esri's World Imagery and Imagery with Labels base maps; and (iii) Geoscience Australia's GEODATA TOPO 250K Series 3. Where practical, Land Borders do not cross cadastral boundaries and are logically consistent with addressing data in GNAF. It is important to reaffirm that although third-party commercial datasets are used for validation, which is within remit of the licence agreement between PSMA and GA, no commercially licenced data has been promoted into the product. Australian Land Borders are constructed exclusively from published open data originating from state, territory and federal agencies. This foundation dataset consists of edges (polylines) representing mediated segments of state and/or territory borders, connected at the nodes and terminated at the coastline defined as the Mean High Water Mark (MHWM) tidal boundary. These polylines are attributed to convey information about provenance of the source. It is envisaged that land borders will be topologically interoperable with the future national coastline dataset/s, currently being built through the ICSM coastline capture collaboration program. Topological interoperability will enable closure of land mass polygon, permitting spatial analysis operations such as vector overly, intersect, or raster map algebra. In addition to polylines, the product incorporates a number of well-known survey-monumented corners which have historical and cultural significance associated with the place name. This foundation dataset is constructed from the best-available data, as published by relevant custodian in state and territory jurisdiction. It should be noted that some custodians - in particular the Northern Territory and New South Wales - have opted out or to rely on data from abutting jurisdiction as an agreed portrayal of their border. Accuracy and precision of land borders as depicted by spatial objects (features) may vary according to custodian specifications, although there is topological coherence across all the objects within this integrated product. The guaranteed minimum nominal scale for all use-cases, applying to complete spatial coverage of this product, is 1:25 000. In some areas the accuracy is much better and maybe approaching cadastre survey specification, however, this is an artefact of data assembly from disparate sources, rather than the product design. As the principle, no data was generalised or spatially degraded in the process of constructing this product. Some use-cases for this product are: general digital and web map-making applications; a reference dataset to use for cartographic generalisation for a smaller-scale map applications; constraining geometric objects for revision and updates to the Mesh Blocks, the building blocks for the larger regions of the Australian Statistical Geography Standard (ASGS) framework; rapid resolution of cross-border data issues to enable construction and visual display of a common operating picture, etc. This foundation dataset will be maintained at irregular intervals, for example if a state or territory jurisdiction decides to publish or republish their land borders. If there is a new version of this dataset, past version will be archived and information about the changes will be made available in the change log.
OverviewThe Jurisdictional Units dataset outlines wildland fire jurisdictional boundaries for federal, state, and local government entities on a national scale and is used within multiple wildland fire systems including the Wildland Fire Decision Support System (WFDSS), the Interior Fuels and Post-Fire Reporting System (IFPRS), the Interagency Fuels Treatment Decision Support System (IFTDSS), the Interagency Fire Occurrence Reporting Modules (InFORM), the Interagency Reporting of Wildland Fire Information System (IRWIN), and the Wildland Computer-Aided Dispatch Enterprise System (WildCAD-E).In this dataset, agency and unit names are an indication of the primary manager’s name and unit name, respectively, recognizing that:There may be multiple owner names.Jurisdiction may be held jointly by agencies at different levels of government (ie State and Local), especially on private lands, Some owner names may be blocked for security reasons.Some jurisdictions may not allow the distribution of owner names. Private ownerships are shown in this layer with JurisdictionalUnitIID=null, JurisdictionalKind=null, and LandownerKind="Private", LandownerCategory="Private". All land inside the US country boundary is covered by a polygon.Jurisdiction for privately owned land varies widely depending on state, county, or local laws and ordinances, fire workload, and other factors, and is not available in a national dataset in most cases.For publicly held lands the agency name is the surface managing agency, such as Bureau of Land Management, United States Forest Service, etc. The unit name refers to the descriptive name of the polygon (i.e. Northern California District, Boise National Forest, etc.).AttributesField NameDefinitionGeometryIDPrimary key for linking geospatial objects with other database systems. Required for every feature. Not populated for Census Block Groups.JurisdictionalUnitIDWhere it could be determined, this is the NWCG Unit Identifier (Unit ID). Where it is unknown, the value is ‘Null’. Null Unit IDs can occur because a unit may not have a Unit ID, or because one could not be reliably determined from the source data. Not every land ownership has an NWCG Unit ID. Unit ID assignment rules are available in the Unit ID standard.JurisdictionalUnitID_sansUSNWCG Unit ID with the "US" characters removed from the beginning. Provided for backwards compatibility.JurisdictionalUnitNameThe name of the Jurisdictional Unit. Where an NWCG Unit ID exists for a polygon, this is the name used in the Name field from the NWCG Unit ID database. Where no NWCG Unit ID exists, this is the “Unit Name” or other specific, descriptive unit name field from the source dataset. A value is populated for all polygons except for Census Blocks Group and for PAD-US polygons that did not have an associated name.LocalNameLocal name for the polygon provided from agency authoritative data, PAD-US, or other source.JurisdictionalKindDescribes the type of unit jurisdiction using the NWCG Landowner Kind data standard. There are two valid values: Federal, Other, and Private. A value is not populated for Census Block Groups.JurisdictionalCategoryDescribes the type of unit jurisdiction using the NWCG Landowner Category data standard. Valid values include: BIA, BLM, BOR, DOD, DOE, NPS, USFS, USFWS, Foreign, Tribal, City, County, State, OtherLoc (other local, not in the standard), Private, and ANCSA. A value is not populated for Census Block Groups.LandownerKindThe landowner kind value associated with the polygon. May be inferred from jurisdictional agency, or by lack of a jurisdictional agency. Legal values align with the NWCG Landowner Kind data standard. A value is populated for all polygons.LandownerCategoryThe landowner category value associated with the polygon. May be inferred from jurisdictional agency, or by lack of a jurisdictional agency. Legal values align with the NWCG Landowner Category data standard. A value is populated for all polygons.LandownerDepartmentFederal department information that aligns with a unit’s landownerCategory information. Legal values include: Department of Agriculture, Department of Interior, Department of Defense, and Department of Energy. A value is not populated for all polygons.DataSourceThe database from which the polygon originated. An effort is made to be as specific as possible (i.e. identify the geodatabase name and feature class in which the polygon originated).SecondaryDataSourceIf the DataSource field is an aggregation from other sources, use this field to specify the source that supplied data to the aggregation. For example, if DataSource is "PAD-US 4.0", then for a TNC polygon, the SecondaryDataSource would be " TNC_PADUS2_0_SA2015_Public_gdb ".SourceUniqueIDIdentifier (GUID or ObjectID) in the data source. Used to trace the polygon back to its authoritative source.DataSourceYearYear that the source data for the polygon were acquired.MapMethodControlled vocabulary to define how the geospatial feature was derived. MapMethod will be Mixed Methods by default for this layer as the data are from mixed sources. Valid Values include: GPS-Driven; GPS-Flight; GPS-Walked; GPS-Walked/Driven; GPS-Unknown Travel Method; Hand Sketch; Digitized-Image; DigitizedTopo; Digitized-Other; Image Interpretation; Infrared Image; Modeled; Mixed Methods; Remote Sensing Derived; Survey/GCDB/Cadastral; Vector; Phone/Tablet; Other.DateCurrentThe last edit, update, of this GIS record. Date should follow the assigned NWCG Date Time data standard, using the 24-hour clock, YYYY-MM-DDhh.mm.ssZ, ISO8601 Standard.CommentsAdditional information describing the feature.JoinMethodAdditional information on how the polygon was matched to information in the NWCG Unit ID database.LegendJurisdictionalCategoryJurisdictionalCategory values grouped for more intuitive use in a map legend or summary table. Census Block Groups are classified as “No Unit”.LegendLandownerCategoryLandownerCategory values grouped for more intuitive use in a map legend or summary table.Other Relevant NWCG Definition StandardsUnitA generic term that represents an organizational entity that only has meaning when it is contextualized by a descriptor, e.g. jurisdictional.Definition Extension: When referring to an organizational entity, a unit refers to the smallest area or lowest level. Higher levels of an organization (region, agency, department, etc.) can be derived from a unit based on organization hierarchy.Unit, JurisdictionalThe governmental entity having overall land and resource management responsibility for a specific geographical area as provided by law.Definition Extension: 1) Ultimately responsible for the fire report to account for statistical fire occurrence; 2) Responsible for setting fire management objectives; 3) Jurisdiction cannot be re-assigned by agreement; 4) The nature and extent of the incident determines jurisdiction (for example, Wildfire vs. All Hazard); 5) Responsible for signing a Delegation of Authority to the Incident Commander.See also: Protecting Unit; LandownerData SourcesThis dataset is an aggregation of multiple spatial data sources: • Authoritative land ownership records from BIA, BLM, NPS, USFS, USFWS, and the Alaska Fire Service/State of Alaska• The Protected Areas Database US (PAD-US 4.0)• Census Block-Group Geometry BIA and Tribal Data:BIA and Tribal land management data were aggregated from BIA regional offices. These data date from 2012 and were reviewed/updated in 2024. Indian Trust Land affiliated with Tribes, Reservations, or BIA Agencies: These data are not considered the system of record and are not intended to be used as such. The Bureau of Indian Affairs (BIA), Branch of Wildland Fire Management (BWFM) is not the originator of these data. The spatial data coverage is a consolidation of the best available records/data received from each of the 12 BIA Regional Offices. The data are no better than the original sources from which they were derived. Care was taken when consolidating these files. However, BWFM cannot accept any responsibility for errors, omissions, or positional accuracy in the original digital data. The information contained in these data is dynamic and is continually changing. Updates to these data will be made whenever such data are received from a Regional Office. The BWFM gives no guarantee, expressed, written, or implied, regarding the accuracy, reliability, or completeness of these data.Alaska:The state of Alaska and Alaska Fire Service (BLM) co-manage a process to aggregate authoritative land ownership, management, and jurisdictional boundary data, based on Master Title Plats. Data ProcessingTo compile this dataset, the authoritative land ownership records and the PAD-US data mentioned above were crosswalked into the Jurisdictional Unit Polygon schema and aggregated through a series of python scripts and FME models. Once aggregated, steps were taken to reduce overlaps within the data. All overlap areas larger than 300 acres were manually examined and removed with the assistance of fire management SMEs. Once overlaps were removed, Census Block Group geometry were crosswalked to the Jurisdictional Unit Polygon schema and appended in areas in which no jurisdictional boundaries were recorded within the authoritative land ownership records and the PAD-US data. Census Block Group geometries represent areas of unknown Landowner Kind/Category and Jurisdictional Kind/Category and were assigned LandownerKind and LandownerCategory values of "Private".Update FrequencyThe Authoritative land ownership records and PAD-US data used to compile this dataset are dynamic and are continually changing. Major updates to this dataset will be made once a year, and minor updates will be incorporated throughout the year as needed. New to the Latest Release (1/15/25)Now pulling from agency authoritative sources for BLM, NPS, USFS, and USFWS (instead of getting this data from PADUS).
Field Name Changes
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is the data and code related to paperEnhancing Spatial Count Data Modeling: A new method for Poisson Means of Stratified NonhomogeneityAbstract: Spatial count data is a prevalent data type in natural and social sciences. As the data present complicated spatial autocorrelation and heterogeneity inherent in geographical analysis, current methods lack a theoretical approach to model and predict the count data, especially with limited spatial samples. To address the gap, this study develops a new method named Poisson Means of Stratified Nonhomogeneity (PoiMSN). It theoretically considers both autocorrelation and heterogeneity, and without any covariate, incorporates local samples and out-stratum neighbors that traditional methods neglected, to accurately model and predict the latent process for Poisson distributed data. PoiMSN, compared to Poisson geostatistics and traditional MSN, was validated by simulation. It demonstrated superior performance, achieving the lowest mean absolute error and root-mean-squared error, with at least 5% improvement in accuracy for autocorrelated and stratified Poisson data. The application to hand, foot, mouth disease data showed PoiMSN could precisely map the disease risks with lower uncertainty. PoiMSN has the ability to accommodate autocorrelated and heterogeneous statistical population and leverage extensive sample information, substantiating its theoretical and empirical superiority in spatially non-stationary count data.
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
This data package corresponds to a research paper by Giesbrecht et. al. (2025) in the journal Ecosystems with the title "Mapping the spatial heterogeneity of watershed ecosystems and water quality in rainforest fjordlands". https://doi.org/10.1007/s10021-025-00964-x The data package contains:
A shapefile representing sampled watershed boundaries in .shp format ("G2025_Wts_boundaries.shp") Table of watershed characteristics in .csv format ("G2025_Wts_data.csv") Table of quality controlled water quality data in .csv format ("G2025_WQ_data.csv") A README file with variable definitions for water quality data ("G2025_WQ_README.csv") A README file with variable definitions for watershed characteristics ("G2025_Wts_README.csv")
Methods In this study, we examined spatial controls on the quality of freshwater exported from diverse watersheds in fjordlands of a coastal temperate rainforest. Samples were collected about once per month for a year from the outlets of 56 watersheds spanning from high mountains with icefields to low islands with extensive wetlands. Watershed size ranged from < 1.5 km2 to 5,782 km2 (Homathko River), yet in the regional and global context, all are considered “small” coastal watersheds (< 10,000 km2 following Milliman and Syvitski, 1992). The study watersheds were spatially distributed along two fjordland transects on the south-central coast of British Columbia, Canada (51°57' N to 50°07' N and 128°09' W to 123°44' W). Watershed characterization and classification This study takes advantage of a previous watershed classification effort, which used four widely available (open access) datasets and 12 easily computed watershed characteristics to define 12 types of small coastal watersheds with cluster analysis (Giesbrecht and others 2022). These clusters separated watersheds by characteristic water source (glacial, snowmelt, rain), topography (mountains, hills, lowland), climate and geographic location within the NPCTR (north, central, south). For the present study, we assigned every sampled watershed to one of the 12 types of coastal watershed defined in Giesbrecht and others (2022). This assignment required a modelling step because 34 of our 56 sampled watersheds were smaller than the minimum size (20 km2) of well delineated watersheds (DW) used in the regional scale classification (2022). We used a random forest (Breiman, 2001) classifier (randomForest package (version 4.6-14) in R (R Core Team, 2020)) to assign class membership to newly delineated (very small) watersheds.The predictor variables were the 12 watershed characteristics originally used to define the regional watershed types via cluster analysis (Giesbrecht and others, 2022). The response variable was watershed type. The present analysis revised the previous regional watershed classification by better resolving the locations and extent of watersheds in the ~ 1 to 10 km2 size range, particularly those with very low relief and extensive wetland cover. Stream chemistry data Water samples were collected from the watershed outlets roughly once every month for a year (March 2018 to March 2019), for a total of 405 observation site-days after quality control. Most watersheds were sampled eight to ten times. Each transect was surveyed over two to three consecutive days in order to sample under relatively similar weather and flow conditions. The two transects were surveyed as close together in time as feasible, but were often more than a week apart, thus not always capturing the same weather system. From each water sample, we measured 22 aspects of riverine water quality, including DOC, alkalinity, cations, organic and inorganic nutrients, 𝛿18O-H2O, 𝛿2H-H2O, and handheld sensor (YSI ProDSS) readings of temperature, specific conductance, pH, and turbidity. Water samples and sensor readings were taken from the main flow, avoiding eddies, shallow water, loose substrates, or woody debris. Samples for dissolved constituents were field-filtered with a 0.45 µm Millipore® Millex-HP hydrophilic polyethyl sulfonate (PES) syringe filter. All samples were kept cool and dark during the field work. Samples were then preserved by freezing or acidification as appropriate, within 24 hours of field collection. The field and laboratory procedures for this study follow those of St. Pierre and others (2021) and Tank and others (2020). Laboratory results below the detection limit were replaced by ½ the detection limit, following common convention (e.g., EPA, 2000). In addition to direct measurements, we calculated several variables from the analytical laboratory results: the total concentration of dissolved inorganic nitrogen (DIN), dissolved organic nitrogen (DON), particulate nitrogen (PN) dissolved organic phosphorous (DOP), and particulate phosphorous (PP). Finally, we computed the mass ratio of sodium to calcium ions (Na:Ca) as a simple index of cation origin. High Na:Ca ratios can be caused by high inputs of cyclic marine salts (via precipitation) relative to cation inputs from rock weathering (Gibbs, 1970; Schlesinger, 1997) and by high inputs from silicate weathering relative to carbonate weathering (Gaillardet and others, 1999; Tank and others, 2012a). Several quality control (QC) and data cleaning procedures were implemented prior to the analysis, using a combination of visual inspection and data-based criteria. For visual inspection, tables and plots of the water quality measurements were examined while cross referencing metadata from field notes and laboratory notes. We omitted any suspiciously high or low values that could be readily explained by a procedural anomaly such as a cracked sample vial. For data-based QC, outlier values of sensitive species (DIN species, TN, and SRP) were identified (mean ± 4SD) and omitted unless supported by independent measurements (e.g., high DIN supported by high TDN and high TN). Additional quality control procedures were applied to calculated values to avoid use of illogical results. For example, where DIN > TDN, the resulting negative DON value was replaced with ½ the detection limit of TDN to indicate a small non-zero quantity. We also omitted samples where specific conductance exceeded 200 µS cm-1, which are suspiciously high for the geological conditions we sampled. These samples also had high concentrations of Na+, K+, Cl-/SO42-, or Sr2+ (where available), likely due to tidal mixing of brackish water. We identified seven such cases, representing five site-dates. Please refer to the corresponding research paper for a more complete description of methods: https://doi.org/10.1007/s10021-025-00964-x
Export DataAccess API NSW Administrative Boundaries Theme
Please Note WGS 84 service aligned to GDA94 This dataset has spatial reference [WGS 84 ≈ GDA94] which may result in misalignments when viewed in GDA2020 environments. A similar service with a ‘multiCRS’ suffix is available which can support GDA2020, GDA94 and WGS 84 ≈ GDA2020 environments. In due course, and allowing time for user feedback and testing, it is intended that the original service name will adopt the new 'multiCRS' functionality. Metadata Portal Metadata Information Content TitleNSW Administrative Boundaries ThemeContent TypeHosted Feature LayerDescriptionThe Administrative Boundaries theme from NSW Foundation Spatial Data Framework (FSDF) is a collection of legislative, regulatory, political, maritime and general administrative boundaries sourced from local and state boundary datasets.They include:Parish CountySuburbLocal Government AreaState Electoral DistrictFederal Electoral DivisionMines Subsidence District NSW Parks and Wildlife Service ReserveState ForestState BorderDomestic Water Front PrecinctThe Administrative Boundaries theme is used to show administrative areas that represent:Voting DistrictsRedistributionsZoning Socio-Economic analysisRegional PlanningService DistributionLocal and State Government BoundariesIn addition, Administrative Boundaries can also be used for analysis and to look at trends over time. Administrative boundary data in combination with geo-coded address data, demographic information and agency specific business information underpins the ability to perform high quality spatial analysis.The use of this data in combination with other data includes:Evidence-based development and assessment of government policyProviding the ability to undertake spatial accountingRegional analysis for government, health, education, business and a range of other purposesSupport for emergency managementMarket catchment analysis, micromarketing, customer analysis and market segmentationEmergency management. Update frequencies vary for each dataset. Individual current status can be found under each spatial data profile. Initial Publication Date04/02/2020Data Currency01/01/3000Data Update FrequencyOtherContent SourceData provider filesFile TypeESRI File Geodatabase (*.gdb)Attribution© State of New South Wales (Spatial Services, a business unit of the Department of Customer Service NSW). For current information go to spatial.nsw.gov.auData Theme, Classification or Relationship to other DatasetsNSW Administrative Boundaries Theme of the Foundation Spatial Data Framework (FSDF)AccuracyThe dataset maintains a positional relationship to, and alignment with, the Lot and Property digital datasets. This dataset was captured by digitising the best available cadastral mapping at a variety of scales and accuracies, ranging from 1:500 to 1:250 000 according to the National Mapping Council of Australia, Standards of Map Accuracy (1975). Therefore, the position of the feature instance will be within 0.5mm at map scale for 90% of the well-defined points. That is, 1:500 = 0.25m, 1:2000 = 1m, 1:4000 = 2m, 1:25000 = 12.5m, 1:50000 = 25m and 1:100000 = 50m. A program to upgrade the spatial location and accuracy of data is ongoing. Spatial Reference System (dataset)GDA94Spatial Reference System (web service)EPSG:3857WGS84 Equivalent ToGDA94Spatial ExtentFull StateContent LineageFor additional information, please contact us via the Spatial Services Customer HubData ClassificationUnclassifiedData Access PolicyOpenData QualityFor additional information, please contact us via the Spatial Services Customer HubTerms and ConditionsCreative CommonsStandard and SpecificationOpen Geospatial Consortium (OGC) implemented and compatible for consumption by common GIS platforms. Available as either cache or non-cache, depending on client use or requirement.Information about the Feature Class and Domain Name descriptions for the NSW Administrative Boundaries Theme can be found in the NSW Cadastral Data Dictionary.Some of Spatial Services Datasets are designed to work together for example NSW Address Point and NSW Address String (table), NSW Property (Polygon) and NSW Property Lot (table) and NSW Lot (polygons). To do this you need to add a Spatial Join.A Spatial Join is a GIS operation that affixes data from one feature layer’s attribute table to another from a spatial perspective.To see how NSW Address, Property, Lot Geometry data and tables can be spatially joined, download the Data Model Document. Data CustodianDCS Spatial Services346 Panorama AveBathurst NSW 2795Point of ContactPlease contact us via the Spatial Services Customer HubData AggregatorDCS Spatial Services346 Panorama AveBathurst NSW 2795Data DistributorDCS Spatial Services346 Panorama AveBathurst NSW 2795Additional Supporting InformationData DictionariesData Model Document. TRIM Number
http://inspire.ec.europa.eu/metadata-codelist/LimitationsOnPublicAccess/noLimitationshttp://inspire.ec.europa.eu/metadata-codelist/LimitationsOnPublicAccess/noLimitations
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset contains information on European groundwater bodies, monitoring sites, river basin districts, river basin districts sub-units and surface bodies reported to the European Environment Agency between 29-11-2001 and 01-03-2021.
The information was reported to the European Environment Agency under the State of Environment reporting obligations. For the EU27 countries, Iceland, Norway and the United Kingdom, the EIONET spatial data was consolidated with the spatial data reported under the Water Framework Directive reporting obligations. For these countries, the reference spatial data set is the "WISE WFD Reference Spatial Datasets reported under Water Framework Directive".
Relevant concepts:
Groundwater body: 'Body of groundwater' means a distinct volume of groundwater within an aquifer or aquifers. Groundwater: All water which is below the surface of the ground in the saturation zone and in direct contact with the ground or subsoil. Aquifer: Subsurface layer or layers of rock or other geological strata of sufficient porosity and permeability to allow either a significant flow of groundwater or the abstraction of significant quantities of groundwater. Surface water body: Body of surface water means a discrete and significant element of surface water such as a lake, a reservoir, a stream, river or canal, part of a stream, river or canal, a transitional water or a stretch of coastal water. Surface water: Inland waters, except groundwater; transitional waters and coastal waters, except in respect of chemical status for which it shall also include territorial waters. Inland water: All standing or flowing water on the surface of the land, and all groundwater on the landward side of the baseline from which the breadth of territorial waters is measured. River: Body of inland water flowing for the most part on the surface of the land but which may flow underground for part of its course. Lake: Body of standing inland surface water. River basin district: The area of land and sea, made up of one or more neighbouring river basins together with their associated groundwaters and coastal waters, which is the main unit for management of river basins. River basin: The area of land from which all surface run-off flows through a sequence of streams, rivers and, possibly, lakes into the sea at a single river mouth, estuary or delta. Sub-basin: The area of land from which all surface run-off flows through a series of streams, rivers and, possibly, lakes to a particular point in a water course (normally a lake or a river confluence). Sub-unit [Operational definition. Not in the WFD]: Reporting unit. River basin districts larger than 50000 square kilometre should be divided into comparable sub-units with an area between 5000 and 50000 square kilometre. The sub-units should be created using river basins (if more than one river basin exists in the RBD), set of contiguous river basins, or sub-basins, for example. If the RBD area is less than 50000 square kilometre, the RBD itself should be used as a sub-unit.
This dataset, termed "GAGES II", an acronym for Geospatial Attributes of Gages for Evaluating Streamflow, version II, provides geospatial data and classifications for 9,322 stream gages maintained by the U.S. Geological Survey (USGS). It is an update to the original GAGES, which was published as a Data Paper on the journal Ecology's website (Falcone and others, 2010b) in 2010. The GAGES II dataset consists of gages which have had either 20+ complete years (not necessarily continuous) of discharge record since 1950, or are currently active, as of water year 2009, and whose watersheds lie within the United States, including Alaska, Hawaii, and Puerto Rico. Reference gages were identified based on indicators that they were the least-disturbed watersheds within the framework of broad regions, based on 12 major ecoregions across the United States. Of the 9,322 total sites, 2,057 are classified as reference, and 7,265 as non-reference. Of the 2,057 reference sites, 1,633 have (through 2009) 20+ years of record since 1950. Some sites have very long flow records: a number of gages have been in continuous service since 1900 (at least), and have 110 years of complete record (1900-2009) to date. The geospatial data include several hundred watershed characteristics compiled from national data sources, including environmental features (e.g. climate – including historical precipitation, geology, soils, topography) and anthropogenic influences (e.g. land use, road density, presence of dams, canals, or power plants). The dataset also includes comments from local USGS Water Science Centers, based on Annual Data Reports, pertinent to hydrologic modifications and influences. The data posted also include watershed boundaries in GIS format. This overall dataset is different in nature to the USGS Hydro-Climatic Data Network (HCDN; Slack and Landwehr 1992), whose data evaluation ended with water year 1988. The HCDN identifies stream gages which at some point in their history had periods which represented natural flow, and the years in which those natural flows occurred were identified (i.e. not all HCDN sites were in reference condition even in 1988, for example, 02353500). The HCDN remains a valuable indication of historic natural streamflow data. However, the goal of this dataset was to identify watersheds which currently have near-natural flow conditions, and the 2,057 reference sites identified here were derived independently of the HCDN. A subset, however, noted in the BasinID worksheet as “HCDN-2009”, has been identified as an updated list of 743 sites for potential hydro-climatic study. The HCDN-2009 sites fulfill all of the following criteria: (a) have 20 years of complete and continuous flow record in the last 20 years (water years 1990-2009), and were thus also currently active as of 2009, (b) are identified as being in current reference condition according to the GAGES-II classification, (c) have less than 5 percent imperviousness as measured from the NLCD 2006, and (d) were not eliminated by a review from participating state Water Science Center evaluators. The data posted here consist of the following items:- This point shapefile, with summary data for the 9,322 gages.- A zip file containing basin characteristics, variable definitions, and a more detailed report.- A zip file containing shapefiles of basin boundaries, organized by classification and aggregated ecoregion.- A zip file containing mainstem stream lines (Arc line coverages) for each gage.
This file contains the data set used to develop a random forest model predict background specific conductivity for stream segments in the contiguous United States. This Excel readable file contains 56 columns of parameters evaluated during development. The data dictionary provides the definition of the abbreviations and the measurement units. Each row is a unique sample described as R** which indicates the NHD Hydrologic Unit (underscore), up to a 7-digit COMID, (underscore) sequential sample month. To develop models that make stream-specific predictions across the contiguous United States, we used StreamCat data set and process (Hill et al. 2016; https://github.com/USEPA/StreamCat). The StreamCat data set is based on a network of stream segments from NHD+ (McKay et al. 2012). These stream segments drain an average area of 3.1 km2 and thus define the spatial grain size of this data set. The data set consists of minimally disturbed sites representing the natural variation in environmental conditions that occur in the contiguous 48 United States. More than 2.4 million SC observations were obtained from STORET (USEPA 2016b), state natural resource agencies, the U.S. Geological Survey (USGS) National Water Information System (NWIS) system (USGS 2016), and data used in Olson and Hawkins (2012) (Table S1). Data include observations made between 1 January 2001 and 31 December 2015 thus coincident with Moderate Resolution Imaging Spectroradiometer (MODIS) satellite data (https://modis.gsfc.nasa.gov/data/). Each observation was related to the nearest stream segment in the NHD+. Data were limited to one observation per stream segment per month. SC observations with ambiguous locations and repeat measurements along a stream segment in the same month were discarded. Using estimates of anthropogenic stress derived from the StreamCat database (Hill et al. 2016), segments were selected with minimal amounts of human activity (Stoddard et al. 2006) using criteria developed for each Level II Ecoregion (Omernik and Griffith 2014). Segments were considered as potentially minimally stressed where watersheds had 0 - 0.5% impervious surface, 0 – 5% urban, 0 – 10% agriculture, and population densities from 0.8 – 30 people/km2 (Table S3). Watersheds with observations with large residuals in initial models were identified and inspected for evidence of other human activities not represented in StreamCat (e.g., mining, logging, grazing, or oil/gas extraction). Observations were removed from disturbed watersheds, with a tidal influence or unusual geologic conditions such as hot springs. About 5% of SC observations in each National Rivers and Stream Assessment (NRSA) region were then randomly selected as independent validation data. The remaining observations became the large training data set for model calibration. This dataset is associated with the following publication: Olson, J., and S. Cormier. Modeling spatial and temporal variation in natural background specific conductivity. ENVIRONMENTAL SCIENCE & TECHNOLOGY. American Chemical Society, Washington, DC, USA, 53(8): 4316-4325, (2019).