Facebook
TwitterThe product represents a new design of the State Map at a scale of 1:5,000 (SM 5) in vector form, whose advantages are recency and colour processing. The map contains planimetry based on cadastral map, altimetry adopted from the altimetry part of ZABAGED and map lettering based on database of geographic names Geonames and abbreviations of feature type signification coming up from attributes of selected ZABAGED features. This new design of the SM 5 is repeatedly generated once a year on the part of the Czech territory where the vector form of cadastral map is available. Therefore, part of export units (map sheets of SM 5) has not a full coverage (price of such export unit is then proportionally reduced).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
LScDC Word-Category RIG MatrixApril 2020 by Neslihan Suzen, PhD student at the University of Leicester (ns433@leicester.ac.uk / suzenneslihan@hotmail.com)Supervised by Prof Alexander Gorban and Dr Evgeny MirkesGetting StartedThis file describes the Word-Category RIG Matrix for theLeicester Scientific Corpus (LSC) [1], the procedure to build the matrix and introduces the Leicester Scientific Thesaurus (LScT) with the construction process. The Word-Category RIG Matrix is a 103,998 by 252 matrix, where rows correspond to words of Leicester Scientific Dictionary-Core (LScDC) [2] and columns correspond to 252 Web of Science (WoS) categories [3, 4, 5]. Each entry in the matrix corresponds to a pair (category,word). Its value for the pair shows the Relative Information Gain (RIG) on the belonging of a text from the LSC to the category from observing the word in this text. The CSV file of Word-Category RIG Matrix in the published archive is presented with two additional columns of the sum of RIGs in categories and the maximum of RIGs over categories (last two columns of the matrix). So, the file ‘Word-Category RIG Matrix.csv’ contains a total of 254 columns.This matrix is created to be used in future research on quantifying of meaning in scientific texts under the assumption that words have scientifically specific meanings in subject categories and the meaning can be estimated by information gains from word to categories. LScT (Leicester Scientific Thesaurus) is a scientific thesaurus of English. The thesaurus includes a list of 5,000 words from the LScDC. We consider ordering the words of LScDC by the sum of their RIGs in categories. That is, words are arranged in their informativeness in the scientific corpus LSC. Therefore, meaningfulness of words evaluated by words’ average informativeness in the categories. We have decided to include the most informative 5,000 words in the scientific thesaurus. Words as a Vector of Frequencies in WoS CategoriesEach word of the LScDC is represented as a vector of frequencies in WoS categories. Given the collection of the LSC texts, each entry of the vector consists of the number of texts containing the word in the corresponding category.It is noteworthy that texts in a corpus do not necessarily belong to a single category, as they are likely to correspond to multidisciplinary studies, specifically in a corpus of scientific texts. In other words, categories may not be exclusive. There are 252 WoS categories and a text can be assigned to at least 1 and at most 6 categories in the LSC. Using the binary calculation of frequencies, we introduce the presence of a word in a category. We create a vector of frequencies for each word, where dimensions are categories in the corpus.The collection of vectors, with all words and categories in the entire corpus, can be shown in a table, where each entry corresponds to a pair (word,category). This table is build for the LScDC with 252 WoS categories and presented in published archive with this file. The value of each entry in the table shows how many times a word of LScDC appears in a WoS category. The occurrence of a word in a category is determined by counting the number of the LSC texts containing the word in a category. Words as a Vector of Relative Information Gains Extracted for CategoriesIn this section, we introduce our approach to representation of a word as a vector of relative information gains for categories under the assumption that meaning of a word can be quantified by their information gained for categories.For each category, a function is defined on texts that takes the value 1, if the text belongs to the category, and 0 otherwise. For each word, a function is defined on texts that takes the value 1 if the word belongs to the text, and 0 otherwise. Consider LSC as a probabilistic sample space (the space of equally probable elementary outcomes). For the Boolean random variables, the joint probability distribution, the entropy and information gains are defined.The information gain about the category from the word is the amount of information on the belonging of a text from the LSC to the category from observing the word in the text [6]. We used the Relative Information Gain (RIG) providing a normalised measure of the Information Gain. This provides the ability of comparing information gains for different categories. The calculations of entropy, Information Gains and Relative Information Gains can be found in the README file in the archive published. Given a word, we created a vector where each component of the vector corresponds to a category. Therefore, each word is represented as a vector of relative information gains. It is obvious that the dimension of vector for each word is the number of categories. The set of vectors is used to form the Word-Category RIG Matrix, in which each column corresponds to a category, each row corresponds to a word and each component is the relative information gain from the word to the category. In Word-Category RIG Matrix, a row vector represents the corresponding word as a vector of RIGs in categories. We note that in the matrix, a column vector represents RIGs of all words in an individual category. If we choose an arbitrary category, words can be ordered by their RIGs from the most informative to the least informative for the category. As well as ordering words in each category, words can be ordered by two criteria: sum and maximum of RIGs in categories. The top n words in this list can be considered as the most informative words in the scientific texts. For a given word, the sum and maximum of RIGs are calculated from the Word-Category RIG Matrix.RIGs for each word of LScDC in 252 categories are calculated and vectors of words are formed. We then form the Word-Category RIG Matrix for the LSC. For each word, the sum (S) and maximum (M) of RIGs in categories are calculated and added at the end of the matrix (last two columns of the matrix). The Word-Category RIG Matrix for the LScDC with 252 categories, the sum of RIGs in categories and the maximum of RIGs over categories can be found in the database.Leicester Scientific Thesaurus (LScT)Leicester Scientific Thesaurus (LScT) is a list of 5,000 words form the LScDC [2]. Words of LScDC are sorted in descending order by the sum (S) of RIGs in categories and the top 5,000 words are selected to be included in the LScT. We consider these 5,000 words as the most meaningful words in the scientific corpus. In other words, meaningfulness of words evaluated by words’ average informativeness in the categories and the list of these words are considered as a ‘thesaurus’ for science. The LScT with value of sum can be found as CSV file with the published archive. Published archive contains following files:1) Word_Category_RIG_Matrix.csv: A 103,998 by 254 matrix where columns are 252 WoS categories, the sum (S) and the maximum (M) of RIGs in categories (last two columns of the matrix), and rows are words of LScDC. Each entry in the first 252 columns is RIG from the word to the category. Words are ordered as in the LScDC.2) Word_Category_Frequency_Matrix.csv: A 103,998 by 252 matrix where columns are 252 WoS categories and rows are words of LScDC. Each entry of the matrix is the number of texts containing the word in the corresponding category. Words are ordered as in the LScDC.3) LScT.csv: List of words of LScT with sum (S) values. 4) Text_No_in_Cat.csv: The number of texts in categories. 5) Categories_in_Documents.csv: List of WoS categories for each document of the LSC.6) README.txt: Description of Word-Category RIG Matrix, Word-Category Frequency Matrix and LScT and forming procedures.7) README.pdf (same as 6 in PDF format)References[1] Suzen, Neslihan (2019): LSC (Leicester Scientific Corpus). figshare. Dataset. https://doi.org/10.25392/leicester.data.9449639.v2[2] Suzen, Neslihan (2019): LScDC (Leicester Scientific Dictionary-Core). figshare. Dataset. https://doi.org/10.25392/leicester.data.9896579.v3[3] Web of Science. (15 July). Available: https://apps.webofknowledge.com/[4] WoS Subject Categories. Available: https://images.webofknowledge.com/WOKRS56B5/help/WOS/hp_subject_category_terms_tasca.html [5] Suzen, N., Mirkes, E. M., & Gorban, A. N. (2019). LScDC-new large scientific dictionary. arXiv preprint arXiv:1912.06858. [6] Shannon, C. E. (1948). A mathematical theory of communication. Bell system technical journal, 27(3), 379-423.
Facebook
TwitterSpatial analysis and statistical summaries of the Protected Areas Database of the United States (PAD-US) provide land managers and decision makers with a general assessment of management intent for biodiversity protection, natural resource management, and recreation access across the nation. The PAD-US 3.0 Combined Fee, Designation, Easement feature class (with Military Lands and Tribal Areas from the Proclamation and Other Planning Boundaries feature class) was modified to remove overlaps, avoiding overestimation in protected area statistics and to support user needs. A Python scripted process ("PADUS3_0_CreateVectorAnalysisFileScript.zip") associated with this data release prioritized overlapping designations (e.g. Wilderness within a National Forest) based upon their relative biodiversity conservation status (e.g. GAP Status Code 1 over 2), public access values (in the order of Closed, Restricted, Open, Unknown), and geodatabase load order (records are deliberately organized in the PAD-US full inventory with fee owned lands loaded before overlapping management designations, and easements). The Vector Analysis File ("PADUS3_0VectorAnalysisFile_ClipCensus.zip") associated item of PAD-US 3.0 Spatial Analysis and Statistics ( https://doi.org/10.5066/P9KLBB5D ) was clipped to the Census state boundary file to define the extent and serve as a common denominator for statistical summaries. Boundaries of interest to stakeholders (State, Department of the Interior Region, Congressional District, County, EcoRegions I-IV, Urban Areas, Landscape Conservation Cooperative) were incorporated into separate geodatabase feature classes to support various data summaries ("PADUS3_0VectorAnalysisFileOtherExtents_Clip_Census.zip") and Comma-separated Value (CSV) tables ("PADUS3_0SummaryStatistics_TabularData_CSV.zip") summarizing "PADUS3_0VectorAnalysisFileOtherExtents_Clip_Census.zip" are provided as an alternative format and enable users to explore and download summary statistics of interest (Comma-separated Table [CSV], Microsoft Excel Workbook [.XLSX], Portable Document Format [.PDF] Report) from the PAD-US Lands and Inland Water Statistics Dashboard ( https://www.usgs.gov/programs/gap-analysis-project/science/pad-us-statistics ). In addition, a "flattened" version of the PAD-US 3.0 combined file without other extent boundaries ("PADUS3_0VectorAnalysisFile_ClipCensus.zip") allow for other applications that require a representation of overall protection status without overlapping designation boundaries. The "PADUS3_0VectorAnalysis_State_Clip_CENSUS2020" feature class ("PADUS3_0VectorAnalysisFileOtherExtents_Clip_Census.gdb") is the source of the PAD-US 3.0 raster files (associated item of PAD-US 3.0 Spatial Analysis and Statistics, https://doi.org/10.5066/P9KLBB5D ). Note, the PAD-US inventory is now considered functionally complete with the vast majority of land protection types represented in some manner, while work continues to maintain updates and improve data quality (see inventory completeness estimates at: http://www.protectedlands.net/data-stewards/ ). In addition, changes in protected area status between versions of the PAD-US may be attributed to improving the completeness and accuracy of the spatial data more than actual management actions or new acquisitions. USGS provides no legal warranty for the use of this data. While PAD-US is the official aggregation of protected areas ( https://www.fgdc.gov/ngda-reports/NGDA_Datasets.html ), agencies are the best source of their lands data.
Facebook
TwitterThe product represents a new design of the State Map at a scale of 1:5,000 (SM 5) in vector form, whose advantages are recency and colour processing. The map contains planimetry based on cadastral map, altimetry adopted from the altimetry part of ZABAGED and map lettering based on database of geographic names Geonames and abbreviations of feature type signification coming up from attributes of selected ZABAGED features. This new design of the SM 5 is repeatedly generated once a year on the part of the Czech territory where the vector form of cadastral map is available. Therefore, part of export units (map sheets of SM 5) has not a full coverage (price of such export unit is then proportionally reduced).
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Microsoft released a U.S.-wide vector building dataset in 2018. Although the vector building layers provide relatively accurate geometries, their use in large-extent geospatial analysis comes at a high computational cost. We used High-Performance Computing (HPC) to develop an algorithm that calculates six summary values for each cell in a raster representation of each U.S. state, excluding Alaska and Hawaii: (1) total footprint coverage, (2) number of unique buildings intersecting each cell, (3) number of building centroids falling inside each cell, and area of the (4) average, (5) smallest, and (6) largest area of buildings that intersect each cell. These values are represented as raster layers with 30 m cell size covering the 48 conterminous states. We also identify errors in the original building dataset. We evaluate precision and recall in the data for three large U.S. urban areas. Precision is high and comparable to results reported by Microsoft while recall is high for buildings with footprints larger than 200 m2 but lower for progressively smaller buildings.
Building footprints are a critical environmental descriptor. Microsoft produced a U.S.-wide vector building dataset in 20181 that was generated from aerial images available to Bing Maps using deep learning methods for object classification2. The main goal of this product has been to increase the coverage of building footprints available for OpenStreetMap. Microsoft identified building footprints in two phases; first, using semantic segmentation to identify building pixels from aerial imagery using Deep Neural Networks and second, converting building pixel blobs into polygons. The final dataset includes 125,192,184 building footprint polygon geometries in GeoJSON vector format, covering all 50 U.S. States, with data for each state distributed separately. These data have 99.3% precision and 93.5% pixel recall accuracy2. Temporal resolution of the data (i.e., years of the aerial imagery used to derive the data) are not provided by Microsoft in the metadata.
Using vector layers for large-extent (i.e., national or state-level) spatial analysis and modelling (e.g., mapping the Wildland-Urban Interface, flood and coastal hazards, or large-extent urban typology modelling) is challenging in practice. Although vector data provide accurate geometries, incorporating them in large-extent spatial analysis comes at a high computational cost. We used High Performance Computing (HPC) to develop an algorithm that calculates six summary statistics (described below) for buildings at 30-m cell size in the 48 conterminous U.S. states, to better support national-scale and multi-state modelling that requires building footprint data. To develop these six derived products from the Microsoft buildings dataset, we created an algorithm that took every single building and built a small meshgrid (a 2D array) for the bounding box of the building and calculated unique values for each cell of the meshgrid. This grid structure is aligned with National Land Cover Database (NLCD) products (projected using Albers Equal Area Conic system), enabling researchers to combine or compare our products with standard national-scale datasets such as land cover, tree canopy cover, and urban imperviousness3.
Locations, shapes, and distribution patterns of structures in urban and rural areas are the subject of many studies. Buildings represent the density of built up areas as an indicator of urban morphology or spatial structures of cities and metropolitan areas4,5. In local studies, the use of vector data types is easier6,7. However, in regional and national studies a raster dataset would be more preferable. For example in measuring the spatial structure of metropolitan areas a rasterized building layer would be more useful than the original vector datasets8.
Our output raster products are: (1) total building footprint coverage per cell (m2 of building footprint per 900 m2 cell); (2) number of buildings that intersect each cell; (3) number of building centroids falling within each cell; (4) area of the largest building intersecting each cell (m2); (5) area of the smallest building intersecting each cell (m2); and (6) average area of all buildings intersecting each cell (m2). The last three area metrics include building area that falls outside the cell but where part of the building intersects the cell (Fig. 1). These values can be used to describe the intensity and typology of the built environment.
Our software is available through U.S. Geological Survey code r...
Facebook
TwitterThis vector data set represents the delineated turf evapotranspiration (ET) unit polygon of the golf course near Hawthorne, Nevada.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset includes Mapillary street view images and corresponding location data from Africa. The street view images are saved in JPG format, with each image assigned a unique ID number, totaling 200,000 images. The location data is represented as point vector data in Esri Shapefile format, where each point includes: a unique street view ID, longitude, latitude, and road surface type , with a total of 200,000 points. All data are projected using the World Geodetic System (WGS) 84 and the pseudo-Mercator coordinate system (EPSG: 3857) .
Facebook
Twitter"These contours are represented as vector data (lines). Each contour line is associated with an elevation (number of feet above sea level) in the attribute database. Contours were generated from the TIN (see description under DTM on page 6 of the supplemental METADATA FILE). Note that these contours are suitable for general landscape planning and educational purposes only. They are not survey-grade quality, and are not intended to support applications that require survey-quality data. These contours should be used for general reference and educational purposes only. The contour interval is 20 feet for counties/localities that lie west of the fall line (I-95 corridor) and 10 feet for localities that lie east of the fall line (I-95). The supplemental METADATA file included with the data contains an illustration below the Contour section that shows contours (left) and contours draped over aerial photography (right).Contours are used as a visual tool to understand general topographic landscape characteristics. In addition, contours can be “queried” so that users can quickly identify areas above or below certain elevations. However, these contours have been interpolated, and are only approximate. Applications requiring precise elevation measurements will require the assistance of a professional surveyor.The Virginia Geospatial Extension Program developed the contour layer using the TIN (described under DTM on page 6 of the supplemental METADATA file). For more information on this data refer to the supplemental metadata pdf found at: https://secure-archive.gis.vt.edu/gisdata/public/UnitedStates/Virginia/VCE_2002_metadata/METADATA.pdfThis data has been curated by the Virginia Cooperative Extension at Virginia Tech and Virginia Tech University Libraries. This data is meant for general use only. Virginia Tech’s University Library is acting as a steward for this data and any questions about its use should refer to our Terms of Use Page."
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
** IMPORTANT UPDATE: **
Until now, the project and public versions of SWORD have been kept separate while algorithms were being developed in preparation for SWOT launch. Now that the SWOT mission is here, we have decided to publish the project version of SWORD which is why the version numbers jump after v2. The primary difference between the project and public versions of SWORD are extra "filler" variables in the NetCDF format that will be used for calculating discharge. Everything else, reach definition, attribute values, etc. are the same between the two versions. For details on the filler variables please reference the Product Description Document provided with the downloads.
If you use the SWORD Database in your work, please cite: Altenau et al., (2021) The Surface Water and Ocean Topography (SWOT) Mission River Database (SWORD): A Global River Network for Satellite Data Products. Water Resources Research. https://doi.org/10.1029/2021WR030054
You can also visit www.swordexplorer.com to explore the current version of SWORD before downloading.
The upcoming Surface Water and Ocean Topography (SWOT) satellite mission, planned to launch in 2022, will vastly expand observations of river water surface elevation (WSE), width, and slope. In order to facilitate a wide range of new analyses with flexibility, the SWOT mission will provide a range of relevant data products. One product the SWOT mission will provide are river vector products stored in shapefile format for each SWOT overpass (JPL Internal Document, 2020b). The SWOT vector data products will be most broadly useful if they allow multitemporal analysis of river nodes and reaches covering the same river areas. Doing so requires defining SWOT reaches and nodes a priori, so that SWOT data can be assigned to them. The SWOt River Database (SWORD) combines multiple global river- and satellite-related datasets to define the nodes and reaches that will constitute SWOT river vector data products. SWORD provides high-resolution river nodes (200 m) and reaches (~10 km) in shapefile and netCDF formats with attached hydrologic variables (WSE, width, slope, etc.) as well as a consistent topological system for global rivers 30 m wide and greater.
The SWORD database is provided in netCDF, geopackage, and shapefile formats. All files start with a two-digit continent identifier ("af" – Africa, "as" – Asia / Siberia, "eu" – Europe / Middle East, "na" – North America, "oc" – Oceania, "sa" – South America). File syntax denotes the regional information for each file and varies slightly between netCDF and shapefile formats.
NetCDF files are structured in 3 groups: centerlines, nodes, and reaches. The centerline group contains location information and associated reach and node ids along the original GRWL 30 m centerlines (Allen and Pavelsky, 2018). Node and reach groups contain hydrologic attributes at the ~200 m node and ~10 km reach locations (see description of attributes below). NetCDFs are distributed at continental scales with a filename convention as follows: [continent]_sword_v17.nc (i.e. na_sword_v17.nc).
SWORD shapefiles consist of four main files (.dbf, .prj, .shp, .shx). There are separate shapefiles for nodes and reaches, where nodes are represented as ~200 m spaced points and reaches are represented as polylines. All shapefiles are in geographic (latitude/longitude) projection, referenced to datum WGS84. Shapefiles are split into HydroBASINS (Lehner and Grill, 2013) Pfafstetter level 2 basins (hbXX) for each continent with a naming convention as follows: [continent]_sword_[nodes/reaches]_hb[XX]_v17.shp (i.e. na_sword_nodes_hb74_v17.shp; na_sword_reaches_hb74_v17.shp).
SWORD geopackage files are split into two files for nodes and reaches per continental region, where nodes are represented as 200 m spaced points and reaches are represented as polylines. All geopackage files are in geographic (latitude/longitude) projection, referenced to datum WGS84. Geopackage file names are distributed at continental scales and are defined by a two-digit identifier (Table 2): [continent]_sword_[nodes/reaches]_v17.gpkg (i.e. na_sword_nodes_v17.gpkg; na_sword_reaches_v17.gpkg).
This list contains the primary attributes contained in the SWORD database.
x: Longitude of the node or reach ranging from 180°E to 180°W (units: decimal degrees).
y: Latitude of the node or reach ranging from 90°S to 90°N (units: decimal degrees).
node_id: ID of each node. The format of the id is as follows: CBBBBBRRRRNNNT where C = Continent (the first number of the Pfafstetter basin code), B = Remaining Pfafstetter basin code up to level 6, R = Reach number (assigned sequentially within a level 6 basin starting at the downstream end working upstream), N = Node number (assigned sequentially within a reach starting at the downstream end working upstream), T = Type (1 – river, 3 – lake on river, 4 – dam or waterfall, 5 – unreliable topology, 6 – ghost node).
node_length (node files only): Node length measured along the GRWL centerline points (units: meters).
reach_id: ID of each reach. The format of the id is as follows: CBBBBBRRRRT where C = Continent (the first number of the Pfafstetter basin code), B = Remaining Pfafstetter basin codes up to level 6, R = Reach number (assigned sequentially within a level 6 basin starting at the downstream end working upstream, T = Type (1 – river, 3 – lake on river, 4 – dam or waterfall, 5 – unreliable topology, 6 – ghost reach).
reach_length (reach files only): Reach length measured along the GRWL centerline points (units: meters).
wse: Average water surface elevation (WSE) value for a node or reach. WSEs are extracted from the MERIT Hydro dataset (Yamazaki et al., 2019) and referenced to the EGM96 geoid (units: meters).
wse_var: WSE variance along the GRWL centerline points used to calculate the average WSE for each node or reach (units: square meters).
width: Average width for a node or reach (units: meters).
width_var: Width variance along the GRWL centerline points used to calculate the average width for each node or reach (units: square meters).
max_width: Maximum width value across the channel for each node or reach that includes island and bar areas (units: meters).
facc: Maximum flow accumulation value for a node or reach. Flow accumulation values are extracted from the MERIT Hydro dataset (Yamazaki et al., 2019) (units: square kilometers).
n_chan_max: Maximum number of channels for each node or reach.
n_chan_mod: Mode of the number of channels for each node or reach.
obstr_type: Type of obstruction for each node or reach based on the Globale Obstruction Database (GROD, Whittemore et al., 2020) and HydroFALLS data (http://wp.geog.mcgill.ca/hydrolab/hydrofalls). Obstr_type values: 0 - No Dam, 1 - Dam, 2 - Channel Dam, 3 - Lock, 4 - Low Permeable Dam, 5 - Waterfall.
grod_id: The unique GROD ID for each node or reach with obstr_type values 1-4.
hfalls_id: The unique HydroFALLS ID for each node or reach with obstr_type value 5.
dist_out: Distance from the river outlet for each node or reach (units: meters).
type: Type identifier for a node or reach: 1 – river, 2 – lake off river, 3 – lake on river, 4 – dam or waterfall, 5 – unreliable topology, 6 – ghost reach/node.
lakeflag: GRWL water body identifier for each reach: 0 – river, 1 – lake/reservoir, 2 – canal, 3 – tidally influenced river.
manual_add (node files only): Binary flag indicating whether the node was manually added to the public GRWL centerlines (Allen and Pavelsky, 2018). These nodes were originally given a width = 1, but have since been updated to have the reach width values.
meand_len (node files only): Length of the meander that a node belongs to, measured from beginning of the meander to its end in meters. For nodes longer than one meander, the meander length will represent the average length of all meanders belonging to the node (units: meters).
sinuosity (node files only): The total reach length the node belongs to divided by the Euclidean distance between the reach end points.
slope (reach files only): Reach average slope calculated along the GRWL centerline points. Slopes are calculated using a linear regression (units: meters/kilometer).
n_nodes (reach files only): Number of nodes associated with each reach.
n_rch_up (reach files only): Number of upstream reaches for each reach.
n_rch_down (reach files only): Number of downstream reaches for each reach.
rch_id_up (reach files only): Reach IDs of the upstream neighboring reaches.
rch_id_dn (reach files only): Reach IDs of the downstream neighboring reaches.
swot_obs (reach files only): The maximum number of SWOT passes to intersect each reach during the 21 day orbit cycle.
swot_orbits (reach files only): A list of the SWOT orbit tracks that intersect each reach during the 21 day orbit cycle.
river_name: All river names associated with a node or reach. If there are multiple names for a node or reach they are listed in alphabetical order and separated by a semicolon.
edit_flag: Numerical flag indicating the type of update applied to SWORD nodes or reaches from the previous version. Flag descriptions are listed in the Product Description Documentation included with the file downloads.
trib_flag: Binary flag indicating if a large tributary not represented in SWORD is entering a node or reach. 0 - no tributary, 1 - tributary.
Allen, G. H., & Pavelsky, T. M. (2018). Global extent of rivers and streams. Science, 361(6402), 585-588.
Altenau, E. H., Pavelsky, T. M., Durand, M. T., Yang X., Frasson, R. P. d. M., & Bendezu, L. (2021). The Surface Water and Ocean Topography (SWOT) Mission River Database (SWORD): A global river network for satellite data products". Water Resources Research.
Biancamaria, S., Lettenmaier, D. P., & Pavelsky, T. M. (2016). The SWOT mission and its
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Contains example data for RVGP paper published at ICLR 2024
Facebook
TwitterThe product represents a new design of the State Map at a scale of 1:5,000 (SM 5) in vector form, whose advantages are recency and colour processing. The map contains planimetry based on cadastral map, altimetry adopted from the altimetry part of ZABAGED and map lettering based on database of geographic names Geonames and abbreviations of feature type signification coming up from attributes of selected ZABAGED features. This new design of the SM 5 is repeatedly generated once a year on the part of the Czech territory where the vector form of cadastral map is available. Therefore, part of export units (map sheets of SM 5) has not a full coverage (price of such export unit is then proportionally reduced).
Facebook
TwitterOpen Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
These data were created under DFO’s Strategic Program for Ecosystem-based Research and Advice - Aquatic Invasive Species Program: “Evaluation of the movement of marine infrastructure as a pathway for aquatic invasive species spread”. This geodatabase contains floating dock locations in coastal waters of the Pacific Northwest, from Puget Sound, Washington to Southeast Alaska. These data were assembled by Josephine Iacarella and used in an analysis to understand the role of floating infrastructure as a vector in the spread of marine nonindigenous species (Iacarella et al., 2019). The data are represented as point vectors, though docks have associated size estimates. Data were collected with the aim to have the most accurate representation of coastal coverage of structures in 2017. The most recent images from Google Earth were used, though in some areas these date back a few years. Floating docks included those that extended into the subtidal and were not fixed on pilings. Dock locations were binned into size categories, with small docks and associated marina structures grouped together as ‘marina areas’ based on spatial clustering and a visual estimate of size (haphazard measurement selection, n=35 per category; small: 57.2 m2 ± 6.7, medium: 379.1 m2 ± 42.8, marina area: 4,453.5 m2 ± 744.4). A total of 7,809 floating dock sites were recorded, covering an estimated area of 2.3 km2.
Facebook
TwitterThe product represents a new design of the State Map at a scale of 1:5,000 (SM 5) in vector form, whose advantages are recency and colour processing. The map contains planimetry based on cadastral map, altimetry adopted from the altimetry part of ZABAGED and map lettering based on database of geographic names Geonames and abbreviations of feature type signification coming up from attributes of selected ZABAGED features. This new design of the SM 5 is repeatedly generated once a year on the part of the Czech territory where the vector form of cadastral map is available. Therefore, part of export units (map sheets of SM 5) has not a full coverage (price of such export unit is then proportionally reduced).
Facebook
TwitterAttribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
This dataset and its metadata statement were supplied to the Bioregional Assessment Programme by a third party and are presented here as originally supplied.
This dataset contains 4 different scale GEODATA TOPO series, Geoscience Australia topographic datasets. 1M, 2.5M, 5M and 10M with age ranges from 2001 to 2004.
1:1 Million - Global Map Australia 1M 2001 is a digital dataset covering the Australian landmass and island territories, at a 1:1 million scale. Product Specifications -Themes: It consists of eight layers of information: Vector layers - administrative boundaries, drainage, transportation and population centres Raster layers - elevation, vegetation, land use and land cover -Coverage: Australia -Currency: Variable, based on GEODATA TOPO 250K Series 1 -Coordinates: Geographical -Datum: GDA94, AHD -Medium: Free online -Format: -Vector: ArcInfo Export, ESRI Shapefile, MapInfo mid/mif and Vector Product Format (VPF) -Raster: Band Interleaved by Line (BIL)
1:2.5 Million - GEODATA TOPO 2.5M 2003 is a national seamless data product aimed at regional or national applications. It is a vector representation of the Australian landscape as represented on the Geoscience Australia 2.5 million general reference map and is suitable for GIS applications. The product consists of the following layers: built-up areas; contours; drainage; framework; localities; offshore; rail transport; road transport; sand ridges; Spot heights; and waterbodies. It is a vector representation of the Australian landscape as represented on the Geoscience Australia 1:2.5 million scale general reference maps. This data supersedes the TOPO 2.5M 1998 product through the following characteristics: developed according to GEODATA specifications derived from GEODATA TOPO 250K Series 2 data where available. Product Specifications Themes: GEODATA TOPO 2.5M 2003 consists of eleven layers: built-up areas; contours; drainage; framework; localities; offshore; rail transport; road transport; sand ridges; spot heights; and waterbodies Coverage: Australia Currency: 2003 Coordinates: Geographical Datum: GDA94, AHD Format: ArcInfo Export, ArcView Shapefile and MapInfo mid/mif Medium: Free online - Available in ArcInfo Export, ArcView Shapefile and MapInfo mid/mif
1:5 Million - GEODATA TOPO 5M 2004 is a national seamless data product aimed at regional or national applications. It is a vector representation of the Australian landscape as represented on the Geoscience Australia 5 million general reference map and is suitable for GIS applications. Offshore and sand ridge layers were sourced from scanning of the original 1:5 million map production material. The remaining nine layers were derived from the GEODATA TOPO 2.5M 2003 dataset. Free online. Available in ArcInfo Export, ArcView Shapefile and MapInfo mid/mif. Product Specifications: Themes: consists of eleven layers: built-up areas; contours; drainage; framework; localities; offshore; rail transport; road transport; sand ridges, spot heights and waterbodies Coverage: Australia Currency: 2004 Coordinates: Geographical Datum: GDA94, AHD Format: ArcInfo Export, ArcView Shapefile and MapInfo mid/mif Medium: Free online
1:10 Million - The GEODATA TOPO 10M 2002 version of this product has been completely revised, including the source information. The data is derived primarily from GEODATA TOPO 250K Series 1 data. In October 2003, the data was released in double precision coordinates. It provides a fundamental base layer of geographic information on which you can build a wide range of applications and is particularly suited to State-wide and national applications. The data consists of ten layers: built-up areas, contours, drainage, Spot heights, framework, localities, offshore, rail transport, road transport, and waterbodies. Coverage: Australia Currency: 2002 Coordinates: Geographical Datum: GDA94, AHD Format: ArcInfo Export, Arcview Shapefile and MapInfo mid/mif Medium: Free online
1:1Million - Vector data was produced by generalising Geoscience Australia's GEODATA TOPO 250K Series 1 data and updated using Series 2 data where available in January 2001. Raster data was sourced from USGS and updated using GEODATA 9 Second DEM Series 2, 1:5 million, Vegetation - Present (1988) and National Land and Water Resources data. However, updates have not been subjected to thorough vetting. A more detailed land use classification for Australia is available at www.nlwra.gov.au.
Full Metadata - http://www.ga.gov.au/metadata-gateway/metadata/record/gcat_48006
1:2.5Million - Data for the Contours, Offshore, and Sand ridge layers was captured from 1:2.5 million scale mapping by scanning stable base photographic film positives of the original map production material. The key source material for Built-up areas, Drainage, Spot heights, Framework, Localities, Rail transport, Road transport and Waterbodies layers was GEODATA TOPO 2.5M 2003
Full Metadata - http://www.ga.gov.au/metadata-gateway/metadata/record/gcat_60804
1:5Million - Offshore and Sand Ridge layers have been derived from 1:5M scale mapping by scanning stable base photographic film positives of the various layers of the original map production material. The remaining layers were sourced from the GEODATA TOPO 2.5M 2003 product.
Full Metadata - http://www.ga.gov.au/metadata-gateway/metadata/record/gcat_61114
1:10Million - The key source for production of the Builtup Areas, Drainage, Framework, Localities, Rail Transport, Road Transport and Waterbodies layers was the GEODATA TOPO 250K Series 1 product. Some revision of the Builtup Areas, Road Transport, Rail Transport and Waterbodies layers was carried out using the latest available satelite imagery. The primary source for the Spot Heights, Contours and Offshore layers was the GEODATA TOPO 10M Version 1 product. A further element to the production of GEODATA TOPO 10M 2002 has been the datum shift from the Australian Geodetic Datum 1966 (AGD66) to the Geocentric Datum of Australia 1994 (GDA94).
Full Metadata - http://www.ga.gov.au/metadata-gateway/metadata/record/gcat_60803
Geoscience Australia (2001) Geoscience Australia GEODATA TOPO series - 1:1 Million to 1:10 Million scale. Bioregional Assessment Source Dataset. Viewed 13 March 2019, http://data.bioregionalassessments.gov.au/dataset/310c5d07-5a56-4cf7-a5c8-63bdb001cd1a.
Facebook
TwitterThese data were created as part of the National Oceanic and Atmospheric Administration Office for Coastal Management's efforts to create an online mapping viewer depicting potential sea level rise and its associated impacts on the nation's coastal areas. The purpose of the mapping viewer is to provide coastal managers and scientists with a preliminary look at sea level rise (slr) and coastal flooding impacts. The viewer is a screening-level tool that uses nationally consistent data sets and analyses. Data and maps provided can be used at several scales to help gauge trends and prioritize actions for different scenarios. The Sea Level Rise and Coastal Flooding Impacts Viewer may be accessed at: https://www.coast.noaa.gov/slr These data depict the potential inundation of coastal areas resulting from current Mean Higher High Water (MHHW) conditions. The process used to produce the data can be described as a modified bathtub approach that attempts to account for both local/regional tidal variability as well as hydrological connectivity. The process uses two source datasets to derive the final inundation rasters and polygons and accompanying low-lying polygons: the Digital Elevation Model (DEM) of the area and a tidal surface model that represents spatial tidal variability. The tidal model is created using the NOAA National Geodetic Survey's VDATUM datum transformation software (http://vdatum.noaa.gov) in conjunction with spatial interpolation/extrapolation methods and represents the MHHW tidal datum in orthometric values (North American Vertical Datum of 1988). The model used to produce these data does not account for erosion, subsidence, or any future changes in an area's hydrodynamics. It is simply a method to derive data in order to visualize the potential scale, not exact location, of inundation from sea level rise. Both raster and vector data are provided. The raster data represent both the horizontal extent of inundation and depth above ground, in meters. The vector data represent the horizontal extent of both hydrologically connected and unconnected inundation. The vector "slr" data represent inundation that is hydrologically connected to the ocean. The vector "low" data represent areas that are hydrologically unconnected to the ocean, but are below MHHW and may also flood. For more information, contact coastal.info@noaa.gov.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
The 3D Hydrography Program (3DHP) data is an integrated, National, 3D-enabled hydrologic dataset derived from the USGS 3D Elevation Program (3DEP) data. For areas where Elevation-derived Hydrography (EDH) has not yet been collected, 3DHP data is supplemented by hydrologic vector data from the National Hydrography Dataset (NHD). As further EDH data is collected, it will replace the NHD data in those areas. 3DHP data ingested from EDH sources includes ‘value added’ catchments and flowline network derivative attributes. All the data is open and non-proprietary. However, users should be aware that temporal changes may have occurred since this dataset was collected and that some parts of this data may no longer represent actual surface conditions. Users should not use this data for critical applications without a full awareness of its limitations. This dataset is not intended to be used for site-specific regulatory determinations. 3DHP datasets include a three-dimensional (3D) hydrography network generated from, and integrated with, elevation data from the 3DEP to better represent stream gradients and channel conditions, along with waterbodies, hydrologic units, hydrologically enhanced elevation and other surfaces, and more consistent and accurate attributes. This product is new in federal fiscal year 2025 (FY25), and consists only of vector data in a series of feature classes. The product represents the 3DHP dataset and the schema in which it is contained as of September 30, 2024 Future Annual Staged Product releases will reflect the schema at the time the product is generated and include more EDH-sourced data holdings.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Overview This dataset contains binary representations of letters (A-Z) in different fonts. It is designed for use in pattern recognition and neural network projects, particularly those involving Hamming networks. The dataset includes multiple fonts for each letter, with each font represented as a 64-bit binary vector.
This data contain a combined file of fonts, sorted alphabetically by letter and then by font number. Columns Letter: The letter represented (A-Z). 64-Bit Vector: The binary vector representation of the letter. Font Number: The font number (each number represent different font from 1 to 5)
This dataset was created as part of a final project for the Neuroscience class.
Facebook
TwitterThis service provides data implemented from the IACS procedure for the INSPIRE topic land cover vector.
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Many traditional quantitative structure–activity relationship (QSAR) models are based on correlation with high-dimensional, highly variable molecular features in their raw form, limiting their generalizing capabilities despite the use of large training sets. They also lack elements of causality and reasoning. With these issues in mind, we developed a method for learning higher-level abstract representations of the effects of the interactions between molecular features and biology. We named the representations as the reason vectors. They are composed of a series of computed activity of substructures obtained from stepwise reconstruction of the molecule. This representation is very different from fingerprints, which are composed of molecular features directly. These vectors capture reasons of bioactivity of chemicals (or absence thereof) in an abstract form, uncover causality in interactions between chemical features, and generalize beyond specific chemical classes or bioactivity. Reason vectors contain only a few key attributes and are much smaller than molecular fingerprints. They allow vague and conceptual similarity searches, less susceptible to failure on novel combinations of query molecule features and more likely to identify reasons of activity in chemical classes that are absent in training data. Reason vectors can be compared with each other and their activity can be computed by matching with vectors from molecules with known bioactivity. A single molecule produces as many reason vectors as heavy atoms in it, and a simple count of these vectors in a series of activity ranges is all what is needed to predict its bioactivity. Thus, the prediction method is devoid of gradient optimization or statistical fitting.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Riparian corridors are important areas that maintain connectivity throughout the state of California. The riparian corridors complement the northern Sierra Nevada foothills wildlife connectivity project linkages to further achieve connectivity in the study area. We identified 280 riparian corridors represented by 232 named creeks, 43 named rivers, and 5 sloughs, forks or runs. The major corridors are the Sacramento, San Joaquin, Pit, Tuolumne, Merced, Feather and Stanislaus rivers. The 280 riparian corridors connect 201 landscape blocks. The riparian corridors complement the focal species linkages by providing many east-west corridors while the majority of linkages have a north-south orientation. Also by following the entire passage of the riparian area, these corridors run through many of the landscape blocks across the study area, helping to provide connectivity outside of habitat patch areas.We identified riparian corridors by selected streams, rivers and creeks from the NHD (National Hydrography Dataset) for state of California. From the NHD dataset, features named ‘StreamRiver’ were extracted from the ‘NHDFlowline’ vector dataset. A code 46006 was then used to extract perennial rivers and streams from the ‘StreamRiver’ dataset. However, this step resulted in a stream and river layer with many small segments. In order to reduce the number of segments and identify complete stream/river lines, we intersected the perennial rivers and streams layer with the CDFW statewide streams layer (‘CA_Streams_Statewide’) using the ‘Select by Location’ tool in ArcMap (‘CA_Streams_Statewide’ layer as target layer and the streams and rivers layer we extracted from NHD as a target layer). Second, we extracted features named ‘ArtificialPath’ from the ‘NHDFlowline’ vector dataset. Artificial paths represent the flow of water into, through, and out of features delineated using area; for example, rivers wide enough to be delineated as a polygon are represented by an artificial path flowline at their center line. Therefore, large rivers are often coded as “artificial path” in the NHD dataset. We then selected only those artificial paths with Geographic Names Information System (GNIS) names, with the assumption that artificial path features without names are “very minor streams, only of use to hydrologist” (http://nhd.usgs.gov). Next we used the same method we implemented for streams and rivers in order to remove small segments and have complete lines. The artificial path dataset is not coded to discriminate between perennial and intermittent ones similar to stream and river features. As a result, artificial paths that intersected with perennial streams and rivers were selected to represent permanent waterways. Then, the perennial stream and river layer and the artificial paths layer were merged into one dataset. After the merge we added a 500 m buffer to each side of the riparian area.We compared this merged stream/river layer with riparian vegetation classification data as a cross check. The riparian vegetation classification data are from the 2011 Northern Sierra Nevada Foothills and 2013 Eastern Central Valley fine-scale vegetation maps developed by the Vegetation Classification and Mapping Program (VegCamp) at the California Department of Fish and Wildlife. For areas outside the foothills and eastern central valley we used land cover data compiled by California Department of Forestry and Fire Protection (CDF) Fire and Resource Assessment Program (FRAP) in 2006, representing data for the period between 1997 and 2002. The resulting perennial dataset was then merged with the wetland and riparian datasets to represent perennial water sources in California. For more information see the project report at [https://nrm.dfg.ca.gov/FileHandler.ashx?DocumentID=85358].
Facebook
TwitterThe product represents a new design of the State Map at a scale of 1:5,000 (SM 5) in vector form, whose advantages are recency and colour processing. The map contains planimetry based on cadastral map, altimetry adopted from the altimetry part of ZABAGED and map lettering based on database of geographic names Geonames and abbreviations of feature type signification coming up from attributes of selected ZABAGED features. This new design of the SM 5 is repeatedly generated once a year on the part of the Czech territory where the vector form of cadastral map is available. Therefore, part of export units (map sheets of SM 5) has not a full coverage (price of such export unit is then proportionally reduced).