Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Overview
3DHD CityScenes is the most comprehensive, large-scale high-definition (HD) map dataset to date, annotated in the three spatial dimensions of globally referenced, high-density LiDAR point clouds collected in urban domains. Our HD map covers 127 km of road sections of the inner city of Hamburg, Germany including 467 km of individual lanes. In total, our map comprises 266,762 individual items.
Our corresponding paper (published at ITSC 2022) is available here. Further, we have applied 3DHD CityScenes to map deviation detection here.
Moreover, we release code to facilitate the application of our dataset and the reproducibility of our research. Specifically, our 3DHD_DevKit comprises:
Python tools to read, generate, and visualize the dataset,
3DHDNet deep learning pipeline (training, inference, evaluation) for map deviation detection and 3D object detection.
The DevKit is available here:
https://github.com/volkswagen/3DHD_devkit.
The dataset and DevKit have been created by Christopher Plachetka as project lead during his PhD period at Volkswagen Group, Germany.
When using our dataset, you are welcome to cite:
@INPROCEEDINGS{9921866, author={Plachetka, Christopher and Sertolli, Benjamin and Fricke, Jenny and Klingner, Marvin and Fingscheidt, Tim}, booktitle={2022 IEEE 25th International Conference on Intelligent Transportation Systems (ITSC)}, title={3DHD CityScenes: High-Definition Maps in High-Density Point Clouds}, year={2022}, pages={627-634}}
Acknowledgements
We thank the following interns for their exceptional contributions to our work.
Benjamin Sertolli: Major contributions to our DevKit during his master thesis
Niels Maier: Measurement campaign for data collection and data preparation
The European large-scale project Hi-Drive (www.Hi-Drive.eu) supports the publication of 3DHD CityScenes and encourages the general publication of information and databases facilitating the development of automated driving technologies.
The Dataset
After downloading, the 3DHD_CityScenes folder provides five subdirectories, which are explained briefly in the following.
This directory contains the training, validation, and test set definition (train.json, val.json, test.json) used in our publications. Respective files contain samples that define a geolocation and the orientation of the ego vehicle in global coordinates on the map.
During dataset generation (done by our DevKit), samples are used to take crops from the larger point cloud. Also, map elements in reach of a sample are collected. Both modalities can then be used, e.g., as input to a neural network such as our 3DHDNet.
To read any JSON-encoded data provided by 3DHD CityScenes in Python, you can use the following code snipped as an example.
import json
json_path = r"E:\3DHD_CityScenes\Dataset\train.json" with open(json_path) as jf: data = json.load(jf) print(data)
Map items are stored as lists of items in JSON format. In particular, we provide:
traffic signs,
traffic lights,
pole-like objects,
construction site locations,
construction site obstacles (point-like such as cones, and line-like such as fences),
line-shaped markings (solid, dashed, etc.),
polygon-shaped markings (arrows, stop lines, symbols, etc.),
lanes (ordinary and temporary),
relations between elements (only for construction sites, e.g., sign to lane association).
Our high-density point cloud used as basis for annotating the HD map is split in 648 tiles. This directory contains the geolocation for each tile as polygon on the map. You can view the respective tile definition using QGIS. Alternatively, we also provide respective polygons as lists of UTM coordinates in JSON.
Files with the ending .dbf, .prj, .qpj, .shp, and .shx belong to the tile definition as “shape file” (commonly used in geodesy) that can be viewed using QGIS. The JSON file contains the same information provided in a different format used in our Python API.
The high-density point cloud tiles are provided in global UTM32N coordinates and are encoded in a proprietary binary format. The first 4 bytes (integer) encode the number of points contained in that file. Subsequently, all point cloud values are provided as arrays. First all x-values, then all y-values, and so on. Specifically, the arrays are encoded as follows.
x-coordinates: 4 byte integer
y-coordinates: 4 byte integer
z-coordinates: 4 byte integer
intensity of reflected beams: 2 byte unsigned integer
ground classification flag: 1 byte unsigned integer
After reading, respective values have to be unnormalized. As an example, you can use the following code snipped to read the point cloud data. For visualization, you can use the pptk package, for instance.
import numpy as np import pptk
file_path = r"E:\3DHD_CityScenes\HD_PointCloud_Tiles\HH_001.bin" pc_dict = {} key_list = ['x', 'y', 'z', 'intensity', 'is_ground'] type_list = ['
This layer shows Veteran Counts by Sex and Age Group by Census Tract for 2012-2016. This tile layer is best viewed atop a darker basemap such as the Dark Blue Canvas. Click here to view the feature layer that includes margin of error fields and calculated percentages.There are currently over 19.6 million veterans in the United States.Data came from American Community Survey 5-year estimates and were retrieved from the Census Bureau's API on Sept. 27th, 2017 by Diana Lavery.
Overview: Actual Natural Vegetation (ANV): probability of occurrence for the Stone pine in its realized environment for the period 2000 - 2028 Traceability (lineage): This is an original dataset produced with a machine learning framework which used a combination of point datasets and raster datasets as inputs. Point dataset is a harmonized collection of tree occurrence data, comprising observations from National Forest Inventories (EU-Forest), GBIF and LUCAS. The complete dataset is available on Zenodo. Raster datasets used as input are: harmonized and gapfilled time series of seasonal aggregates of the Landsat GLAD ARD dataset (bands and spectral indices); monthly time series air and surface temperature and precipitation from a reprocessed version of the Copernicus ERA5 dataset; long term averages of bioclimatic variables from CHELSA, tree species distribution maps from the European Atlas of Forest Tree Species; elevation, slope and other elevation-derived metrics; long term monthly averages snow probability and long term monthly averages of cloud fraction from MODIS. For a more comprehensive list refer to Bonannella et al. (2022) (in review, preprint available at: https://doi.org/10.21203/rs.3.rs-1252972/v1). Scientific methodology: Probability and uncertainty maps were the output of a spatiotemporal ensemble machine learning framework based on stacked regularization. Three base models (random forest, gradient boosted trees and generalized linear models) were first trained on the input dataset and their predictions were used to train an additional model (logistic regression) which provided the final predictions. More details on the whole workflow are available in the listed publication. Usability: Probability maps can be used to detect potential forest degradation and compositional change across the time period analyzed. Some possible applications for these topics are explained in the listed publication. Uncertainty quantification: Uncertainty is quantified by taking the standard deviation of the probabilities predicted by the three components of the spatiotemporal ensemble model. Data validation approaches: Distribution maps were validated using a spatial 5-fold cross validation following the workflow detailed in the listed publication. Completeness: The raster files perfectly cover the entire Geo-harmonizer region as defined by the landmask raster dataset available here. Consistency: Areas which are outside of the calibration area of the point dataset (Iceland, Norway) usually have high uncertainty values. This is not only a problem of extrapolation but also of poor representation in the feature space available to the model of the conditions that are present in this countries. Positional accuracy: The rasters have a spatial resolution of 30m. Temporal accuracy: The maps cover the period 2000 - 2020, each map covers a certain number of years according to the following scheme: (1) 2000--2002, (2) 2002--2006, (3) 2006--2010, (4) 2010--2014, (5) 2014--2018 and (6) 2018--2020 Thematic accuracy: Both probability and uncertainty maps contain values from 0 to 100: in the case of probability maps, they indicate the probability of occurrence of a single individual of the target species, while uncertainty maps indicate the standard deviation of the ensemble model.
Overview: Actual Natural Vegetation (ANV): probability of occurrence for the Aleppo pine in its realized environment for the period 2000 - 2026 Traceability (lineage): This is an original dataset produced with a machine learning framework which used a combination of point datasets and raster datasets as inputs. Point dataset is a harmonized collection of tree occurrence data, comprising observations from National Forest Inventories (EU-Forest), GBIF and LUCAS. The complete dataset is available on Zenodo. Raster datasets used as input are: harmonized and gapfilled time series of seasonal aggregates of the Landsat GLAD ARD dataset (bands and spectral indices); monthly time series air and surface temperature and precipitation from a reprocessed version of the Copernicus ERA5 dataset; long term averages of bioclimatic variables from CHELSA, tree species distribution maps from the European Atlas of Forest Tree Species; elevation, slope and other elevation-derived metrics; long term monthly averages snow probability and long term monthly averages of cloud fraction from MODIS. For a more comprehensive list refer to Bonannella et al. (2022) (in review, preprint available at: https://doi.org/10.21203/rs.3.rs-1252972/v1). Scientific methodology: Probability and uncertainty maps were the output of a spatiotemporal ensemble machine learning framework based on stacked regularization. Three base models (random forest, gradient boosted trees and generalized linear models) were first trained on the input dataset and their predictions were used to train an additional model (logistic regression) which provided the final predictions. More details on the whole workflow are available in the listed publication. Usability: Probability maps can be used to detect potential forest degradation and compositional change across the time period analyzed. Some possible applications for these topics are explained in the listed publication. Uncertainty quantification: Uncertainty is quantified by taking the standard deviation of the probabilities predicted by the three components of the spatiotemporal ensemble model. Data validation approaches: Distribution maps were validated using a spatial 5-fold cross validation following the workflow detailed in the listed publication. Completeness: The raster files perfectly cover the entire Geo-harmonizer region as defined by the landmask raster dataset available here. Consistency: Areas which are outside of the calibration area of the point dataset (Iceland, Norway) usually have high uncertainty values. This is not only a problem of extrapolation but also of poor representation in the feature space available to the model of the conditions that are present in this countries. Positional accuracy: The rasters have a spatial resolution of 30m. Temporal accuracy: The maps cover the period 2000 - 2020, each map covers a certain number of years according to the following scheme: (1) 2000--2002, (2) 2002--2006, (3) 2006--2010, (4) 2010--2014, (5) 2014--2018 and (6) 2018--2020 Thematic accuracy: Both probability and uncertainty maps contain values from 0 to 100: in the case of probability maps, they indicate the probability of occurrence of a single individual of the target species, while uncertainty maps indicate the standard deviation of the ensemble model.
Overview: Actual Natural Vegetation (ANV): probability of occurrence for the Pedunculate oak in its realized environment for the period 2000 - 2033 Traceability (lineage): This is an original dataset produced with a machine learning framework which used a combination of point datasets and raster datasets as inputs. Point dataset is a harmonized collection of tree occurrence data, comprising observations from National Forest Inventories (EU-Forest), GBIF and LUCAS. The complete dataset is available on Zenodo. Raster datasets used as input are: harmonized and gapfilled time series of seasonal aggregates of the Landsat GLAD ARD dataset (bands and spectral indices); monthly time series air and surface temperature and precipitation from a reprocessed version of the Copernicus ERA5 dataset; long term averages of bioclimatic variables from CHELSA, tree species distribution maps from the European Atlas of Forest Tree Species; elevation, slope and other elevation-derived metrics; long term monthly averages snow probability and long term monthly averages of cloud fraction from MODIS. For a more comprehensive list refer to Bonannella et al. (2022) (in review, preprint available at: https://doi.org/10.21203/rs.3.rs-1252972/v1). Scientific methodology: Probability and uncertainty maps were the output of a spatiotemporal ensemble machine learning framework based on stacked regularization. Three base models (random forest, gradient boosted trees and generalized linear models) were first trained on the input dataset and their predictions were used to train an additional model (logistic regression) which provided the final predictions. More details on the whole workflow are available in the listed publication. Usability: Probability maps can be used to detect potential forest degradation and compositional change across the time period analyzed. Some possible applications for these topics are explained in the listed publication. Uncertainty quantification: Uncertainty is quantified by taking the standard deviation of the probabilities predicted by the three components of the spatiotemporal ensemble model. Data validation approaches: Distribution maps were validated using a spatial 5-fold cross validation following the workflow detailed in the listed publication. Completeness: The raster files perfectly cover the entire Geo-harmonizer region as defined by the landmask raster dataset available here. Consistency: Areas which are outside of the calibration area of the point dataset (Iceland, Norway) usually have high uncertainty values. This is not only a problem of extrapolation but also of poor representation in the feature space available to the model of the conditions that are present in this countries. Positional accuracy: The rasters have a spatial resolution of 30m. Temporal accuracy: The maps cover the period 2000 - 2020, each map covers a certain number of years according to the following scheme: (1) 2000--2002, (2) 2002--2006, (3) 2006--2010, (4) 2010--2014, (5) 2014--2018 and (6) 2018--2020 Thematic accuracy: Both probability and uncertainty maps contain values from 0 to 100: in the case of probability maps, they indicate the probability of occurrence of a single individual of the target species, while uncertainty maps indicate the standard deviation of the ensemble model.
This layer shows Population. This is shown by state and county boundaries. This service contains the 2018-2022 release of data from the American Community Survey (ACS) 5-year data, and contains estimates and margins of error. There are also additional calculated attributes related to this topic, which can be mapped or used within analysis. This layer is symbolized to show the point by Population Density and size of the point by Total Population. The size of the symbol represents the total count of housing units. Population Density was calculated based on the total population and area of land fields, which both came from the U.S. Census Bureau. Formula used for Calculating the Pop Density (B01001_001E/GEO_LAND_AREA_SQ_KM). To see the full list of attributes available in this service, go to the "Data" tab, and choose "Fields" at the top right. Current Vintage: 2018-2022ACS Table(s): B01001, B09020Data downloaded from: Census Bureau's API for American Community Survey Date of API call: January 18, 2024National Figures: data.census.govThe United States Census Bureau's American Community Survey (ACS):About the SurveyGeography & ACSTechnical DocumentationNews & UpdatesThis ready-to-use layer can be used within ArcGIS Pro, ArcGIS Online, its configurable apps, dashboards, Story Maps, custom apps, and mobile apps. Data can also be exported for offline workflows. Please cite the Census and ACS when using this data.Data Note from the Census:Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see Accuracy of the Data). The effect of nonsampling error is not represented in these tables.Data Processing Notes:Boundaries come from the Cartographic Boundaries via US Census TIGER geodatabases. Boundaries are updated at the same time as the data updates, and the boundary vintage appropriately matches the data vintage as specified by the Census. These are Census boundaries with water and/or coastlines clipped for cartographic purposes. For state and county boundaries, the water and coastlines are derived from the coastlines of the 500k TIGER Cartographic Boundary Shapefiles. The original AWATER and ALAND fields are still available as attributes within the data table (units are square meters). The States layer contains 52 records - all US states, Washington D.C., and Puerto Rico. The Counties (and equivalent) layer contains 3221 records - all counties and equivalent, Washington D.C., and Puerto Rico municipios. See Areas Published. Percentages and derived counts, and associated margins of error, are calculated values (that can be identified by the "_calc_" stub in the field name), and abide by the specifications defined by the American Community Survey.Field alias names were created based on the Table Shells.Margin of error (MOE) values of -555555555 in the API (or "*****" (five asterisks) on data.census.gov) are displayed as 0 in this dataset. The estimates associated with these MOEs have been controlled to independent counts in the ACS weighting and have zero sampling error. So, the MOEs are effectively zeroes, and are treated as zeroes in MOE calculations. Other negative values on the API, such as -222222222, -666666666, -888888888, and -999999999, all represent estimates or MOEs that can't be calculated or can't be published, usually due to small sample sizes. All of these are rendered in this dataset as null (blank) values.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains a list of 186 Digital Humanities projects leveraging information visualisation methods. Each project has been classified according to visualisation and interaction techniques, narrativity and narrative solutions, domain, methods for the representation of uncertainty and interpretation, and the employment of critical and custom approaches to visually represent humanities data.
The project_id
column contains unique internal identifiers assigned to each project. Meanwhile, the last_access
column records the most recent date (in DD/MM/YYYY format) on which each project was reviewed based on the web address specified in the url
column.
The remaining columns can be grouped into descriptive categories aimed at characterising projects according to different aspects:
Narrativity. It reports the presence of information visualisation techniques employed within narrative structures. Here, the term narrative encompasses both author-driven linear data stories and more user-directed experiences where the narrative sequence is determined by user exploration [1]. We define 2 columns to identify projects using visualisation techniques in narrative, or non-narrative sections. Both conditions can be true for projects employing visualisations in both contexts. Columns:
non_narrative
(boolean)
narrative
(boolean)
Domain. The humanities domain to which the project is related. We rely on [2] and the chapters of the first part of [3] to abstract a set of general domains. Column:
domain
(categorical):
History and archaeology
Art and art history
Language and literature
Music and musicology
Multimedia and performing arts
Philosophy and religion
Other: both extra-list domains and cases of collections without a unique or specific thematic focus.
Visualisation of uncertainty and interpretation. Buiding upon the frameworks proposed by [4] and [5], a set of categories was identified, highlighting a distinction between precise and impressional communication of uncertainty. Precise methods explicitly represent quantifiable uncertainty such as missing, unknown, or uncertain data, precisely locating and categorising it using visual variables and positioning. Two sub-categories are interactive distinction, when uncertain data is not visually distinguishable from the rest of the data but can be dynamically isolated or included/excluded categorically through interaction techniques (usually filters); and visual distinction, when uncertainty visually “emerges” from the representation by means of dedicated glyphs and spatial or visual cues and variables. On the other hand, impressional methods communicate the constructed and situated nature of data [6], exposing the interpretative layer of the visualisation and indicating more abstract and unquantifiable uncertainty using graphical aids or interpretative metrics. Two sub-categories are: ambiguation, when the use of graphical expedients—like permeable glyph boundaries or broken lines—visually convey the ambiguity of a phenomenon; and interpretative metrics, when expressive, non-scientific, or non-punctual metrics are used to build a visualisation. Column:
uncertainty_interpretation
(categorical):
Interactive distinction
Visual distinction
Ambiguation
Interpretative metrics
Critical adaptation. We identify projects in which, with regards to at least a visualisation, the following criteria are fulfilled: 1) avoid repurposing of prepackaged, generic-use, or ready-made solutions; 2) being tailored and unique to reflect the peculiarities of the phenomena at hand; 3) avoid simplifications to embrace and depict complexity, promoting time-consuming visualisation-based inquiry. Column:
critical_adaptation
(boolean)
Non-temporal visualisation techniques. We adopt and partially adapt the terminology and definitions from [7]. A column is defined for each type of visualisation and accounts for its presence within a project, also including stacked layouts and more complex variations. Columns and inclusion criteria:
plot
(boolean): visual representations that map data points onto a two-dimensional coordinate system.
cluster_or_set
(bool): sets or cluster-based visualisations used to unveil possible inter-object similarities.
map
(boolean): geographical maps used to show spatial insights. While we do not specify the variants of maps (e.g., pin maps, dot density maps, flow maps, etc.), we make an exception for maps where each data point is represented by another visualisation (e.g., a map where each data point is a pie chart) by accounting for the presence of both in their respective columns.
network
(boolean): visual representations highlighting relational aspects through nodes connected by links or edges.
hierarchical_diagram
(boolean): tree-like structures such as tree diagrams, radial trees, but also dendrograms. They differ from networks for their strictly hierarchical structure and absence of closed connection loops.
treemap
(boolean): still hierarchical, but highlighting quantities expressed by means of area size. It also includes circle packing variants.
word_cloud
(boolean): clouds of words, where each instance’s size is proportional to its frequency in a related context
bars
(boolean): includes bar charts, histograms, and variants. It coincides with “bar charts” in [7] but with a more generic term to refer to all bar-based visualisations.
line_chart
(boolean): the display of information as sequential data points connected by straight-line segments.
area_chart
(boolean): similar to a line chart but with a filled area below the segments. It also includes density plots.
pie_chart
(boolean): circular graphs divided into slices which can also use multi-level solutions.
plot_3d
(boolean): plots that use a third dimension to encode an additional variable.
proportional_area
(boolean): representations used to compare values through area size. Typically, using circle- or square-like shapes.
other
(boolean): it includes all other types of non-temporal visualisations that do not fall into the aforementioned categories.
Temporal visualisations and encodings. In addition to non-temporal visualisations, a group of techniques to encode temporality is considered in order to enable comparisons with [7]. Columns:
timeline
(boolean): the display of a list of data points or spans in chronological order. They include timelines working either with a scale or simply displaying events in sequence. As in [7], we also include structured solutions resembling Gantt chart layouts.
temporal_dimension
(boolean): to report when time is mapped to any dimension of a visualisation, with the exclusion of timelines. We use the term “dimension” and not “axis” as in [7] as more appropriate for radial layouts or more complex representational choices.
animation
(boolean): temporality is perceived through an animation changing the visualisation according to time flow.
visual_variable
(boolean): another visual encoding strategy is used to represent any temporality-related variable (e.g., colour).
Interaction techniques. A set of categories to assess affordable interaction techniques based on the concept of user intent [8] and user-allowed data actions [9]. The following categories roughly match the “processing”, “mapping”, and “presentation” actions from [9] and the manipulative subset of methods of the “how” an interaction is performed in the conception of [10]. Only interactions that affect the visual representation or the aspect of data points, symbols, and glyphs are taken into consideration. Columns:
basic_selection
(boolean): the demarcation of an element either for the duration of the interaction or more permanently until the occurrence of another selection.
advanced_selection
(boolean): the demarcation involves both the selected element and connected elements within the visualisation or leads to brush and link effects across views. Basic selection is tacitly implied.
navigation
(boolean): interactions that allow moving, zooming, panning, rotating, and scrolling the view but only when applied to the visualisation and not to the web page. It also includes “drill” interactions (to navigate through different levels or portions of data detail, often generating a new view that replaces or accompanies the original) and “expand” interactions generating new perspectives on data by expanding and collapsing nodes.
arrangement
(boolean): methods to organise visualisation elements (symbols, glyphs, etc.) or
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This web map is provides the data and maps used in the story map Population density and diversity in New Zealand, created by Stats NZ. It uses Statistical Area 1 (SA1) data collected and published as part of the 2018 Census. The web map uses a mapping technique called multi-variate dot density mapping. The data used in the map can be found at this web service - 2018 Census Individual part 1 data by SA1.For questions or comments on the data or maps, please contact info@stats.govt.nz Census Data Quality Notes:We combined data from the census forms with administrative data to create the 2018 Census dataset, which meets Stats NZ’s quality criteria for population structure information.We added real data about real people to the dataset where we were confident the people should be counted but hadn’t completed a census form. We also used data from the 2013 Census and administrative sources and statistical imputation methods to fill in some missing characteristics of people and dwellings.Data quality for 2018 Census provides more information on the quality of the 2018 Census data.An independent panel of experts has assessed the quality of the 2018 Census dataset. The panel has endorsed Stats NZ’s overall methods and concluded that the use of government administrative records has improved the coverage of key variables such as age, sex, ethnicity, and place. The panel’s Initial Report of the 2018 Census External Data Quality Panel (September 2019), assessed the methodologies used by Stats NZ to produce the final dataset, as well as the quality of some of the key variables. Its second report 2018 Census External Data Quality Panel: Assessment of variables (December 2019) assessed an additional 31 variables. In its third report, Final report of the 2018 Census External Data Quality Panel (February 2020), the panel made 24 recommendations, several relating to preparations for the 2023 Census. Along with this report, the panel, supported by Stats NZ, produced a series of graphs summarising the sources of data for key 2018 Census individual variables, 2018 Census External Data Quality Panel: Data sources for key 2018 Census individual variables.The Quick guide to the 2018 Census outlines the key changes we introduced as we prepared for the 2018 Census, and the changes we made once collection was complete.The geographic boundaries are as at 1 January 2018. See Statistical standard for geographic areas 2018.2018 Census – DataInfo+ provides information about methods, and related metadata.Data quality ratings for 2018 Census variables provides information on data quality ratings.
The position of the groundwater surface, or the groundwater pressure surface with tensioned groundwater, is usually represented by groundwater equalisation (isohypses). The map theme shows the groundwater surface of the first large-scale groundwater stock plant for all loose rock areas in Lower Saxony. Density differences were not taken into account. In the solid rock areas of southern Lower Saxony, this type of representation is not practicable, since a spatially contiguous groundwater body usually does not exist there. The groundwater moves in the solid rock in cleft and fault systems or karsth hollow spaces. Although the groundwater deposits in the solid rock, e.g. in karst areas, can be quite remarkable, they are not meaningful with groundwater equals on this scale. These areas are marked as fixed rocks on the map. In Lower Saxony’s coastal area, groundwater levels are strongly influenced by the tide-related changes in the seawater level and by measures of artificial drainage (creation plants, siele). In the area of pottery and sub-creation plants, the groundwater flow direction can run away from the coast towards inland. In the immediate vicinity of the coastline, the groundwater flow direction changes depending on the state of the tide. The groundwater level equals are shown here only strongly generalised. For the construction of the groundwater equivalents, groundwater level measurements are generally used at all measuring points at the same time. This illustration is based on the January 1993 reference date and represents a mean groundwater level of the 1990-2000 series. The reference date measurements of the map series are based on groundwater level data from the National Water Services, which were used with the permission of the Lower Saxony State Agency for Water Management, Coastal and Nature Conservation. In addition, some data from water supply companies were provided. Since the grid from reference date measurements does not have sufficient document density, the database was supplemented, as far as it seemed technically reasonable, with groundwater level measurements from other periods. This data comes from the drilling database or from archive documents of the LBEG. In areas with high fluctuations in the groundwater level, this supplement was not made. In the area of abdominal moraines, groundwater distances are highly variability due to the very heterogeneous geological structure of these areas. Here, the groundwater equals can only represent the large-scale flow direction. In areas with very low document dot density, the actual water levels on site may differ from the map representation. In order to make the line representation of the groundwater equals more vivid, the surfaces enclosed by them are colored. The colour surfaces indicate the position of the groundwater surface or the groundwater pressure area at intervals of 2.5 m in meters to NN. The groundwater equilibrium plan is suitable for clarifying the flow directions and the potential gradients of groundwater in the loose rock areas. Detailed statements may require maps with a higher document density at cut-off date measurements.
description: Region(s) of distribution of Glacier Lanternfish (Benthosema glaciale) (Reinhardt, 1837) in the Arctic as digitized for U.S. Geological Survey Scientific Investigations Report 2016-5038. For details on the project and purpose, see the report at https://doi.org/10.3133/sir20165038. Complete metadata for the collection of species datasets is in the metadata document "Dataset_for_Alaska_Marine_Fish_Ecology_Catalog.xml" at https://doi.org/10.5066/F7M61HD7. Source(s) for this digitized data layer are listed in the metadata Process Steps section. Note that the original source may show an extended area; some datasets were limited to the published map boundary. Distributions of marine fishes are shown in adjacent Arctic seas where reliable data are available. The data were clipped to show only the marine distribution areas although some species also may have an inland presence.; abstract: Region(s) of distribution of Glacier Lanternfish (Benthosema glaciale) (Reinhardt, 1837) in the Arctic as digitized for U.S. Geological Survey Scientific Investigations Report 2016-5038. For details on the project and purpose, see the report at https://doi.org/10.3133/sir20165038. Complete metadata for the collection of species datasets is in the metadata document "Dataset_for_Alaska_Marine_Fish_Ecology_Catalog.xml" at https://doi.org/10.5066/F7M61HD7. Source(s) for this digitized data layer are listed in the metadata Process Steps section. Note that the original source may show an extended area; some datasets were limited to the published map boundary. Distributions of marine fishes are shown in adjacent Arctic seas where reliable data are available. The data were clipped to show only the marine distribution areas although some species also may have an inland presence.
These are non-editable Fire map information for use with the Idaho BLM Fire Management Plan (FMP). This Feature Service contains data that are not intended for active editing and will require regular yearly updates.The fire origin point data of fire occurrence having been recorded in longitude and latitude using degrees, minutes, seconds to record Fire Occurrence point feature data for Idaho 1980-2018. The Orchard Training Area Fire Starts were removed. The Orchard Training Area Fire Starts skewed the data to such an extent it made sense to remove these fire start locations. Within ArcMap using Point Density Geoprocessing tool with Circle and Neighborhood as parameters created a simple raster data set to display the concentration of fire start locations across Idaho
description: Region(s) of distribution of Scalebelly Eelpout (Lycodes squamiventer) Jensen, 1904 in the Arctic as digitized for U.S. Geological Survey Scientific Investigations Report 2016-5038. For details on the project and purpose, see the report at https://doi.org/10.3133/sir20165038. Complete metadata for the collection of species datasets is in the metadata document "Dataset_for_Alaska_Marine_Fish_Ecology_Catalog.xml" at https://doi.org/10.5066/F7M61HD7. Source(s) for this digitized data layer are listed in the metadata Process Steps section. Note that the original source may show an extended area; some datasets were limited to the published map boundary. Distributions of marine fishes are shown in adjacent Arctic seas where reliable data are available. The data were clipped to show only the marine distribution areas although some species also may have an inland presence.; abstract: Region(s) of distribution of Scalebelly Eelpout (Lycodes squamiventer) Jensen, 1904 in the Arctic as digitized for U.S. Geological Survey Scientific Investigations Report 2016-5038. For details on the project and purpose, see the report at https://doi.org/10.3133/sir20165038. Complete metadata for the collection of species datasets is in the metadata document "Dataset_for_Alaska_Marine_Fish_Ecology_Catalog.xml" at https://doi.org/10.5066/F7M61HD7. Source(s) for this digitized data layer are listed in the metadata Process Steps section. Note that the original source may show an extended area; some datasets were limited to the published map boundary. Distributions of marine fishes are shown in adjacent Arctic seas where reliable data are available. The data were clipped to show only the marine distribution areas although some species also may have an inland presence.
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Summary: This repository contains spatial data files representing the density of vegetation cover within a 200 meter radius of points on a grid across the land area of New York City (NYC), New York, USA based on 2017 six-inch resolution land cover data, as well as SQL code used to carry out the analysis. The 200 meter radius was selected based on a study led by researchers at the NYC Department of Health and Mental Hygiene, which found that for a given point in the city, cooling benefits of vegetation only begin to accrue once the vegetation cover within a 200 meter radius is at least 32% (Johnson et al. 2020). The grid spacing of 100 feet in north/south and east/west directions was intended to provide granular enough detail to offer useful insights at a local scale (e.g., within a neighborhood) while keeping the amount of data needed to be processed for this manageable. The contained files were developed by the NY Cities Program of The Nature Conservancy and the NYC Environmental Justice Alliance through the Just Nature NYC Partnership. Additional context and interpretation of this work is available in a blog post.
References: Johnson, S., Z. Ross, I. Kheirbek, and K. Ito. 2020. Characterization of intra-urban spatial variation in observed summer ambient temperature from the New York City Community Air Survey. Urban Climate 31:100583. https://doi.org/10.1016/j.uclim.2020.100583
Files in this Repository: Spatial Data (all data are in the New York State Plane Coordinate System - Long Island Zone, North American Datum 1983, EPSG 2263): Points with unique identifiers (fid) and data on proportion tree canopy cover (prop_canopy), proportion grass/shrub cover (prop_grassshrub), and proportion total vegetation cover (prop_veg) within a 200 meter radius (same data made available in two commonly used formats, Esri File GeoDatabase and GeoPackage): nyc_propveg2017_200mbuffer_100ftgrid_nowater.gdb.zip nyc_propveg2017_200mbuffer_100ftgrid_nowater.gpkg Raster Data with the proportion total vegetation within a 200 meter radius of the center of each cell (pixel centers align with the spatial point data) nyc_propveg2017_200mbuffer_100ftgrid_nowater.tif Computer Code: Code for generating the point data in PostgreSQL/PostGIS, assuming the data sources listed below are already in a PostGIS database. nyc_point_buffer_vegetation_overlay.sql
Data Sources and Methods: We used two openly available datasets from the City of New York for this analysis: Borough Boundaries (Clipped to Shoreline) for NYC, from the NYC Department of City Planning, available at https://www.nyc.gov/site/planning/data-maps/open-data/districts-download-metadata.page Six-inch resolution land cover data for New York City as of 2017, available at https://data.cityofnewyork.us/Environment/Land-Cover-Raster-Data-2017-6in-Resolution/he6d-2qns All data were used in the New York State Plane Coordinate System, Long Island Zone (EPSG 2263). Land cover data were used in a polygonized form for these analyses. The general steps for developing the data available in this repository were as follows: Create a grid of points across the city, based on the full extent of the Borough Boundaries dataset, with points 100 feet from one another in east/west and north/south directions Delete any points that do not overlap the areas in the Borough Boundaries dataset. Create circles centered at each point, with a radius of 200 meters (656.168 feet) in line with the aforementioned paper (Johnson et al. 2020). Overlay the circles with the land cover data, and calculate the proportion of the land cover that was grass/shrub and tree canopy land cover types. Note, because the land cover data consistently ended at the boundaries of NYC, for points within 200 meters of Nassau and Westchester Counties, the area with land cover data was smaller than the area of the circles. Relate the results from the overlay analysis back to the associated points. Create a raster data layer from the point data, with 100 foot by 100 foot resolution, where the center of each pixel is at the location of the respective points. Areas between the Borough Boundary polygons (open water of NY Harbor) are coded as "no data." All steps except for the creation of the raster dataset were conducted in PostgreSQL/PostGIS, as documented in nyc_point_buffer_vegetation_overlay.sql. The conversion of the results to a raster dataset was done in QGIS (version 3.28), ultimately using the gdal_rasterize function.
The Communities at Sea maps use Vessel Trip Report location point data as input to create density polygons representing visitation frequency ("fisherdays"). The data show total labor including crew time and the time spent in transit to and from fishing locations. They do not show other variables such as vessel value or number of pounds landed. The results can be interpreted as maps of "community presence." This layer shows data for the gillnet fishing gear group for Point Judith, RI from 2011-2015.
Region(s) of distribution of Okhotsk Hookear Sculpin (Artediellus ochotensis) Gilbert & Burke, 1912 in the Arctic as digitized for U.S. Geological Survey Scientific Investigations Report 2016-5038. For details on the project and purpose, see the report at https://doi.org/10.3133/sir20165038. Complete metadata for the collection of species datasets is in the metadata document "Dataset_for_Alaska_Marine_Fish_Ecology_Catalog.xml" at https://doi.org/10.5066/F7M61HD7. Source(s) for this digitized data layer are listed in the metadata Process Steps section. Note that the original source may show an extended area; some datasets were limited to the published map boundary. Distributions of marine fishes are shown in adjacent Arctic seas where reliable data are available. The data were clipped to show only the marine distribution areas although some species also may have an inland presence.
This layer shows Population. This is shown by state and county boundaries. This service contains the 2017-2021 release of data from the American Community Survey (ACS) 5-year data, and contains estimates and margins of error. There are also additional calculated attributes related to this topic, which can be mapped or used within analysis. This layer is symbolized to show the point by Population Density and size of the point by Total Population. The size of the symbol represents the total count of housing units. Population Density was calculated based on the total population and area of land fields, which both came from the U.S. Census Bureau. Formula used for Calculating the Pop Density (B01001_001E/GEO_LAND_AREA_SQ_KM). To see the full list of attributes available in this service, go to the "Data" tab, and choose "Fields" at the top right. Current Vintage: 2017-2021ACS Table(s): B01001, B09020Data downloaded from: Census Bureau's API for American Community Survey Date of API call: February 16, 2023National Figures: data.census.govThe United States Census Bureau's American Community Survey (ACS):About the SurveyGeography & ACSTechnical DocumentationNews & UpdatesThis ready-to-use layer can be used within ArcGIS Pro, ArcGIS Online, its configurable apps, dashboards, Story Maps, custom apps, and mobile apps. Data can also be exported for offline workflows. Please cite the Census and ACS when using this data.Data Note from the Census:Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see Accuracy of the Data). The effect of nonsampling error is not represented in these tables.Data Processing Notes:Boundaries come from the Cartographic Boundaries via US Census TIGER geodatabases. Boundaries are updated at the same time as the data updates, and the boundary vintage appropriately matches the data vintage as specified by the Census. These are Census boundaries with water and/or coastlines clipped for cartographic purposes. For state and county boundaries, the water and coastlines are derived from the coastlines of the 500k TIGER Cartographic Boundary Shapefiles. The original AWATER and ALAND fields are still available as attributes within the data table (units are square meters). The States layer contains 52 records - all US states, Washington D.C., and Puerto Rico. The Counties (and equivalent) layer contains 3221 records - all counties and equivalent, Washington D.C., and Puerto Rico municipios. See Areas Published. Percentages and derived counts, and associated margins of error, are calculated values (that can be identified by the "_calc_" stub in the field name), and abide by the specifications defined by the American Community Survey.Field alias names were created based on the Table Shells.Margin of error (MOE) values of -555555555 in the API (or "*****" (five asterisks) on data.census.gov) are displayed as 0 in this dataset. The estimates associated with these MOEs have been controlled to independent counts in the ACS weighting and have zero sampling error. So, the MOEs are effectively zeroes, and are treated as zeroes in MOE calculations. Other negative values on the API, such as -222222222, -666666666, -888888888, and -999999999, all represent estimates or MOEs that can't be calculated or can't be published, usually due to small sample sizes. All of these are rendered in this dataset as null (blank) values.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Overview
3DHD CityScenes is the most comprehensive, large-scale high-definition (HD) map dataset to date, annotated in the three spatial dimensions of globally referenced, high-density LiDAR point clouds collected in urban domains. Our HD map covers 127 km of road sections of the inner city of Hamburg, Germany including 467 km of individual lanes. In total, our map comprises 266,762 individual items.
Our corresponding paper (published at ITSC 2022) is available here. Further, we have applied 3DHD CityScenes to map deviation detection here.
Moreover, we release code to facilitate the application of our dataset and the reproducibility of our research. Specifically, our 3DHD_DevKit comprises:
Python tools to read, generate, and visualize the dataset,
3DHDNet deep learning pipeline (training, inference, evaluation) for map deviation detection and 3D object detection.
The DevKit is available here:
https://github.com/volkswagen/3DHD_devkit.
The dataset and DevKit have been created by Christopher Plachetka as project lead during his PhD period at Volkswagen Group, Germany.
When using our dataset, you are welcome to cite:
@INPROCEEDINGS{9921866, author={Plachetka, Christopher and Sertolli, Benjamin and Fricke, Jenny and Klingner, Marvin and Fingscheidt, Tim}, booktitle={2022 IEEE 25th International Conference on Intelligent Transportation Systems (ITSC)}, title={3DHD CityScenes: High-Definition Maps in High-Density Point Clouds}, year={2022}, pages={627-634}}
Acknowledgements
We thank the following interns for their exceptional contributions to our work.
Benjamin Sertolli: Major contributions to our DevKit during his master thesis
Niels Maier: Measurement campaign for data collection and data preparation
The European large-scale project Hi-Drive (www.Hi-Drive.eu) supports the publication of 3DHD CityScenes and encourages the general publication of information and databases facilitating the development of automated driving technologies.
The Dataset
After downloading, the 3DHD_CityScenes folder provides five subdirectories, which are explained briefly in the following.
This directory contains the training, validation, and test set definition (train.json, val.json, test.json) used in our publications. Respective files contain samples that define a geolocation and the orientation of the ego vehicle in global coordinates on the map.
During dataset generation (done by our DevKit), samples are used to take crops from the larger point cloud. Also, map elements in reach of a sample are collected. Both modalities can then be used, e.g., as input to a neural network such as our 3DHDNet.
To read any JSON-encoded data provided by 3DHD CityScenes in Python, you can use the following code snipped as an example.
import json
json_path = r"E:\3DHD_CityScenes\Dataset\train.json" with open(json_path) as jf: data = json.load(jf) print(data)
Map items are stored as lists of items in JSON format. In particular, we provide:
traffic signs,
traffic lights,
pole-like objects,
construction site locations,
construction site obstacles (point-like such as cones, and line-like such as fences),
line-shaped markings (solid, dashed, etc.),
polygon-shaped markings (arrows, stop lines, symbols, etc.),
lanes (ordinary and temporary),
relations between elements (only for construction sites, e.g., sign to lane association).
Our high-density point cloud used as basis for annotating the HD map is split in 648 tiles. This directory contains the geolocation for each tile as polygon on the map. You can view the respective tile definition using QGIS. Alternatively, we also provide respective polygons as lists of UTM coordinates in JSON.
Files with the ending .dbf, .prj, .qpj, .shp, and .shx belong to the tile definition as “shape file” (commonly used in geodesy) that can be viewed using QGIS. The JSON file contains the same information provided in a different format used in our Python API.
The high-density point cloud tiles are provided in global UTM32N coordinates and are encoded in a proprietary binary format. The first 4 bytes (integer) encode the number of points contained in that file. Subsequently, all point cloud values are provided as arrays. First all x-values, then all y-values, and so on. Specifically, the arrays are encoded as follows.
x-coordinates: 4 byte integer
y-coordinates: 4 byte integer
z-coordinates: 4 byte integer
intensity of reflected beams: 2 byte unsigned integer
ground classification flag: 1 byte unsigned integer
After reading, respective values have to be unnormalized. As an example, you can use the following code snipped to read the point cloud data. For visualization, you can use the pptk package, for instance.
import numpy as np import pptk
file_path = r"E:\3DHD_CityScenes\HD_PointCloud_Tiles\HH_001.bin" pc_dict = {} key_list = ['x', 'y', 'z', 'intensity', 'is_ground'] type_list = ['