Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Today, deep neural networks are widely used in many computer vision problems, also for geographic information systems (GIS) data. This type of data is commonly used for urban analyzes and spatial planning. We used orthophotographic images of two residential districts from Kielce, Poland for research including urban sprawl automatic analysis with Transformer-based neural network application.Orthophotomaps were obtained from Kielce GIS portal. Then, the map was manually masked into building and building surroundings classes. Finally, the ortophotomap and corresponding classification mask were simultaneously divided into small tiles. This approach is common in image data preprocessing for machine learning algorithms learning phase. Data contains two original orthophotomaps from Wietrznia and Pod Telegrafem residential districts with corresponding masks and also their tiled version, ready to provide as a training data for machine learning models.Transformed-based neural network has undergone a training process on the Wietrznia dataset, targeted for semantic segmentation of the tiles into buildings and surroundings classes. After that, inference of the models was used to test model's generalization ability on the Pod Telegrafem dataset. The efficiency of the model was satisfying, so it can be used in automatic semantic building segmentation. Then, the process of dividing the images can be reversed and complete classification mask retrieved. This mask can be used for area of the buildings calculations and urban sprawl monitoring, if the research would be repeated for GIS data from wider time horizon.Since the dataset was collected from Kielce GIS portal, as the part of the Polish Main Office of Geodesy and Cartography data resource, it may be used only for non-profit and non-commertial purposes, in private or scientific applications, under the law "Ustawa z dnia 4 lutego 1994 r. o prawie autorskim i prawach pokrewnych (Dz.U. z 2006 r. nr 90 poz 631 z późn. zm.)". There are no other legal or ethical considerations in reuse potential.Data information is presented below.wietrznia_2019.jpg - orthophotomap of Wietrznia districtmodel's - used for training, as an explanatory imagewietrznia_2019.png - classification mask of Wietrznia district - used for model's training, as a target imagewietrznia_2019_validation.jpg - one image from Wietrznia district - used for model's validation during training phasepod_telegrafem_2019.jpg - orthophotomap of Pod Telegrafem district - used for model's evaluation after training phasewietrznia_2019 - folder with wietrznia_2019.jpg (image) and wietrznia_2019.png (annotation) images, divided into 810 tiles (512 x 512 pixels each), tiles with no information were manually removed, so the training data would contain only informative tilestiles presented - used for the model during training (images and annotations for fitting the model to the data)wietrznia_2019_vaidation - folder with wietrznia_2019_validation.jpg image divided into 16 tiles (256 x 256 pixels each) - tiles were presented to the model during training (images for validation model's efficiency); it was not the part of the training datapod_telegrafem_2019 - folder with pod_telegrafem.jpg image divided into 196 tiles (256 x 265 pixels each) - tiles were presented to the model during inference (images for evaluation model's robustness)Dataset was created as described below.Firstly, the orthophotomaps were collected from Kielce Geoportal (https://gis.kielce.eu). Kielce Geoportal offers a .pst recent map from April 2019. It is an orthophotomap with a resolution of 5 x 5 pixels, constructed from a plane flight at 700 meters over ground height, taken with a camera for vertical photos. Downloading was done by WMS in open-source QGIS software (https://www.qgis.org), as a 1:500 scale map, then converted to a 1200 dpi PNG image.Secondly, the map from Wietrznia residential district was manually labelled, also in QGIS, in the same scope, as the orthophotomap. Annotation based on land cover map information was also obtained from Kielce Geoportal. There are two classes - residential building and surrounding. Second map, from Pod Telegrafem district was not annotated, since it was used in the testing phase and imitates situation, where there is no annotation for the new data presented to the model.Next, the images was converted to an RGB JPG images, and the annotation map was converted to 8-bit GRAY PNG image.Finally, Wietrznia data files were tiled to 512 x 512 pixels tiles, in Python PIL library. Tiles with no information or a relatively small amount of information (only white background or mostly white background) were manually removed. So, from the 29113 x 15938 pixels orthophotomap, only 810 tiles with corresponding annotations were left, ready to train the machine learning model for the semantic segmentation task. Pod Telegrafem orthophotomap was tiled with no manual removing, so from the 7168 x 7168 pixels ortophotomap were created 197 tiles with 256 x 256 pixels resolution. There was also image of one residential building, used for model's validation during training phase, it was not the part of the training data, but was a part of Wietrznia residential area. It was 2048 x 2048 pixel ortophotomap, tiled to 16 tiles 256 x 265 pixels each.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
The SkySeaLand Dataset is a high-resolution satellite imagery collection developed for object detection, classification, and aerial analysis tasks. It focuses on transportation-related objects observed from diverse geospatial contexts, offering precise YOLO-formatted annotations for four categories: airplane, boat, car, and ship.
This dataset bridges terrestrial, maritime, and aerial domains, providing a unified resource for developing and benchmarking computer vision models in complex real-world environments.
.txt file per image) The SkySeaLand Dataset is divided into the following subsets for training, validation, and testing:
This split ensures a balanced distribution for training, validating, and testing models, facilitating robust model evaluation and performance analysis.
| Class Name | Object Count |
|---|---|
| Airplane | 4,847 |
| Boat | 3,697 |
| Car | 6,932 |
| Ship | 3,627 |
The dataset maintains a moderately balanced distribution among categories, ensuring stable model performance during multi-class training and evaluation.
Each label file contains normalized bounding box annotations in YOLO format.
The format for each line is:
Where: - class_id: The class of the object (refer to the table below). - x_center, y_center: The center coordinates of the bounding box, normalized between 0 and 1 relative to the image width and height. - width, height: The width and height of the bounding box, also normalized between 0 and 1.
| Class ID | Category |
|---|---|
| 0 | Airplane |
| 1 | Boat |
| 2 | Car |
| 3 | Ship |
All coordinates are normalized between 0 and 1 relative to the image width and height.
Data Source:
- Satellite imagery was obtained from Google Earth Pro under fair-use and research guidelines.
- The dataset was prepared solely for academic and educational computer vision research.
Annotation Tools:
- Manual annotations were performed and verified using:
- CVAT (Computer Vision Annotation Tool)
- Roboflow
These tools were used to ensure consistent annotation quality and accurate bounding box placement across all object classes.
Facebook
Twitterhttps://market.us/privacy-policy/https://market.us/privacy-policy/
By 2034, the AI Annotation Market is expected to reach a valuation of USD 28.5 billion, expanding at a healthy CAGR of 28.6%
Facebook
Twitterhttps://www.wiseguyreports.com/pages/privacy-policyhttps://www.wiseguyreports.com/pages/privacy-policy
| BASE YEAR | 2024 |
| HISTORICAL DATA | 2019 - 2023 |
| REGIONS COVERED | North America, Europe, APAC, South America, MEA |
| REPORT COVERAGE | Revenue Forecast, Competitive Landscape, Growth Factors, and Trends |
| MARKET SIZE 2024 | 1127.4(USD Million) |
| MARKET SIZE 2025 | 1240.1(USD Million) |
| MARKET SIZE 2035 | 3200.0(USD Million) |
| SEGMENTS COVERED | Application, End Use, Service Type, Deployment Mode, Regional |
| COUNTRIES COVERED | US, Canada, Germany, UK, France, Russia, Italy, Spain, Rest of Europe, China, India, Japan, South Korea, Malaysia, Thailand, Indonesia, Rest of APAC, Brazil, Mexico, Argentina, Rest of South America, GCC, South Africa, Rest of MEA |
| KEY MARKET DYNAMICS | Increasing demand for AI technologies, Growth of autonomous vehicles, Advancements in LiDAR technology, Rising need for geospatial data, Expansion in 3D modeling applications |
| MARKET FORECAST UNITS | USD Million |
| KEY COMPANIES PROFILED | TechniMeasure, Amazon Web Services, Pointivo, Landmark Solutions, Autodesk, NVIDIA, Pix4D, Hexagon, Intel Corporation, Microsoft Azure, Faro Technologies, Google Cloud, Siemens, 3D Systems, Matterport, CGG |
| MARKET FORECAST PERIOD | 2025 - 2035 |
| KEY MARKET OPPORTUNITIES | Increasing demand for autonomous vehicles, Growth in AI and machine learning, Expansion of smart city projects, Rise in 3D modeling applications, Development of augmented and virtual reality |
| COMPOUND ANNUAL GROWTH RATE (CAGR) | 10.0% (2025 - 2035) |
Facebook
TwitterAnnotation for the Assessor's GIS data. This service is used in the OpenWeb and Opendoor application's.
Facebook
TwitterSpatial prepositions have been studied in some detail from multiple disciplinary perspectives. However, neither the semantic similarity of these prepositions, nor the relationships between the multiple senses of different spatial prepositions, are well understood. In an empirical study of 24 spatial prepositions, we identify the degree and nature of semantic similarity and extract senses for three semantically similar groups of prepositions using t-SNE, DBSCAN clustering, and Venn diagrams. We validate the work by manual annotation with another data set. We find nuances in meaning among proximity and adjacency prepositions, such as the use of close to instead of near for pairs of lines, and the importance of proximity over contact for the next to preposition, in contrast to other adjacency prepositions.
Facebook
TwitterAnnotation created from Indian Lands and Native Entities.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains annotations (i.e. polygons) for solar photovoltaic (PV) objects in the previously published dataset "Classification Training Dataset for Crop Types in Rwanda" published by RTI International (DOI: 10.34911/rdnt.r4p1fr [1]). These polygons are intended to enable the use of this dataset as a machine learning training dataset for solar PV identification in drone imagery. Note that this dataset contains ONLY the solar panel polygon labels and needs to be used with the original RGB UAV imagery “Drone Imagery Classification Training Dataset for Crop Types in Rwanda” (https://mlhub.earth/data/rti_rwanda_crop_type). The original dataset contains UAV imagery (RGB) in .tiff format in six provinces in Rwanda, each with three phases imaged and our solar PV annotation dataset follows the same data structure with province and phase label in each subfolder.Data processing:Please refer to this Github repository for further details: https://github.com/BensonRen/Drone_based_solar_PV_detection. The original dataset is divided into 8000x8000 pixel image tiles and manually labeled with polygons (mainly rectangles) to indicate the presence of solar PV. These polygons are converted into pixel-wise, binary class annotations.Other information:1. The six provinces that UAV imagery came from are: (1) Cyampirita (2) Kabarama (3) Kaberege (4) Kinyaga (5) Ngarama (6) Rwakigarati. These original data collections were staged across 18 phases, each collected a set of imagery from a given Province (each provinces had 3 phases of collection). We have annotated 15 out of 18 phases, with the missing ones being: Kabarama-Phase2, Kaberege-Phase3, and Kinyaga-Phase3 due to data compatibility issues of the unused phases.2. The annotated polygons are transformed into binary maps the size of the image tiles but where each pixel is either 0 or 1. In this case, 0 represents background and 1 represents solar PV pixels. These binary maps are in .png format and each Province/phase set has between 9 and 49 annotation patches. Using the code provided in the above repository, the same image patches can be cropped from the original RGB imagery.3. Solar PV densities vary across the image patches. In total, there were 214 solar PV instances labeled in the 15 phase.Associated publications:“Utilizing geospatial data for assessing energy security: Mapping small solar home systems using unmanned aerial vehicles and deep learning” [https://arxiv.org/abs/2201.05548]This dataset is published under CC-BY-NC-SA-4.0 license. (https://creativecommons.org/licenses/by-nc-sa/4.0/)
Facebook
TwitterPinal County Control Network Township Range Annotation
Facebook
TwitterCoast Train is a library of images of coastal environments, annotations, and corresponding thematic label masks (or ‘label images’) collated for the purposes of training and evaluating machine learning (ML), deep learning, and other models for image segmentation. It includes image sets from both geospatial satellite, aerial, and UAV imagery and orthomosaics, as well as non-geospatial oblique and nadir imagery. Images include a diverse range of coastal environments from the U.S. Pacific, Gulf of Mexico, Atlantic, and Great Lakes coastlines, consisting of time-series of high-resolution (≤1m) orthomosaics and satellite image tiles (10–30m). Each image, image annotation, and labelled image is available as a single NPZ zipped file. NPZ files follow the following naming convention: {datasource}{numberofclasses}{threedigitdatasetversion}.zip, where {datasource} is the source of the original images (for example, NAIP, Landsat 8, Sentinel 2), {numberofclasses} is the number of classes used to annotate the images, and {threedigitdatasetversion} is the three-digit code corresponding to the dataset version (in other words, 001 is version 1). Each zipped folder contains a collection of NPZ format files, each of which corresponds to an individual image. An individual NPZ file is named after the image that it represents and contains (1) a CSV file with detail information for every image in the zip folder and (2) a collection of the following NPY files: orig_image.npy (original input image unedited), image.npy (original input image after color balancing and normalization), classes.npy (list of classes annotated and present in the labelled image), doodles.npy (integer image of all image annotations), color_doodles.npy (color image of doodles.npy), label.npy (labelled image created from the classes present in the annotations), and settings.npy (annotation and machine learning settings used to generate the labelled image from annotations). All NPZ files can be extracted using the utilities available in Doodler (Buscombe, 2022). A merged CSV file containing detail information on the complete imagery collection is available at the top level of this data release, details of which are available in the Entity and Attribute section of this metadata file.
Facebook
TwitterUSGS Open-File Report 99-362 are digital files used to create the published paper map, USGS OFR 99-362. The 1:63,360 scale map shows the bedrock geology of a special study area within the Chugach National Forest, Alaska. Digital files include ARC/Info coverages in export format of geology, structural data, and annotation, and a PDF file of the Open-File Report.
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Our strategy is to reuse images from existing benchmark datasets as much as possible and manually annotate new land cover labels. We selected xBD, Inria, Open Cities AI, SpaceNet, Landcover.ai, AIRS, GeoNRW, and HTCD datasets. For countries and regions not covered by the existing datasets, aerial images publicly available in such countries or regions were collected to mitigate the regional gap, which is an issue in most of the existing benchmark datasets. The open data were downloaded from OpenAerialMap and geospatial agencies in Peru and Japan. The attribution of source data is summarized here.
We provide annotations with eight classes: bareland, rangeland, developed space, road, tree, water, agriculture land, and building. Their color and proportion of pixels are summarized below. All the labeling was done manually, and it took 2.5 hours per image on average.
| Color (HEX) | Class | % |
|---|---|---|
| #800000 | Bareland | 1.5 |
| #00FF24 | Rangeland | 22.9 |
| #949494 | Developed space | 16.1 |
| #FFFFFF | Road | 6.7 |
| #226126 | Tree | 20.2 |
| #0045FF | Wate | 3.3 |
| #4BB549 | Agriculture land | 13.7 |
| #DE1F07 | Building | 15.6 |
Label data of OpenEarthMap are provided under the same license as the original RGB images, which varies with each source dataset. For more details, please see the attribution of source data here. Label data for regions where the original RGB images are in the public domain or where the license is not explicitly stated are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
@inproceedings{xia_2023_openearthmap,
title = {OpenEarthMap: A Benchmark Dataset for Global High-Resolution Land Cover Mapping},
author = {Junshi Xia and Naoto Yokoya and Bruno Adriano and Clifford Broni-Bediako},
booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
month = {January},
year = {2023},
pages = {6254-6264}
}
Facebook
Twitterhttps://www.usa.gov/government-works/https://www.usa.gov/government-works/
This dataset contains detailed information about the locations and operational status of grocery stores in Washington, spanning multiple years. It includes both spatial and temporal data, offering a comprehensive view of how grocery stores are distributed and have evolved over time. Below is a breakdown of the columns included in the dataset:
X, Y: Geographic coordinates (latitude and longitude) representing the store's location in the dataset.
STORENAME: The name of the grocery store.
ADDRESS: The physical address of the grocery store.
ZIPCODE: The ZIP code of the store’s location.
PHONE: The contact phone number for the store.
WARD: The local government ward in which the store is located.
SSL: A unique identifier or code related to the store, possibly referring to specific data collection attributes.
NOTES: Additional comments or information about the store.
PRESENT: Temporal indicators showing the presence (likely open or closed) of each store across various years. These columns provide insights into the longevity and temporal trends of grocery store operations.
GIS_ID: A unique identifier for geographic information system (GIS) data.
XCOORD, YCOORD: Coordinates (likely more specific) used for spatial data analysis, providing the exact location of the store.
MAR_ID: A unique identifier for marketing or regional analysis purposes.
GLOBALID: A global unique identifier for the store data.
CREATOR: The individual or system that created the data entry.
CREATED: Timestamp showing when the data entry was created.
EDITOR: The individual or system that edited the data entry.
EDITED: Timestamp showing when the data entry was last edited.
SE_ANNO_CAD_DATA: Specific annotation or data related to CAD (computer-aided design), possibly linked to store location details.
OBJECTID: A unique identifier for the object or record within the dataset.
This dataset is invaluable for urban planners, policymakers, and business stakeholders looking to improve food access and urban infrastructure.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The RoadSens-4M Dataset provides a multimodal collection of sensor, video, weather, and GIS data designed to support research in intelligent transportation systems, road condition monitoring, and machine-learning-based anomaly detection. This dataset integrates synchronized smartphone sensor data (accelerometer, gyroscope, magnetometer, GPS) with video annotations, weather, and geospatial information to accurately identify and classify road surface anomalies, including bumps, potholes, and normal road segments.The dataset comprises 103 data sessions organized in a hierarchical structure to facilitate flexible access and multi-level analysis. It is divided into four main components: Raw Data, Combined CSV with GIS and Weather Data, Isolated Data, and GIS Data. Each session folder contains all corresponding sensor CSV files, including both calibrated and uncalibrated readings from the accelerometer, gyroscope, magnetometer, barometer, compass, gravity, and GPS sensors, along with annotation and metadata files. Within every session, a dedicated camera subfolder holds annotation data and a text file linking to the corresponding video stored on Google Drive, allowing researchers to access complete recordings without manual segmentation.The merged CSV files combine synchronized sensor, GIS, and weather information (temperature, humidity, wind speed, and atmospheric pressure) with a sampling interval of 0.01 seconds, ensuring high temporal resolution. The Isolated Data folder further separates normal and anomaly samples to enable focused comparative analysis, while the GIS Data folder contains QGIS and elevation files for spatial and topographical visualization.This well-structured organization ensures seamless integration of sensor, video, geographic, and environmental data, supporting efficient navigation and in-depth multimodal research. The raw data are hosted separately on Google Drive and can be accessed via the following link:🔗 https://drive.google.com/drive/folders/16tRSgXy6bjgIcJZzdw3U5unw7jpsKAHB?usp=drive_link
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
SpaceNet LLC is a nonprofit organization dedicated to accelerating open source, artificial intelligence applied research for geospatial applications, specifically foundational mapping (i.e. building footprint & road network detection).
I have been experimenting on SAR image segmentation for the past few months and would like to share with the Kaggle community this high quality dataset. It is the data from SpaceNet 6 challenge and is freely available in AWS opendata. This dataset only contains the training split, if you are interested in the testing split (only SAR) or the expanded SAR and optical dataset you should follow the steps and download from AWS S3. I share the dataset here to cut the steps of downloading the data and utilize Kaggle's powerful cloud computing.
This openly-licensed dataset features a unique combination of half-meter Synthetic Aperture Radar (SAR) imagery from Capella Space and half-meter electro-optical (EO) imagery from Maxar.
https://miro.medium.com/max/267/1*rqZ_qb_gN2voJC7YEqOFuQ.png" alt="sar image1">
https://miro.medium.com/max/267/1*lM3Oj6wqfjhqI2o4SpngOQ.png" alt="rgb1">
https://miro.medium.com/max/334/1*lVzH0w8_GVIyZHSFUczbHw.png" alt="sar image2">
https://miro.medium.com/max/334/1*OYmAog0U9OGrScHFoHYqAQ.png" alt="rgb2">
SAR data are provided by Capella Space via an aerial mounted sensor collecting 204 individual image strips from both north and south facing look-angles. Each of the image strips features four polarizations (HH, HV, VH, and VV) of data and are preprocessed to display the intensity of backscatter in decibel units at half-meter spatial resolution
The 48k building footprint annotations are provided by 3D Basisregistratie Adressen en Gebouwen (3DBAG) dataset with some additional quality control. Also in the annotations are statistics of building heights derived from digital elevation model
https://miro.medium.com/max/500/1*x5VCNbYLjUmxjiLiT9jrYA.png" alt="building footprints">
Shermeyer, J., Hogan, D., Brown, J., Etten, A.V., Weir, N., Pacifici, F., Hänsch, R., Bastidas, A., Soenen, S., Bacastow, T.M., & Lewis, R. (2020). SpaceNet 6: Multi-Sensor All Weather Mapping Dataset. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 768-777. Arxiv paper
SAR imagery can be an answer to disaster analysis or frequent earth monitoring thanks to its active sensor, imaging day/night and in any cloud coverage. But SAR images have their own challenges, which requires a trained eye, unlike optical images. Moreover, the launch of new high resolution SAR satellites will yield massive quantity of earth observation data. Just like with any modern computer vision problem, this looks like a job for a deep learning model.
Facebook
TwitterThe Hydrology Feature Dataset contains photogrammetrically compiled water drainage features and structures including rivers, streams, drainage canals, locks, dams, lakes, ponds, reservoirs and mooring cells. Rivers, Lakes, Ponds, Reservoirs, Hidden Lakes, Reservoirs or Ponds: If greater than 25 feet and less than 30 feet wide, is captured as a double line stream. If greater than 30 feet wide it is captured as a river. Lakes are large standing bodies of water greater than 5 acres in size. Ponds are large standing bodies of water greater than 1 acre and less than 5 acres in size. Polygons are created from Stream edges and River Edges. The Ohio River, Monongahela River and Allegheny River are coded as Major Rivers. All other River and Stream polygons are coded as River. If a stream is less than 25 feet wide it is placed as a single line and coded as a Stream. Both sides of the stream are digitized and coded as a Stream for Streams whose width is greater than 25 feet. River edges are digitized and coded as River.
A Drainage Canal is a manmade or channelized hydrographic feature. Drainage Canals are differentiated from streams in that drainage canals have had the sides and/or bottom stabilized to prevent erosion for the predominant length of the feature. Streams may have had some stabilization done, but are primarily in a natural state. Lakes are large standing bodies of water greater than five acres in size. Ponds are large standing bodies of water greater than one acre in size and less than five acres in size. Reservoirs are manmade embankments of water. Included in this definition are both covered and uncovered water tanks. Reservoirs that are greater than one acre in size are digitized. Hidden Streams, Hidden Rivers and Hidden Drainage Canal or Culverts are those areas of drainage where the water flows through a manmade facility such as a culvert. Hydrology Annotation is not being updated but will be preserved. If a drainage feature has been removed, as apparent on the aerial photography, the associated drainage name annotation will be removed. A Mooring Cell is a structure to which tows can tie off while awaiting lockage. They are normally constructed of concrete and steel and are anchored to the river bottom by means of gravity or sheet piling.
Mooring Cells do not currently exist in the Allegheny County dataset but will be added. Locks are devices that are used to control flow or access to a hydrologic feature. The edges of the Lock are captured. Dams are devices that are used to hold or delay the natural flow of water. The edges of the Dam are shown.
This dataset is harvested on a weekly basis from Allegheny County’s GIS data portal. The full metadata record for this dataset can also be found on Allegheny County's GIS portal. You can access the metadata record and other resources on the GIS portal by clicking on the “Explore” button (and choosing the "Go to resource" option) to the right of the "ArcGIS Open Dataset" text below.
Category: Environment
Department: Geographic Information Systems Group; Department of Administrative Services
Data Notes: Coordinate System: Pennsylvania State Plane South Zone 3702; U.S. Survey Foot
Development Notes: Original Lakes and Drainage datasets combined to create this layer. Data was updated as a result of a flyover in the spring of 2004. A database field has been defined for all map features named Update Year". This database field will define which dataset provided each map feature. Map features from the current map will be set to "2004". The earlier dataset map features the earlier dataset map features used to supplement the area near the county boundary will be set to "1993". All new or modified map data will have the value for "Update Year" set to "2004".
Data Dictionary: https://docs.google.com/spreadsheets/d/16BWrRkoPtq2ANRkrbG7CrfQk2dUsWRiaS2Ee1mTn7l0/edit?usp=sharing
Facebook
Twitterhttps://www.statsndata.org/how-to-orderhttps://www.statsndata.org/how-to-order
The 3D Point Cloud Annotation Services market has emerged as a pivotal segment within the realms of computer vision, artificial intelligence, and geospatial technologies, addressing the increasing demand for accurate data interpretation across various industries. As enterprises strive to leverage 3D data for enhance
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
“Mobile mapping data” or “geospatial videos”, as a technology that combines GPS data with videos, were collected from the windshield of vehicles with Android Smartphones. Nearly 7,000 videos with an average length of 70 seconds were recorded in 2019. The smartphones collected sensor data (longitude and latitude, accuracy, speed and bearing) approximately every second during the video recording.
Based on the geospatial videos, we manually identified and labeled about 10,000 parking violations in data with the help of an annotation tool. For this purpose, we defined six categorical variables (see PDF). Besides parking violations, we included street features like street category, type of bicycle infrastructure, and direction of parking spaces. An example for a street category is the collector street, which is an access street with primary residential use as well as individual shops and community facilities. Obviously, the labeling is a step that can (partly) be done automatically with image recognition in the future if the labeled data is used as a training dataset for a machine learning model.
https://www.bmvi.de/SharedDocs/DE/Artikel/DG/mfund-projekte/parkright.html
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
GIs present (+) and absent (−).*Functional classification data obtained by Burkholderia pseudomallei K96243 genome annotation from Pathema Bioinformatics Resource Center [28].
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Today, deep neural networks are widely used in many computer vision problems, also for geographic information systems (GIS) data. This type of data is commonly used for urban analyzes and spatial planning. We used orthophotographic images of two residential districts from Kielce, Poland for research including urban sprawl automatic analysis with Transformer-based neural network application.Orthophotomaps were obtained from Kielce GIS portal. Then, the map was manually masked into building and building surroundings classes. Finally, the ortophotomap and corresponding classification mask were simultaneously divided into small tiles. This approach is common in image data preprocessing for machine learning algorithms learning phase. Data contains two original orthophotomaps from Wietrznia and Pod Telegrafem residential districts with corresponding masks and also their tiled version, ready to provide as a training data for machine learning models.Transformed-based neural network has undergone a training process on the Wietrznia dataset, targeted for semantic segmentation of the tiles into buildings and surroundings classes. After that, inference of the models was used to test model's generalization ability on the Pod Telegrafem dataset. The efficiency of the model was satisfying, so it can be used in automatic semantic building segmentation. Then, the process of dividing the images can be reversed and complete classification mask retrieved. This mask can be used for area of the buildings calculations and urban sprawl monitoring, if the research would be repeated for GIS data from wider time horizon.Since the dataset was collected from Kielce GIS portal, as the part of the Polish Main Office of Geodesy and Cartography data resource, it may be used only for non-profit and non-commertial purposes, in private or scientific applications, under the law "Ustawa z dnia 4 lutego 1994 r. o prawie autorskim i prawach pokrewnych (Dz.U. z 2006 r. nr 90 poz 631 z późn. zm.)". There are no other legal or ethical considerations in reuse potential.Data information is presented below.wietrznia_2019.jpg - orthophotomap of Wietrznia districtmodel's - used for training, as an explanatory imagewietrznia_2019.png - classification mask of Wietrznia district - used for model's training, as a target imagewietrznia_2019_validation.jpg - one image from Wietrznia district - used for model's validation during training phasepod_telegrafem_2019.jpg - orthophotomap of Pod Telegrafem district - used for model's evaluation after training phasewietrznia_2019 - folder with wietrznia_2019.jpg (image) and wietrznia_2019.png (annotation) images, divided into 810 tiles (512 x 512 pixels each), tiles with no information were manually removed, so the training data would contain only informative tilestiles presented - used for the model during training (images and annotations for fitting the model to the data)wietrznia_2019_vaidation - folder with wietrznia_2019_validation.jpg image divided into 16 tiles (256 x 256 pixels each) - tiles were presented to the model during training (images for validation model's efficiency); it was not the part of the training datapod_telegrafem_2019 - folder with pod_telegrafem.jpg image divided into 196 tiles (256 x 265 pixels each) - tiles were presented to the model during inference (images for evaluation model's robustness)Dataset was created as described below.Firstly, the orthophotomaps were collected from Kielce Geoportal (https://gis.kielce.eu). Kielce Geoportal offers a .pst recent map from April 2019. It is an orthophotomap with a resolution of 5 x 5 pixels, constructed from a plane flight at 700 meters over ground height, taken with a camera for vertical photos. Downloading was done by WMS in open-source QGIS software (https://www.qgis.org), as a 1:500 scale map, then converted to a 1200 dpi PNG image.Secondly, the map from Wietrznia residential district was manually labelled, also in QGIS, in the same scope, as the orthophotomap. Annotation based on land cover map information was also obtained from Kielce Geoportal. There are two classes - residential building and surrounding. Second map, from Pod Telegrafem district was not annotated, since it was used in the testing phase and imitates situation, where there is no annotation for the new data presented to the model.Next, the images was converted to an RGB JPG images, and the annotation map was converted to 8-bit GRAY PNG image.Finally, Wietrznia data files were tiled to 512 x 512 pixels tiles, in Python PIL library. Tiles with no information or a relatively small amount of information (only white background or mostly white background) were manually removed. So, from the 29113 x 15938 pixels orthophotomap, only 810 tiles with corresponding annotations were left, ready to train the machine learning model for the semantic segmentation task. Pod Telegrafem orthophotomap was tiled with no manual removing, so from the 7168 x 7168 pixels ortophotomap were created 197 tiles with 256 x 256 pixels resolution. There was also image of one residential building, used for model's validation during training phase, it was not the part of the training data, but was a part of Wietrznia residential area. It was 2048 x 2048 pixel ortophotomap, tiled to 16 tiles 256 x 265 pixels each.