100+ datasets found

U
Map georeferencing challenge training and validation data
data.usgs.gov
catalog.data.gov
Updated Dec 27, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Margaret Goldman; Joshua Rosera; Graham Lederer; Garth Graham; Asitang Mishra; Alice Yepremyan (2023). Map georeferencing challenge training and validation data [Dataset]. http://doi.org/10.5066/P9FXSPT1
Explore at:
Unique identifier
https://doi.org/10.5066/P9FXSPT1
Dataset updated
Dec 27, 2023
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Authors
Margaret Goldman; Joshua Rosera; Graham Lederer; Garth Graham; Asitang Mishra; Alice Yepremyan
License
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
Time period covered
2022 - 2023
Description
Extracting useful and accurate information from scanned geologic and other earth science maps is a time-consuming and laborious process involving manual human effort. To address this limitation, the USGS partnered with the Defense Advanced Research Projects Agency (DARPA) to run the AI for Critical Mineral Assessment Competition, soliciting innovative solutions for automatically georeferencing and extracting features from maps. The competition opened for registration in August 2022 and concluded in December 2022. Training and validation data from the map georeferencing challenge are provided here, as well as competition details and a baseline solution. The data were derived from published sources and are provided to the public to support continued development of automated georeferencing and feature extraction tools. References for all maps are included with the data.
Training data for 'Mapping-by-sequencing' tutorial (Galaxy Training...
zenodo.org
data.niaid.nih.gov
bin
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Wolfgang Maier; Wolfgang Maier (2020). Training data for 'Mapping-by-sequencing' tutorial (Galaxy Training Material) [Dataset]. http://doi.org/10.5281/zenodo.1098034
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.1098034
Dataset updated
Jan 24, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Wolfgang Maier; Wolfgang Maier
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The data provided here are part of a Galaxy Training Network tutorial that demonstrates mapping-by-sequencing analysis and represent a subsample of the data used in Sun & Schneeberger, 2015 (DOI:10.1007/978-1-4939-2444-8_19).
Training: 4. Mapping with Google Earth Pro
sudan-uneplive.hub.arcgis.com
Updated Jun 25, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
UN Environment, Early Warning &Data Analytics (2020). Training: 4. Mapping with Google Earth Pro [Dataset]. https://sudan-uneplive.hub.arcgis.com/documents/dfe5e722e15b4c4e9f3d9c346067d92f
Explore at:
Dataset updated
Jun 25, 2020
Dataset provided by
United Nations Environment Programmehttp://www.unep.org/
Authors
UN Environment, Early Warning &Data Analytics
Description
This training, developed by UNEP, covers the basic of Google Earth Pro, including how to search for locations and create data. Google Earth Pro is a useful tool for participatory mapping processes.
KU Students Wrap Up Course in Aerial Mapping Using Drones - Datasets -...
ckan.americaview.org
Updated Sep 16, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ckan.americaview.org (2021). KU Students Wrap Up Course in Aerial Mapping Using Drones - Datasets - AmericaView - CKAN [Dataset]. https://ckan.americaview.org/dataset/ku-students-wrap-up-course-in-aerial-mapping-using-drones
Explore at:
Dataset updated
Sep 16, 2021
Dataset provided by
CKANhttps://ckan.org/
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
LAWRENCE — Fall break is barely behind us, but a group of University of Kansas students has just finished an innovative eight-week course in using drones to develop aerial maps. Over the past two months, they’ve visited sites in KU's West District and at the Baker Wetlands, taking still images and videos over those areas. “The drone mapping course has been excellent in providing a hands-on experience with the drones,” said Siddharth Shankar, graduate student from Lucknow, India. “The course has focused not just on drones and how to fly them but also has made us aware of the FAA rules and regulations about drone flying and safety precautions. “My research has been in glaciology, with the study of icebergs in Greenland. The drone mapping course has provided new insights into incorporating it with my research in the near future.” The course, offered annually during the fall semester, is designed to teach students about the rapidly growing technology of small unmanned aerial systems, referred to as drones, and its wide-ranging applications — which include search-and-rescue, real estate and environmental monitoring. Students in the course come from a variety of disciplines including geography & atmospheric science, geology, ecology & evolutionary biology and civil engineering. Enthusiasm for the course has been very high, and it has filled rapidly each time it has been offered.
U
GeoNatShapes: a natural feature reference dataset for mapping and AI...
data.usgs.gov
catalog.data.gov
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Samantha Arundel; WenWen Li; Sizhe Wang; Arthur Chan; Nadia Ariani; Majid Mohamed, GeoNatShapes: a natural feature reference dataset for mapping and AI training [Dataset]. http://doi.org/10.5066/P9X5BN1L
Explore at:
Unique identifier
https://doi.org/10.5066/P9X5BN1L
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Authors
Samantha Arundel; WenWen Li; Sizhe Wang; Arthur Chan; Nadia Ariani; Majid Mohamed
License
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
Time period covered
2020
Description
These data were compiled for the use of training natural feature machine learning (GeoAI) detection and delineation. The natural feature classes include the Geographic Names Information System (GNIS) feature types Basins, Bays, Bends, Craters, Gaps, Guts, Islands, Lakes, Ridges and Valleys, and are an areal representation of those GNIS point features. Features were produced using heads-up digitizing from 2018 to 2019 by Dr. Sam Arundel's team at the U.S. Geological Survey, Center of Excellence for Geospatial Information Science, Rolla, Missouri, USA, and Dr. Wenwen Li's team in the School of Geographical Sciences at Arizona State University, Tempe, Arizona, USA.
g
Mapping of training courses Parcoursup | gimi9.com
gimi9.com
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mapping of training courses Parcoursup | gimi9.com [Dataset]. https://gimi9.com/dataset/eu_https-data-education-gouv-fr-explore-dataset-fr-esr-cartographie_formations_parcoursup-
Explore at:
Description
This dataset presents the data underlying the interactive map of all training courses accessible via Parcoursup in 2020, 2021, 2022 and 2023 (‘https://dossier.parcoursup.fr/Candidat/carte’). The 2024 data will be completed gradually until 17 January. This dataset is updated daily.
Satellite images and road-reference data for AI-based road mapping in...
data.niaid.nih.gov
datadryad.org
+1more
zip
Updated Apr 4, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sean Sloan; Raiyan Talkhani; Tao Huang; Jayden Engert; William Laurance (2024). Satellite images and road-reference data for AI-based road mapping in Equatorial Asia [Dataset]. http://doi.org/10.5061/dryad.bvq83bkg7
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.bvq83bkg7
Dataset updated
Apr 4, 2024
Dataset provided by
Vancouver Island University
James Cook University
Authors
Sean Sloan; Raiyan Talkhani; Tao Huang; Jayden Engert; William Laurance
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Area covered
Asia
Description
For the purposes of training AI-based models to identify (map) road features in rural/remote tropical regions on the basis of true-colour satellite imagery, and subsequently testing the accuracy of these AI-derived road maps, we produced a dataset of 8904 satellite image ‘tiles’ and their corresponding known road features across Equatorial Asia (Indonesia, Malaysia, Papua New Guinea). Methods

INPUT 200 SATELLITE IMAGES

The main dataset shared here was derived from a set of 200 input satellite images, also provided here. These 200 images are effectively ‘screenshots’ (i.e., reduced-resolution copies) of high-resolution true-colour satellite imagery (~0.5-1m pixel resolution) observed using the Elvis Elevation and Depth spatial data portal (https://elevation.fsdf.org.au/), which here is functionally equivalent to the more familiar Google Earth. Each of these original images was initially acquired at a resolution of 1920x886 pixels. Actual image resolution was coarser than the native high-resolution imagery. Visual inspection of these 200 images suggests a pixel resolution of ~5 meters, given the number of pixels required to span features of familiar scale, such as roads and roofs, as well as the ready discrimination of specific land uses, vegetation types, etc. These 200 images generally spanned either forest-agricultural mosaics or intact forest landscapes with limited human intervention. Sloan et al. (2023) present a map indicating the various areas of Equatorial Asia from which these images were sourced.
IMAGE NAMING CONVENTION A common naming convention applies to satellite images’ file names: XX##.png where:

XX – denotes the geographical region / major island of Equatorial Asia of the image, as follows: ‘bo’ (Borneo), ‘su’ (Sumatra), ‘sl’ (Sulawesi), ‘pn’ (Papua New Guinea), ‘jv’ (java), ‘ng’ (New Guinea [i.e., Papua and West Papua provinces of Indonesia])

– denotes the ith image for a given geographical region / major island amongst the original 200 images, e.g., bo1, bo2, bo3…

INTERPRETING ROAD FEATURES IN THE IMAGES For each of the 200 input satellite images, its road was visually interpreted and manually digitized to create a reference image dataset by which to train, validate, and test AI road-mapping models, as detailed in Sloan et al. (2023). The reference dataset of road features was digitized using the ‘pen tool’ in Adobe Photoshop. The pen’s ‘width’ was held constant over varying scales of observation (i.e., image ‘zoom’) during digitization. Consequently, at relatively small scales at least, digitized road features likely incorporate vegetation immediately bordering roads. The resultant binary (Road / Not Road) reference images were saved as PNG images with the same image dimensions as the original 200 images.

IMAGE TILES AND REFERENCE DATA FOR MODEL DEVELOPMENT

The 200 satellite images and the corresponding 200 road-reference images were both subdivided (aka ‘sliced’) into thousands of smaller image ‘tiles’ of 256x256 pixels each. Subsequent to image subdivision, subdivided images were also rotated by 90, 180, or 270 degrees to create additional, complementary image tiles for model development. In total, 8904 image tiles resulted from image subdivision and rotation. These 8904 image tiles are the main data of interest disseminated here. Each image tile entails the true-colour satellite image (256x256 pixels) and a corresponding binary road reference image (Road / Not Road).
Of these 8904 image tiles, Sloan et al. (2023) randomly selected 80% for model training (during which a model ‘learns’ to recognize road features in the input imagery), 10% for model validation (during which model parameters are iteratively refined), and 10% for final model testing (during which the final accuracy of the output road map is assessed). Here we present these data in two folders accordingly:

'Training’ – contains 7124 image tiles used for model training in Sloan et al. (2023), i.e., 80% of the original pool of 8904 image tiles. ‘Testing’– contains 1780 image tiles used for model validation and model testing in Sloan et al. (2023), i.e., 20% of the original pool of 8904 image tiles, being the combined set of image tiles for model validation and testing in Sloan et al. (2023).

IMAGE TILE NAMING CONVENTION A common naming convention applies to image tiles’ directories and file names, in both the ‘training’ and ‘testing’ folders: XX##_A_B_C_DrotDDD where

XX – denotes the geographical region / major island of Equatorial Asia of the original input 1920x886 pixel image, as follows: ‘bo’ (Borneo), ‘su’ (Sumatra), ‘sl’ (Sulawesi), ‘pn’ (Papua New Guinea), ‘jv’ (java), ‘ng’ (New Guinea [i.e., Papua and West Papua provinces of Indonesia])

– denotes the ith image for a given geographical region / major island amongst the original 200 images, e.g., bo1, bo2, bo3…

A, B, C and D – can all be ignored. These values, which are one of 0, 256, 512, 768, 1024, 1280, 1536, and 1792, are effectively ‘pixel coordinates’ in the corresponding original 1920x886-pixel input image. They were recorded within the names of image tiles’ sub-directories and file names merely to ensure that names/directory were uniquely named)

rot – implies an image rotation. Not all image tiles are rotated, so ‘rot’ will appear only occasionally.

DDD – denotes the degree of image-tile rotation, e.g., 90, 180, 270. Not all image tiles are rotated, so ‘DD’ will appear only occasionally.

Note that the designator ‘XX##’ is directly equivalent to the filenames of the corresponding 1920x886-pixel input satellite images, detailed above. Therefore, each image tiles can be ‘matched’ with its parent full-scale satellite image. For example, in the ‘training’ folder, the subdirectory ‘Bo12_0_0_256_256’ indicates that its image tile therein (also named ‘Bo12_0_0_256_256’) would have been sourced from the full-scale image ‘Bo12.png’.
Quivira National Wildlife Refuge vegetation mapping project 2010-2011
catalog.data.gov
gimi9.com
+2more
Updated Feb 22, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Fish and Wildlife Service (2025). Quivira National Wildlife Refuge vegetation mapping project 2010-2011 [Dataset]. https://catalog.data.gov/dataset/quivira-national-wildlife-refuge-vegetation-mapping-project-2010-2011
Explore at:
Dataset updated
Feb 22, 2025
Dataset provided by
U.S. Fish and Wildlife Servicehttp://www.fws.gov/
Description
Quivira National Wildlife Refuge was established in 1955, and a detailed vegetation map was not available for management purposes. With the present development of a biological program and Comprehensive Conservation Plan (CCP), a baseline vegetation map of the refuge was identified as a necessity. Development of the vegetation map and associated report was a multi-step process. Aerial photography (NAIP, 2008) was used with eCognition to create polygons of different plant communities based on the likeness of surrounding pixels in the area. Prior to ground-truthing, the following activities were accomplished: training on vegetation mapping using GIS (previous experience and National Conservation Training Center course), creation of an vegetation association and alliance dichotomous key, development of a refuge plant key and identification skills, and preparation of maps for ground truthing. Once out in the field dominant plants were identified for appropriate vegetation alliance and association classification, plant specimens were collected for the refuge herbarium as necessary and additional observations and photos were gathered for the report. Over the course of the project, classification data was entered into a GIS and polygons were appropriately modified to create the final map. At Quivira, results found a total of 42 alliances and 43 associations.The most dominant plants throughout the refuge in 2008 based on canopy cover were saltgrass, plum, little bluestem and cottonwood. The number of alliances and associations found on the refuge show high species diversity.
Z
Land Cover Fraction Mapping with FORCE - Supplemental Data
data.niaid.nih.gov
zenodo.org
Updated Jan 13, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Schug, Franz (2023). Land Cover Fraction Mapping with FORCE - Supplemental Data [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7529762
Explore at:
Dataset updated
Jan 13, 2023
Dataset provided by
Frantz, David
Schug, Franz
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This upload contains data required to replicate a tutorial that applies regression-based unmixing of spectral-temporal metrics for sub-pixel land cover mapping with synthetically created training data. The tutorial uses the Framework for Operational Radiometric Correction for Environmental monitoring.

This dataset contains intermediate and final results of the workflow described in that tutorial as well as auxiliary data such as parameter files.

Please refer to the above mentioned tutorial for more information.
Sensitivity and specificity of global native T1 value and pattern-derived...
plos.figshare.com
xls
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hossam El-Rewaidy; Ulf Neisius; Shiro Nakamori; Long Ngo; Jennifer Rodriguez; Warren J. Manning; Reza Nezafat (2023). Sensitivity and specificity of global native T1 value and pattern-derived texture features to identify subjects in the three cohorts (i.e. control, HCM, and DCM) using multiclass linear SVM. [Dataset]. http://doi.org/10.1371/journal.pone.0233694.t004
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0233694.t004
Dataset updated
May 31, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Hossam El-Rewaidy; Ulf Neisius; Shiro Nakamori; Long Ngo; Jennifer Rodriguez; Warren J. Manning; Reza Nezafat
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Sensitivity and specificity of global native T1 value and pattern-derived texture features to identify subjects in the three cohorts (i.e. control, HCM, and DCM) using multiclass linear SVM.
d
Data from "Mapping bedrock outcrops in the Sierra Nevada Mountains...
catalog.data.gov
gimi9.com
Updated Feb 21, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2025). Data from "Mapping bedrock outcrops in the Sierra Nevada Mountains (California, USA) using machine learning" [Dataset]. https://catalog.data.gov/dataset/data-from-mapping-bedrock-outcrops-in-the-sierra-nevada-mountains-california-usa-using-mac
Explore at:
Dataset updated
Feb 21, 2025
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Area covered
California, Sierra Nevada, Nevada, United States
Description
Accurate, high-resolution maps of bedrock outcrops are extremely valuable. The increasing availability of high-resolution imagery can be coupled with machine learning techniques to improve regional bedrock maps. This data release contains training data created for developing a machine learning model capable of identifying exposed bedrock across the entire Sierra Nevada Mountains (California, USA). The training data consist of 20 thematic rasters in GeoTIFF format, where image labels represent three categories: rock, not rock, and no data. These training data labels were created using 0.6-m imagery from the National Agriculture Imagery Program (NAIP) acquired in 2016. Eight existing labeled sites were available from Petliak et al. (2019), an earlier effort. We further revised those labels for improved accuracy and created additional 12 reference sites following the same protocol of semi-manual mapping in Petliak et al. (2019). A machine learning model (https://github.com/nasa/delta) was trained and tested based on these image labels as detailed in Shastry et al. (in review). The trained model was then used to map exposed bedrock across the entire Sierra Nevada region using 2016 NAIP imagery, and this data release also includes these model outputs. The model output gives the likelihood (from 0 to 255) that each pixel is bedrock, and not a direct binary classification. The associated publication used a threshold of 50%, or pixel value 127, where all pixel values 127 or higher are classified as rock and less than as not rock.
Military Installations, Ranges, and Training Areas (MIRTA) DoD Sites -...
share-open-data-njtpa.hub.arcgis.com
arc-gis-hub-home-arcgishub.hub.arcgis.com
+4more
Updated Jul 31, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
GeoPlatform ArcGIS Online (2023). Military Installations, Ranges, and Training Areas (MIRTA) DoD Sites - Points [Dataset]. https://share-open-data-njtpa.hub.arcgis.com/datasets/geoplatform::military-installations-ranges-and-training-areas-mirta-dod-sites-points
Explore at:
Dataset updated
Jul 31, 2023
Dataset provided by
Authors
GeoPlatform ArcGIS Online
Area covered
North Pacific Ocean, Pacific Ocean
Description
The dataset depicts the authoritative locations of the most commonly known Department of Defense (DoD) sites, installations, ranges, and training areas world-wide. These sites encompass land which is federally owned or otherwise managed. This dataset was created from source data provided by the four Military Service Component headquarters and was compiled by the Defense Installation Spatial Data Infrastructure (DISDI) Program within the Office of the Assistant Secretary of Defense for Energy, Installations, and Environment. Only sites reported in the BSR or released in a map supplementing the Foreign Investment Risk Review Modernization Act of 2018 (FIRRMA) Real Estate Regulation (31 CFR Part 802) were considered for inclusion. This list does not necessarily represent a comprehensive collection of all Department of Defense facilities. For inventory purposes, installations are comprised of sites, where a site is defined as a specific geographic location of federally owned or managed land and is assigned to military installation. DoD installations are commonly referred to as a base, camp, post, station, yard, center, homeport facility for any ship, or other activity under the jurisdiction, custody, control of the DoD.While every attempt has been made to provide the best available data quality, this data set is intended for use at mapping scales between 1:50,000 and 1:3,000,000. For this reason, boundaries in this data set may not perfectly align with DoD site boundaries depicted in other federal data sources. Maps produced at a scale of 1:50,000 or smaller which otherwise comply with National Map Accuracy Standards, will remain compliant when this data is incorporated. Boundary data is most suitable for larger scale maps; point locations are better suited for mapping scales between 1:250,000 and 1:3,000,000.If a site is part of a Joint Base (effective/designated on 1 October, 2010) as established under the 2005 Base Realignment and Closure process, it is attributed with the name of the Joint Base. All sites comprising a Joint Base are also attributed to the responsible DoD Component, which is not necessarily the pre-2005 Component responsible for the site.
d
FileMarket | Telegram Users Geolocation Data with IP & Consent | 50,000...
datarade.ai
Updated Aug 18, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FileMarket (2024). FileMarket | Telegram Users Geolocation Data with IP & Consent | 50,000 Records | AI, ML, DL & LLM Training Data [Dataset]. https://datarade.ai/data-products/filemarket-telegram-users-geolocation-data-with-ip-consen-filemarket
Explore at:
.bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
Dataset updated
Aug 18, 2024
Dataset authored and provided by
FileMarket
Area covered
Anguilla, Korea (Republic of), Gambia, Uzbekistan, Syrian Arab Republic, Portugal, Malaysia, Kiribati, Martinique, Thailand
Description
This dataset offers a comprehensive collection of Telegram users' geolocation data, including IP addresses, with full user consent, covering 50,000 records. This data is specifically tailored for use in AI, ML, DL, and LLM models, as well as applications requiring Geographic Data and Social Media Data. The dataset provides critical geospatial information, making it a valuable resource for developing location-based services, targeted marketing strategies, and more.

What Makes This Data Unique? This dataset is unique due to its focus on geolocation data tied to Telegram users, a platform with a global user base. It includes IP to Geolocation Data, offering precise geospatial insights that are essential for accurate geographic analysis. The inclusion of user consent ensures that the data is ethically sourced and legally compliant. The dataset's broad coverage across various regions makes it particularly valuable for AI and machine learning models that require diverse, real-world data inputs.

Data Sourcing: The data is collected through a network of in-app tasks across different mini-apps within Telegram. Users participate in these tasks voluntarily, providing explicit consent to share their geolocation and IP information. The data is collected in real-time, capturing accurate geospatial details as users interact with various Telegram mini-apps. This method of data collection ensures that the information is both relevant and up-to-date, making it highly valuable for applications that require current location data.

Primary Use-Cases: This dataset is highly versatile and can be applied across multiple categories, including:

IP to Geolocation Data: The dataset provides precise mapping of IP addresses to geographical locations, making it ideal for applications that require accurate geolocation services. Geographic Data: The geospatial information contained in the dataset supports a wide range of geographic analysis, including regional behavior studies and location-based service optimization. Social Media Data: The dataset's integration with Telegram users' activities provides insights into social media behaviors across different regions, enhancing social media analytics and targeted marketing. Large Language Model (LLM) Data: The geolocation data can be used to train LLMs to better understand and generate content that is contextually relevant to specific regions. Deep Learning (DL) Data: The dataset is ideal for training deep learning models that require accurate and diverse geospatial inputs, such as those used in autonomous systems and advanced geographic analytics. Integration with Broader Data Offering: This geolocation dataset is a valuable addition to the broader data offerings from FileMarket. It can be combined with other datasets, such as web browsing behavior or social media activity data, to create comprehensive AI models that provide deep insights into user behaviors across different contexts. Whether used independently or as part of a larger data strategy, this dataset offers unique value for developers and data scientists focused on enhancing their models with precise, consented geospatial data.
g
Marino Watershed: Training dataset
demo.georchestra.org
geosas.fr
Updated Jun 12, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
AgroParisTech (2024). Marino Watershed: Training dataset [Dataset]. https://demo.georchestra.org/geonetwork/srv/api/records/1aca752e-ad4b-4b50-9281-f79df08232b12?language=all
Explore at:
ogc:wms-1.3.0-http-get-map, www:link-1.0-http--linkAvailable download formats
Dataset updated
Jun 12, 2024
Dataset provided by
AgroParisTech
UMR TÉTIS
Description
A very high spatial resolution Land Use and Land Cover map was produced for the greater Marino watershed (Peru) using the MORINGA processing chain. The methods involved multisource satellite imagery and a random forest model, as well as manual post-treatment. The final map provides important information for environmental management and monitoring and contributes to developing standardized methodologies for accurate LULC mapping.

Training Dataset
a
Sonoma County Vegetation and Habitat Map (Layer File)
santest-ssfzgc0wzfev45bn.hub.arcgis.com
Updated Jun 1, 2017
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sonoma County Ag + Open Space (2017). Sonoma County Vegetation and Habitat Map (Layer File) [Dataset]. https://santest-ssfzgc0wzfev45bn.hub.arcgis.com/datasets/sonomaopenspace::-sonoma-county-vegetation-and-habitat-map-layer-file
Explore at:
Dataset updated
Jun 1, 2017
Dataset authored and provided by
Sonoma County Ag + Open Space
Area covered
Sonoma County
Description
The Sonoma County fine scale vegetation and habitat map is an 83-class vegetation map of Sonoma County with 212,391 polygons. The fine scale vegetation and habitat map represents the state of the landscape in 2013 and adheres to the National Vegetation Classification System (NVC). The map was designed to be used at scales of 1:5,000 and smaller. This layer file is just to be used for symbology - no spatial data is included. For the spatial data, download the veg map layer package, file geodatabase, or shapefile. The full datasheet for this product is available here: https://sonomaopenspace.egnyte.com/dl/qOm3JEb3tDClass definitions, as well as a dichotomous key for the map classes, can be found in the Sonoma Vegetation and Habitat Map Key (https://sonomaopenspace.egnyte.com/dl/xObbaG6lF8). The fine scale vegetation and habitat map was created using semi-automated methods that include field work, computer-based machine learning, and manual aerial photo interpretation. The vegetation and habitat map was developed by first creating a lifeform map, an 18-class map that served as a foundation for the fine-scale map. The lifeform map was created using “expert systems” rulesets in Trimble Ecognition. These rulesets combine automated image segmentation (stand delineation) with object based image classification techniques. In contrast with machine learning approaches, expert systems rulesets are developed heuristically based on the knowledge of experienced image analysts. Key data sets used in the expert systems rulesets for lifeform included: orthophotography (’11 and ’13), the LiDAR derived Canopy Height Model (CHM), and other LiDAR derived landscape metrics. After it was produced using Ecognition, the preliminary lifeform map product was manually edited by photo interpreters. Manual editing corrected errors where the automated methods produced incorrect results. Edits were made to correct two types of errors: 1) unsatisfactory polygon (stand) delineations and 2) incorrect polygon labels.The mapping team used the lifeform map as the foundation for the finer scale and more floristically detailed Fine Scale Vegetation and Habitat map. For example, a single polygon mapped in the lifeform map as forest might be divided into four polygons in the in the fine scale map including redwood forest, Douglas-fir forest, Oregon white oak forest, and bay forest. The fine scale vegetation and habitat map was developed using a semi-automated approach. The approach combines Ecognition segmentation, extensive field data collection, machine learning, manual editing, and expert review. Ecognition segmentation results in a refinement of the lifeform polygons. Field data collection results in a large number of training polygons labeled with their field-validated map class. Machine learning relies on the field collected data as training data and a stack of GIS datasets as predictor variables. The resulting model is used to create automated fine-scale labels countywide. Machine learning algorithms for this project included both Random Forests and Support Vector Machines (SVMs). Machine learning is followed by extensive manual editing, which is used to 1) edit segment (polygon) labels when they are incorrect and 2) edit segment (polygon) shape when necessary.The map classes in the fine scale vegetation and habitat map generally correspond to the alliance level of the National Vegetation Classification, but some map classes - especially riparian vegetation and herbaceous types - correspond to higher levels of the hierarchy (such as group or macrogroup).
G
QGIS Training Tutorials: Using Spatial Data in Geographic Information...
open.canada.ca
datasets.ai
+2more
html
Updated Oct 5, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statistics Canada (2021). QGIS Training Tutorials: Using Spatial Data in Geographic Information Systems [Dataset]. https://open.canada.ca/data/en/dataset/89be0c73-6f1f-40b7-b034-323cb40b8eff
Explore at:
htmlAvailable download formats
Dataset updated
Oct 5, 2021
Dataset provided by
Statistics Canada
License
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Description
Have you ever wanted to create your own maps, or integrate and visualize spatial datasets to examine changes in trends between locations and over time? Follow along with these training tutorials on QGIS, an open source geographic information system (GIS) and learn key concepts, procedures and skills for performing common GIS tasks – such as creating maps, as well as joining, overlaying and visualizing spatial datasets. These tutorials are geared towards new GIS users. We’ll start with foundational concepts, and build towards more advanced topics throughout – demonstrating how with a few relatively easy steps you can get quite a lot out of GIS. You can then extend these skills to datasets of thematic relevance to you in addressing tasks faced in your day-to-day work.
a
Sonoma County Vegetation and Habitat Map (Vector Tiles - Full Labels)
hub.arcgis.com
Updated Dec 21, 2018
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sonoma County Ag + Open Space (2018). Sonoma County Vegetation and Habitat Map (Vector Tiles - Full Labels) [Dataset]. https://hub.arcgis.com/maps/856c5202d7b44b4cbff2b23ee43b1f5f
Explore at:
Dataset updated
Dec 21, 2018
Dataset authored and provided by
Sonoma County Ag + Open Space
Area covered

Description
This is a vector tile service with labels for the fine scale vegetation and habitat map, to be used in web maps and GIS software packages. Labels appear at scales greater than 1:5,000 and show the full Latin name or vegetation group name. At scales smaller than 1:5,000 the abbreviated vegetation class name is displayed. This service is mean to be used in conjunction with the vector tile services of the veg map polygons (either the solid symbology service or the hollow symbology service). The key to map class abbreviations can be found here. The Sonoma County fine scale vegetation and habitat map is an 82-class vegetation map of Sonoma County with 212,391 polygons. The fine scale vegetation and habitat map represents the state of the landscape in 2013 and adheres to the National Vegetation Classification System (NVC). The map was designed to be used at scales of 1:5,000 and smaller. The full datasheet for this product is available here: https://sonomaopenspace.egnyte.com/dl/qOm3JEb3tD The final report for the fine scale vegetation map, containing methods and an accuracy assessment, is available here: https://sonomaopenspace.egnyte.com/dl/1SWyCSirE9Class definitions, as well as a dichotomous key for the map classes, can be found in the Sonoma Vegetation and Habitat Map Key (https://sonomaopenspace.egnyte.com/dl/xObbaG6lF8) The fine scale vegetation and habitat map was created using semi-automated methods that include field work, computer-based machine learning, and manual aerial photo interpretation. The vegetation and habitat map was developed by first creating a lifeform map, an 18-class map that served as a foundation for the fine-scale map. The lifeform map was created using “expert systems” rulesets in Trimble Ecognition. These rulesets combine automated image segmentation (stand delineation) with object based image classification techniques. In contrast with machine learning approaches, expert systems rulesets are developed heuristically based on the knowledge of experienced image analysts. Key data sets used in the expert systems rulesets for lifeform included: orthophotography (’11 and ’13), the LiDAR derived Canopy Height Model (CHM), and other LiDAR derived landscape metrics. After it was produced using Ecognition, the preliminary lifeform map product was manually edited by photo interpreters. Manual editing corrected errors where the automated methods produced incorrect results. Edits were made to correct two types of errors: 1) unsatisfactory polygon (stand) delineations and 2) incorrect polygon labels. The mapping team used the lifeform map as the foundation for the finer scale and more floristically detailed Fine Scale Vegetation and Habitat map. For example, a single polygon mapped in the lifeform map as forest might be divided into four polygons in the in the fine scale map including redwood forest, Douglas-fir forest, Oregon white oak forest, and bay forest. The fine scale vegetation and habitat map was developed using a semi-automated approach. The approach combines Ecognition segmentation, extensive field data collection, machine learning, manual editing, and expert review. Ecognition segmentation results in a refinement of the lifeform polygons. Field data collection results in a large number of training polygons labeled with their field-validated map class. Machine learning relies on the field collected data as training data and a stack of GIS datasets as predictor variables. The resulting model is used to create automated fine-scale labels countywide. Machine learning algorithms for this project included both Random Forests and Support Vector Machines (SVMs). Machine learning is followed by extensive manual editing, which is used to 1) edit segment (polygon) labels when they are incorrect and 2) edit segment (polygon) shape when necessary. The map classes in the fine scale vegetation and habitat map generally correspond to the alliance level of the National Vegetation Classification, but some map classes - especially riparian vegetation and herbaceous types - correspond to higher levels of the hierarchy (such as group or macrogroup).
Forest plantation mapping in the Southern Highlands, Tanzania 2016:...
doi.pangaea.de
html, tsv
Updated Oct 1, 2018
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Niina Käyhkö; Joni Koskinen; Ulpu Leinonen; Andreas Vollrath; Antonia Ortmann; Erik J Lindquist; Remi d'Annunzio; Anssi Pekkarinen (2018). Forest plantation mapping in the Southern Highlands, Tanzania 2016: tentative training data set [Dataset]. http://doi.org/10.1594/PANGAEA.894887
Explore at:
html, tsvAvailable download formats
Unique identifier
https://doi.org/10.1594/PANGAEA.894887
Dataset updated
Oct 1, 2018
Dataset provided by
PANGAEA
Authors
Niina Käyhkö; Joni Koskinen; Ulpu Leinonen; Andreas Vollrath; Antonia Ortmann; Erik J Lindquist; Remi d'Annunzio; Anssi Pekkarinen
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered

Variables measured
Site, LATITUDE, LONGITUDE, Identification, Coordinate reference system
Description
Tentative forest plantation mapping training data set collected by the author team, September 2016, using Google Earth / Collect Earth software
g
Satellite-Derived Training Data for Automated Flood Detection in the...
gimi9.com
data.usgs.gov
+1more
Updated Jul 27, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2021). Satellite-Derived Training Data for Automated Flood Detection in the Continental U.S. [Dataset]. https://www.gimi9.com/dataset/data-gov_satellite-derived-training-data-for-automated-flood-detection-in-the-continental-u-s/
Explore at:
Dataset updated
Jul 27, 2021
Area covered
United States
Description
Remotely sensed imagery is increasingly used by emergency managers to monitor and map the impact of flood events to support preparedness, response, and critical decision making throughout the flood event lifecycle. To reduce latency in delivery of imagery-derived information, ensure consistent and reliably derived map products, and facilitate processing of an increasing volume of remote sensing data-streams, automated flood mapping workflows are needed. The U.S. Geological Survey is facilitating the development and integration of machine-learning algorithms in collaboration with NASA, National Geospatial Intelligence Agency (NGA), University of Alabama, and University of Illinois to create a workflow for rapidly generating improved flood-map products. A major bottleneck to the training of robust, generalizable machine learning algorithms for pattern recognition is a lack of training data that is representative across the landscape. To overcome this limitation for the training of algorithms capable of detection of surface inundation in diverse contexts, this publication includes the data developed from MAXAR Worldview sensors that is input as training data for machine learning. This data release consists of 100 thematic rasters, in GeoTiff format, with image labels representing five discrete categories: water, not water, maybe water, clouds and background/no data. Specifically, these training data were created by labeling 8-band, multispectral scenes from the MAXAR-Digital Globe, Worldview-2 and 3 satellite-based sensors. Scenes were selected to be spatially and spectrally diverse and geographically representative of different water features within the continental U.S. The labeling procedures used a hybrid approach of unsupervised classification for the initial spectral clustering, followed by expert-level manual interpretation and QA/QC peer review to finalize each labeled image. Updated versions of the data may be issued along with version update documentation. The 100 raster files that make up the training data are available to download here (https://doi.org/10.5066/P9C7HYRV).
Z
TIME4CS WP4 Mapping of citizen science training resources
data.niaid.nih.gov
zenodo.org
Updated Jul 16, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nielsen, Kristian H. (2022). TIME4CS WP4 Mapping of citizen science training resources [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6840273
Explore at:
Dataset updated
Jul 16, 2022
Dataset provided by
Nielsen, Kristian H.
Kragh, Gitte
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset was compiled as part of the TIME4CS project, WP4, and lists identified citizen science training resources, as of July 2022.

The EU-citizen.science platform provided the basis for mapping CS training in Europe, as the team behind the platform has put considerable effort into compiling, and encouraging the CS community to contribute, CS training resources. Additionally, training courses were identified based on the case studies in WP1, as most universities do not list their courses on the EU-citizen.science platform.

Facebook

Twitter

Click to copy link

Link copied

Cite

Margaret Goldman; Joshua Rosera; Graham Lederer; Garth Graham; Asitang Mishra; Alice Yepremyan (2023). Map georeferencing challenge training and validation data [Dataset]. http://doi.org/10.5066/P9FXSPT1

Map georeferencing challenge training and validation data

Explore at:

Unique identifier

https://doi.org/10.5066/P9FXSPT1

Dataset updated

Dec 27, 2023

Dataset provided by

United States Geological Surveyhttp://www.usgs.gov/

Authors

Margaret Goldman; Joshua Rosera; Graham Lederer; Garth Graham; Asitang Mishra; Alice Yepremyan

License

U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically

Time period covered

2022 - 2023

Description

Extracting useful and accurate information from scanned geologic and other earth science maps is a time-consuming and laborious process involving manual human effort. To address this limitation, the USGS partnered with the Defense Advanced Research Projects Agency (DARPA) to run the AI for Critical Mineral Assessment Competition, soliciting innovative solutions for automatically georeferencing and extracting features from maps. The competition opened for registration in August 2022 and concluded in December 2022. Training and validation data from the map georeferencing challenge are provided here, as well as competition details and a baseline solution. The data were derived from published sources and are provided to the public to support continued development of automated georeferencing and feature extraction tools. References for all maps are included with the data.

Clear search

Close search

Google apps

Main menu

Map georeferencing challenge training and validation data

Training data for 'Mapping-by-sequencing' tutorial (Galaxy Training...

Training: 4. Mapping with Google Earth Pro

KU Students Wrap Up Course in Aerial Mapping Using Drones - Datasets -...

GeoNatShapes: a natural feature reference dataset for mapping and AI...

Mapping of training courses Parcoursup | gimi9.com

Satellite images and road-reference data for AI-based road mapping in...

– denotes the ith image for a given geographical region / major island amongst the original 200 images, e.g., bo1, bo2, bo3…

– denotes the ith image for a given geographical region / major island amongst the original 200 images, e.g., bo1, bo2, bo3…

Quivira National Wildlife Refuge vegetation mapping project 2010-2011

Land Cover Fraction Mapping with FORCE - Supplemental Data

Sensitivity and specificity of global native T1 value and pattern-derived...

Data from "Mapping bedrock outcrops in the Sierra Nevada Mountains...

Military Installations, Ranges, and Training Areas (MIRTA) DoD Sites -...

FileMarket | Telegram Users Geolocation Data with IP & Consent | 50,000...

Marino Watershed: Training dataset

Sonoma County Vegetation and Habitat Map (Layer File)

QGIS Training Tutorials: Using Spatial Data in Geographic Information...

Sonoma County Vegetation and Habitat Map (Vector Tiles - Full Labels)

Forest plantation mapping in the Southern Highlands, Tanzania 2016:...

Satellite-Derived Training Data for Automated Flood Detection in the...

TIME4CS WP4 Mapping of citizen science training resources

Map georeferencing challenge training and validation dataSee More Versions

Map georeferencing challenge training and validation data