Facebook
TwitterThis data release contains the analytical results and evaluated source data files of geospatial analyses for identifying areas in Alaska that may be prospective for different types of lode gold deposits, including orogenic, reduced-intrusion-related, epithermal, and gold-bearing porphyry. The spatial analysis is based on queries of statewide source datasets of aeromagnetic surveys, Alaska Geochemical Database (AGDB3), Alaska Resource Data File (ARDF), and Alaska Geologic Map (SIM3340) within areas defined by 12-digit HUCs (subwatersheds) from the National Watershed Boundary dataset. The packages of files available for download are: 1. LodeGold_Results_gdb.zip - The analytical results in geodatabase polygon feature classes which contain the scores for each source dataset layer query, the accumulative score, and a designation for high, medium, or low potential and high, medium, or low certainty for a deposit type within the HUC. The data is described by FGDC metadata. An mxd file, and cartographic feature classes are provided for display of the results in ArcMap. An included README file describes the complete contents of the zip file. 2. LodeGold_Results_shape.zip - Copies of the results from the geodatabase are also provided in shapefile and CSV formats. The included README file describes the complete contents of the zip file. 3. LodeGold_SourceData_gdb.zip - The source datasets in geodatabase and geotiff format. Data layers include aeromagnetic surveys, AGDB3, ARDF, lithology from SIM3340, and HUC subwatersheds. The data is described by FGDC metadata. An mxd file and cartographic feature classes are provided for display of the source data in ArcMap. Also included are the python scripts used to perform the analyses. Users may modify the scripts to design their own analyses. The included README files describe the complete contents of the zip file and explain the usage of the scripts. 4. LodeGold_SourceData_shape.zip - Copies of the geodatabase source dataset derivatives from ARDF and lithology from SIM3340 created for this analysis are also provided in shapefile and CSV formats. The included README file describes the complete contents of the zip file.
Facebook
TwitterOpen Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Have you ever wanted to create your own maps, or integrate and visualize spatial datasets to examine changes in trends between locations and over time? Follow along with these training tutorials on QGIS, an open source geographic information system (GIS) and learn key concepts, procedures and skills for performing common GIS tasks – such as creating maps, as well as joining, overlaying and visualizing spatial datasets. These tutorials are geared towards new GIS users. We’ll start with foundational concepts, and build towards more advanced topics throughout – demonstrating how with a few relatively easy steps you can get quite a lot out of GIS. You can then extend these skills to datasets of thematic relevance to you in addressing tasks faced in your day-to-day work.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In this course, you will learn to work within the free and open-source R environment with a specific focus on working with and analyzing geospatial data. We will cover a wide variety of data and spatial data analytics topics, and you will learn how to code in R along the way. The Introduction module provides more background info about the course and course set up. This course is designed for someone with some prior GIS knowledge. For example, you should know the basics of working with maps, map projections, and vector and raster data. You should be able to perform common spatial analysis tasks and make map layouts. If you do not have a GIS background, we would recommend checking out the West Virginia View GIScience class. We do not assume that you have any prior experience with R or with coding. So, don't worry if you haven't developed these skill sets yet. That is a major goal in this course. Background material will be provided using code examples, videos, and presentations. We have provided assignments to offer hands-on learning opportunities. Data links for the lecture modules are provided within each module while data for the assignments are linked to the assignment buttons below. Please see the sequencing document for our suggested order in which to work through the material. After completing this course you will be able to: prepare, manipulate, query, and generally work with data in R. perform data summarization, comparisons, and statistical tests. create quality graphs, map layouts, and interactive web maps to visualize data and findings. present your research, methods, results, and code as web pages to foster reproducible research. work with spatial data in R. analyze vector and raster geospatial data to answer a question with a spatial component. make spatial models and predictions using regression and machine learning. code in the R language at an intermediate level.
Facebook
TwitterThis dataset contains model-based county-level estimates in GIS-friendly format. PLACES covers the entire United States—50 states and the District of Columbia—at county, place, census tract, and ZIP Code Tabulation Area levels. It provides information uniformly on this large scale for local areas at four geographic levels. Estimates were provided by the Centers for Disease Control and Prevention (CDC), Division of Population Health, Epidemiology and Surveillance Branch. Project was funded by the Robert Wood Johnson Foundation in conjunction with the CDC Foundation. Data sources used to generate these model-based estimates are Behavioral Risk Factor Surveillance System (BRFSS) 2022 or 2021 data, Census Bureau 2022 county population estimates, and American Community Survey (ACS) 2018–2022 estimates. The 2024 release uses 2022 BRFSS data for 36 measures and 2021 BRFSS data for 4 measures (high blood pressure, high cholesterol, cholesterol screening, and taking medicine for high blood pressure control among those with high blood pressure) that the survey collects data on every other year. These data can be joined with the census 2022 county boundary file in a GIS system to produce maps for 40 measures at the county level. An ArcGIS Online feature service is also available for users to make maps online or to add data to desktop GIS software. https://cdcarcgis.maps.arcgis.com/home/item.html?id=3b7221d4e47740cab9235b839fa55cd7
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Today, deep neural networks are widely used in many computer vision problems, also for geographic information systems (GIS) data. This type of data is commonly used for urban analyzes and spatial planning. We used orthophotographic images of two residential districts from Kielce, Poland for research including urban sprawl automatic analysis with Transformer-based neural network application.Orthophotomaps were obtained from Kielce GIS portal. Then, the map was manually masked into building and building surroundings classes. Finally, the ortophotomap and corresponding classification mask were simultaneously divided into small tiles. This approach is common in image data preprocessing for machine learning algorithms learning phase. Data contains two original orthophotomaps from Wietrznia and Pod Telegrafem residential districts with corresponding masks and also their tiled version, ready to provide as a training data for machine learning models.Transformed-based neural network has undergone a training process on the Wietrznia dataset, targeted for semantic segmentation of the tiles into buildings and surroundings classes. After that, inference of the models was used to test model's generalization ability on the Pod Telegrafem dataset. The efficiency of the model was satisfying, so it can be used in automatic semantic building segmentation. Then, the process of dividing the images can be reversed and complete classification mask retrieved. This mask can be used for area of the buildings calculations and urban sprawl monitoring, if the research would be repeated for GIS data from wider time horizon.Since the dataset was collected from Kielce GIS portal, as the part of the Polish Main Office of Geodesy and Cartography data resource, it may be used only for non-profit and non-commertial purposes, in private or scientific applications, under the law "Ustawa z dnia 4 lutego 1994 r. o prawie autorskim i prawach pokrewnych (Dz.U. z 2006 r. nr 90 poz 631 z późn. zm.)". There are no other legal or ethical considerations in reuse potential.Data information is presented below.wietrznia_2019.jpg - orthophotomap of Wietrznia districtmodel's - used for training, as an explanatory imagewietrznia_2019.png - classification mask of Wietrznia district - used for model's training, as a target imagewietrznia_2019_validation.jpg - one image from Wietrznia district - used for model's validation during training phasepod_telegrafem_2019.jpg - orthophotomap of Pod Telegrafem district - used for model's evaluation after training phasewietrznia_2019 - folder with wietrznia_2019.jpg (image) and wietrznia_2019.png (annotation) images, divided into 810 tiles (512 x 512 pixels each), tiles with no information were manually removed, so the training data would contain only informative tilestiles presented - used for the model during training (images and annotations for fitting the model to the data)wietrznia_2019_vaidation - folder with wietrznia_2019_validation.jpg image divided into 16 tiles (256 x 256 pixels each) - tiles were presented to the model during training (images for validation model's efficiency); it was not the part of the training datapod_telegrafem_2019 - folder with pod_telegrafem.jpg image divided into 196 tiles (256 x 265 pixels each) - tiles were presented to the model during inference (images for evaluation model's robustness)Dataset was created as described below.Firstly, the orthophotomaps were collected from Kielce Geoportal (https://gis.kielce.eu). Kielce Geoportal offers a .pst recent map from April 2019. It is an orthophotomap with a resolution of 5 x 5 pixels, constructed from a plane flight at 700 meters over ground height, taken with a camera for vertical photos. Downloading was done by WMS in open-source QGIS software (https://www.qgis.org), as a 1:500 scale map, then converted to a 1200 dpi PNG image.Secondly, the map from Wietrznia residential district was manually labelled, also in QGIS, in the same scope, as the orthophotomap. Annotation based on land cover map information was also obtained from Kielce Geoportal. There are two classes - residential building and surrounding. Second map, from Pod Telegrafem district was not annotated, since it was used in the testing phase and imitates situation, where there is no annotation for the new data presented to the model.Next, the images was converted to an RGB JPG images, and the annotation map was converted to 8-bit GRAY PNG image.Finally, Wietrznia data files were tiled to 512 x 512 pixels tiles, in Python PIL library. Tiles with no information or a relatively small amount of information (only white background or mostly white background) were manually removed. So, from the 29113 x 15938 pixels orthophotomap, only 810 tiles with corresponding annotations were left, ready to train the machine learning model for the semantic segmentation task. Pod Telegrafem orthophotomap was tiled with no manual removing, so from the 7168 x 7168 pixels ortophotomap were created 197 tiles with 256 x 256 pixels resolution. There was also image of one residential building, used for model's validation during training phase, it was not the part of the training data, but was a part of Wietrznia residential area. It was 2048 x 2048 pixel ortophotomap, tiled to 16 tiles 256 x 265 pixels each.
Facebook
TwitterThis map shows the free and open data status of county public geospatial (GIS) data across Minnesota. The accompanying data set can be used to make similar maps using GIS software.
Counties shown in this dataset as having free and open public geospatial data (with or without a policy) are: Aitkin, Anoka, Becker, Beltrami, Benton, Big Stone, Carlton, Carver, Cass, Chippewa, Chisago, Clay, Clearwater, Cook, Crow Wing, Dakota, Douglas, Grant, Hennepin, Hubbard, Isanti, Itasca, Kittson, Koochiching, Lac qui Parle, Lake, Lyon, Marshall, McLeod, Meeker, Mille Lacs, Morrison, Mower, Norman, Olmsted, Otter Tail, Pipestone, Polk, Pope, Ramsey, Renville, Rice, Scott, Sherburne, St. Louis, Stearns, Steele, Stevens, Traverse, Wabasha, Waseca, Washington, Wilkin, Winona, Wright, and Yellow Medicine.
To see if a county's data is distributed via the Minnesota Geospatial Commons, check the Commons organizations page: https://gisdata.mn.gov/organization
To see if a county distributes data via its website, check the link(s) on the Minnesota County GIS Contacts webpage: https://www.mngeo.state.mn.us/county_contacts.html
Facebook
TwitterOur dataset provides detailed and precise insights into the business, commercial, and industrial aspects of any given area in the USA (Including Point of Interest (POI) Data and Foot Traffic. The dataset is divided into 150x150 sqm areas (geohash 7) and has over 50 variables. - Use it for different applications: Our combined dataset, which includes POI and foot traffic data, can be employed for various purposes. Different data teams use it to guide retailers and FMCG brands in site selection, fuel marketing intelligence, analyze trade areas, and assess company risk. Our dataset has also proven to be useful for real estate investment.- Get reliable data: Our datasets have been processed, enriched, and tested so your data team can use them more quickly and accurately.- Ideal for trainning ML models. The high quality of our geographic information layers results from more than seven years of work dedicated to the deep understanding and modeling of geospatial Big Data. Among the features that distinguished this dataset is the use of anonymized and user-compliant mobile device GPS location, enriched with other alternative and public data.- Easy to use: Our dataset is user-friendly and can be easily integrated to your current models. Also, we can deliver your data in different formats, like .csv, according to your analysis requirements. - Get personalized guidance: In addition to providing reliable datasets, we advise your analysts on their correct implementation.Our data scientists can guide your internal team on the optimal algorithms and models to get the most out of the information we provide (without compromising the security of your internal data).Answer questions like: - What places does my target user visit in a particular area? Which are the best areas to place a new POS?- What is the average yearly income of users in a particular area?- What is the influx of visits that my competition receives?- What is the volume of traffic surrounding my current POS?This dataset is useful for getting insights from industries like:- Retail & FMCG- Banking, Finance, and Investment- Car Dealerships- Real Estate- Convenience Stores- Pharma and medical laboratories- Restaurant chains and franchises- Clothing chains and franchisesOur dataset includes more than 50 variables, such as:- Number of pedestrians seen in the area.- Number of vehicles seen in the area.- Average speed of movement of the vehicles seen in the area.- Point of Interest (POIs) (in number and type) seen in the area (supermarkets, pharmacies, recreational locations, restaurants, offices, hotels, parking lots, wholesalers, financial services, pet services, shopping malls, among others). - Average yearly income range (anonymized and aggregated) of the devices seen in the area.Notes to better understand this dataset:- POI confidence means the average confidence of POIs in the area. In this case, POIs are any kind of location, such as a restaurant, a hotel, or a library. - Category confidences, for example"food_drinks_tobacco_retail_confidence" indicates how confident we are in the existence of food/drink/tobacco retail locations in the area. - We added predictions for The Home Depot and Lowe's Home Improvement stores in the dataset sample. These predictions were the result of a machine-learning model that was trained with the data. Knowing where the current stores are, we can find the most similar areas for new stores to open.How efficient is a Geohash?Geohash is a faster, cost-effective geofencing option that reduces input data load and provides actionable information. Its benefits include faster querying, reduced cost, minimal configuration, and ease of use.Geohash ranges from 1 to 12 characters. The dataset can be split into variable-size geohashes, with the default being geohash7 (150m x 150m).
Facebook
TwitterThe establishment of a BES Multi-User Geodatabase (BES-MUG) allows for the storage, management, and distribution of geospatial data associated with the Baltimore Ecosystem Study. At present, BES data is distributed over the internet via the BES website. While having geospatial data available for download is a vast improvement over having the data housed at individual research institutions, it still suffers from some limitations. BES-MUG overcomes these limitations; improving the quality of the geospatial data available to BES researches, thereby leading to more informed decision-making. BES-MUG builds on Environmental Systems Research Institute's (ESRI) ArcGIS and ArcSDE technology. ESRI was selected because its geospatial software offers robust capabilities. ArcGIS is implemented agency-wide within the USDA and is the predominant geospatial software package used by collaborating institutions. Commercially available enterprise database packages (DB2, Oracle, SQL) provide an efficient means to store, manage, and share large datasets. However, standard database capabilities are limited with respect to geographic datasets because they lack the ability to deal with complex spatial relationships. By using ESRI's ArcSDE (Spatial Database Engine) in conjunction with database software, geospatial data can be handled much more effectively through the implementation of the Geodatabase model. Through ArcSDE and the Geodatabase model the database's capabilities are expanded, allowing for multiuser editing, intelligent feature types, and the establishment of rules and relationships. ArcSDE also allows users to connect to the database using ArcGIS software without being burdened by the intricacies of the database itself. For an example of how BES-MUG will help improve the quality and timeless of BES geospatial data consider a census block group layer that is in need of updating. Rather than the researcher downloading the dataset, editing it, and resubmitting to through ORS, access rules will allow the authorized user to edit the dataset over the network. Established rules will ensure that the attribute and topological integrity is maintained, so that key fields are not left blank and that the block group boundaries stay within tract boundaries. Metadata will automatically be updated showing who edited the dataset and when they did in the event any questions arise. Currently, a functioning prototype Multi-User Database has been developed for BES at the University of Vermont Spatial Analysis Lab, using Arc SDE and IBM's DB2 Enterprise Database as a back end architecture. This database, which is currently only accessible to those on the UVM campus network, will shortly be migrated to a Linux server where it will be accessible for database connections over the Internet. Passwords can then be handed out to all interested researchers on the project, who will be able to make a database connection through the Geographic Information Systems software interface on their desktop computer. This database will include a very large number of thematic layers. Those layers are currently divided into biophysical, socio-economic and imagery categories. Biophysical includes data on topography, soils, forest cover, habitat areas, hydrology and toxics. Socio-economics includes political and administrative boundaries, transportation and infrastructure networks, property data, census data, household survey data, parks, protected areas, land use/land cover, zoning, public health and historic land use change. Imagery includes a variety of aerial and satellite imagery. See the readme: http://96.56.36.108/geodatabase_SAL/readme.txt See the file listing: http://96.56.36.108/geodatabase_SAL/diroutput.txt
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F22121490%2F7189944f8fc292a094c90daa799d08ca%2FChatGPT%20Image%2015%20Kas%202025%2014_07_37.png?generation=1763204959770660&alt=media" alt="">
This synthetic dataset simulates 300 global cities across 6 major geographic regions, designed specifically for unsupervised machine learning and clustering analysis. It explores how economic status, environmental quality, infrastructure, and digital access shape urban lifestyles worldwide.
| Feature | Description | Range |
|---|---|---|
| 10 Features | Economic, environmental & social indicators | Realistically scaled |
| 300 Cities | Europe, Asia, Americas, Africa, Oceania | Diverse distributions |
| Strong Correlations | Income ↔ Rent (+0.8), Density ↔ Pollution (+0.6) | ML-ready |
| No Missing Values | Clean, preprocessed data | Ready for analysis |
| 4-5 Natural Clusters | Metropolitan hubs, eco-towns, developing centers | Pre-validated |
✅ Realistic Correlations: Income strongly predicts rent (+0.8), internet access (+0.7), and happiness (+0.6)
✅ Regional Diversity: Each region has distinct economic and environmental characteristics
✅ Clustering-Ready: Naturally separable into 4-5 lifestyle archetypes
✅ Beginner-Friendly: No data cleaning required, includes example code
✅ Documented: Comprehensive README with methodology and use cases
import pandas as pd
from sklearn.cluster import KMeans
from sklearn.preprocessing import StandardScaler
# Load and prepare
df = pd.read_csv('city_lifestyle_dataset.csv')
X = df.drop(['city_name', 'country'], axis=1)
X_scaled = StandardScaler().fit_transform(X)
# Cluster
kmeans = KMeans(n_clusters=5, random_state=42)
df['cluster'] = kmeans.fit_predict(X_scaled)
# Analyze
print(df.groupby('cluster').mean())
After working with this dataset, you will be able to: 1. Apply K-Means, DBSCAN, and Hierarchical Clustering 2. Use PCA for dimensionality reduction and visualization 3. Interpret correlation matrices and feature relationships 4. Create geographic visualizations with cluster assignments 5. Profile and name discovered clusters based on characteristics
| Cluster | Characteristics | Example Cities |
|---|---|---|
| Metropolitan Tech Hubs | High income, density, rent | Silicon Valley, Singapore |
| Eco-Friendly Towns | Low density, clean air, high happiness | Nordic cities |
| Developing Centers | Mid income, high density, poor air | Emerging markets |
| Low-Income Suburban | Low infrastructure, income | Rural areas |
| Industrial Mega-Cities | Very high density, pollution | Manufacturing hubs |
Unlike random synthetic data, this dataset was carefully engineered with: - ✨ Realistic correlation structures based on urban research - 🌍 Regional characteristics matching real-world patterns - 🎯 Optimal cluster separability (validated via silhouette scores) - 📚 Comprehensive documentation and starter code
✓ Learn clustering without data cleaning hassles
✓ Practice PCA and dimensionality reduction
✓ Create beautiful geographic visualizations
✓ Understand feature correlation in real-world contexts
✓ Build a portfolio project with clear business insights
This dataset was designed for educational purposes in machine learning and data science. While synthetic, it reflects real patterns observed in global urban development research.
Happy Clustering! 🎉
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Last Update: 10/10/2025The statewide roads dataset is a multi-purpose statewide roads dataset for cartography and range based-address location. This dataset is also used as the base geometry for deriving the GIS-representation of UDOT's highway linear referencing system (LRS). A network analysis dataset for route-finding can also be derived from this dataset.This dataset utilizes a data model based on Next-Generation 911 standards and the Federal Highway Administration's All Roads Network Of Linear-referenced Data (ARNOLD) reporting requirements for state DOTs. UGRC adopted this data model on September 13th, 2017.The statewide roads dataset is maintained by UGRC in partnership with local governments, the Utah 911 Committee, and UDOT. This dataset is updated monthly with Davis, Salt Lake, Utah, Washington and Weber represented every month, along with additional counties based on an annual update schedule. UGRC obtains the data from the authoritative data source (typically county agencies), projects the data and attributes into the current data model, spatially assigns polygon-based fields based on the appropriate SGID boundary, and then standardizes the attribute values to ensure statewide consistency. UGRC also generates a UNIQUE_ID field based on the segment's location in the US National Grid, with the street name then tacked on. The UNIQUE_ID field is static and is UGRC's current, ad hoc solution to a persistent global id. More information about the data model can be found here: https://docs.google.com/spreadsheets/d/1jQ_JuRIEtzxj60F0FAGmdu5JrFpfYBbSt3YzzCjxpfI/edit#gid=811360546 More information about the data model transition can be found here: https://gis.utah.gov/major-updates-coming-to-roads-data-model/We are currently working with US Forest Service to improve the Forest Service roads in this dataset, however, for the most up-to-date and complete set of USFS roads, please visit their data portal where you can download the "National Forest System Roads" dataset.More information can be found on the UGRC data page for this layer:https://gis.utah.gov/data/transportation/roads-system/
Facebook
Twitterhttps://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.1/customlicense?persistentId=doi:10.7910/DVN/OQIPRWhttps://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.1/customlicense?persistentId=doi:10.7910/DVN/OQIPRW
Advancing Research on Nutrition and Agriculture (AReNA) is a 6-year, multi-country project in South Asia and sub-Saharan Africa funded by the Bill and Melinda Gates Foundation, being implemented from 2015 through 2020. The objective of AReNA is to close important knowledge gaps on the links between nutrition and agriculture, with a particular focus on conducting policy-relevant research at scale and crowding in more research on this issue by creating data sets and analytical tools that can benefit the broader research community. Much of the research on agriculture and nutrition is hindered by a lack of data, and many of the datasets that do contain both agriculture and nutrition information are often small in size and geographic scope. AReNA team constructed a large multi-level, multi-country dataset combining nutrition and nutrition-relevant information at the individual and household level from the Demographic and Health Surveys (DHS) with a wide variety of geo-referenced data on agricultural production, agroecology, climate, demography, and infrastructure (GIS data). This dataset includes 60 countries, 184 DHS, and 122,473 clusters. Over one thousand geospatial variables are linked with DHS. The entire dataset is organized into 13 individual files: DHS_distance, DHS_livestock, DHS_main, DHS_malaria, DHS NDVI, DHS_nightlight, DHS_pasture and climate (mean), DHS_rainfall, DHS_soil, DHS_SPAM, DHS_suit, DHS_temperature, and DHS_traveltime.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
The Job Market Insights Dataset offers a comprehensive view of job postings worldwide, providing critical data on job roles, salaries, qualifications, locations, and company profiles. This dataset serves as a valuable resource for understanding global employment trends and patterns in various industries.
The primary objective of analyzing this dataset is to gain actionable insights into job market dynamics, including in-demand skills, salary ranges by role, preferred qualifications, and geographical job distributions. This analysis can empower job seekers, recruiters, and businesses to make informed decisions.
This dataset is a goldmine for extracting insights that can optimize recruitment strategies, guide career planning, and inform educational initiatives.
Facebook
TwitterThe City of Norfolk Open GIS Data Site. This site contains various spatial data that can be used by anyone with an interest in geographic information systems (GIS) data for their applications. The City’s datasets are updated regularly and can be downloaded or accessed for free from this site. If you don’t see a particular dataset you are looking for, please check back often, as we will be providing additional data to the site in the future.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains a set of synthetic data that can be used to evaluate the efficiency of geosaptial datasbases. The datasets is composed of four json file, characterized by different size. They can be used to analyze the scalability of geospatial datasets with respect to the database size. Each json file contains a set of "points", each one characterized by a set of random attributes (description, url of a picture linked to the point, creation date, delete date, update date, identifier, partition identifier). The synthetically generated points are uniformly distributed among the world.
Facebook
TwitterData set that contains information on archaeological remains of the pre historic settlement of the Letolo valley on Savaii on Samoa. It is built in ArcMap from ESRI and is based on previously unpublished surveys made by the Peace Corps Volonteer Gregory Jackmond in 1976-78, and in a lesser degree on excavations made by Helene Martinsson Wallin and Paul Wallin. The settlement was in use from at least 1000 AD to about 1700- 1800. Since abandonment it has been covered by thick jungle. However by the time of the survey by Jackmond (1976-78) it was grazed by cattle and the remains was visible. The survey is at file at Auckland War Memorial Museum and has hitherto been unpublished. A copy of the survey has been accessed by Olof Håkansson through Martinsson Wallin and Wallin and as part of a Masters Thesis in Archeology at Uppsala University it has been digitised.
Olof Håkansson has built the data base structure in the software from ESRI, and digitised the data in 2015 to 2017. One of the aims of the Masters Thesis was to discuss hierarchies. To do this, subsets of the data have been displayed in various ways on maps. Another aim was to discuss archaeological methodology when working with spatial data, but the data in itself can be used without regard to the questions asked in the Masters Thesis. All data that was unclear has been removed in an effort to avoid errors being introduced. Even so, if there is mistakes in the data set it is to be blamed on the researcher, Olof Håkansson. A more comprehensive account of the aim, questions, purpose, method, as well the results of the research, is to be found in the Masters Thesis itself. Direkt link http://uu.diva-portal.org/smash/record.jsf?pid=diva2%3A1149265&dswid=9472
Purpose:
The purpose is to examine hierarchies in prehistoric Samoa. The purpose is further to make the produced data sets available for study.
Prehistoric remains of the settlement of Letolo on the Island of Savaii in Samoa in Polynesia
Facebook
TwitterThis dataset contains model-based place (incorporated and census designated places) level estimates for the PLACES 2022 release in GIS-friendly format. PLACES covers the entire United States—50 states and the District of Columbia (DC)—at county, place, census tract, and ZIP Code Tabulation Area levels. It provides information uniformly on this large scale for local areas at 4 geographic levels. Estimates were provided by the Centers for Disease Control and Prevention (CDC), Division of Population Health, Epidemiology and Surveillance Branch. PLACES was funded by the Robert Wood Johnson Foundation in conjunction with the CDC Foundation. Data sources used to generate these model-based estimates include Behavioral Risk Factor Surveillance System (BRFSS) 2020 or 2019 data, Census Bureau 2010 population estimates, and American Community Survey (ACS) 2015–2019 estimates. The 2022 release uses 2020 BRFSS data for 25 measures and 2019 BRFSS data for 4 measures (high blood pressure, taking high blood pressure medication, high cholesterol, and cholesterol screening) that the survey collects data on every other year. These data can be joined with the 2019 Census TIGER/Line place boundary file in a GIS system to produce maps for 29 measures at the place level. An ArcGIS Online feature service is also available for users to make maps online or to add data to desktop GIS software. https://cdcarcgis.maps.arcgis.com/home/item.html?id=3b7221d4e47740cab9235b839fa55cd7
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This data was prepared as input for the Selkie GIS-TE tool. This GIS tool aids site selection, logistics optimization and financial analysis of wave or tidal farms in the Irish and Welsh maritime areas. Read more here: https://www.selkie-project.eu/selkie-tools-gis-technoeconomic-model/
This research was funded by the Science Foundation Ireland (SFI) through MaREI, the SFI Research Centre for Energy, Climate and the Marine and by the Sustainable Energy Authority of Ireland (SEAI). Support was also received from the European Union's European Regional Development Fund through the Ireland Wales Cooperation Programme as part of the Selkie project.
File Formats
Results are presented in three file formats:
tif Can be imported into a GIS software (such as ARC GIS) csv Human-readable text format, which can also be opened in Excel png Image files that can be viewed in standard desktop software and give a spatial view of results
Input Data
All calculations use open-source data from the Copernicus store and the open-source software Python. The Python xarray library is used to read the data.
Hourly Data from 2000 to 2019
Wind -
Copernicus ERA5 dataset
17 by 27.5 km grid
10m wind speed
Wave - Copernicus Atlantic -Iberian Biscay Irish - Ocean Wave Reanalysis dataset 3 by 5 km grid
Accessibility
The maximum limits for Hs and wind speed are applied when mapping the accessibility of a site.
The Accessibility layer shows the percentage of time the Hs (Atlantic -Iberian Biscay Irish - Ocean Wave Reanalysis) and wind speed (ERA5) are below these limits for the month.
Input data is 20 years of hourly wave and wind data from 2000 to 2019, partitioned by month. At each timestep, the accessibility of the site was determined by checking if
the Hs and wind speed were below their respective limits. The percentage accessibility is the number of hours within limits divided by the total number of hours for the month.
Environmental data is from the Copernicus data store (https://cds.climate.copernicus.eu/). Wave hourly data is from the 'Atlantic -Iberian Biscay Irish - Ocean Wave Reanalysis' dataset.
Wind hourly data is from the ERA 5 dataset.
Availability
A device's availability to produce electricity depends on the device's reliability and the time to repair any failures. The repair time depends on weather
windows and other logistical factors (for example, the availability of repair vessels and personnel.). A 2013 study by O'Connor et al. determined the
relationship between the accessibility and availability of a wave energy device. The resulting graph (see Fig. 1 of their paper) shows the correlation between
accessibility at Hs of 2m and wind speed of 15.0m/s and availability. This graph is used to calculate the availability layer from the accessibility layer.
The input value, accessibility, measures how accessible a site is for installation or operation and maintenance activities. It is the percentage time the
environmental conditions, i.e. the Hs (Atlantic -Iberian Biscay Irish - Ocean Wave Reanalysis) and wind speed (ERA5), are below operational limits.
Input data is 20 years of hourly wave and wind data from 2000 to 2019, partitioned by month. At each timestep, the accessibility of the site was determined
by checking if the Hs and wind speed were below their respective limits. The percentage accessibility is the number of hours within limits divided by the total
number of hours for the month. Once the accessibility was known, the percentage availability was calculated using the O'Connor et al. graph of the relationship
between the two. A mature technology reliability was assumed.
Weather Window
The weather window availability is the percentage of possible x-duration windows where weather conditions (Hs, wind speed) are below maximum limits for the
given duration for the month.
The resolution of the wave dataset (0.05° × 0.05°) is higher than that of the wind dataset
(0.25° x 0.25°), so the nearest wind value is used for each wave data point. The weather window layer is at the resolution of the wave layer.
The first step in calculating the weather window for a particular set of inputs (Hs, wind speed and duration) is to calculate the accessibility at each timestep.
The accessibility is based on a simple boolean evaluation: are the wave and wind conditions within the required limits at the given timestep?
Once the time series of accessibility is calculated, the next step is to look for periods of sustained favourable environmental conditions, i.e. the weather
windows. Here all possible operating periods with a duration matching the required weather-window value are assessed to see if the weather conditions remain
suitable for the entire period. The percentage availability of the weather window is calculated based on the percentage of x-duration windows with suitable
weather conditions for their entire duration.The weather window availability can be considered as the probability of having the required weather window available
at any given point in the month.
Extreme Wind and Wave
The Extreme wave layers show the highest significant wave height expected to occur during the given return period. The Extreme wind layers show the highest wind speed expected to occur during the given return period.
To predict extreme values, we use Extreme Value Analysis (EVA). EVA focuses on the extreme part of the data and seeks to determine a model to fit this reduced
portion accurately. EVA consists of three main stages. The first stage is the selection of extreme values from a time series. The next step is to fit a model
that best approximates the selected extremes by determining the shape parameters for a suitable probability distribution. The model then predicts extreme values
for the selected return period. All calculations use the python pyextremes library. Two methods are used - Block Maxima and Peaks over threshold.
The Block Maxima methods selects the annual maxima and fits a GEVD probability distribution.
The peaks_over_threshold method has two variable calculation parameters. The first is the percentile above which values must be to be selected as extreme (0.9 or 0.998). The
second input is the time difference between extreme values for them to be considered independent (3 days). A Generalised Pareto Distribution is fitted to the selected
extremes and used to calculate the extreme value for the selected return period.
Facebook
TwitterHEPGIS is a web-based interactive geographic map server that allows users to navigate and view geo-spatial data, print maps, and obtain data on specific features using only a web browser. It includes geo-spatial data used for transportation planning. HEPGIS previously received ARRA funding for development of Economically distressed Area maps. It is also being used to demonstrate emerging trends to address MPO and statewide planning regulations/requirements , enhanced National Highway System, Primary Freight Networks, commodity flows and safety data . HEPGIS has been used to help implement MAP-21 regulations and will help implement the Grow America Act, particularly related to Ladder of Opportunities and MPO reforms.
Facebook
TwitterRoads data are intended to be used for a variety of mapping, resource management, planning, and analysis applications.
Facebook
TwitterThis dataset was updated May, 2025.This ownership dataset was generated primarily from CPAD data, which already tracks the majority of ownership information in California. CPAD is utilized without any snapping or clipping to FRA/SRA/LRA. CPAD has some important data gaps, so additional data sources are used to supplement the CPAD data. Currently this includes the most currently available data from BIA, DOD, and FWS. Additional sources may be added in subsequent versions. Decision rules were developed to identify priority layers in areas of overlap.Starting in 2022, the ownership dataset was compiled using a new methodology. Previous versions attempted to match federal ownership boundaries to the FRA footprint, and used a manual process for checking and tracking Federal ownership changes within the FRA, with CPAD ownership information only being used for SRA and LRA lands. The manual portion of that process was proving difficult to maintain, and the new method (described below) was developed in order to decrease the manual workload, and increase accountability by using an automated process by which any final ownership designation could be traced back to a specific dataset.The current process for compiling the data sources includes:* Clipping input datasets to the California boundary* Filtering the FWS data on the Primary Interest field to exclude lands that are managed by but not owned by FWS (ex: Leases, Easements, etc)* Supplementing the BIA Pacific Region Surface Trust lands data with the Western Region portion of the LAR dataset which extends into California.* Filtering the BIA data on the Trust Status field to exclude areas that represent mineral rights only.* Filtering the CPAD data on the Ownership Level field to exclude areas that are Privately owned (ex: HOAs)* In the case of overlap, sources were prioritized as follows: FWS > BIA > CPAD > DOD* As an exception to the above, DOD lands on FRA which overlapped with CPAD lands that were incorrectly coded as non-Federal were treated as an override, such that the DOD designation could win out over CPAD.In addition to this ownership dataset, a supplemental _source dataset is available which designates the source that was used to determine the ownership in this dataset. Data Sources:* GreenInfo Network's California Protected Areas Database (CPAD2023a). https://www.calands.org/cpad/; https://www.calands.org/wp-content/uploads/2023/06/CPAD-2023a-Database-Manual.pdf* US Fish and Wildlife Service FWSInterest dataset (updated December, 2023). https://gis-fws.opendata.arcgis.com/datasets/9c49bd03b8dc4b9188a8c84062792cff_0/explore* Department of Defense Military Bases dataset (updated September 2023) https://catalog.data.gov/dataset/military-bases* Bureau of Indian Affairs, Pacific Region, Surface Trust and Pacific Region Office (PRO) land boundaries data (2023) via John Mosley John.Mosley@bia.gov* Bureau of Indian Affairs, Land Area Representations (LAR) and BIA Regions datasets (updated Oct 2019) https://biamaps.doi.gov/bogs/datadownload.html Data Gaps & Changes:Known gaps include several BOR, ACE and Navy lands which were not included in CPAD nor the DOD MIRTA dataset. Our hope for future versions is to refine the process by pulling in additional data sources to fill in some of those data gaps. Additionally, any feedback received about missing or inaccurate data can be taken back to the appropriate source data where appropriate, so fixes can occur in the source data, instead of just in this dataset.25_1: The CPAD Input dataset was amended to merge large gaps in certain areas of the state known to be erroneous, such as Yosemite National Park, and to eliminate overlaps from the original input. The FWS input dataset was updated in February of 2025, and the DOD input dataset was updated in October of 2024. The BIA input dataset was the same as was used for the previous ownership version.24_1: Input datasets this year included numerous changes since the previous version, particularly the CPAD and DOD inputs. Of particular note was the re-addition of Camp Pendleton to the DOD input dataset, which is reflected in this version of the ownership dataset. We were unable to obtain an updated input for tribral data, so the previous inputs was used for this version.23_1: A few discrepancies were discovered between data changes that occurred in CPAD when compared with parcel data. These issues will be taken to CPAD for clarification for future updates, but for ownership23_1 it reflects the data as it was coded in CPAD at the time. In addition, there was a change in the DOD input data between last year and this year, with the removal of Camp Pendleton. An inquiry was sent for clarification on this change, but for ownership23_1 it reflects the data per the DOD input dataset.22_1 : represents an initial version of ownership with a new methodology which was developed under a short timeframe. A comparison with previous versions of ownership highlighted the some data gaps with the current version. Some of these known gaps include several BOR, ACE and Navy lands which were not included in CPAD nor the DOD MIRTA dataset. Our hope for future versions is to refine the process by pulling in additional data sources to fill in some of those data gaps. In addition, any topological errors (like overlaps or gaps) that exist in the input datasets may thus carry over to the ownership dataset. Ideally, any feedback received about missing or inaccurate data can be taken back to the relevant source data where appropriate, so fixes can occur in the source data, instead of just in this dataset.
Facebook
TwitterThis data release contains the analytical results and evaluated source data files of geospatial analyses for identifying areas in Alaska that may be prospective for different types of lode gold deposits, including orogenic, reduced-intrusion-related, epithermal, and gold-bearing porphyry. The spatial analysis is based on queries of statewide source datasets of aeromagnetic surveys, Alaska Geochemical Database (AGDB3), Alaska Resource Data File (ARDF), and Alaska Geologic Map (SIM3340) within areas defined by 12-digit HUCs (subwatersheds) from the National Watershed Boundary dataset. The packages of files available for download are: 1. LodeGold_Results_gdb.zip - The analytical results in geodatabase polygon feature classes which contain the scores for each source dataset layer query, the accumulative score, and a designation for high, medium, or low potential and high, medium, or low certainty for a deposit type within the HUC. The data is described by FGDC metadata. An mxd file, and cartographic feature classes are provided for display of the results in ArcMap. An included README file describes the complete contents of the zip file. 2. LodeGold_Results_shape.zip - Copies of the results from the geodatabase are also provided in shapefile and CSV formats. The included README file describes the complete contents of the zip file. 3. LodeGold_SourceData_gdb.zip - The source datasets in geodatabase and geotiff format. Data layers include aeromagnetic surveys, AGDB3, ARDF, lithology from SIM3340, and HUC subwatersheds. The data is described by FGDC metadata. An mxd file and cartographic feature classes are provided for display of the source data in ArcMap. Also included are the python scripts used to perform the analyses. Users may modify the scripts to design their own analyses. The included README files describe the complete contents of the zip file and explain the usage of the scripts. 4. LodeGold_SourceData_shape.zip - Copies of the geodatabase source dataset derivatives from ARDF and lithology from SIM3340 created for this analysis are also provided in shapefile and CSV formats. The included README file describes the complete contents of the zip file.