Facebook
TwitterAn National Geospatial Data Asset (NGDA) is defined as a geospatial dataset that has been designated by the FGDC Steering Committee and meets at least one of the following criteria: used by multiple agencies or with agency partners such as State, Tribal and local governments; applied to achieve Presidential priorities as expressed by OMB; required to meet shared mission goals of multiple Federal agencies; or expressly required by statutory mandate. Together, these datasets comprise the NGDA Portfolio. This metadata points to a spreadsheet that contains the official list of NGDA with a link to specific NGDA metadata maintained by the dataset owners on Data.gov, GeoPlatform.gov, a link to their associated NGDA Theme, and the agency responsible for the NGDA.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains comprehensive geospatial data detailing the geographical features and boundaries of India. It includes information on various geographic elements such as terrain, water bodies, administrative boundaries, and infrastructure, providing valuable insights for spatial analysis and mapping projects.
Facebook
TwitterThis dataset is used to produce the CNCS state profile map for use on our website.
Facebook
TwitterLearn Geographic Mapping with Altair, Vega-Lite and Vega using Curated Datasets
Complete geographic and geophysical data collection for mapping and visualization. This consolidation includes 18 complementary datasets used by 31+ Vega, Vega-Lite, and Altair examples 📊. Perfect for learning geographic visualization techniques including projections, choropleths, point maps, vector fields, and interactive displays.
Source data lives on GitHub and can also be accessed via CDN. The vega-datasets project serves as a common repository for example datasets used across these visualization libraries and related projects.
airports.csv), lines (like londonTubeLines.json), and polygons (like us-10m.json).windvectors.csv, annual-precip.json).This pack includes 18 datasets covering base maps, reference points, statistical data for choropleths, and geophysical data.
| Dataset | File | Size | Format | License | Description | Key Fields / Join Info |
|---|---|---|---|---|---|---|
| US Map (1:10m) | us-10m.json | 627 KB | TopoJSON | CC-BY-4.0 | US state and county boundaries. Contains states and counties objects. Ideal for choropleths. | id (FIPS code) property on geometries |
| World Map (1:110m) | world-110m.json | 117 KB | TopoJSON | CC-BY-4.0 | World country boundaries. Contains countries object. Suitable for world-scale viz. | id property on geometries |
| London Boroughs | londonBoroughs.json | 14 KB | TopoJSON | CC-BY-4.0 | London borough boundaries. | properties.BOROUGHN (name) |
| London Centroids | londonCentroids.json | 2 KB | GeoJSON | CC-BY-4.0 | Center points for London boroughs. | properties.id, properties.name |
| London Tube Lines | londonTubeLines.json | 78 KB | GeoJSON | CC-BY-4.0 | London Underground network lines. | properties.name, properties.color |
| Dataset | File | Size | Format | License | Description | Key Fields / Join Info |
|---|---|---|---|---|---|---|
| US Airports | airports.csv | 205 KB | CSV | Public Domain | US airports with codes and coordinates. | iata, state, `l... |
Facebook
TwitterThis data release contains the analytical results and evaluated source data files of geospatial analyses for identifying areas in Alaska that may be prospective for different types of lode gold deposits, including orogenic, reduced-intrusion-related, epithermal, and gold-bearing porphyry. The spatial analysis is based on queries of statewide source datasets of aeromagnetic surveys, Alaska Geochemical Database (AGDB3), Alaska Resource Data File (ARDF), and Alaska Geologic Map (SIM3340) within areas defined by 12-digit HUCs (subwatersheds) from the National Watershed Boundary dataset. The packages of files available for download are: 1. LodeGold_Results_gdb.zip - The analytical results in geodatabase polygon feature classes which contain the scores for each source dataset layer query, the accumulative score, and a designation for high, medium, or low potential and high, medium, or low certainty for a deposit type within the HUC. The data is described by FGDC metadata. An mxd file, and cartographic feature classes are provided for display of the results in ArcMap. An included README file describes the complete contents of the zip file. 2. LodeGold_Results_shape.zip - Copies of the results from the geodatabase are also provided in shapefile and CSV formats. The included README file describes the complete contents of the zip file. 3. LodeGold_SourceData_gdb.zip - The source datasets in geodatabase and geotiff format. Data layers include aeromagnetic surveys, AGDB3, ARDF, lithology from SIM3340, and HUC subwatersheds. The data is described by FGDC metadata. An mxd file and cartographic feature classes are provided for display of the source data in ArcMap. Also included are the python scripts used to perform the analyses. Users may modify the scripts to design their own analyses. The included README files describe the complete contents of the zip file and explain the usage of the scripts. 4. LodeGold_SourceData_shape.zip - Copies of the geodatabase source dataset derivatives from ARDF and lithology from SIM3340 created for this analysis are also provided in shapefile and CSV formats. The included README file describes the complete contents of the zip file.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Optimized for Geospatial and Big Data Analysis
This dataset is a refined and enhanced version of the original DataCo SMART SUPPLY CHAIN FOR BIG DATA ANALYSIS dataset, specifically designed for advanced geospatial and big data analysis. It incorporates geocoded information, language translations, and cleaned data to enable applications in logistics optimization, supply chain visualization, and performance analytics.
src_points.geojson: Source point geometries. dest_points.geojson: Destination point geometries. routes.geojson: Line geometries representing source-destination routes. DataCoSupplyChainDatasetRefined.csv
src_points.geojson
dest_points.geojson
routes.geojson
This dataset is based on the original dataset published by Fabian Constante, Fernando Silva, and António Pereira:
Constante, Fabian; Silva, Fernando; Pereira, António (2019), “DataCo SMART SUPPLY CHAIN FOR BIG DATA ANALYSIS”, Mendeley Data, V5, doi: 10.17632/8gx2fvg2k6.5.
Refinements include geospatial processing, translation, and additional cleaning by the uploader to enhance usability and analytical potential.
This dataset is designed to empower data scientists, researchers, and business professionals to explore the intersection of geospatial intelligence and supply chain optimization.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains a set of synthetic data that can be used to evaluate the efficiency of geosaptial datasbases. The datasets is composed of four json file, characterized by different size. They can be used to analyze the scalability of geospatial datasets with respect to the database size. Each json file contains a set of "points", each one characterized by a set of random attributes (description, url of a picture linked to the point, creation date, delete date, update date, identifier, partition identifier). The synthetically generated points are uniformly distributed among the world.
Facebook
TwitterThe Barrow Area Information Database (BAID) data collection is comprised of geospatial data for the research hubs of Barrow, Atqasuk and Ivotuk on Alaska's North Slope. Over 9600 research plots and instrument locations are included in the BAID research sites database. Updates to the project tracking database are ongoing through field mapping of new research locations and extant sampling sites dating back to the 1940s. Many ancillary data layers are also compiled to facilitate research activities and science communication. These geospatial data sets have been compiled through BAID and related NSF efforts. Geospatial data unique to this project are currently browseable via the BAID archive and include shapefiles of research information (sampling sites and instrumentation, the NOAA-CMDL clean air sector), administrative units (Barrow Environmental Observatory Science Research District plus adjacent federal lands, village districts, zoning, tax parcels, and the Ukpeagvik Inupiat Corporation boundary), infrastructure (power poles, snow fences, roads), erosion data for Elson Lagoon and imagery (declassified military imagery, air photo mosaics, IKONOS, Landsat, Quickbird, SAR and flight line indexes). Related data sets can be browsed via BAID’s web mapping tools and downloaded via the “Related links” section below. In addition, the BAID Internet Map Server (BAID-IMS) provides browse access to a number of additional layers which are available for download through catalog pages at the National Snow and Ice Data Center (NSIDC), the Alaska Geospatial Data Clearinghouse at USGS and the Alaska State Geo-Spatial Data Clearinghouse. Some layers are proprietary and are only available for browse access in BAID-IMS through special agreement. BAID provides a suite of user interfaces (Internet Map Server, Google Earth and Adobe Flex) and Open Geospatial Consortium web services for accessing the research plots and instrument locations. For more information on...
Facebook
TwitterThis integrated geospatial data integrates 337 geospatial data layers derived from 35 sources to understand the complex interplay between human expansion to the ocean and the environment from across the Economic Exclusive Zone surrounding the United Kingdom (UK-EEZ).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Cadaster data from PDOK used to illustrate the use of geopandas and shapely, geospatial python packages for manipulating vector data. The brpgewaspercelen_definitief_2020.gpkg file has been subsetted in order to make the download manageable for workshops. Other datasets are copies of those available from PDOK.
Facebook
TwitterThe Open Government Data portals (OGD) thanks to the presence of thousands of geo-referenced datasets, containing spatial information, are of extreme interest for any analysis or process relating to the territory. For this to happen, users must be enabled to access these datasets and reuse them. An element often considered hindering the full dissemination of OGD data is the quality of their metadata. Starting from an experimental investigation conducted on over 160,000 geospatial datasets belonging to six national and international OGD portals, this work has as its first objective to provide an overview of the usage of these portals measured in terms of datasets views and downloads. Furthermore, to assess the possible influence of the quality of the metadata on the use of geospatial datasets, an assessment of the metadata for each dataset was carried out, and the correlation between these two variables was measured. The results obtained showed a significant underutilization of geospatial datasets and a generally poor quality of their metadata. Besides, a weak correlation was found between the use and quality of the metadata, not such as to assert with certainty that the latter is a determining factor of the former.
The dataset consists of six zipped CSV files, containing the collected datasets' usage data, full metadata, and computed quality values, for about 160,000 geospatial datasets belonging to the three national and three international portals considered in the study, i.e. US (catalog.data.gov), Colombia (datos.gov.co), Ireland (data.gov.ie), HDX (data.humdata.org), EUODP (data.europa.eu), and NASA (data.nasa.gov).
Data collection occurred in the period: 2019-12-19 -- 2019-12-23.
The header for each CSV file is:
[ ,portalid,id,downloaddate,metadata,overallq,qvalues,assessdate,dviews,downloads,engine,admindomain]
where for each row (a portal's dataset) the following fields are defined as follows:
portalid: portal identifier
id: dataset identifier
downloaddate: date of data collection
metadata: the overall dataset's metadata downloaded via API from the portal according to the supporting platform schema
overallq: overall quality values computed by applying the methodology presented in [1]
qvalues: json object containing the quality values computed for the 17 metrics presented in [1]
assessdate: date of quality assessment
dviews: number of total views for the dataset
downloads: number of total downloads for the dataset (made available only by the Colombia, HDX, and NASA portals)
engine: identifier of the supporting portal platform: 1(CKAN), 2 (Socrata)
admindomain: 1 (national), 2 (international)
[1] Neumaier, S.; Umbrich, J.; Polleres, A. Automated Quality Assessment of Metadata Across Open Data Portals.J. Data and Information Quality2016,8, 2:1–2:29. doi:10.1145/2964909
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
giswqs/geospatial dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterThe National Aggregates of Geospatial Data Collection: Population, Landscape, And Climate Estimates, Version 3 (PLACE III) data set contains estimates of national-level aggregations in urban, rural, and total designations of territorial extent and population size by biome, climate zone, coastal proximity zone, elevation zone, and population density zone, for 232 statistical areas (countries and other UN recognized territories). This data set is produced by the Columbia University Center for International Earth Science Information Network (CIESIN).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
MCGD_Data_V2.2 contains all the data that we have collected on locations in modern China, plus a number of locations outside of China that we encounter frequently in historical sources on China. All further updates will appear under the name "MCGD_Data" with a time stamp (e.g., MCGD_Data2023-06-21)
You can also have access to this dataset and all the datasets that the ENP-China makes available on GitLab: https://gitlab.com/enpchina/IndexesEnp
Altogether there are 464,970 entries. The data include the name of locations and their variants in Chinese, pinyin, and any recorded transliteration; the name of the province in Chinese and in pinyin; Province ID; the latitude and longitude; the Name ID and Location ID, and NameID_Legacy. The Name IDs all start with H followed by seven digits. This is the internal ID system of MCGD (the NameID_Legacy column records the Name IDs in their original format depending on the source). Locations IDs that start with "DH" are data points extracted from China Historical GIS (Harvard University); those that start with "D" are locations extracted from the data points in Geonames; those that have only digits (8 digits) are data points we have added from various map sources.
One of the main features of the MCGD Main Dataset is the systematic collection and compilation of place names from non-Chinese language historical sources. Locations were designated in transliteration systems that are hardly comprehensible today, which makes it very difficult to find the actual locations they correspond to. This dataset allows for the conversion from these obsolete transliterations to the current names and geocoordinates.
From June 2021 onward, we have adopted a different file naming system to keep track of versions. From MCGD_Data_V1 we have moved to MCGD_Data_V2. In June 2022, we introduced time stamps, which result in the following naming convention: MCGD_Data_YYYY.MM.DD.
UPDATES
MCGD_Data2025_02_28 includes a major change with the duplication of all the locations listed under Beijing, Shanghai, Tianjin, and Chongqing (北京, 上海, 天津, 重慶) and their listing under the name of the provinces to which they belonge origially before the creation of the four special municipalities after 1949. This is meant to facilitate the matching of data from historical sources. Each location has a unique NameID. Altogether there are 472,818 entries
MCGD_Data2025_02_27 inclues an update on locations extracted from Minguo zhengfu ge yuanhui keyuan yishang zhiyuanlu 國民政府各院部會科員以上職員錄 (Directory of staff members and above in the ministries and committees of the National Government). Nanjing: Guomin zhengfu wenguanchu yinzhuju 國民政府文官處印鑄局國民政府文官處印鑄局, 1944). We also made corrections in the Prov_Py and Prov_Zh columns as there were some misalignments between the pinyin name and the name in Chines characters. The file now includes 465,128 entries.
MCGD_Data2024_03_23 includes an update on locations in Taiwan from the Asia Directories. Altogether there are 465,603 entries (of which 187 place names without geocoordinates, labelled in the Lat Long columns as "Unknown").
MCGD_Data2023.12.22 contains all the data that we have collected on locations in China, whatever the period. Altogether there are 465,603 entries (of which 187 place names without geocoordinates, labelled in the Lat Long columns as "Unknown"). The dataset also includes locations outside of China for the purpose of matching such locations to the place names extracted from historical sources. For example, one may need to locate individuals born outside of China. Rather than maintaining two separate files, we made the decision to incorporate all the place names found in historical sources in the gazetteer. Such place names can easily be removed by selecting all the entries where the 'Province' data is missing.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In this course, you will learn to work within the free and open-source R environment with a specific focus on working with and analyzing geospatial data. We will cover a wide variety of data and spatial data analytics topics, and you will learn how to code in R along the way. The Introduction module provides more background info about the course and course set up. This course is designed for someone with some prior GIS knowledge. For example, you should know the basics of working with maps, map projections, and vector and raster data. You should be able to perform common spatial analysis tasks and make map layouts. If you do not have a GIS background, we would recommend checking out the West Virginia View GIScience class. We do not assume that you have any prior experience with R or with coding. So, don't worry if you haven't developed these skill sets yet. That is a major goal in this course. Background material will be provided using code examples, videos, and presentations. We have provided assignments to offer hands-on learning opportunities. Data links for the lecture modules are provided within each module while data for the assignments are linked to the assignment buttons below. Please see the sequencing document for our suggested order in which to work through the material. After completing this course you will be able to: prepare, manipulate, query, and generally work with data in R. perform data summarization, comparisons, and statistical tests. create quality graphs, map layouts, and interactive web maps to visualize data and findings. present your research, methods, results, and code as web pages to foster reproducible research. work with spatial data in R. analyze vector and raster geospatial data to answer a question with a spatial component. make spatial models and predictions using regression and machine learning. code in the R language at an intermediate level.
Facebook
TwitterU.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
The Pacific Southwest Region has geospatial datasets available for download from this website. These datasets are zipped personal or file geodatabases created using ESRI ArcGis 10.0 software. Additional descriptive information as well as data steward contact information, for each geodatabase, can be found under the metadata link. State Level Datasets Existing Vegetation, Fire History, Fire Return Interval Departure, Direct Protection Areas, and other California extent data sets. Region Level Datasets Forest Activities (FACTS), Vegetation Burn Severity, Allotments and other Regional extent datasets. Forest Planning & Monitoring Datasets Land Manangement Plans, including the Draft Early Adopters (Inyo, Sierra and Sequia National Forests) Forest Datasets Transportation and land suitability class data are available. Resources in this dataset:Resource Title: Pacific Southwest Region Geospatial Data. File Name: Web Page, url: https://www.fs.usda.gov/main/r5/landmanagement/gis The Pacific Southwest Region has geospatial datasets available for download from this website. They include State Level Datasets, Region Level Datasets, Forest Planning & Monitoring Datasets, and Forest Datasets. Freeware, like 7-Zip, for decompressing (unzipping) the geodatabases can be found by utilizing a search engine; as can freeware, like ArcGis Explorer Desktop, for viewing the geospatial dataResource Software Recommended: 7-Zip,url: http://www.7-zip.org/ Resource Title: Pacific Southwest Region Geospatial Data. File Name: Web Page, url: https://www.fs.usda.gov/main/r5/landmanagement/gis The Pacific Southwest Region has geospatial datasets available for download from this website. They include State Level Datasets, Region Level Datasets, Forest Planning & Monitoring Datasets, and Forest Datasets. Freeware, like 7-Zip, for decompressing (unzipping) the geodatabases can be found by utilizing a search engine; as can freeware, like ArcGis Explorer Desktop, for viewing the geospatial dataResource Software Recommended: ArcGIS Explorer Desktop,url: http://www.esri.com/software/arcgis/explorer/index.html
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Dataset Information
Note that this is the full week of data that was sampled from Twitter. The 10,005,301 count mentioned in the introductory paper below refers to the weekday portion of the data (i.e., Monday through Friday). If you remove Saturday (Jan 12, 2013) and Sunday (Jan 13, 2013), then you will get the Monday through Friday portion that was analyzed in the paper. Has Missing Values? No
Dataset Characteristics Multivariate, Time-Series, Spatiotemporal Subject Area Social Science Associated Tasks Classification, Regression, Clustering
Variable Information This dataset contains geospatial and timestamp data for one week worth of Tweets in the contiguous United States. The Tweets were created between January 12, 2013 and January 18, 2013. The exact location (i.e., longitude and latitude) and timestamp (hour, minute, second) of each Tweet was recorded. All timestamps are reported in central standard time in the format "YYYY-MM-DD HH:MM:SS". The geo-tag information was used to assign each Tweet to one of the four standard time zones (for details see Helwig et al., 2015). The data were collected by the CyberGIS Center for Advanced Digital and Spatial Studies at the University of Illinois at Urbana-Champaign. Details on the data preprocessing and analysis can be found in Helwig et al. (2015). Class Labels 1. longitude: exact longitude coordinate of Tweet (real valued) 2. latitude: exact latitude coordinate of Tweet (real valued) 3. timestamp: 20130112000000 = 2013-01-12 00:00:00 CST (integer) 4. timezone: 1 = Eastern, 2 = Central, 3 = Mountain, 4 = Pacific (integer)
Facebook
TwitterThe U.S. Geological Survey developed this dataset as part of the Colorado Front Range Infrastructure Resources Project (FRIRP). One goal of the FRIRP was to provide information on the availability of those hydrogeologic resources that are either critical to maintaining infrastructure along the northern Front Range or that may become less available because of urban expansion in the northern Front Range. This dataset extends from the Boulder-Jefferson County line on the south, to the middle of Larimer and Weld Counties on the North. On the west, this dataset is bounded by the approximate mountain front of the Front Range of the Rocky Mountains; on the east, by an arbitrary north-south line extending through a point about 6.5 kilometers east of Greeley. This digital geospatial dataset consists of depth-to-water (unsaturated-thickness) contours that were generated from hydrogeologic data with Geographic Information System (GIS) software.
Facebook
TwitterU.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
The Southwestern Region is 20.6 million acres. There are six national forests in Arizona, five national forests and a national grassland in New Mexico, and one national grassland each in Oklahoma and the Texas panhandle.The region ranges in elevation from 1,600 feet above sea level and an annual rain fall of 8 inches in Arizona's lower Sonoran Desert to 13,171-foot high Wheeler Peak and over 35 inches of precipitation a year in northern New Mexico. Geographic Information Systems or GIS are computer systems, software and data used to analyze and display spatial or locational data about surface features. One of the strengths of GIS is the capability to overlay or compare multiple feature layers. A user can then analyze the relationship between the layers. Data, reports and maps produced through GIS are used by managers and resource specialists to make decisions about land management activities on National Forests. The National Forests of the Southwestern Region maintain and utilize GIS data for various features on the ground. Some of these datasets are made available for download through this page. Resources in this dataset:Resource Title: GIS Datasets. File Name: Web Page, url: https://www.fs.usda.gov/detail/r3/landmanagement/gis/?cid=STELPRDB5202474 Selected GIS datasets for the Southwestern Region are available for download from this page.Resource Software Recommended: ArcExplorer,url: http://www.esri.com/software/arcexplorer/index.html
Facebook
TwitterThis dataset was created by BBHawa
Facebook
TwitterAn National Geospatial Data Asset (NGDA) is defined as a geospatial dataset that has been designated by the FGDC Steering Committee and meets at least one of the following criteria: used by multiple agencies or with agency partners such as State, Tribal and local governments; applied to achieve Presidential priorities as expressed by OMB; required to meet shared mission goals of multiple Federal agencies; or expressly required by statutory mandate. Together, these datasets comprise the NGDA Portfolio. This metadata points to a spreadsheet that contains the official list of NGDA with a link to specific NGDA metadata maintained by the dataset owners on Data.gov, GeoPlatform.gov, a link to their associated NGDA Theme, and the agency responsible for the NGDA.