Facebook
TwitterThe MAR Web Geocoder is a web browser-based tool for geocoding locations, typically addresses, in Washington, DC. It is developed by the Office of Chief Technology Officer (OCTO) and can input Excel or CSV files to output an Excel file. Geocoding is the process of assigning a location in the form of geographic coordinates (often expressed as latitude and longitude) to spreadsheet data. This is done by comparing the descriptive geographic data to known geographic locations such as addresses, blocks, intersections, or place names.
Facebook
TwitterCreated for the use in Region 9
Facebook
TwitterThe resource-location extension for CKAN enhances data resources by automatically adding latitude and longitude coordinates to CSV files containing address data, using provided address, city and zipcode columns. This simplifies geocoding and location-based analysis directly within CKAN. The extension requires CKAN version 2.7.2 or higher. Key Features: Automated Geocoding: Automatically converts address data within CSV files into latitude and longitude coordinates during resource upload. Address Field Configuration: Allows users to specify the CSV column numbers corresponding to address, city, and zipcode fields. Coordinate Appending: Adds new columns to the CSV file containing the calculated latitude and longitude coordinates, preserving the original data. CSV Processing during Upload: Geocoding process is integrated directly into the resource upload workflow. Language Management: Offers translation support and instructions for adding new translations. How It Works: During CSV resource upload, the user is prompted to input column numbers corresponding to the address, city, and zipcode. Upon submission of the upload form, the extension processes the file, geocodes the addresses using these column values, and appends latitude and longitude as new columns to the CSV. This modified CSV file, now containing geographic coordinates, is stored as the resource. Benefits & Impact: By automatically adding geographic coordinates, the resource-location extension simplifies tasks such as mapping and spatial analysis of tabular data. This automated geocoding process enhances the usability and value of address-based datasets within CKAN.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
The Data Set
file:> ufo-complete-geocoded-time-normalized.csv
Complete the original data set containing resolved and unresolved locations and convert and not convert normalized time to seconds. 88874 total records, 724 locations not found or blank (0.8146%), 7131 erroneous time or blank (8.0237%)
file:> ufo-scrubbed-geocode-time-normalized.csv
Scrubbed data set with only non-zero resolved locations and >0 normalized time. 81185 total records, 0 locations not found, 0 erroneous time or blank records.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Created for use in the Renewable and Appropriate Energy Lab at UC Berkeley and Lawrence Berkeley National Laboratory.
Geography: All 58 Counties of the American State of California
Time period: 2015
Unit of analysis: Tons per year
Variables:
Sources: All columns except for lat and lon were scraped from the California Air Resources Board Facility Search Tool using the Request module from Python’s Urllib library. The script used is included below in scripts in case you would like to get additional columns.
The lat and lon columns were geocoded using the Geocoder library for Python with the Bing provider.
download.py
import pandas as pd
out_dir = 'ARB/'
file_ext = '.csv'
for i in range(1, 59):
facilities = pd.read_csv("https://www.arb.ca.gov/app/emsinv/facinfo/faccrit_output.csv?&dbyr=2015&ab_=&dis_=&co_=" + str(i) + "&fname_=&city_=&sort=FacilityNameA&fzip_=&fsic_=&facid_=&all_fac=C&chapis_only=&CERR=&dd=")
for index, row in facilities.iterrows():
curr_facility = pd.read_csv("https://www.arb.ca.gov/app/emsinv/facinfo/facdet_output.csv?&dbyr=2015&ab_=" + str(row['AB']) + "&dis_=" + str(row['DIS']) + "&co_=" + str(row['CO']) + "&fname_=&city_=&sort=C&fzip_=&fsic_=&facid_=" + str(row['FACID']) + "&all_fac=&chapis_only=&CERR=&dd=")
facilities.set_value(index, 'PM2.5T', curr_facility.loc[curr_facility['POLLUTANT NAME'] == 'PM2.5'].iloc[0]['EMISSIONS_TONS_YR'])
facilities.to_csv(out_dir + str(i) + file_ext)
geocode.py
import geocoder
import csv
directory = 'ARB/'
outdirectory = 'ARB_OUT/'
for i in range(1, 59):
with open(directory + str(i) + ".csv", 'rb') as csvfile, open(outdirectory + str(i) + '.csv', 'a') as csvout:
reader = csv.DictReader(csvfile)
fieldnames = reader.fieldnames + ['lat'] + ['lon'] # Add new columns
writer = csv.DictWriter(csvout, fieldnames)
writer.writeheader()
for row in reader:
address = row['FSTREET'] + ', ' + row['FCITY'] + ', California ' + row['FZIP']
g = geocoder.bing(address, key='API_KEY')
newrow = dict(row)
if g.latlng:
newrow['lat'] = g.json['lat']
newrow['lon'] = g.json['lng']
writer.writerow(newrow) # Only write row if successfully geocoded
Facebook
TwitterSenior_Centers_csv_Geocoded
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset has rich detailing all operational Target locations as of Dec 2023. Dive into comprehensive columns featuring address, latitude/longitude coordinates, store opening dates, last remodel dates, capabilities, and various other intriguing data points.
About - Target Corporation is an American retail corporation that operates a chain of discount department stores and hypermarkets, headquartered in Minneapolis, Minnesota.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Geoscape G-NAF is the geocoded address database for Australian businesses and governments. It’s the trusted source of geocoded address data for Australia with over 50 million contributed addresses distilled into 15.4 million G-NAF addresses. It is built and maintained by Geoscape Australia using independently examined and validated government data.
From 22 August 2022, Geoscape Australia is making G-NAF available in an additional simplified table format. G-NAF Core makes accessing geocoded addresses easier by utilising less technical effort.
G-NAF Core will be updated on a quarterly basis along with G-NAF.
Further information about contributors to G-NAF is available here.
With more than 15 million Australian physical address record, G-NAF is one of the most ubiquitous and powerful spatial datasets. The records include geocodes, which are latitude and longitude map coordinates. G-NAF does not contain personal information or details relating to individuals.
Updated versions of G-NAF are published on a quarterly basis. Previous versions are available here
Users have the option to download datasets with feature coordinates referencing either GDA94 or GDA2020 datums.
Changes in the November 2025 release
Nationally, the November 2025 update of G-NAF shows an increase of 32,773 addresses overall (0.21%). The total number of addresses in G-NAF now stands at 15,827,416 of which 14,983,358 or 94.67% are principal.
There is one new locality for the November 2025 Release of G-NAF, the locality of Southwark in South Australia.
Geoscape has moved product descriptions, guides and reports online to https://docs.geoscape.com.au.
Further information on G-NAF, including FAQs on the data, is available here or through Geoscape Australia’s network of partners. They provide a range of commercial products based on G-NAF, including software solutions, consultancy and support.
Additional information: On 1 October 2020, PSMA Australia Limited began trading as Geoscape Australia.
Use of the G-NAF downloaded from data.gov.au is subject to the End User Licence Agreement (EULA)
The EULA terms are based on the Creative Commons Attribution 4.0 International license (CC BY 4.0). However, an important restriction relating to the use of the open G-NAF for the sending of mail has been added.
The open G-NAF data must not be used for the generation of an address or the compilation of an address for the sending of mail unless the user has verified that each address to be used for the sending of mail is capable of receiving mail by reference to a secondary source of information. Further information on this use restriction is available here.
End users must only use the data in ways that are consistent with the Australian Privacy Principles issued under the Privacy Act 1988 (Cth).
Users must also note the following attribution requirements:
Preferred attribution for the Licensed Material:
_G-NAF © Geoscape Australia licensed by the Commonwealth of Australia under the _Open Geo-coded National Address File (G-NAF) End User Licence Agreement.
Preferred attribution for Adapted Material:
Incorporates or developed using G-NAF © Geoscape Australia licensed by the Commonwealth of Australia under the Open Geo-coded National Address File (G-NAF) End User Licence Agreement.
G-NAF is a complex and large dataset (approximately 5GB unpacked), consisting of multiple tables that will need to be joined prior to use. The dataset is primarily designed for application developers and large-scale spatial integration. Users are advised to read the technical documentation, including product change notices and the individual product descriptions before downloading and using the product. A quick reference guide on unpacking the G-NAF is also available.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Address geocoding, or simply geocoding, is the process of taking a text-based description of a location, such as an address or the name of a place, and returning geographic coordinates, frequently latitude/longitude pair, to identify a location on the Earth's surface - Wikipedia
What is meant by geocoding in GIS? Geocoding is typically preceded by the data cleaning step of preprocessing and standardizing the format of the data. It is a crucial part of developing a GIS (Geographic Information Systems)
This dataset comes with three files of the same content - text, CSV, and JSON for ease of use.
Each address has 4 components - - address string - city - state - zipcode
Example - "777 Brockton Avenue, Abington MA 2351"
Address Geocoding Solutions(Coordinates From Text)
The dataset was collected from this GitHub gist : https://gist.github.com/HeroicEric/1102788
Facebook
TwitterOffices_on_Aging_csv_Geocoded
Facebook
TwitterMapping incident locations from a CSV file in a web map (YouTube video).
Facebook
TwitterA comprehensive self-hosted geospatial database of international address data, including street names, coordinates, and address data ranges for Enterprise use. The address data are georeferenced with industry-standard WGS84 coordinates (geocoding).
All address data are provided in the official local languages. Names and other data in non-Roman languages are also made available in English through translations and transliterations.
Use cases for the Global Address Database (Geospatial data/Map Data)
Address Data Enrichment
Address capture and validation
Parcel delivery
Master Data Management
Logistics and Shipping
Sales and Marketing
Product Features
Fully and accurately geocoded
Multi-language support
Address ranges for streets covered by several zip codes
Comprehensive city definitions across countries
Administrative areas with a level range of 0-4
International Address Formats
For additional insights, you can combine the map data with:
UNLOCODE and IATA codes (geocoded)
Time zones and Daylight Saving Time (DST)
Population data: Past and future trends
Data export methodology
Our address data enrichment packages are offered in CSV format. All international address data are optimized for seamless integration with popular systems like Esri ArcGIS, Snowflake, QGIS, and more.
Why do companies choose our location databases
Enterprise-grade service
Reduce integration time and cost by 30%
Frequent, consistent updates for the highest quality
Note: Custom international address data packages are available. Please submit a request via the above contact button for more details.
Facebook
TwitterThis repository contains the scripts required to implement the Wikidata-based geocoding pipeline described in the accompanying paper. geocode.sh : Shell script for setting up and executing Stanford CoreNLP with the required language models and entitylink annotator. Automates preprocessing, named entity recognition (NER), and wikification across a directory of plain-text (.txt) files. Configured for both local execution and high-performance computing (HPC) environments. geocode.py : Python script that processes the list of extracted location entities (entities.txt) and retrieves latitude/longitude coordinates from Wikidata using Pywikibot. Handles redirects, missing pages, and missing coordinate values, returning standardized placeholder codes where necessary. Outputs results as a CSV file with columns for place name, latitude, longitude, and source file. geocode.sbatch : Optional SLURM submission script for running run_corenlp.sh on HPC clusters. Includes configurable resource requests for scalable processing of large corpora. README.md : Detailed README file including a line-by-line explanation of the geocode.sh file. Together, these files provide a reproducible workflow for geocoding textual corpora via wikification, suitable for projects ranging from small-scale literary analysis to large-scale archival datasets.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset contains key characteristics about the data described in the Data Descriptor Geocoding of worldwide patent data. Contents:
1. human readable metadata summary table in CSV format
2. machine readable metadata file in JSON format
Versioning Note:Version 2 was generated when the metadata format was updated from JSON to JSON-LD. This was an automatic process that changed only the format, not the contents, of the metadata.
Facebook
TwitterBy GetTheData [source]
This Open Postcode Geo dataset contains a wealth of information about UK postcodes. For each postcode, there are several geospace attributes you can use to refine your analysis such as latitude, longitude, easting and northing. Moreover, the positional quality indicator provides a range of accuracy levels for each geospace attribute.
In addition to positioning data, this dataset has been derived from the Office for National Statistics' Postcode Directory which gives users extra insights into postcodes such as postcode areas, districts and sectors — enabling them to accurately group records into geographic hierarchies: perfect for mapping applications and statistical analysis!
And with data coming from multiple sources — The Crown Copyright & Database Right (2016), Royal Mail Copyright & Database Right (2016) & ONS ™ - users can be assured that Open Postcode Geo provides accurate and up-to-date results that cover terminated archives as well as smaller user-generated postcodes! All released under the UK Government's Open Government Licence v3; with attribution required pursuant to ONS Licences info... Now you too have access to powerful spatial information about the United Kingdom; helping you gain unparalleled insight in record time
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
- Use this dataset to combine with other datasets to accurately geocode addresses in a variety of formats, such as full postcodes or postcodes with only one digit missing.
- Utilise the different hierarchy levels including postcode area, district and sector for data visualization and analysis on census data collected by specific area in the UK.
- Feed this dataset into a route optimization algorithm so delivery routes can be quickly optimized between different locations using accurate lat-long coordinates from each address
If you use this dataset in your research, please credit the original authors. Data Source
See the dataset description for more information.
File: open_postcode_geo.csv | Column name | Description | |:---------------|:------------------------------| | AB1 0AA | Postcode (String) | | terminated | Terminated postcode (String) | | small | Small postcode (String) | | 385386 | Easting coordinate (Integer) | | 801193 | Northing coordinate (Integer) | | Scotland | Country name (String) | | 57.101474 | Latitude coordinate (Float) | | -2.242851 | Longitude coordinate (Float) | | AB10AA | Postcode area (String) | | AB1 0AA.1 | Postcode district (String) | | AB1 0AA | Postcode sector (String) | | AB1.1 | Postcode area (String) |
If you use this dataset in your research, please credit the original authors. If you use this dataset in your research, please credit GetTheData.
Facebook
TwitterA comprehensive self-hosted geospatial database of street names, coordinates, and address data ranges for Enterprise use. The address data are georeferenced with industry-standard WGS84 coordinates (geocoding).
All geospatial data are provided in the official local languages. Names and other data in non-Roman languages are also made available in English through translations and transliterations.
Use cases for the Global Address Database (Geospatial data)
Address capture and validation
Parcel delivery
Master Data Management
Logistics and Shipping
Sales and Marketing
Additional features
Fully and accurately geocoded
Multi-language support
Address ranges for streets covered by several zip codes
Comprehensive city definitions across countries
Administrative areas with a level range of 0-4
International Address Formats
For additional insights, you can combine the map data with:
UNLOCODE and IATA codes (geocoded)
Time zones and Daylight Saving Time (DST)
Population data: Past and future trends
Data export methodology
Our location data packages are offered in CSV format. All geospatial data are optimized for seamless integration with popular systems like Esri ArcGIS, Snowflake, QGIS, and more.
Why companies choose our location databases
Enterprise-grade service
Reduce integration time and cost by 30%
Frequent, consistent updates for the highest quality
Note: Custom geospatial data packages are available. Please submit a request via the above contact button for more details.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The Bigfoot Field Researchers Organization (BFRO) - www.bfro.net - is an organization dedicated to investigating the bigfoot / sasquatch mystery. This dataset contains sighting data publicly available on the BFRO website in a more digestible form.
There are three files:
The original data can be found here
Foto von Jon Sailer auf Unsplash
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Social network services such as Twitter are important venues that can be used as rich data sources to mine public opinions about various topics. In this study, we used Twitter to collect data on one of the most growing theories in education, namely Self-Regulated Learning (SRL) and carry out further analysis to investigate What Twitter says about SRL? This work uses three main analysis methods, descriptive, topic modeling, and geocoding analysis. The searched and collected dataset consists of a large volume of relevant SRL tweets equal to 54,070 tweets between 2011 and 2021. The descriptive analysis uncovers a growing discussion on SRL on Twitter from 2011 till 2018 and then markedly decreased till the collection day. For topic modeling, the text mining technique of Latent Dirichlet allocation (LDA) was applied and revealed insights on computationally processed topics. Finally, the geocoding analysis uncovers a diverse community from all over the world, yet a higher density representation of users from the Global North was identified. Further implications are discussed in the paper.
Facebook
TwitterThis geocoded dataset represents all natural disaster records in EM-DAT's database for the Philippines between 1980 and 2012. The "disasters.csv" file contains 421 records pertaining to individual disasters, and the "locations.csv" file contains 1815 location records that relate (many to one) to corresponding records in the "disasters.csv" file.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset contains key characteristics about the data described in the Data Descriptor GDIS, a global dataset of geocoded disaster locations. Contents:
1. human readable metadata summary table in CSV format
2. machine readable metadata file in JSON format
Facebook
TwitterThe MAR Web Geocoder is a web browser-based tool for geocoding locations, typically addresses, in Washington, DC. It is developed by the Office of Chief Technology Officer (OCTO) and can input Excel or CSV files to output an Excel file. Geocoding is the process of assigning a location in the form of geographic coordinates (often expressed as latitude and longitude) to spreadsheet data. This is done by comparing the descriptive geographic data to known geographic locations such as addresses, blocks, intersections, or place names.