100+ datasets found

d
Global Point of Interest (POI) Data | 230M+ Locations, 5000 Categories,...
datarade.ai
.json
Updated Sep 7, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Xverum (2024). Global Point of Interest (POI) Data | 230M+ Locations, 5000 Categories, Geographic & Location Intelligence, Regular Updates [Dataset]. https://datarade.ai/data-products/global-point-of-interest-poi-data-230m-locations-5000-c-xverum
Explore at:
.jsonAvailable download formats
Dataset updated
Sep 7, 2024
Dataset provided by
Xverum LLC
Authors
Xverum
Area covered
French Polynesia, Mauritania, Andorra, Kyrgyzstan, Vietnam, Costa Rica, Northern Mariana Islands, Bahamas, Antarctica, Guatemala
Description
Xverum’s Point of Interest (POI) Data is a comprehensive dataset containing 230M+ verified locations across 5000 business categories. Our dataset delivers structured geographic data, business attributes, location intelligence, and mapping insights, making it an essential tool for GIS applications, market research, urban planning, and competitive analysis.

With regular updates and continuous POI discovery, Xverum ensures accurate, up-to-date information on businesses, landmarks, retail stores, and more. Delivered in bulk to S3 Bucket and cloud storage, our dataset integrates seamlessly into mapping, geographic information systems, and analytics platforms.

🔥 Key Features:

Extensive POI Coverage: ✅ 230M+ Points of Interest worldwide, covering 5000 business categories. ✅ Includes retail stores, restaurants, corporate offices, landmarks, and service providers.

Geographic & Location Intelligence Data: ✅ Latitude & longitude coordinates for mapping and navigation applications. ✅ Geographic classification, including country, state, city, and postal code. ✅ Business status tracking – Open, temporarily closed, or permanently closed.

Continuous Discovery & Regular Updates: ✅ New POIs continuously added through discovery processes. ✅ Regular updates ensure data accuracy, reflecting new openings and closures.

Rich Business Insights: ✅ Detailed business attributes, including company name, category, and subcategories. ✅ Contact details, including phone number and website (if available). ✅ Consumer review insights, including rating distribution and total number of reviews (additional feature). ✅ Operating hours where available.

Ideal for Mapping & Location Analytics: ✅ Supports geospatial analysis & GIS applications. ✅ Enhances mapping & navigation solutions with structured POI data. ✅ Provides location intelligence for site selection & business expansion strategies.

Bulk Data Delivery (NO API): ✅ Delivered in bulk via S3 Bucket or cloud storage. ✅ Available in structured format (.json) for seamless integration.

🏆Primary Use Cases:

Mapping & Geographic Analysis: 🔹 Power GIS platforms & navigation systems with precise POI data. 🔹 Enhance digital maps with accurate business locations & categories.

Retail Expansion & Market Research: 🔹 Identify key business locations & competitors for market analysis. 🔹 Assess brand presence across different industries & geographies.

Business Intelligence & Competitive Analysis: 🔹 Benchmark competitor locations & regional business density. 🔹 Analyze market trends through POI growth & closure tracking.

Smart City & Urban Planning: 🔹 Support public infrastructure projects with accurate POI data. 🔹 Improve accessibility & zoning decisions for government & businesses.

💡 Why Choose Xverum’s POI Data?

230M+ Verified POI Records – One of the largest & most detailed location datasets available.

Global Coverage – POI data from 249+ countries, covering all major business sectors.

Regular Updates – Ensuring accurate tracking of business openings & closures.

Comprehensive Geographic & Business Data – Coordinates, addresses, categories, and more.

Bulk Dataset Delivery – S3 Bucket & cloud storage delivery for full dataset access.

100% Compliant – Ethically sourced, privacy-compliant data.

Access Xverum’s 230M+ POI dataset for mapping, geographic analysis, and location intelligence. Request a free sample or contact us to customize your dataset today!
🌎 Location Intelligence Data | From Google Map
kaggle.com
zip
Updated Apr 21, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Azhar Saleem (2024). 🌎 Location Intelligence Data | From Google Map [Dataset]. https://www.kaggle.com/datasets/azharsaleem/location-intelligence-data-from-google-map
Explore at:
zip(1911275 bytes)Available download formats
Dataset updated
Apr 21, 2024
Authors
Azhar Saleem
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
👨‍💻 Author: Azhar Saleem

"https://github.com/azharsaleem18" target="_blank"> https://img.shields.io/badge/GitHub-Profile-blue?style=for-the-badge&logo=github" alt="GitHub Profile"> "https://www.kaggle.com/azharsaleem" target="_blank"> https://img.shields.io/badge/Kaggle-Profile-blue?style=for-the-badge&logo=kaggle" alt="Kaggle Profile"> "https://www.linkedin.com/in/azhar-saleem/" target="_blank"> https://img.shields.io/badge/LinkedIn-Profile-blue?style=for-the-badge&logo=linkedin" alt="LinkedIn Profile">
"https://www.youtube.com/@AzharSaleem19" target="_blank"> https://img.shields.io/badge/YouTube-Profile-red?style=for-the-badge&logo=youtube" alt="YouTube Profile"> "https://www.facebook.com/azhar.saleem1472/" target="_blank"> https://img.shields.io/badge/Facebook-Profile-blue?style=for-the-badge&logo=facebook" alt="Facebook Profile"> "https://www.tiktok.com/@azhar_saleem18" target="_blank"> https://img.shields.io/badge/TikTok-Profile-blue?style=for-the-badge&logo=tiktok" alt="TikTok Profile">
"https://twitter.com/azhar_saleem18" target="_blank"> https://img.shields.io/badge/Twitter-Profile-blue?style=for-the-badge&logo=twitter" alt="Twitter Profile"> "https://www.instagram.com/azhar_saleem18/" target="_blank"> https://img.shields.io/badge/Instagram-Profile-blue?style=for-the-badge&logo=instagram" alt="Instagram Profile"> "mailto:azharsaleem6@gmail.com"> https://img.shields.io/badge/Email-Contact%20Me-red?style=for-the-badge&logo=gmail" alt="Email Contact">

Dataset Overview

Welcome to the Google Places Comprehensive Business Dataset! This dataset has been meticulously scraped from Google Maps and presents extensive information about businesses across several countries. Each entry in the dataset provides detailed insights into business operations, location specifics, customer interactions, and much more, making it an invaluable resource for data analysts and scientists looking to explore business trends, geographic data analysis, or consumer behaviour patterns.

Key Features

Business Details: Includes unique identifiers, names, and contact information.

Geolocation Data: Precise latitude and longitude for pinpointing business locations on a map.

Operational Timings: Detailed opening and closing hours for each day of the week, allowing analysis of business activity patterns.

Customer Engagement: Data on review counts and ratings, offering insights into customer satisfaction and business popularity.

Additional Attributes: Links to business websites, time zone information, and country-specific details enrich the dataset for comprehensive analysis.

Potential Use Cases

This dataset is ideal for a variety of analytical projects, including: - Market Analysis: Understand business distribution and popularity across different regions. - Customer Sentiment Analysis: Explore relationships between customer ratings and business characteristics. - Temporal Trend Analysis: Analyze patterns of business activity throughout the week. - Geospatial Analysis: Integrate with mapping software to visualise business distribution or cluster businesses based on location.

Dataset Structure

The dataset contains 46 columns, providing a thorough profile for each listed business. Key columns include:

business_id: A unique Google Places identifier for each business, ensuring distinct entries.

phone_number: The contact number associated with the business. It provides a direct means of communication.

name: The official name of the business as listed on Google Maps.

full_address: The complete postal address of the business, including locality and geographic details.

latitude: The geographic latitude coordinate of the business location, useful for mapping and spatial analysis.

longitude: The geographic longitude coordinate of the business location.

review_count: The total number of reviews the business has received on Google Maps.

rating: The average user rating out of 5 for the business, reflecting customer satisfaction.

timezone: The world timezone the business is located in, important for temporal analysis.

website: The official website URL of the business, providing further information and contact options.

category: The category or type of service the business provides, such as restaurant, museum, etc.

claim_status: Indicates whether the business listing has been claimed by the owner on Google Maps.

plus_code: A sho...
d
Datasets for Computational Methods and GIS Applications in Social Science
search.dataone.org
Updated Oct 29, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Fahui Wang; Lingbo Liu (2025). Datasets for Computational Methods and GIS Applications in Social Science [Dataset]. http://doi.org/10.7910/DVN/4CM7V4
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/4CM7V4
Dataset updated
Oct 29, 2025
Dataset provided by
Harvard Dataverse
Authors
Fahui Wang; Lingbo Liu
Description
Dataset for the textbook Computational Methods and GIS Applications in Social Science (3rd Edition), 2023 Fahui Wang, Lingbo Liu Main Book Citation: Wang, F., & Liu, L. (2023). Computational Methods and GIS Applications in Social Science (3rd ed.). CRC Press. https://doi.org/10.1201/9781003292302 KNIME Lab Manual Citation: Liu, L., & Wang, F. (2023). Computational Methods and GIS Applications in Social Science - Lab Manual. CRC Press. https://doi.org/10.1201/9781003304357 KNIME Hub Dataset and Workflow for Computational Methods and GIS Applications in Social Science-Lab Manual Update Log If Python package not found in Package Management, use ArcGIS Pro's Python Command Prompt to install them, e.g., conda install -c conda-forge python-igraph leidenalg NetworkCommDetPro in CMGIS-V3-Tools was updated on July 10,2024 Add spatial adjacency table into Florida on June 29,2024 The dataset and tool for ABM Crime Simulation were updated on August 3, 2023, The toolkits in CMGIS-V3-Tools was updated on August 3rd,2023. Report Issues on GitHub https://github.com/UrbanGISer/Computational-Methods-and-GIS-Applications-in-Social-Science Following the website of Fahui Wang : http://faculty.lsu.edu/fahui Contents Chapter 1. Getting Started with ArcGIS: Data Management and Basic Spatial Analysis Tools Case Study 1: Mapping and Analyzing Population Density Pattern in Baton Rouge, Louisiana Chapter 2. Measuring Distance and Travel Time and Analyzing Distance Decay Behavior Case Study 2A: Estimating Drive Time and Transit Time in Baton Rouge, Louisiana Case Study 2B: Analyzing Distance Decay Behavior for Hospitalization in Florida Chapter 3. Spatial Smoothing and Spatial Interpolation Case Study 3A: Mapping Place Names in Guangxi, China Case Study 3B: Area-Based Interpolations of Population in Baton Rouge, Louisiana Case Study 3C: Detecting Spatiotemporal Crime Hotspots in Baton Rouge, Louisiana Chapter 4. Delineating Functional Regions and Applications in Health Geography Case Study 4A: Defining Service Areas of Acute Hospitals in Baton Rouge, Louisiana Case Study 4B: Automated Delineation of Hospital Service Areas in Florida Chapter 5. GIS-Based Measures of Spatial Accessibility and Application in Examining Healthcare Disparity Case Study 5: Measuring Accessibility of Primary Care Physicians in Baton Rouge Chapter 6. Function Fittings by Regressions and Application in Analyzing Urban Density Patterns Case Study 6: Analyzing Population Density Patterns in Chicago Urban Area >Chapter 7. Principal Components, Factor and Cluster Analyses and Application in Social Area Analysis Case Study 7: Social Area Analysis in Beijing Chapter 8. Spatial Statistics and Applications in Cultural and Crime Geography Case Study 8A: Spatial Distribution and Clusters of Place Names in Yunnan, China Case Study 8B: Detecting Colocation Between Crime Incidents and Facilities Case Study 8C: Spatial Cluster and Regression Analyses of Homicide Patterns in Chicago Chapter 9. Regionalization Methods and Application in Analysis of Cancer Data Case Study 9: Constructing Geographical Areas for Mapping Cancer Rates in Louisiana Chapter 10. System of Linear Equations and Application of Garin-Lowry in Simulating Urban Population and Employment Patterns Case Study 10: Simulating Population and Service Employment Distributions in a Hypothetical City Chapter 11. Linear and Quadratic Programming and Applications in Examining Wasteful Commuting and Allocating Healthcare Providers Case Study 11A: Measuring Wasteful Commuting in Columbus, Ohio Case Study 11B: Location-Allocation Analysis of Hospitals in Rural China Chapter 12. Monte Carlo Method and Applications in Urban Population and Traffic Simulations Case Study 12A. Examining Zonal Effect on Urban Population Density Functions in Chicago by Monte Carlo Simulation Case Study 12B: Monte Carlo-Based Traffic Simulation in Baton Rouge, Louisiana Chapter 13. Agent-Based Model and Application in Crime Simulation Case Study 13: Agent-Based Crime Simulation in Baton Rouge, Louisiana Chapter 14. Spatiotemporal Big Data Analytics and Application in Urban Studies Case Study 14A: Exploring Taxi Trajectory in ArcGIS Case Study 14B: Identifying High Traffic Corridors and Destinations in Shanghai Dataset File Structure 1 BatonRouge Census.gdb BR.gdb 2A BatonRouge BR_Road.gdb Hosp_Address.csv TransitNetworkTemplate.xml BR_GTFS Google API Pro.tbx 2B Florida FL_HSA.gdb R_ArcGIS_Tools.tbx (RegressionR) 3A China_GX GX.gdb 3B BatonRouge BR.gdb 3C BatonRouge BRcrime R_ArcGIS_Tools.tbx (STKDE) 4A BatonRouge BRRoad.gdb 4B Florida FL_HSA.gdb HSA Delineation Pro.tbx Huff Model Pro.tbx FLplgnAdjAppend.csv 5 BRMSA BRMSA.gdb Accessibility Pro.tbx 6 Chicago ChiUrArea.gdb R_ArcGIS_Tools.tbx (RegressionR) 7 Beijing BJSA.gdb bjattr.csv R_ArcGIS_Tools.tbx (PCAandFA, BasicClustering) 8A Yunnan YN.gdb R_ArcGIS_Tools.tbx (SaTScanR) 8B Jiangsu JS.gdb 8C Chicago ChiCity.gdb cityattr.csv ...

🌆 City Lifestyle Segmentation Dataset

kaggle.com

zip

Updated Nov 15, 2025

Facebook

Twitter

Click to copy link

Link copied

Cite

UmutUygurr (2025). 🌆 City Lifestyle Segmentation Dataset [Dataset]. https://www.kaggle.com/datasets/umuttuygurr/city-lifestyle-segmentation-dataset

Explore at:

zip(11274 bytes)Available download formats

Dataset updated

Nov 15, 2025

Authors

UmutUygurr

License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F22121490%2F7189944f8fc292a094c90daa799d08ca%2FChatGPT%20Image%2015%20Kas%202025%2014_07_37.png?generation=1763204959770660&alt=media" alt="">

🌆 About This Dataset

This synthetic dataset simulates 300 global cities across 6 major geographic regions, designed specifically for unsupervised machine learning and clustering analysis. It explores how economic status, environmental quality, infrastructure, and digital access shape urban lifestyles worldwide.

🎯 Perfect For:

📊 K-Means, DBSCAN, Agglomerative Clustering
🔬 PCA & t-SNE Dimensionality Reduction
🗺️ Geospatial Visualization (Plotly, Folium)
📈 Correlation Analysis & Feature Engineering
🎓 Educational Projects (Beginner to Intermediate)

📦 What's Inside?

Feature	Description	Range
10 Features	Economic, environmental & social indicators	Realistically scaled
300 Cities	Europe, Asia, Americas, Africa, Oceania	Diverse distributions
Strong Correlations	Income ↔ Rent (+0.8), Density ↔ Pollution (+0.6)	ML-ready
No Missing Values	Clean, preprocessed data	Ready for analysis
4-5 Natural Clusters	Metropolitan hubs, eco-towns, developing centers	Pre-validated

🔥 Key Features

✅ Realistic Correlations: Income strongly predicts rent (+0.8), internet access (+0.7), and happiness (+0.6)
✅ Regional Diversity: Each region has distinct economic and environmental characteristics
✅ Clustering-Ready: Naturally separable into 4-5 lifestyle archetypes
✅ Beginner-Friendly: No data cleaning required, includes example code
✅ Documented: Comprehensive README with methodology and use cases

🚀 Quick Start Example

import pandas as pd
from sklearn.cluster import KMeans
from sklearn.preprocessing import StandardScaler

# Load and prepare
df = pd.read_csv('city_lifestyle_dataset.csv')
X = df.drop(['city_name', 'country'], axis=1)
X_scaled = StandardScaler().fit_transform(X)

# Cluster
kmeans = KMeans(n_clusters=5, random_state=42)
df['cluster'] = kmeans.fit_predict(X_scaled)

# Analyze
print(df.groupby('cluster').mean())

🎓 Learning Outcomes

After working with this dataset, you will be able to: 1. Apply K-Means, DBSCAN, and Hierarchical Clustering 2. Use PCA for dimensionality reduction and visualization 3. Interpret correlation matrices and feature relationships 4. Create geographic visualizations with cluster assignments 5. Profile and name discovered clusters based on characteristics

📚 Ideal For These Projects

🏆 Kaggle Competitions: Practice clustering techniques
📝 Academic Projects: Urban planning, sociology, environmental science
💼 Portfolio Work: Showcase ML skills to employers
🎓 Learning: Hands-on practice with unsupervised learning
🔬 Research: Urban lifestyle segmentation studies

🌍 Expected Clusters

Cluster	Characteristics	Example Cities
Metropolitan Tech Hubs	High income, density, rent	Silicon Valley, Singapore
Eco-Friendly Towns	Low density, clean air, high happiness	Nordic cities
Developing Centers	Mid income, high density, poor air	Emerging markets
Low-Income Suburban	Low infrastructure, income	Rural areas
Industrial Mega-Cities	Very high density, pollution	Manufacturing hubs

🛠️ Technical Details

Format: CSV (UTF-8)
Size: ~300 rows × 10 columns
Missing Values: 0%
Data Types: 2 categorical, 8 numerical
Target Variable: None (unsupervised)
Correlation Strength: Pre-validated (r: 0.4 to 0.8)

📖 What Makes This Dataset Special?

Unlike random synthetic data, this dataset was carefully engineered with: - ✨ Realistic correlation structures based on urban research - 🌍 Regional characteristics matching real-world patterns - 🎯 Optimal cluster separability (validated via silhouette scores) - 📚 Comprehensive documentation and starter code

🏅 Use This Dataset If You Want To:

✓ Learn clustering without data cleaning hassles
✓ Practice PCA and dimensionality reduction
✓ Create beautiful geographic visualizations
✓ Understand feature correlation in real-world contexts
✓ Build a portfolio project with clear business insights

📊 Acknowledgments

This dataset was designed for educational purposes in machine learning and data science. While synthetic, it reflects real patterns observed in global urban development research.

Happy Clustering! 🎉

Data from: Indoor GIS Solution for Space Use Assessment
ckan.americaview.org
Updated Aug 7, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ckan.americaview.org (2023). Indoor GIS Solution for Space Use Assessment [Dataset]. https://ckan.americaview.org/dataset/indoor-gis-solution-for-space-use-assessment
Explore at:
Dataset updated
Aug 7, 2023
Dataset provided by
CKANhttps://ckan.org/
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
As GIS and computing technologies advanced rapidly, many indoor space studies began to adopt GIS technology, data models, and analysis methods. However, even with a considerable amount of research on indoor GIS and various indoor systems developed for different applications, there has not been much attention devoted to adopting indoor GIS for the evaluation space usage. Applying indoor GIS for space usage assessment can not only provide a map-based interface for data collection, but also brings spatial analysis and reporting capabilities for this purpose. This study aims to explore best practice of using an indoor GIS platform to assess space usage and design a complete indoor GIS solution to facilitate and streamline the data collection, a management and reporting workflow. The design has a user-friendly interface for data collectors and an automated mechanism to aggregate and visualize the space usage statistics. A case study was carried out at the Purdue University Libraries to assess study space usage. The system is efficient and effective in collecting student counts and activities and generating reports to interested parties in a timely manner. The analysis results of the collected data provide insights into the user preferences in terms of space usage. This study demonstrates the advantages of applying an indoor GIS solution to evaluate space usage as well as providing a framework to design and implement such a system. The system can be easily extended and applied to other buildings for space usage assessment purposes with minimal development efforts.
d
Point-of-Interest (POI) Data | Global Coverage | 250M Business Listings Data...
datarade.ai
.json, .csv, .xls
Updated Jan 30, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Quadrant (2022). Point-of-Interest (POI) Data | Global Coverage | 250M Business Listings Data with Custom On-Demand Attributes [Dataset]. https://datarade.ai/data-products/quadrant-point-of-interest-poi-data-business-listings-dat-quadrant
Explore at:
.json, .csv, .xlsAvailable download formats
Dataset updated
Jan 30, 2022
Dataset authored and provided by
Quadrant
Area covered
France
Description
We seek to mitigate the challenges with web-scraped and off-the-shelf POI data, and provide tailored, complete, and manually verified datasets with Geolancer. Our goal is to help represent the physical world accurately for applications and services dependent on precise POI data, and offer a reliable basis for geospatial analysis and intelligence.

Our POI database is powered by our proprietary POI collection and verification platform, Geolancer, which provides manually verified, authentic, accurate, and up-to-date POI datasets.

Enrich your geospatial applications with a contextual layer of comprehensive and actionable information on landmarks, key features, business areas, and many more granular, on-demand attributes. We offer on-demand data collection and verification services that fit unique use cases and business requirements. Using our advanced data acquisition techniques, we build and offer tailormade POI datasets. Combined with our expertise in location data solutions, we can be a holistic data partner for our customers.

KEY FEATURES - Our proprietary, industry-leading manual verification platform Geolancer delivers up-to-date, authentic data points

POI-as-a-Service with on-demand verification and collection in 170+ countries leveraging our network of 1M+ contributors

Customise your feed by specific refresh rate, location, country, category, and brand based on your specific needs

Data Noise Filtering Algorithms normalise and de-dupe POI data that is ready for analysis with minimal preparation

DATA QUALITY

Quadrant’s POI data are manually collected and verified by Geolancers. Our network of freelancers, maps cities and neighborhoods adding and updating POIs on our proprietary app Geolancer on their smartphone. Compared to other methods, this process guarantees accuracy and promises a healthy stream of POI data. This method of data collection also steers clear of infringement on users’ privacy and sale of their location data. These purpose-built apps do not store, collect, or share any data other than the physical location (without tying context back to an actual human being and their mobile device).

USE CASES

The main goal of POI data is to identify a place of interest, establish its accurate location, and help businesses understand the happenings around that place to make better, well-informed decisions. POI can be essential in assessing competition, improving operational efficiency, planning the expansion of your business, and more.

It can be used by businesses to power their apps and platforms for last-mile delivery, navigation, mapping, logistics, and more. Combined with mobility data, POI data can be employed by retail outlets to monitor traffic to one of their sites or of their competitors. Logistics businesses can save costs and improve customer experience with accurate address data. Real estate companies use POI data for site selection and project planning based on market potential. Governments can use POI data to enforce regulations, monitor public health and well-being, plan public infrastructure and services, and more. A few common and widespread use cases of POI data are:

Navigation and mapping for digital marketplaces and apps.

Logistics for online shopping, food delivery, last-mile delivery, and more.

Improving operational efficiency for rideshare and transportation platforms.

Demographic and human mobility studies for market consumption and competitive analysis.

Market assessment, site selection, and business expansion.

Disaster management and urban mapping for public welfare.

Advertising and marketing deployment and ROI assessment.

Real-estate mapping for online sales and renting platforms.About Geolancer

ABOUT GEOLANCER

Quadrant's POI-as-a-Service is powered by Geolancer, our industry-leading manual verification project. Geolancers, equipped with a smartphone running our proprietary app, manually add and verify POI data points, ensuring accuracy and authenticity. Geolancer helps data buyers acquire data with the update frequency suited for their specific use case.
Geographic Product Demand
kaggle.com
zip
Updated May 13, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Soumyadip Sarkar (2025). Geographic Product Demand [Dataset]. https://www.kaggle.com/datasets/neuralsorcerer/geographic-product-demand-dataset/discussion
Explore at:
zip(166660269 bytes)Available download formats
Dataset updated
May 13, 2025
Authors
Soumyadip Sarkar
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset Description

This dataset contains ten million synthetically generated sales transactions from various geographic locations across the globe. It includes details on product sales, revenue, geographic coordinates, and other relevant features that can be used for analyzing geographic influences on product demand.

File Information:

File Name: geographic_product_demand_dataset_10M.csv

Number of Records: 10,000,000

Size: Approximately 903 MB

Columns: 11

Columns Description:

Location ID: A unique identifier for each location.

City: The city where the sales occurred.

State: The state where the sales occurred, if applicable.

Country: The country where the sales occurred.

Latitude: Latitude coordinates for the sales location.

Longitude: Longitude coordinates for the sales location.

Product ID: A unique identifier for each product.

Product Category: The category of the product (e.g., Tablet, Washing Machine).

Sales Volume: The number of units sold in the transaction.

Sales Revenue: The revenue generated from the sale.

Date: The date of the sales transaction (in YYYY-MM-DD format).

Usage

This dataset is designed for geospatial analysis of product demand, sales forecasting, and machine learning tasks. You can explore geographic patterns in consumer demand and analyze how product categories and sales revenues vary across different regions.

Example Use Cases:

Sales Analysis: Explore how different regions vary in terms of demand for luxury goods versus essential goods.

Geospatial Analysis: Visualize the geographic distribution of sales volumes and revenues.

Time Series Analysis: Investigate how product demand changes over time and across different regions.

Machine Learning: Build models to predict sales revenue based on geographic and product-related factors.

Data Preprocessing Tips:

Convert the Date column to a datetime format before conducting temporal analysis.

Use one-hot encoding for categorical variables like Product Category if applying machine learning models.

Utilize latitude and longitude coordinates for geospatial visualizations.
d
GIS Data | Global Geospatial data | Postal/Administrative boundaries |...
datarade.ai
.json, .xml
Updated Mar 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
GeoPostcodes (2025). GIS Data | Global Geospatial data | Postal/Administrative boundaries | Countries, Regions, Cities, Suburbs, and more [Dataset]. https://datarade.ai/data-products/geopostcodes-gis-data-gesopatial-data-postal-administrati-geopostcodes
Explore at:
.json, .xmlAvailable download formats
Dataset updated
Mar 4, 2025
Dataset authored and provided by
GeoPostcodes
Area covered
France, United States
Description
Overview

Empower your location data visualizations with our edge-matched polygons, even in difficult geographies.

Our self-hosted GIS data cover administrative and postal divisions with up to 6 precision levels: a zip code layer and up to 5 administrative levels. All levels follow a seamless hierarchical structure with no gaps or overlaps.

The geospatial data shapes are offered in high-precision and visualization resolution and are easily customized on-premise.

Use cases for the Global Boundaries Database (GIS data, Geospatial data)

In-depth spatial analysis

Clustering

Geofencing

Reverse Geocoding

Reporting and Business Intelligence (BI)

Product Features

Coherence and precision at every level

Edge-matched polygons

High-precision shapes for spatial analysis

Fast-loading polygons for reporting and BI

Multi-language support

For additional insights, you can combine the GIS data with:

Population data: Historical and future trends

UNLOCODE and IATA codes

Time zones and Daylight Saving Time (DST)

Data export methodology

Our geospatial data packages are offered in variable formats, including - .shp - .gpkg - .kml - .shp - .gpkg - .kml - .geojson

All GIS data are optimized for seamless integration with popular systems like Esri ArcGIS, Snowflake, QGIS, and more.

Why companies choose our map data

Precision at every level

Coverage of difficult geographies

No gaps, nor overlaps

Note: Custom geospatial data packages are available. Please submit a request via the above contact button for more details.
Iraq: Road Surface Data
data.humdata.org
geojson, geopackage
Updated Aug 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
HeiGIT (Heidelberg Institute for Geoinformation Technology) (2025). Iraq: Road Surface Data [Dataset]. https://data.humdata.org/dataset/iraq-road-surface-data
Explore at:
geojson(802433366), geopackage(303861760)Available download formats
Dataset updated
Aug 26, 2025
Dataset provided by
HeiGIThttps://heigit.org/
License
Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
Area covered
Iraq
Description
This dataset provides detailed information on road surfaces from OpenStreetMap (OSM) data, distinguishing between paved and unpaved surfaces across the region. This information is based on road surface prediction derived from hybrid deep learning approach. For more information on Methods, refer to the paper

Roughly 0.2839 million km of roads are mapped in OSM in this region. Based on AI-mapped estimates the share of paved and unpaved roads is approximately 0.026 and 0.0089 (in million kms), corressponding to 9.1664% and 3.1261% respectively of the total road length in the dataset region. 0.249 million km or 87.7075% of road surface information is missing in OSM. In order to fill this gap, Mapillary derived road surface dataset provides an additional 0.0003 million km of information (corressponding to 0.1046% of total missing information on road surface)

It is intended for use in transportation planning, infrastructure analysis, climate emissions and geographic information system (GIS) applications.

This dataset provides comprehensive information on road and urban area features, including location, surface quality, and classification metadata. This dataset includes attributes from OpenStreetMap (OSM) data, AI predictions for road surface, and urban classifications.

AI features:

pred_class: Model-predicted class for the road surface, with values "paved" or "unpaved."

pred_label: Binary label associated with pred_class (0 = paved, 1 = unpaved).

osm_surface_class: Classification of the surface type from OSM, categorized as "paved" or "unpaved."

combined_surface_osm_priority: Surface classification combining pred_label and surface(OSM) while prioritizing the OSM surface tag, classified as "paved" or "unpaved."

combined_surface_DL_priority: Surface classification combining pred_label and surface(OSM) while prioritizing DL prediction pred_label, classified as "paved" or "unpaved."

n_of_predictions_used: Number of predictions used for the feature length estimation.

predicted_length: Predicted length based on the DL model’s estimations, in meters.

DL_mean_timestamp: Mean timestamp of the predictions used, for comparison.

OSM features may have these attributes(Learn what tags mean here):

name: Name of the feature, if available in OSM.

name:en: Name of the feature in English, if available in OSM.

name:* (in local language): Name of the feature in the local official language, where available.

highway: Road classification based on OSM tags (e.g., residential, motorway, footway).

surface: Description of the surface material of the road (e.g., asphalt, gravel, dirt).

smoothness: Assessment of surface smoothness (e.g., excellent, good, intermediate, bad).

width: Width of the road, where available.

lanes: Number of lanes on the road.

oneway: Indicates if the road is one-way (yes or no).

bridge: Specifies if the feature is a bridge (yes or no).

layer: Indicates the layer of the feature in cases where multiple features are stacked (e.g., bridges, tunnels).

source: Source of the data, indicating the origin or authority of specific attributes.

Urban classification features may have these attributes:

continent: The continent where the data point is located (e.g., Europe, Asia).

country_iso_a2: The ISO Alpha-2 code representing the country (e.g., "US" for the United States).

urban: Binary indicator for urban areas based on the GHSU Urban Layer 2019. (0 = rural, 1 = urban)

urban_area: Name of the urban area or city where the data point is located.

osm_id: Unique identifier assigned by OpenStreetMap (OSM) to each feature.

osm_type: Type of OSM element (e.g., node, way, relation).

The data originates from OpenStreetMap (OSM) and is augmented with model predictions using images downloaded from Mapillary in combination with the GHSU Global Human Settlement Urban Layer 2019 and AFRICAPOLIS2020 urban layer.

This dataset is one of many HeiGIT exports on HDX. See the HeiGIT website for more information.

We are looking forward to hearing about your use-case! Feel free to reach out to us and tell us about your research at communications@heigit.org – we would be happy to amplify your work.
Enriched NYTimes COVID19 U.S. County Dataset
kaggle.com
zip
Updated Jun 14, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ringhilterra17 (2020). Enriched NYTimes COVID19 U.S. County Dataset [Dataset]. https://www.kaggle.com/ringhilterra17/enrichednytimescovid19
Explore at:
zip(11291611 bytes)Available download formats
Dataset updated
Jun 14, 2020
Authors
ringhilterra17
License
Attribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
License information was derived automatically
Area covered
United States
Description
Overview and Inspiration

I wanted to make some geospatial visualizations to convey the current severity of COVID19 in different parts of the U.S..

I liked the NYTimes COVID dataset, but it was lacking information on county boundary shape data, population per county, new cases / deaths per day, and per capita calculations, and county demographics.

After a lot of work tracking down the different data sources I wanted and doing all of the data wrangling and joins in python, I wanted to open-source the final enriched data set in order to give others a head start in their COVID-19 related analytic, modeling, and visualization efforts.

This dataset is enriched with county shapes, county center point coordinates, 2019 census population estimates, county population densities, cases and deaths per capita, and calculated per day cases / deaths metrics. It contains daily data per county back to January, allowing for analyizng changes over time.

UPDATE: I have also included demographic information per county, including ages, races, and gender breakdown. This could help determine which counties are most susceptible to an outbreak.

How this data can be used

Geospatial analysis and visualization - Which counties are currently getting hit the hardest (per capita and totals)? - What patterns are there in the spread of the virus across counties? (network based spread simulations using county center lat / lons) -county population densities play a role in how quickly the virus spreads? -how does a specific county/state cases and deaths compare to other counties/states? Join with other county level datasets easily (with fips code column)

Content Details

See the column descriptions for more details on the dataset

Visualizations and Analysis Examples

COVID-19 U.S. Time-lapse: Confirmed Cases per County (per capita)

https://github.com/ringhilterra/enriched-covid19-data/blob/master/example_viz/covid-cases-final-04-06.gif?raw=true" alt="">-

Other Data Notes

Please review nytimes README for detailed notes on Covid-19 data - https://github.com/nytimes/covid-19-data/

The only update I made in regards to 'Geographic Exceptions', is that I took 'New York City' county provided in the Covid-19 data, which has all cases for 'for the five boroughs of New York City (New York, Kings, Queens, Bronx and Richmond counties) and replaced the missing FIPS for those rows with the 'New York County' fips code 36061. That way I could join to a geometry, and then I used the sum of those five boroughs population estimates for the 'New York City' estimate, which allowed me calculate 'per capita' metrics for 'New York City' entries in the Covid-19 dataset

Acknowledgements

Special thanks to NYTimes for all of their hard work gathering and consolidating all of the U.S. COVID19 related data on daily basis. Their git repo https://github.com/nytimes/covid-19-data/

Also, thanks to ykzeng for the county population density estimates: https://github.com/ykzeng/covid-19/tree/master/data-
Generalizing OD Maps to Explore Multi-dimensional Geospatial Datasets
tandf.figshare.com
png
Updated Feb 19, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Liqun Liu; Romain Vuillemot; Philippe Rivière; Jeremy Boy; Aurélien Tabard (2025). Generalizing OD Maps to Explore Multi-dimensional Geospatial Datasets [Dataset]. http://doi.org/10.6084/m9.figshare.27179524.v1
Explore at:
pngAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.27179524.v1
Dataset updated
Feb 19, 2025
Dataset provided by
Taylor & Francishttps://taylorandfrancis.com/
Authors
Liqun Liu; Romain Vuillemot; Philippe Rivière; Jeremy Boy; Aurélien Tabard
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Understanding the mobility of entities in geospatial data is important to many fields, ranging from the social sciences to epidemiology, economics or air traffic control. Visualizing such entities can be challenging as it requires preserving both their explicit properties (spatial trajectories) and their implicit properties (abstract attributes of those trajectories). An existing technique called origin–destination maps preserves both explicit and implicit properties of datasets, using the spatial nesting technique. In this paper, we aim at generalizing this technique beyond an origins-and-destinations dataset (2-attribute datasets), to explore multi-dimensional datasets (N-attribute datasets) with the nesting approach. We present an abstraction framework – we call Gridify – and an interactive open-source tool implementing this framework using several levels of nested maps. We report on several case studies representative of the types of dimensions found in geospatial datasets (quantitative, temporal, discrete, boolean), showing the applicability of this approach to achieve visual exploratory analysis tasks in various application domains.
d
Geospatial data for object-based high-resolution classification of conifers...
catalog.data.gov
data.usgs.gov
+2more
Updated Nov 26, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2025). Geospatial data for object-based high-resolution classification of conifers within greater sage-grouse habitat across Nevada and a portion of northeastern California (ver. 2.0 July 2018) [Dataset]. https://catalog.data.gov/dataset/geospatial-data-for-object-based-high-resolution-classification-of-conifers-within-greater
Explore at:
Dataset updated
Nov 26, 2025
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Description
These products were developed to provide scientific and correspondingly spatially explicit information regarding the distribution and abundance of conifers (namely, singleleaf pinyon (Pinus monophylla), Utah juniper (Juniperus osteosperma), and western juniper (Juniperus occidentalis)) in Nevada and portions of northeastern California. Encroachment of these trees into sagebrush ecosystems of the Great Basin can present a threat to populations of greater sage-grouse (Centrocercus urophasianus). These data provide land managers and other interested parties with a high-resolution representation of conifers across the range of sage-grouse habitat in Nevada and northeastern California that can be used for a variety of management and research applications. We mapped conifer trees at 1 x 1 meter resolution across the extent of all Nevada Department of Wildlife Sage-grouse Population Management Units plus a 10 km buffer. Using 2010 and 2013 National Agriculture Imagery Program digital orthophoto quads (DOQQs) as our reference imagery, we applied object-based image analysis with Feature Analyst software (Overwatch, 2013) to classify conifer features across our study extent. This method relies on machine learning algorithms that extract features from imagery based on their spectral and spatial signatures. Conifers in 6230 DOQQs were classified and outputs were then tested for errors of omission and commission using stratified random sampling. Results of the random sampling were used to populate a confusion matrix and calculate the overall map accuracy of 84.3 percent. We provide 5 sets of products for this mapping process across the entire mapping extent: (1) a shapefile representing accuracy results linked to our mapping subunits; (2) binary rasters representing conifer presence or absence at a 1 x 1 meter resolution; (3) a 30 x 30 meter resolution raster representing percentage of conifer canopy cover within each cell from 0 to 100; (4) 1 x 1 meter resolution canopy cover classification rasters derived from a 50 meter radius moving window analysis; and (5) a raster prioritizing pinyon-juniper management for sage-grouse habitat restoration efforts. The latter three products can be reclassified into user-specified bins to meet different management or study objectives, which include approximations for phases of encroachment. These products complement, and in some cases improve upon, existing conifer maps in the western United States, and will help facilitate sage-grouse habitat management and sagebrush ecosystem restoration. These data support the following publication: Coates, P.S., Gustafson, K.B., Roth, C.L., Chenaille, M.P., Ricca, M.A., Mauch, Kimberly, Sanchez-Chopitea, Erika, Kroger, T.J., Perry, W.M., and Casazza, M.L., 2017, Using object-based image analysis to conduct high-resolution conifer extraction at regional spatial scales: U.S. Geological Survey Open-File Report 2017-1093, 40 p., https://doi.org/10.3133/ofr20171093. References: ESRI, 2013, ArcGIS Desktop: Release 10.2: Environmental Systems Research Institute. Overwatch, 2013, Feature Analyst Version 5.1.2.0 for ArcGIS: Overwatch Systems Ltd.
Data from: Crime Hot Spot Forecasting with Data from the Pittsburgh...
icpsr.umich.edu
datasets.ai
+1more
ascii, delimited, r +3
Updated Aug 7, 2015
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gorr, Wilpen L.; Olligschlaeger, Andreas (2015). Crime Hot Spot Forecasting with Data from the Pittsburgh [Pennsylvania] Bureau of Police, 1990-1998 [Dataset]. http://doi.org/10.3886/ICPSR03469.v1
Explore at:
stata, delimited, spss, sas, r, asciiAvailable download formats
Unique identifier
https://doi.org/10.3886/ICPSR03469.v1
Dataset updated
Aug 7, 2015
Dataset provided by
Inter-university Consortium for Political and Social Researchhttps://www.icpsr.umich.edu/web/pages/
Authors
Gorr, Wilpen L.; Olligschlaeger, Andreas
License
https://www.icpsr.umich.edu/web/ICPSR/studies/3469/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/3469/terms
Time period covered
1990 - 1998
Area covered
United States, Pennsylvania, Pittsburgh
Description
This study used crime count data from the Pittsburgh, Pennsylvania, Bureau of Police offense reports and 911 computer-aided dispatch (CAD) calls to determine the best univariate forecast method for crime and to evaluate the value of leading indicator crime forecast models. The researchers used the rolling-horizon experimental design, a design that maximizes the number of forecasts for a given time series at different times and under different conditions. Under this design, several forecast models are used to make alternative forecasts in parallel. For each forecast model included in an experiment, the researchers estimated models on training data, forecasted one month ahead to new data not previously seen by the model, and calculated and saved the forecast error. Then they added the observed value of the previously forecasted data point to the next month's training data, dropped the oldest historical data point, and forecasted the following month's data point. This process continued over a number of months. A total of 15 statistical datasets and 3 geographic information systems (GIS) shapefiles resulted from this study. The statistical datasets consist of Univariate Forecast Data by Police Precinct (Dataset 1) with 3,240 cases Output Data from the Univariate Forecasting Program: Sectors and Forecast Errors (Dataset 2) with 17,892 cases Multivariate, Leading Indicator Forecast Data by Grid Cell (Dataset 3) with 5,940 cases Output Data from the 911 Drug Calls Forecast Program (Dataset 4) with 5,112 cases Output Data from the Part One Property Crimes Forecast Program (Dataset 5) with 5,112 cases Output Data from the Part One Violent Crimes Forecast Program (Dataset 6) with 5,112 cases Input Data for the Regression Forecast Program for 911 Drug Calls (Dataset 7) with 10,011 cases Input Data for the Regression Forecast Program for Part One Property Crimes (Dataset 8) with 10,011 cases Input Data for the Regression Forecast Program for Part One Violent Crimes (Dataset 9) with 10,011 cases Output Data from Regression Forecast Program for 911 Drug Calls: Estimated Coefficients for Leading Indicator Models (Dataset 10) with 36 cases Output Data from Regression Forecast Program for Part One Property Crimes: Estimated Coefficients for Leading Indicator Models (Dataset 11) with 36 cases Output Data from Regression Forecast Program for Part One Violent Crimes: Estimated Coefficients for Leading Indicator Models (Dataset 12) with 36 cases Output Data from Regression Forecast Program for 911 Drug Calls: Forecast Errors (Dataset 13) with 4,936 cases Output Data from Regression Forecast Program for Part One Property Crimes: Forecast Errors (Dataset 14) with 4,936 cases Output Data from Regression Forecast Program for Part One Violent Crimes: Forecast Errors (Dataset 15) with 4,936 cases. The GIS Shapefiles (Dataset 16) are provided with the study in a single zip file: Included are polygon data for the 4,000 foot, square, uniform grid system used for much of the Pittsburgh crime data (grid400); polygon data for the 6 police precincts, alternatively called districts or zones, of Pittsburgh(policedist); and polygon data for the 3 major rivers in Pittsburgh the Allegheny, Monongahela, and Ohio (rivers).
d
GIS Data the Netherlands | Mapping Data | 3.7M+ Places in the Netherlands
datarade.ai
Updated Mar 3, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
InfobelPRO (2025). GIS Data the Netherlands | Mapping Data | 3.7M+ Places in the Netherlands [Dataset]. https://datarade.ai/data-products/gis-data-the-netherlands-mapping-data-3-7m-places-in-the-infobelpro
Explore at:
.bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
Dataset updated
Mar 3, 2025
Dataset authored and provided by
InfobelPRO
Area covered
Netherlands
Description
Unlock precise, high-quality GIS data covering 3.7M+ verified locations across the Netherlands. With 50+ enriched attributes including coordinates, building structures, and spatial geometry our dataset provides the granularity and accuracy needed for in-depth spatial analysis. Powered by AI-driven enrichment and deduplication, and backed by 30+ years of expertise, our GIS solutions support industries ranging from mapping and navigation to urban planning and market analysis, helping businesses and organizations make smarter, data-driven decisions.

Key use cases of GIS Data helping our customers :

Optimize Mapping & Spatial Analysis : Use GIS data to analyse landscapes, urban infrastructure, and competitor locations, ensuring data-driven planning and decision-making.

Enhance Navigation & Location-Based Services : Improve real-time route planning, asset tracking, and EV charging station discovery for seamless location-based experiences.

Identify Strategic Sites for Business Expansion : Leverage GIS intelligence to select optimal retail sites, franchise locations, and warehouses with precision.

Improve Logistics & Address Accuracy : Streamline delivery networks, validate addresses, and optimize courier routes to boost efficiency and customer satisfaction.

Support Environmental & Urban Development Initiatives : Utilize GIS insights for disaster preparedness, sustainable city planning, and land-use management.
c
California Overlapping Cities and Counties and Identifiers
gis.data.ca.gov
data.ca.gov
+1more
Updated Sep 16, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
California Department of Technology (2024). California Overlapping Cities and Counties and Identifiers [Dataset]. https://gis.data.ca.gov/datasets/california-overlapping-cities-and-counties-and-identifiers/about
Explore at:
Dataset updated
Sep 16, 2024
Dataset authored and provided by
California Department of Technology
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Area covered

Description
WARNING: This is a pre-release dataset and its fields names and data structures are subject to change. It should be considered pre-release until the end of 2024. Expected changes:Metadata is missing or incomplete for some layers at this time and will be continuously improved.We expect to update this layer roughly in line with CDTFA at some point, but will increase the update cadence over time as we are able to automate the final pieces of the process.This dataset is continuously updated as the source data from CDTFA is updated, as often as many times a month. If you require unchanging point-in-time data, export a copy for your own use rather than using the service directly in your applications.PurposeCounty and incorporated place (city) boundaries along with third party identifiers used to join in external data. Boundaries are from the authoritative source the California Department of Tax and Fee Administration (CDTFA), altered to show the counties as one polygon. This layer displays the city polygons on top of the County polygons so the area isn"t interrupted. The GEOID attribute information is added from the US Census. GEOID is based on merged State and County FIPS codes for the Counties. Abbreviations for Counties and Cities were added from Caltrans Division of Local Assistance (DLA) data. Place Type was populated with information extracted from the Census. Names and IDs from the US Board on Geographic Names (BGN), the authoritative source of place names as published in the Geographic Name Information System (GNIS), are attached as well. Finally, coastal buffers are removed, leaving the land-based portions of jurisdictions. This feature layer is for public use.Related LayersThis dataset is part of a grouping of many datasets:Cities: Only the city boundaries and attributes, without any unincorporated areasWith Coastal BuffersWithout Coastal BuffersCounties: Full county boundaries and attributes, including all cities within as a single polygonWith Coastal BuffersWithout Coastal BuffersCities and Full Counties: A merge of the other two layers, so polygons overlap within city boundaries. Some customers require this behavior, so we provide it as a separate service.With Coastal BuffersWithout Coastal Buffers (this dataset)Place AbbreviationsUnincorporated Areas (Coming Soon)Census Designated Places (Coming Soon)Cartographic CoastlinePolygonLine source (Coming Soon)Working with Coastal BuffersThe dataset you are currently viewing includes the coastal buffers for cities and counties that have them in the authoritative source data from CDTFA. In the versions where they are included, they remain as a second polygon on cities or counties that have them, with all the same identifiers, and a value in the COASTAL field indicating if it"s an ocean or a bay buffer. If you wish to have a single polygon per jurisdiction that includes the coastal buffers, you can run a Dissolve on the version that has the coastal buffers on all the fields except COASTAL, Area_SqMi, Shape_Area, and Shape_Length to get a version with the correct identifiers.Point of ContactCalifornia Department of Technology, Office of Digital Services, odsdataservices@state.ca.govField and Abbreviation DefinitionsCOPRI: county number followed by the 3-digit city primary number used in the Board of Equalization"s 6-digit tax rate area numbering systemPlace Name: CDTFA incorporated (city) or county nameCounty: CDTFA county name. For counties, this will be the name of the polygon itself. For cities, it is the name of the county the city polygon is within.Legal Place Name: Board on Geographic Names authorized nomenclature for area names published in the Geographic Name Information SystemGNIS_ID: The numeric identifier from the Board on Geographic Names that can be used to join these boundaries to other datasets utilizing this identifier.GEOID: numeric geographic identifiers from the US Census Bureau Place Type: Board on Geographic Names authorized nomenclature for boundary type published in the Geographic Name Information SystemPlace Abbr: CalTrans Division of Local Assistance abbreviations of incorporated area namesCNTY Abbr: CalTrans Division of Local Assistance abbreviations of county namesArea_SqMi: The area of the administrative unit (city or county) in square miles, calculated in EPSG 3310 California Teale Albers.COASTAL: Indicates if the polygon is a coastal buffer. Null for land polygons. Additional values include "ocean" and "bay".GlobalID: While all of the layers we provide in this dataset include a GlobalID field with unique values, we do not recommend you make any use of it. The GlobalID field exists to support offline sync, but is not persistent, so data keyed to it will be orphaned at our next update. Use one of the other persistent identifiers, such as GNIS_ID or GEOID instead.AccuracyCDTFA"s source data notes the following about accuracy:City boundary changes and county boundary line adjustments filed with the Board of Equalization per Government Code 54900. This GIS layer contains the boundaries of the unincorporated county and incorporated cities within the state of California. The initial dataset was created in March of 2015 and was based on the State Board of Equalization tax rate area boundaries. As of April 1, 2024, the maintenance of this dataset is provided by the California Department of Tax and Fee Administration for the purpose of determining sales and use tax rates. The boundaries are continuously being revised to align with aerial imagery when areas of conflict are discovered between the original boundary provided by the California State Board of Equalization and the boundary made publicly available by local, state, and federal government. Some differences may occur between actual recorded boundaries and the boundaries used for sales and use tax purposes. The boundaries in this map are representations of taxing jurisdictions for the purpose of determining sales and use tax rates and should not be used to determine precise city or county boundary line locations. COUNTY = county name; CITY = city name or unincorporated territory; COPRI = county number followed by the 3-digit city primary number used in the California State Board of Equalization"s 6-digit tax rate area numbering system (for the purpose of this map, unincorporated areas are assigned 000 to indicate that the area is not within a city).Boundary ProcessingThese data make a structural change from the source data. While the full boundaries provided by CDTFA include coastal buffers of varying sizes, many users need boundaries to end at the shoreline of the ocean or a bay. As a result, after examining existing city and county boundary layers, these datasets provide a coastline cut generally along the ocean facing coastline. For county boundaries in northern California, the cut runs near the Golden Gate Bridge, while for cities, we cut along the bay shoreline and into the edge of the Delta at the boundaries of Solano, Contra Costa, and Sacramento counties.In the services linked above, the versions that include the coastal buffers contain them as a second (or third) polygon for the city or county, with the value in the COASTAL field set to whether it"s a bay or ocean polygon. These can be processed back into a single polygon by dissolving on all the fields you wish to keep, since the attributes, other than the COASTAL field and geometry attributes (like areas) remain the same between the polygons for this purpose.SliversIn cases where a city or county"s boundary ends near a coastline, our coastline data may cross back and forth many times while roughly paralleling the jurisdiction"s boundary, resulting in many polygon slivers. We post-process the data to remove these slivers using a city/county boundary priority algorithm. That is, when the data run parallel to each other, we discard the coastline cut and keep the CDTFA-provided boundary, even if it extends into the ocean a small amount. This processing supports consistent boundaries for Fort Bragg, Point Arena, San Francisco, Pacifica, Half Moon Bay, and Capitola, in addition to others. More information on this algorithm will be provided soon.Coastline CaveatsSome cities have buffers extending into water bodies that we do not cut at the shoreline. These include South Lake Tahoe and Folsom, which extend into neighboring lakes, and San Diego and surrounding cities that extend into San Diego Bay, which our shoreline encloses. If you have feedback on the exclusion of these items, or others, from the shoreline cuts, please reach out using the contact information above.Offline UseThis service is fully enabled for sync and export using Esri Field Maps or other similar tools. Importantly, the GlobalID field exists only to support that use case and should not be used for any other purpose (see note in field descriptions).Updates and Date of ProcessingConcurrent with CDTFA updates, approximately every two weeks, Last Processed: 12/17/2024 by Nick Santos using code path at https://github.com/CDT-ODS-DevSecOps/cdt-ods-gis-city-county/ at commit 0bf269d24464c14c9cf4f7dea876aa562984db63. It incorporates updates from CDTFA as of 12/12/2024. Future updates will include improvements to metadata and update frequency.
d
Spatiotemporal Data Analysis with Codeless Visual Programming
search.dataone.org
dataverse.harvard.edu
Updated Feb 21, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Spatial Data Lab (2024). Spatiotemporal Data Analysis with Codeless Visual Programming [Dataset]. http://doi.org/10.7910/DVN/I0AWAM
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/I0AWAM
Dataset updated
Feb 21, 2024
Dataset provided by
Harvard Dataverse
Authors
Spatial Data Lab
Description
This seminar will introduce the KNIME Analytics Platform and its Geospatial Analytics extension developed by the Spatial Data Lab (SDL) team at Harvard's Center for Geographic Analysis (CGA). The SDL team members will share the presentations, presenting the project's vision and demonstrating the new way of performing geospatial analysis in a codeless visual way with case studies.
Geospatial Deep Learning Seminar Online Course
ckan.americaview.org
Updated Nov 2, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ckan.americaview.org (2021). Geospatial Deep Learning Seminar Online Course [Dataset]. https://ckan.americaview.org/dataset/geospatial-deep-learning-seminar-online-course
Explore at:
Dataset updated
Nov 2, 2021
Dataset provided by
CKANhttps://ckan.org/
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This seminar is an applied study of deep learning methods for extracting information from geospatial data, such as aerial imagery, multispectral imagery, digital terrain data, and other digital cartographic representations. We first provide an introduction and conceptualization of artificial neural networks (ANNs). Next, we explore appropriate loss and assessment metrics for different use cases followed by the tensor data model, which is central to applying deep learning methods. Convolutional neural networks (CNNs) are then conceptualized with scene classification use cases. Lastly, we explore semantic segmentation, object detection, and instance segmentation. The primary focus of this course is semantic segmenation for pixel-level classification. The associated GitHub repo provides a series of applied examples. We hope to continue to add examples as methods and technologies further develop. These examples make use of a vareity of datasets (e.g., SAT-6, topoDL, Inria, LandCover.ai, vfillDL, and wvlcDL). Please see the repo for links to the data and associated papers. All examples have associated videos that walk through the process, which are also linked to the repo. A variety of deep learning architectures are explored including UNet, UNet++, DeepLabv3+, and Mask R-CNN. Currenlty, two examples use ArcGIS Pro and require no coding. The remaining five examples require coding and make use of PyTorch, Python, and R within the RStudio IDE. It is assumed that you have prior knowledge of coding in the Python and R enviroinments. If you do not have experience coding, please take a look at our Open-Source GIScience and Open-Source Spatial Analytics (R) courses, which explore coding in Python and R, respectively. After completing this seminar you will be able to: explain how ANNs work including weights, bias, activation, and optimization. describe and explain different loss and assessment metrics and determine appropriate use cases. use the tensor data model to represent data as input for deep learning. explain how CNNs work including convolutional operations/layers, kernel size, stride, padding, max pooling, activation, and batch normalization. use PyTorch, Python, and R to prepare data, produce and assess scene classification models, and infer to new data. explain common semantic segmentation architectures and how these methods allow for pixel-level classification and how they are different from traditional CNNs. use PyTorch, Python, and R (or ArcGIS Pro) to prepare data, produce and assess semantic segmentation models, and infer to new data.
HIFLD California Dam Lines
gis-calema.opendata.arcgis.com
hub.arcgis.com
Updated Jan 9, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CA Governor's Office of Emergency Services (2023). HIFLD California Dam Lines [Dataset]. https://gis-calema.opendata.arcgis.com/datasets/hifld-california-dam-lines/about
Explore at:
Dataset updated
Jan 9, 2023
Dataset provided by
California Governor's Office of Emergency Services
Authors
CA Governor's Office of Emergency Services
Area covered
California,
Description
Homeland Security Use Cases: Use cases describe how the data may be used and help to define and clarify requirements. 1) In the event of a threat against the dams infrastructure, this dataset may be used to locate dams needing protection. 2) An accident has occurred at a dam and emergency medical personnel / rescue personnel must quickly deploy to the dam. 3) A resource for situational awareness planning and response for federal government eventsDam locations were digitized using any combination of ortho imagery, topographic DRGs, NAVTEQ streets, NHD flowlines, NHD landmarklines, TIGER hydrography, contact with authoritative sources or web research. A line was created by tracing the crest of the dam using referencing imagery and the NHD flowlines. Entities classified as both MaxCapacity and High Hazard are represented once. Text fields in this dataset have been set to all upper case to facilitate consistent database engine search results. All diacritics (e.g., the German umlaut or the Spanish tilde) have been replaced with their closest equivalent English character to facilitate use with database systems that may not support diacritics. No entities for American Samoa, District of Columbia, the Northern Mariana Islands or the Virgin Islands are included in this dataset. The currentness of this dataset is indicated by the [GEODATE] attribute. HIFLD source.
n
SEN12TS: A SAR and Multispectral Dataset for Land Cover Classification
cmr.earthdata.nasa.gov
access.earthdata.nasa.gov
Updated Oct 10, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). SEN12TS: A SAR and Multispectral Dataset for Land Cover Classification [Dataset]. http://doi.org/10.34911/rdnt.9qh1mb
Explore at:
Unique identifier
https://doi.org/10.34911/rdnt.9qh1mb
Dataset updated
Oct 10, 2023
Time period covered
Jan 1, 2020 - Jan 1, 2023
Area covered
Description
The SEN12TS dataset contains Sentinel-1, Sentinel-2, and labeled land cover image triplets over six agro-ecologically diverse areas of interest: California, Iowa, Catalonia, Ethiopia, Uganda, and Sumatra. Using the Descartes Labs geospatial analytics platform, 246,400 triplets are produced at 10m resolution over 31,398 256-by-256-pixel unique spatial tiles for a total size of 1.69 TB. The image triplets include radiometric terrain corrected synthetic aperture radar (SAR) backscatter measurements; interferometric synthetic aperture radar (InSAR) coherence and phase layers; local incidence angle and ground slope values; multispectral optical imagery; and decameter-resolution land cover data. Moreover, sensed imagery is available in timeseries: Within an image triplet, radar-derived imagery is collected at four timesteps 12 days apart. For the same spatial extent, up to 16 image triplets are available across the calendar year of 2020.

The SEN12TS documentation demonstrates two initial use cases for the dataset. The first transforms radar imagery into enhanced vegetation indices by means of a generative adversarial network, and the second tests combinations of input imagery for cropland classification.
Socio-Demographic Predictors and Distribution of Pulmonary Tuberculosis (TB)...
plos.figshare.com
tiff
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Atikaimu Wubuli; Feng Xue; Daobin Jiang; Xuemei Yao; Halmurat Upur; Qimanguli Wushouer (2023). Socio-Demographic Predictors and Distribution of Pulmonary Tuberculosis (TB) in Xinjiang, China: A Spatial Analysis [Dataset]. http://doi.org/10.1371/journal.pone.0144010
Explore at:
tiffAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0144010
Dataset updated
May 31, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Atikaimu Wubuli; Feng Xue; Daobin Jiang; Xuemei Yao; Halmurat Upur; Qimanguli Wushouer
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Xinjiang, China
Description
ObjectivesXinjiang is one of the high TB burden provinces of China. A spatial analysis was conducted using geographical information system (GIS) technology to improve the understanding of geographic variation of the pulmonary TB occurrence in Xinjiang, its predictors, and to search for targeted interventions.MethodsNumbers of reported pulmonary TB cases were collected at county/district level from TB surveillance system database. Population data were extracted from Xinjiang Statistical Yearbook (2006~2014). Spatial autocorrelation (or dependency) was assessed using global Moran’s I statistic. Anselin’s local Moran’s I and local Getis-Ord statistics were used to detect local spatial clusters. Ordinary least squares (OLS) regression, spatial lag model (SLM) and geographically-weighted regression (GWR) models were used to explore the socio-demographic predictors of pulmonary TB incidence from global and local perspectives. SPSS17.0, ArcGIS10.2.2, and GeoDA software were used for data analysis.ResultsIncidence of sputum smear positive (SS+) TB and new SS+TB showed a declining trend from 2005 to 2013. Pulmonary TB incidence showed a declining trend from 2005 to 2010 and a rising trend since 2011 mainly caused by the rising trend of sputum smear negative (SS-) TB incidence (p

Facebook

Twitter

Click to copy link

Link copied

Cite

Xverum (2024). Global Point of Interest (POI) Data | 230M+ Locations, 5000 Categories, Geographic & Location Intelligence, Regular Updates [Dataset]. https://datarade.ai/data-products/global-point-of-interest-poi-data-230m-locations-5000-c-xverum

Global Point of Interest (POI) Data | 230M+ Locations, 5000 Categories, Geographic & Location Intelligence, Regular Updates

Explore at:

.jsonAvailable download formats

Dataset updated

Sep 7, 2024

Dataset provided by

Xverum LLC

Authors

Xverum

Area covered

French Polynesia, Mauritania, Andorra, Kyrgyzstan, Vietnam, Costa Rica, Northern Mariana Islands, Bahamas, Antarctica, Guatemala

Description

Xverum’s Point of Interest (POI) Data is a comprehensive dataset containing 230M+ verified locations across 5000 business categories. Our dataset delivers structured geographic data, business attributes, location intelligence, and mapping insights, making it an essential tool for GIS applications, market research, urban planning, and competitive analysis.

With regular updates and continuous POI discovery, Xverum ensures accurate, up-to-date information on businesses, landmarks, retail stores, and more. Delivered in bulk to S3 Bucket and cloud storage, our dataset integrates seamlessly into mapping, geographic information systems, and analytics platforms.

🔥 Key Features:

Extensive POI Coverage: ✅ 230M+ Points of Interest worldwide, covering 5000 business categories. ✅ Includes retail stores, restaurants, corporate offices, landmarks, and service providers.

Geographic & Location Intelligence Data: ✅ Latitude & longitude coordinates for mapping and navigation applications. ✅ Geographic classification, including country, state, city, and postal code. ✅ Business status tracking – Open, temporarily closed, or permanently closed.

Continuous Discovery & Regular Updates: ✅ New POIs continuously added through discovery processes. ✅ Regular updates ensure data accuracy, reflecting new openings and closures.

Rich Business Insights: ✅ Detailed business attributes, including company name, category, and subcategories. ✅ Contact details, including phone number and website (if available). ✅ Consumer review insights, including rating distribution and total number of reviews (additional feature). ✅ Operating hours where available.

Ideal for Mapping & Location Analytics: ✅ Supports geospatial analysis & GIS applications. ✅ Enhances mapping & navigation solutions with structured POI data. ✅ Provides location intelligence for site selection & business expansion strategies.

Bulk Data Delivery (NO API): ✅ Delivered in bulk via S3 Bucket or cloud storage. ✅ Available in structured format (.json) for seamless integration.

🏆Primary Use Cases:

Mapping & Geographic Analysis: 🔹 Power GIS platforms & navigation systems with precise POI data. 🔹 Enhance digital maps with accurate business locations & categories.

Retail Expansion & Market Research: 🔹 Identify key business locations & competitors for market analysis. 🔹 Assess brand presence across different industries & geographies.

Business Intelligence & Competitive Analysis: 🔹 Benchmark competitor locations & regional business density. 🔹 Analyze market trends through POI growth & closure tracking.

Smart City & Urban Planning: 🔹 Support public infrastructure projects with accurate POI data. 🔹 Improve accessibility & zoning decisions for government & businesses.

💡 Why Choose Xverum’s POI Data?

230M+ Verified POI Records – One of the largest & most detailed location datasets available.
Global Coverage – POI data from 249+ countries, covering all major business sectors.
Regular Updates – Ensuring accurate tracking of business openings & closures.
Comprehensive Geographic & Business Data – Coordinates, addresses, categories, and more.
Bulk Dataset Delivery – S3 Bucket & cloud storage delivery for full dataset access.
100% Compliant – Ethically sourced, privacy-compliant data.

Access Xverum’s 230M+ POI dataset for mapping, geographic analysis, and location intelligence. Request a free sample or contact us to customize your dataset today!

Clear search

Close search

Google apps

Main menu

Global Point of Interest (POI) Data | 230M+ Locations, 5000 Categories,...

🌎 Location Intelligence Data | From Google Map

👨‍💻 Author: Azhar Saleem

Dataset Overview

Key Features

Potential Use Cases

Dataset Structure

Datasets for Computational Methods and GIS Applications in Social Science

🌆 City Lifestyle Segmentation Dataset

🌆 About This Dataset

🎯 Perfect For:

📦 What's Inside?

🔥 Key Features

🚀 Quick Start Example

🎓 Learning Outcomes

📚 Ideal For These Projects

🌍 Expected Clusters

🛠️ Technical Details

📖 What Makes This Dataset Special?

🏅 Use This Dataset If You Want To:

📊 Acknowledgments

Data from: Indoor GIS Solution for Space Use Assessment

Point-of-Interest (POI) Data | Global Coverage | 250M Business Listings Data...

Geographic Product Demand

Dataset Description

File Information:

Columns Description:

Usage

Example Use Cases:

Data Preprocessing Tips:

GIS Data | Global Geospatial data | Postal/Administrative boundaries |...

Iraq: Road Surface Data

Enriched NYTimes COVID19 U.S. County Dataset

Overview and Inspiration

How this data can be used

Content Details

Visualizations and Analysis Examples

Other Data Notes

Acknowledgements

Generalizing OD Maps to Explore Multi-dimensional Geospatial Datasets

Geospatial data for object-based high-resolution classification of conifers...

Data from: Crime Hot Spot Forecasting with Data from the Pittsburgh...

GIS Data the Netherlands | Mapping Data | 3.7M+ Places in the Netherlands

California Overlapping Cities and Counties and Identifiers

Spatiotemporal Data Analysis with Codeless Visual Programming

Geospatial Deep Learning Seminar Online Course

HIFLD California Dam Lines

SEN12TS: A SAR and Multispectral Dataset for Land Cover Classification

Socio-Demographic Predictors and Distribution of Pulmonary Tuberculosis (TB)...

Global Point of Interest (POI) Data | 230M+ Locations, 5000 Categories, Geographic & Location Intelligence, Regular UpdatesSee More Versions

Global Point of Interest (POI) Data | 230M+ Locations, 5000 Categories, Geographic & Location Intelligence, Regular Updates