41 datasets found
  1. Geospatial Data Pack for Visualization

    • kaggle.com
    zip
    Updated Oct 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vega Datasets (2025). Geospatial Data Pack for Visualization [Dataset]. https://www.kaggle.com/datasets/vega-datasets/geospatial-data-pack
    Explore at:
    zip(1422109 bytes)Available download formats
    Dataset updated
    Oct 21, 2025
    Dataset authored and provided by
    Vega Datasets
    Description

    Geospatial Data Pack for Visualization 🗺️

    Learn Geographic Mapping with Altair, Vega-Lite and Vega using Curated Datasets

    Complete geographic and geophysical data collection for mapping and visualization. This consolidation includes 18 complementary datasets used by 31+ Vega, Vega-Lite, and Altair examples 📊. Perfect for learning geographic visualization techniques including projections, choropleths, point maps, vector fields, and interactive displays.

    Source data lives on GitHub and can also be accessed via CDN. The vega-datasets project serves as a common repository for example datasets used across these visualization libraries and related projects.

    Why Use This Dataset? 🤔

    • Comprehensive Geospatial Types: Explore a variety of core geospatial data models:
      • Vector Data: Includes points (like airports.csv), lines (like londonTubeLines.json), and polygons (like us-10m.json).
      • Raster-like Data: Work with gridded datasets (like windvectors.csv, annual-precip.json).
    • Diverse Formats: Gain experience with standard and efficient geospatial formats like GeoJSON (see Table 1, 2, 4), compressed TopoJSON (see Table 1), and plain CSV/TSV (see Table 2, 3, 4) for point data and attribute tables ready for joining.
    • Multi-Scale Coverage: Practice visualization across different geographic scales, from global and national (Table 1, 4) down to the city level (Table 1).
    • Rich Thematic Mapping: Includes multiple datasets (Table 3) specifically designed for joining attributes to geographic boundaries (like states or counties from Table 1) to create insightful choropleth maps.
    • Ready-to-Use & Example-Driven: Cleaned datasets tightly integrated with 31+ official examples (see Appendix) from Altair, Vega-Lite, and Vega, allowing you to immediately practice techniques like projections, point maps, network maps, and interactive displays.
    • Python Friendly: Works seamlessly with essential Python libraries like Altair (which can directly read TopoJSON/GeoJSON), Pandas, and GeoPandas, fitting perfectly into the Kaggle notebook environment.

    Table of Contents

    Dataset Inventory 🗂️

    This pack includes 18 datasets covering base maps, reference points, statistical data for choropleths, and geophysical data.

    1. BASE MAP BOUNDARIES (Topological Data)

    DatasetFileSizeFormatLicenseDescriptionKey Fields / Join Info
    US Map (1:10m)us-10m.json627 KBTopoJSONCC-BY-4.0US state and county boundaries. Contains states and counties objects. Ideal for choropleths.id (FIPS code) property on geometries
    World Map (1:110m)world-110m.json117 KBTopoJSONCC-BY-4.0World country boundaries. Contains countries object. Suitable for world-scale viz.id property on geometries
    London BoroughslondonBoroughs.json14 KBTopoJSONCC-BY-4.0London borough boundaries.properties.BOROUGHN (name)
    London CentroidslondonCentroids.json2 KBGeoJSONCC-BY-4.0Center points for London boroughs.properties.id, properties.name
    London Tube LineslondonTubeLines.json78 KBGeoJSONCC-BY-4.0London Underground network lines.properties.name, properties.color

    2. GEOGRAPHIC REFERENCE POINTS (Point Data) 📍

    DatasetFileSizeFormatLicenseDescriptionKey Fields / Join Info
    US Airportsairports.csv205 KBCSVPublic DomainUS airports with codes and coordinates.iata, state, `l...
  2. Data from: A concentration-based approach to data classification for...

    • tandf.figshare.com
    • figshare.com
    txt
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Robert G. Cromley; Shuowei Zhang; Natalia Vorotyntseva (2023). A concentration-based approach to data classification for choropleth mapping [Dataset]. http://doi.org/10.6084/m9.figshare.1456086.v2
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Taylor & Francishttps://taylorandfrancis.com/
    Authors
    Robert G. Cromley; Shuowei Zhang; Natalia Vorotyntseva
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The choropleth map is a device used for the display of socioeconomic data associated with an areal partition of geographic space. Cartographers emphasize the need to standardize any raw count data by an area-based total before displaying the data in a choropleth map. The standardization process converts the raw data from an absolute measure into a relative measure. However, there is recognition that the standardizing process does not enable the map reader to distinguish between low–low and high–high numerator/denominator differences. This research uses concentration-based classification schemes using Lorenz curves to address some of these issues. A test data set of nonwhite birth rate by county in North Carolina is used to demonstrate how this approach differs from traditional mean–variance-based systems such as the Jenks’ optimal classification scheme.

  3. Natural Earth 1:110m Countries

    • kaggle.com
    zip
    Updated Mar 14, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anton Poznyakovskiy (2020). Natural Earth 1:110m Countries [Dataset]. https://www.kaggle.com/datasets/poznyakovskiy/natural-earth-1110m-countries
    Explore at:
    zip(197544 bytes)Available download formats
    Dataset updated
    Mar 14, 2020
    Authors
    Anton Poznyakovskiy
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset contains geometry data for the countries of the world together with their names and country codes in various formats. The primary use case is choropleths, color-coded maps. The data can be read as a pandas DataFrame with geopandas and plotted with matplotlib. See the starter notebook for an example how to do it.

    The data was created by Natural Earth. It is in public domain and free to use for any purpose at the time of this writing; you might want to check their Terms of Use.

    Photo by KOBU Agency on Unsplash

  4. USA states GeoJson

    • kaggle.com
    zip
    Updated Aug 18, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kate Gallo (2020). USA states GeoJson [Dataset]. https://www.kaggle.com/pompelmo/usa-states-geojson
    Explore at:
    zip(30298 bytes)Available download formats
    Dataset updated
    Aug 18, 2020
    Authors
    Kate Gallo
    Area covered
    United States
    Description

    Context

    I created a dataset to help people create choropleth maps of United States states.

    Content

    One geojson to plot the countries borders, and one csv from the Census Bureau for the us population per state.

    Inspiration

    I think the best way to use this dataset is in joining it with other data. For example, I used this dataset to plot police killings using the data from https://www.kaggle.com/jpmiller/police-violence-in-the-us

  5. Pakistan Cities— 1,513 locations with lat/lon/pop

    • kaggle.com
    zip
    Updated Aug 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ikram Ul Hassan (2025). Pakistan Cities— 1,513 locations with lat/lon/pop [Dataset]. https://www.kaggle.com/datasets/ikramshah512/pakistan-cities-wikidata-linked-1513-locations
    Explore at:
    zip(42829 bytes)Available download formats
    Dataset updated
    Aug 17, 2025
    Authors
    Ikram Ul Hassan
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    Pakistan
    Description

    A comprehensive dataset of 1,513 Pakistani cities, towns, tehsils, districts and places with latitude/longitude, administrative region, population (when available) and Wikidata IDs — ideal for mapping, geospatial analysis, enrichment, and location-based ML.

    Why this dataset is valuable:

    • Full geocoordinates for every entry (100% coverage) — ready for mapping and spatial joins.
    • Wide geographic coverage across all 7 major regions of Pakistan (provinces / administrative regions).
    • Wikidata IDs included for reliable cross-referencing and automatic enrichment from external knowledge bases.
    • Useful for data scientists, GIS engineers, civic tech projects, academic research, and startups building Pakistan-focused location services.

    Highlights (fetched from the data):

    • Total rows: 1,513
    • Unique places (city field): 1,497
    • Rows with population > 0: 526 (≈34.8%)
    • Coordinate coverage: 1513 / 1513 (100%) — directly usable with mapping libraries.

    Column definitions (short):

    • id — Internal numeric row id (unique integer).
    • wikiDataId — Wikidata QID (e.g., Q####) for the place; use to fetch rich metadata.
    • type — Administrative/place type (e.g., ADM1, ADM2, city, district, tehsil).
    • city — Common/local city/place name (short label).
    • name — Full name / official name of the place (may include “District”, “Tehsil”, etc.).
    • country — Country name (Pakistan).
    • countryCode — ISO country code (e.g., PK).
    • region — Primary administrative region / province (e.g., Punjab, Sindh).
    • regionCode — Short code for region (e.g., PB, KP depending on your encoding).
    • regionWdId — Wikidata QID for the region.
    • latitude — Latitude in decimal degrees (float).
    • longitude — Longitude in decimal degrees (float).
    • population — Integer population (0 or NA where unknown).

    Typical & high-value use cases:

    • Mapping & visualization: choropleth maps, point overlays, heatmaps of population or density.
    • Geospatial analysis: distance calculations, nearest-neighbor queries, clustering of urban centers.
    • Data enrichment: join with other datasets (OpenStreetMap, Wikidata, census data) using wikiDataId and coordinates.
    • Machine learning & NLP: training geolocation models, geoparsing, toponym resolution, place name disambiguation.
    • Urban planning & research: analyze distribution of population-ready places vs administrative units.
    • Mobile / location-based apps: lookup & reverse geocoding fallback, seeding POI databases for Pakistan.
    • Humanitarian & disaster response: baseline location lists for logistics and situational awareness.
  6. Mapping 2021 Census Data using the Living Atlas

    • lecture-with-gis-esriukeducation.hub.arcgis.com
    • teachwithgis.co.uk
    Updated Apr 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Esri UK Education (2025). Mapping 2021 Census Data using the Living Atlas [Dataset]. https://lecture-with-gis-esriukeducation.hub.arcgis.com/datasets/mapping-2021-census-data-using-the-living-atlas
    Explore at:
    Dataset updated
    Apr 30, 2025
    Dataset provided by
    Esrihttp://esri.com/
    Authors
    Esri UK Education
    Description

    Anyone who has taught GIS using Census Data knows it is an invaluable data set for showing students how to take data stored in a table and join it to boundary data to transform this data into something that can be visualised and analysed spatially. Joins are a core GIS skill and need to be learnt, as not every data set is going to come neatly packaged as a shapefile or feature layer with all the data you need stored within. I don't know how many times I taught students to download data as a table from Nomis, load it into a GIS and then join that table data to the appropriate boundary data so they could produce choropleth maps to do some visual analysis, but it was a lot! Once students had gotten the hang of joins using census data they'd often ask why this data doesn't exist as a prepackaged feature layer with all the data they wanted within it. Well good news, now a lot off it is and it's accessible through the Living Atlas! Don't get me wrong I fully understand the importance of teaching students how to perform joins but once you have this understanding if you can access data that already contains all the information you need then you should be taking advantage of it to save you time. So in this exercise I am going to show you how to load English and Welsh Census Data from the 2021 Census into the ArcGIS Map Viewer from the Living Atlas and produce some choropleth maps to use to perform visual analysis without having to perform a single join.

  7. d

    How to select appropriate hue ranges for sequential color schemes on...

    • search.dataone.org
    • data.niaid.nih.gov
    • +1more
    Updated Apr 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tai sheng Chen; Xi Lv; Kun Hu; Meng lin Chen; Lu Cheng; Wei xing Jiang (2025). How to select appropriate hue ranges for sequential color schemes on choropleth maps? A quantitative evaluation using map reading experiments [Dataset]. http://doi.org/10.5061/dryad.c59zw3rdt
    Explore at:
    Dataset updated
    Apr 3, 2025
    Dataset provided by
    Dryad Digital Repository
    Authors
    Tai sheng Chen; Xi Lv; Kun Hu; Meng lin Chen; Lu Cheng; Wei xing Jiang
    Time period covered
    Jan 1, 2023
    Description

    We propose map reading experiments to quantitatively evaluate the selection of hue ranges for sequential color schemes on choropleth maps. In these experiments, 60 sequential color schemes with six base hues and ten hue ranges were employed as experimental color schemes, and a total of 414 college students were invited to complete identification, comparison, and ranking tasks. Both controlled and real-map experiments were performed, each involving a web-based survey and an eye-tracking experiment. In the controlled experiments, the shapes of the map objects were relatively regular, and attribute data were randomized. In contrast, the shapes were complex in real-map experiments, and real data were employed. Our findings show that widely used color schemes with a hue range of 0º yield poor performance in all tasks; 15º hue ranges yield good performance in the comparison and ranking tasks but poor performance in the identification task. For large hue ranges of 120-360º, participants showed...

  8. NYC zipcode geodata

    • kaggle.com
    zip
    Updated Sep 23, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Saidakbarp (2019). NYC zipcode geodata [Dataset]. https://www.kaggle.com/saidakbarp/nyc-zipcode-geodata
    Explore at:
    zip(552766 bytes)Available download formats
    Dataset updated
    Sep 23, 2019
    Authors
    Saidakbarp
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    New York
    Description

    Context

    I used this publicly available data for making interactive map visualization of NYC. Zipcode geodata is useful for building interactive maps with each zip code area representing a separate area on the map.

    Content

    NYC zipcode geodata in geojson format

    Acknowledgements

    The rights belong to the original authors.

  9. Geographic data of Japan

    • kaggle.com
    zip
    Updated May 8, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    zhanhao h. (2021). Geographic data of Japan [Dataset]. https://www.kaggle.com/zhanhaoh/geographic-data-of-japan
    Explore at:
    zip(1257833 bytes)Available download formats
    Dataset updated
    May 8, 2021
    Authors
    zhanhao h.
    Area covered
    Japan
    Description

    Context

    The dataset defines the geographic polygon shapes of the prefectures of Japan. You can use it for plotting Mapbox Choropleth maps by the plotly package conveniently. It is a small modification from the dataset at https://github.com/dataofjapan/land/blob/master/japan.geojson.

    Content

    For each prefecture, an id is assigned. The id naming is something like 'Kyoto' which means for the Kyoto prefecture, and 'Okinawa' which means for the Okinawa prefecture.

    Acknowledgements

    It is a small modification from the original dataset at https://github.com/dataofjapan/land/blob/master/japan.geojson. I have added id for each element so that it can be conveniently used for plotting Mapbox Choropleth maps.

  10. List of Countries and their Population

    • kaggle.com
    Updated Apr 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anah Chukwujekwu (2025). List of Countries and their Population [Dataset]. https://www.kaggle.com/datasets/anahchukwujekwu/list-of-countries-and-their-population/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 12, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Anah Chukwujekwu
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    🌍 Countries and Dependencies by Population (2025)

    This dataset provides a comprehensive list of countries and dependent territories worldwide, along with their most recent population estimates.The data is sourced from the Wikipedia page List of countries and dependencies by population, which compiles figures from national statistical offices and the United Nations Population Division

    📄 Dataset Overview

    • Country/Territory Name Includes sovereign states, dependent territories, and regions with limited recognition.
    • Population Latest available estimates, primarily from national censuses or UN projection.
    • Percentage of World Population Each country's population as a percentage of the global total.
    • Date of Estimate The reference date for the population figure.
    • Notes Additional information, such as inclusion or exclusion of certain region.

    🧠 Potential Use Cases

    • Analyzing global population distribution and trends.- Creating visualizations like choropleth maps.- Normalizing other datasets by population for per capita analysis.- Educational purposes in demographics and geography.

    📌 Notes

    • The dataset includes territories and regions with limited recognition to provide a complete global perspective.
    • Population figures are based on the most recent estimates available as of 225.
    • Data may be subject to revisions as new census information becomes available.
  11. d

    Data from: CrimeMapTutorial Workbooks and Sample Data for ArcView and...

    • catalog.data.gov
    • icpsr.umich.edu
    • +1more
    Updated Nov 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institute of Justice (2025). CrimeMapTutorial Workbooks and Sample Data for ArcView and MapInfo, 2000 [Dataset]. https://catalog.data.gov/dataset/crimemaptutorial-workbooks-and-sample-data-for-arcview-and-mapinfo-2000-3c9be
    Explore at:
    Dataset updated
    Nov 14, 2025
    Dataset provided by
    National Institute of Justice
    Description

    CrimeMapTutorial is a step-by-step tutorial for learning crime mapping using ArcView GIS or MapInfo Professional GIS. It was designed to give users a thorough introduction to most of the knowledge and skills needed to produce daily maps and spatial data queries that uniformed officers and detectives find valuable for crime prevention and enforcement. The tutorials can be used either for self-learning or in a laboratory setting. The geographic information system (GIS) and police data were supplied by the Rochester, New York, Police Department. For each mapping software package, there are three PDF tutorial workbooks and one WinZip archive containing sample data and maps. Workbook 1 was designed for GIS users who want to learn how to use a crime-mapping GIS and how to generate maps and data queries. Workbook 2 was created to assist data preparers in processing police data for use in a GIS. This includes address-matching of police incidents to place them on pin maps and aggregating crime counts by areas (like car beats) to produce area or choropleth maps. Workbook 3 was designed for map makers who want to learn how to construct useful crime maps, given police data that have already been address-matched and preprocessed by data preparers. It is estimated that the three tutorials take approximately six hours to complete in total, including exercises.

  12. FOLIUM_INDIA

    • kaggle.com
    zip
    Updated Jun 15, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    KD007 (2020). FOLIUM_INDIA [Dataset]. https://www.kaggle.com/krishcross/india-shape-map
    Explore at:
    zip(16183750 bytes)Available download formats
    Dataset updated
    Jun 15, 2020
    Authors
    KD007
    Area covered
    India
    Description

    Folium makes it easy to visualize data that’s been manipulated in Python on an interactive leaflet map. It enables both the binding of data to a map for choropleth visualizations as well as passing rich vector/raster/HTML visualizations as markers on the map. These files can be used to mark the state boundaries on the map of INDIA using folium library and the CSV also contains the state data and how to use it in our notebooks. I have used it in one of my kernels which can be viewed.

    The library has a number of built-in tilesets from OpenStreetMap, Mapbox, and Stamen, and supports custom tilesets with Mapbox or Cloudmade API keys. folium supports both Image, Video, GeoJSON, and TopoJSON overlays. Due to extensible functionalities I find folium the best map plotting library in python. Do give it a try and use it in your kernels.

  13. Data from: geojson

    • kaggle.com
    zip
    Updated Mar 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    JP (2020). geojson [Dataset]. https://www.kaggle.com/para24/geojson
    Explore at:
    zip(2792291 bytes)Available download formats
    Dataset updated
    Mar 24, 2020
    Authors
    JP
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    GeoJSON file containing India state borders along with their non-spatial attributes (id, name, etc) for use in Plotly Choropleth maps.

  14. f

    Spatial polygon data used for Fig 2.

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Dec 5, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chongsuvivatwong, Virasakdi; Chumchuen, Kemmapon; Wichaidit, Wit (2024). Spatial polygon data used for Fig 2. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001358357
    Explore at:
    Dataset updated
    Dec 5, 2024
    Authors
    Chongsuvivatwong, Virasakdi; Chumchuen, Kemmapon; Wichaidit, Wit
    Description

    In June 2022, Thailand legalized recreational cannabis. Currently, cannabis is now the most consumed drug. Cannabis usage can increase inflammatory responses in the respiratory tract. Sharing of cannabis waterpipes has been linked to increased tuberculosis risks. Using a national in-patient databank, we aimed to 1) describe the spatiotemporal correlation between cannabis-related and tuberculosis hospital admissions, and 2) compare the rate of subsequent pulmonary tuberculosis admission between those with prior admissions for cannabis-related causes and those without. Both admission types were aggregated to the number of admissions in monthly and provincial units. Temporal and spatial patterns were visualized using line plots and choropleth maps, respectively. A matched cohort analysis was conducted to compare the incidence density rate of subsequent tuberculosis admission and the hazard ratio. Throughout 2017–2022, we observed a gradual decline in tuberculosis admissions, in contrast to the increase in cannabis-related admissions. Both admissions shared a hotspot in Northeastern Thailand. Between matched cohorts of 6,773 in-patients, the incidence density rate per 100,000 person–years of subsequent tuberculosis admissions was 267.6 and 165.9 in in-patients with and without past cannabis-admission, respectively. After adjusting for covariates, we found that a cannabis-related admission history was associated with a hazard ratio of 1.48 (P = 0.268) for subsequent tuberculosis admission. Our findings failed to support the evidence that cannabis consumption increased pulmonary tuberculosis risk. Other study types are needed to further assess the association between cannabis consumption and pulmonary tuberculosis.

  15. Synthetic population for JOR

    • zenodo.org
    bin, csv, pdf, zip
    Updated Jul 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abhijin Adiga; Hannah Baek; Stephen Eubank; Przemyslaw Porebski; Madhav Marathe; Henning Mortveit; Samarth Swarup; Mandy Wilson; Dawen Xie; Abhijin Adiga; Hannah Baek; Stephen Eubank; Przemyslaw Porebski; Madhav Marathe; Henning Mortveit; Samarth Swarup; Mandy Wilson; Dawen Xie (2024). Synthetic population for JOR [Dataset]. http://doi.org/10.5281/zenodo.6503398
    Explore at:
    pdf, zip, csv, binAvailable download formats
    Dataset updated
    Jul 16, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Abhijin Adiga; Hannah Baek; Stephen Eubank; Przemyslaw Porebski; Madhav Marathe; Henning Mortveit; Samarth Swarup; Mandy Wilson; Dawen Xie; Abhijin Adiga; Hannah Baek; Stephen Eubank; Przemyslaw Porebski; Madhav Marathe; Henning Mortveit; Samarth Swarup; Mandy Wilson; Dawen Xie
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Synthetic populations for regions of the World (SPW) | Jordan

    Dataset information

    A synthetic population of a region as provided here, captures the people of the region with selected demographic attributes, their organization into households, their assigned activities for a day, the locations where the activities take place and thus where interactions among population members happen (e.g., spread of epidemics).

    License

    CC-BY-4.0

    Acknowledgment

    This project was supported by the National Science Foundation under the NSF RAPID: COVID-19 Response Support: Building Synthetic Multi-scale Networks (PI: Madhav Marathe, Co-PIs: Henning Mortveit, Srinivasan Venkatramanan; Fund Number: OAC-2027541).

    Contact information

    Henning.Mortveit@virginia.edu

    Identifiers

    Region nameJordan
    Region IDjor
    Modelcoarse
    Version0_9_0

    Statistics

    NameValue
    Population5723567.0
    Average age23.5
    Households1235755.0
    Average household size4.6
    Residence locations1235755.0
    Activity locations131978.0
    Average number of activities6.4
    Average travel distance44.5

    Sources

    DescriptionNameVersionUrl
    Activity template dataWorld Bank2021https://data.worldbank.org
    Administrative boundariesADCW7.6https://www.adci.com/adc-worldmap
    Curated POIs based on OSMSLIPO/OSM POIshttp://slipo.eu/?p=1551 https://www.openstreetmap.org/
    Household dataDHShttps://dhsprogram.com
    Population count with demographic attributesGPWv4.11https://sedac.ciesin.columbia.edu/data/set/gpw-v4-admin-unit-center-points-population-estimates-rev11

    Files description

    Base data files (jor_data_v_0_9.zip)

    FilenameDescription
    jor_person_v_0_9.csvData for each person including attributes such as age, gender, and household ID.
    jor_household_v_0_9.csvData at household level.
    jor_residence_locations_v_0_9.csvData about residence locations
    jor_activity_locations_v_0_9.csvData about activity locations, including what activity types are supported at these locations
    jor_activity_location_assignment_v_0_9.csvFor each person and for each of their activities, this file specifies the location where the activity takes place

    Derived data files

    FilenameDescription
    jor_contact_matrix_v_0_9.csvA POLYMOD-type contact matrix constructed from a network representation of the location assignment data and a within-location contact model.

    Validation and measures files

    FilenameDescription
    jor_household_grouping_validation_v_0_9.pdfValidation plots for household construction
    jor_activity_durations_{adult,child}_v_0_9.pdfComparison of time spent on generated activities with survey data
    jor_activity_patterns_{adult,child}_v_0_9.pdfComparison of generated activity patterns by the time of day with survey data
    jor_location_construction_0_9.pdfValidation plots for location construction
    jor_location_assignement_0_9.pdfValidation plots for location assignment, including travel distribution plots
    jor_jor_ver_0_9_0_avg_travel_distance.pdfChoropleth map visualizing average travel distance
    jor_jor_ver_0_9_0_travel_distr_combined.pdfTravel distance distribution
    jor_jor_ver_0_9_0_num_activity_loc.pdfChoropleth map visualizing number of activity locations
    jor_jor_ver_0_9_0_avg_age.pdfChoropleth map visualizing average age
    jor_jor_ver_0_9_0_pop_density_per_sqkm.pdfChoropleth map visualizing population density
    jor_jor_ver_0_9_0_pop_size.pdfChoropleth map visualizing population size

  16. Synthetic population for IND_DELHI

    • zenodo.org
    bin, pdf, zip
    Updated Jul 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abhijin Adiga; Hannah Baek; Stephen Eubank; Przemyslaw Porebski; Madhav Marathe; Henning Mortveit; Samarth Swarup; Mandy Wilson; Dawen Xie; Abhijin Adiga; Hannah Baek; Stephen Eubank; Przemyslaw Porebski; Madhav Marathe; Henning Mortveit; Samarth Swarup; Mandy Wilson; Dawen Xie (2024). Synthetic population for IND_DELHI [Dataset]. http://doi.org/10.5281/zenodo.6505994
    Explore at:
    pdf, zip, binAvailable download formats
    Dataset updated
    Jul 16, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Abhijin Adiga; Hannah Baek; Stephen Eubank; Przemyslaw Porebski; Madhav Marathe; Henning Mortveit; Samarth Swarup; Mandy Wilson; Dawen Xie; Abhijin Adiga; Hannah Baek; Stephen Eubank; Przemyslaw Porebski; Madhav Marathe; Henning Mortveit; Samarth Swarup; Mandy Wilson; Dawen Xie
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Delhi, India
    Description

    Synthetic populations for regions of the World (SPW) | Delhi

    Dataset information

    A synthetic population of a region as provided here, captures the people of the region with selected demographic attributes, their organization into households, their assigned activities for a day, the locations where the activities take place and thus where interactions among population members happen (e.g., spread of epidemics).

    License

    CC-BY-4.0

    Acknowledgment

    This project was supported by the National Science Foundation under the NSF RAPID: COVID-19 Response Support: Building Synthetic Multi-scale Networks (PI: Madhav Marathe, Co-PIs: Henning Mortveit, Srinivasan Venkatramanan; Fund Number: OAC-2027541).

    Contact information

    Henning.Mortveit@virginia.edu

    Identifiers

    Region nameDelhi
    Region IDind_140001944
    Modelcoarse
    Version0_9_0

    Statistics

    NameValue
    Population15951510
    Average age28.2
    Households3625935
    Average household size4.4
    Residence locations3625935
    Activity locations1309377
    Average number of activities5.5
    Average travel distance26.6

    Sources

    DescriptionNameVersionUrl
    Activity template dataWorld Bank2021https://data.worldbank.org
    Administrative boundariesADCW7.6https://www.adci.com/adc-worldmap
    Curated POIs based on OSMSLIPO/OSM POIshttp://slipo.eu/?p=1551 https://www.openstreetmap.org/
    Household dataDHShttps://dhsprogram.com
    Population count with demographic attributesGPWv4.11https://sedac.ciesin.columbia.edu/data/set/gpw-v4-admin-unit-center-points-population-estimates-rev11

    Files description

    Base data files (ind_140001944_data_v_0_9.zip)

    FilenameDescription
    ind_140001944_person_v_0_9.csvData for each person including attributes such as age, gender, and household ID.
    ind_140001944_household_v_0_9.csvData at household level.
    ind_140001944_residence_locations_v_0_9.csvData about residence locations
    ind_140001944_activity_locations_v_0_9.csvData about activity locations, including what activity types are supported at these locations
    ind_140001944_activity_location_assignment_v_0_9.csvFor each person and for each of their activities, this file specifies the location where the activity takes place

    Derived data files

    FilenameDescription
    ind_140001944_contact_matrix_v_0_9.csvA POLYMOD-type contact matrix constructed from a network representation of the location assignment data and a within-location contact model.

    Validation and measures files

    FilenameDescription
    ind_140001944_household_grouping_validation_v_0_9.pdfValidation plots for household construction
    ind_140001944_activity_durations_{adult,child}_v_0_9.pdfComparison of time spent on generated activities with survey data
    ind_140001944_activity_patterns_{adult,child}_v_0_9.pdfComparison of generated activity patterns by the time of day with survey data
    ind_140001944_location_construction_0_9.pdfValidation plots for location construction
    ind_140001944_location_assignement_0_9.pdfValidation plots for location assignment, including travel distribution plots
    ind_140001944_ind_140001944_ver_0_9_0_avg_travel_distance.pdfChoropleth map visualizing average travel distance
    ind_140001944_ind_140001944_ver_0_9_0_travel_distr_combined.pdfTravel distance distribution
    ind_140001944_ind_140001944_ver_0_9_0_num_activity_loc.pdfChoropleth map visualizing number of activity locations
    ind_140001944_ind_140001944_ver_0_9_0_avg_age.pdfChoropleth map visualizing average age
    ind_140001944_ind_140001944_ver_0_9_0_pop_density_per_sqkm.pdfChoropleth map visualizing population density
    ind_140001944_ind_140001944_ver_0_9_0_pop_size.pdfChoropleth map visualizing population size

  17. World shapefile

    • kaggle.com
    zip
    Updated Jul 24, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kamile Novaes (2023). World shapefile [Dataset]. https://www.kaggle.com/datasets/kamilenovaes/world-shapefile/code
    Explore at:
    zip(206143 bytes)Available download formats
    Dataset updated
    Jul 24, 2023
    Authors
    Kamile Novaes
    License

    ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
    License information was derived automatically

    Area covered
    World
    Description

    This dataset contains a comprehensive collection of geographic shapefiles representing the boundaries of countries and territories worldwide. The shapefiles define the outlines of each nation and are based on the most recent and accurate geographical data available. The dataset includes polygon geometries that accurately represent the territorial extent of each country, making it suitable for various geographical analyses, visualizations, and spatial applications.

    Content: The dataset comprises shapefiles in the ESRI shapefile format (.shp) along with associated files (.shx, .dbf, etc.) that contain the attributes of each country, such as country names, ISO codes, and other relevant information. The polygons in the shapefiles correspond to the land boundaries of each nation, enabling precise mapping and spatial analysis.

    Use Cases: This dataset can be utilized in a wide range of applications, including but not limited to:

    • Creating choropleth maps to visualize and analyze various socio-economic indicators by country.
    • Conducting spatial analysis to study population distribution, territorial areas, and geographic trends.
    • Performing geopolitical research and country-level comparisons.
    • Integrating with other datasets to enrich geographic analyses and insights.

    Source: The shapefile data is sourced from reputable and authoritative geographic databases, ensuring its accuracy and reliability for diverse applications.

  18. Synthetic population for IND_MANIPUR

    • zenodo.org
    bin, pdf, zip
    Updated Jul 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abhijin Adiga; Hannah Baek; Stephen Eubank; Przemyslaw Porebski; Madhav Marathe; Henning Mortveit; Samarth Swarup; Mandy Wilson; Dawen Xie; Abhijin Adiga; Hannah Baek; Stephen Eubank; Przemyslaw Porebski; Madhav Marathe; Henning Mortveit; Samarth Swarup; Mandy Wilson; Dawen Xie (2024). Synthetic population for IND_MANIPUR [Dataset]. http://doi.org/10.5281/zenodo.6506020
    Explore at:
    pdf, bin, zipAvailable download formats
    Dataset updated
    Jul 16, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Abhijin Adiga; Hannah Baek; Stephen Eubank; Przemyslaw Porebski; Madhav Marathe; Henning Mortveit; Samarth Swarup; Mandy Wilson; Dawen Xie; Abhijin Adiga; Hannah Baek; Stephen Eubank; Przemyslaw Porebski; Madhav Marathe; Henning Mortveit; Samarth Swarup; Mandy Wilson; Dawen Xie
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Manipur, India
    Description

    Synthetic populations for regions of the World (SPW) | Manipur

    Dataset information

    A synthetic population of a region as provided here, captures the people of the region with selected demographic attributes, their organization into households, their assigned activities for a day, the locations where the activities take place and thus where interactions among population members happen (e.g., spread of epidemics).

    License

    CC-BY-4.0

    Acknowledgment

    This project was supported by the National Science Foundation under the NSF RAPID: COVID-19 Response Support: Building Synthetic Multi-scale Networks (PI: Madhav Marathe, Co-PIs: Henning Mortveit, Srinivasan Venkatramanan; Fund Number: OAC-2027541).

    Contact information

    Henning.Mortveit@virginia.edu

    Identifiers

    Region nameManipur
    Region IDind_140001942
    Modelcoarse
    Version0_9_0

    Statistics

    NameValue
    Population2796700
    Average age27.5
    Households635806
    Average household size4.4
    Residence locations635806
    Activity locations192709
    Average number of activities5.5
    Average travel distance78.3

    Sources

    DescriptionNameVersionUrl
    Activity template dataWorld Bank2021https://data.worldbank.org
    Administrative boundariesADCW7.6https://www.adci.com/adc-worldmap
    Curated POIs based on OSMSLIPO/OSM POIshttp://slipo.eu/?p=1551 https://www.openstreetmap.org/
    Household dataDHShttps://dhsprogram.com
    Population count with demographic attributesGPWv4.11https://sedac.ciesin.columbia.edu/data/set/gpw-v4-admin-unit-center-points-population-estimates-rev11

    Files description

    Base data files (ind_140001942_data_v_0_9.zip)

    FilenameDescription
    ind_140001942_person_v_0_9.csvData for each person including attributes such as age, gender, and household ID.
    ind_140001942_household_v_0_9.csvData at household level.
    ind_140001942_residence_locations_v_0_9.csvData about residence locations
    ind_140001942_activity_locations_v_0_9.csvData about activity locations, including what activity types are supported at these locations
    ind_140001942_activity_location_assignment_v_0_9.csvFor each person and for each of their activities, this file specifies the location where the activity takes place

    Derived data files

    FilenameDescription
    ind_140001942_contact_matrix_v_0_9.csvA POLYMOD-type contact matrix constructed from a network representation of the location assignment data and a within-location contact model.

    Validation and measures files

    FilenameDescription
    ind_140001942_household_grouping_validation_v_0_9.pdfValidation plots for household construction
    ind_140001942_activity_durations_{adult,child}_v_0_9.pdfComparison of time spent on generated activities with survey data
    ind_140001942_activity_patterns_{adult,child}_v_0_9.pdfComparison of generated activity patterns by the time of day with survey data
    ind_140001942_location_construction_0_9.pdfValidation plots for location construction
    ind_140001942_location_assignement_0_9.pdfValidation plots for location assignment, including travel distribution plots
    ind_140001942_ind_140001942_ver_0_9_0_avg_travel_distance.pdfChoropleth map visualizing average travel distance
    ind_140001942_ind_140001942_ver_0_9_0_travel_distr_combined.pdfTravel distance distribution
    ind_140001942_ind_140001942_ver_0_9_0_num_activity_loc.pdfChoropleth map visualizing number of activity locations
    ind_140001942_ind_140001942_ver_0_9_0_avg_age.pdfChoropleth map visualizing average age
    ind_140001942_ind_140001942_ver_0_9_0_pop_density_per_sqkm.pdfChoropleth map visualizing population density
    ind_140001942_ind_140001942_ver_0_9_0_pop_size.pdfChoropleth map visualizing population size

  19. List of sociodemographic variables used in PCA analysis to create new...

    • plos.figshare.com
    xls
    Updated Jun 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Julia Elizabeth Isaacson; Jinny Jing Ye; Lincoln LuĂ­s Silva; Thiago Augusto Hernandes Rocha; Luciano de Andrade; Joao Felipe Hermann Costa Scheidt; Fan Hui Wen; Jacqueline Sachett; Wuelton Marcelo Monteiro; Catherine Ann Staton; Joao Ricardo Nickenig Vissoci; Charles John Gerardo (2023). List of sociodemographic variables used in PCA analysis to create new indicators for spatial analysis. [Dataset]. http://doi.org/10.1371/journal.pntd.0011305.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 21, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Julia Elizabeth Isaacson; Jinny Jing Ye; Lincoln LuĂ­s Silva; Thiago Augusto Hernandes Rocha; Luciano de Andrade; Joao Felipe Hermann Costa Scheidt; Fan Hui Wen; Jacqueline Sachett; Wuelton Marcelo Monteiro; Catherine Ann Staton; Joao Ricardo Nickenig Vissoci; Charles John Gerardo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    List of sociodemographic variables used in PCA analysis to create new indicators for spatial analysis.

  20. NFIP Community Layer No Overlaps Whole

    • catalog.data.gov
    • gimi9.com
    • +1more
    Updated Jun 7, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FEMA/Resilience/Federal Insurance Directorate (2025). NFIP Community Layer No Overlaps Whole [Dataset]. https://catalog.data.gov/dataset/nfip-community-layer-no-overlaps-whole
    Explore at:
    Dataset updated
    Jun 7, 2025
    Dataset provided by
    Federal Emergency Management Agencyhttp://www.fema.gov/
    Description

    This dataset is flattened and multicounty communities are unsplit by county lines. Flattened means that there are no overlaps; larger shapes like counties are punched out or clipped where smaller communities are contained within them. This allows for choropleth shading and other mapping techniques such as calculating unincorporated county land area. Multicounty cities like Houston are a single feature, undivided by counties. This layer is derived from Census, State of Maine, and National Flood Hazard Layer political boundaries.rnrnThe Community Layer datasets contain geospatial community boundaries associated with Census and NFIP data. The dataset does not contain personal identifiable information (PII). The Community Layer can be used to tie Community ID numbers (CID) to jurisdiction, tribal, and special land use area boundaries.rnrnA geodatabase (GDB) link is Included in the Full Data section below. The compressed file contains a collection of files that can store, query, and manage both spatial and nonspatial data using software that can read such a file. It bcontains all of the community layers/b, not just the layer for which this dataset page describes. rnThis layer can also be accessed from the FEMA ArcGIS viewer online: https://fema.maps.arcgis.com/home/item.html?id=8dcf28fc5b97404bbd9d1bc6d3c9b3cfrnrnrnCitation: FEMA's citation requirements for datasets (API usage or file downloads) can be found on the OpenFEMA Terms and Conditions page, Citing Data section: https://www.fema.gov/about/openfema/terms-conditions.rnrnFor answers to Frequently Asked Questions (FAQs) about the OpenFEMA program, API, and publicly available datasets, please visit: https://www.fema.gov/about/openfema/faq.rnIf you have media inquiries about this dataset, please email the FEMA News Desk at FEMA-News-Desk@fema.dhs.gov or call (202) 646-3272. For inquiries about FEMA's data and Open Government program, please email the OpenFEMA team at OpenFEMA@fema.dhs.gov.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Vega Datasets (2025). Geospatial Data Pack for Visualization [Dataset]. https://www.kaggle.com/datasets/vega-datasets/geospatial-data-pack
Organization logo

Geospatial Data Pack for Visualization

Learn Geographic Mapping with Altair, Vega-Lite and Vega using Curated Datasets

Explore at:
zip(1422109 bytes)Available download formats
Dataset updated
Oct 21, 2025
Dataset authored and provided by
Vega Datasets
Description

Geospatial Data Pack for Visualization 🗺️

Learn Geographic Mapping with Altair, Vega-Lite and Vega using Curated Datasets

Complete geographic and geophysical data collection for mapping and visualization. This consolidation includes 18 complementary datasets used by 31+ Vega, Vega-Lite, and Altair examples 📊. Perfect for learning geographic visualization techniques including projections, choropleths, point maps, vector fields, and interactive displays.

Source data lives on GitHub and can also be accessed via CDN. The vega-datasets project serves as a common repository for example datasets used across these visualization libraries and related projects.

Why Use This Dataset? 🤔

  • Comprehensive Geospatial Types: Explore a variety of core geospatial data models:
    • Vector Data: Includes points (like airports.csv), lines (like londonTubeLines.json), and polygons (like us-10m.json).
    • Raster-like Data: Work with gridded datasets (like windvectors.csv, annual-precip.json).
  • Diverse Formats: Gain experience with standard and efficient geospatial formats like GeoJSON (see Table 1, 2, 4), compressed TopoJSON (see Table 1), and plain CSV/TSV (see Table 2, 3, 4) for point data and attribute tables ready for joining.
  • Multi-Scale Coverage: Practice visualization across different geographic scales, from global and national (Table 1, 4) down to the city level (Table 1).
  • Rich Thematic Mapping: Includes multiple datasets (Table 3) specifically designed for joining attributes to geographic boundaries (like states or counties from Table 1) to create insightful choropleth maps.
  • Ready-to-Use & Example-Driven: Cleaned datasets tightly integrated with 31+ official examples (see Appendix) from Altair, Vega-Lite, and Vega, allowing you to immediately practice techniques like projections, point maps, network maps, and interactive displays.
  • Python Friendly: Works seamlessly with essential Python libraries like Altair (which can directly read TopoJSON/GeoJSON), Pandas, and GeoPandas, fitting perfectly into the Kaggle notebook environment.

Table of Contents

Dataset Inventory 🗂️

This pack includes 18 datasets covering base maps, reference points, statistical data for choropleths, and geophysical data.

1. BASE MAP BOUNDARIES (Topological Data)

DatasetFileSizeFormatLicenseDescriptionKey Fields / Join Info
US Map (1:10m)us-10m.json627 KBTopoJSONCC-BY-4.0US state and county boundaries. Contains states and counties objects. Ideal for choropleths.id (FIPS code) property on geometries
World Map (1:110m)world-110m.json117 KBTopoJSONCC-BY-4.0World country boundaries. Contains countries object. Suitable for world-scale viz.id property on geometries
London BoroughslondonBoroughs.json14 KBTopoJSONCC-BY-4.0London borough boundaries.properties.BOROUGHN (name)
London CentroidslondonCentroids.json2 KBGeoJSONCC-BY-4.0Center points for London boroughs.properties.id, properties.name
London Tube LineslondonTubeLines.json78 KBGeoJSONCC-BY-4.0London Underground network lines.properties.name, properties.color

2. GEOGRAPHIC REFERENCE POINTS (Point Data) 📍

DatasetFileSizeFormatLicenseDescriptionKey Fields / Join Info
US Airportsairports.csv205 KBCSVPublic DomainUS airports with codes and coordinates.iata, state, `l...
Search
Clear search
Close search
Google apps
Main menu