1 dataset found
  1. Data from: Deep learning four decades of human migration: datasets

    • zenodo.org
    csv, nc
    Updated Oct 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Thomas Gaskin; Thomas Gaskin; Guy Abel; Guy Abel (2025). Deep learning four decades of human migration: datasets [Dataset]. http://doi.org/10.5281/zenodo.17344747
    Explore at:
    csv, ncAvailable download formats
    Dataset updated
    Oct 13, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Thomas Gaskin; Thomas Gaskin; Guy Abel; Guy Abel
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This Zenodo repository contains all migration flow estimates associated with the paper "Deep learning four decades of human migration." Evaluation code, training data, trained neural networks, and smaller flow datasets are available in the main GitHub repository, which also provides detailed instructions on data sourcing. Due to file size limits, the larger datasets are archived here.

    Data is available in both NetCDF (.nc) and CSV (.csv) formats. The NetCDF format is more compact and pre-indexed, making it suitable for large files. In Python, datasets can be opened as xarray.Dataset objects, enabling coordinate-based data selection.

    Each dataset uses the following coordinate conventions:

    • Year: 1990–2023
    • Birth ISO: Country of birth (UN ISO3)
    • Origin ISO: Country of origin (UN ISO3)
    • Destination ISO: Destination country (UN ISO3)
    • Country ISO: Used for net migration data (UN ISO3)

    The following data files are provided:

    • T.nc: Full table of flows disaggregated by country of birth. Dimensions: Year, Birth ISO, Origin ISO, Destination ISO
    • flows.nc: Total origin-destination flows (equivalent to T summed over Birth ISO). Dimensions: Year, Origin ISO, Destination ISO
    • net_migration.nc: Net migration data by country. Dimensions: Year, Country ISO
    • stocks.nc: Stock estimates for each country pair. Dimensions: Year, Origin ISO (corresponding to Birth ISO), Destination ISO
    • test_flows.nc: Flow estimates on a randomly selected set of test edges, used for model validation

    Additionally, two CSV files are provided for convenience:

    • mig_unilateral.csv: Unilateral migration estimates per country, comprising:
      • imm: Total immigration flows
      • emi: Total emigration flows
      • net: Net migration
      • imm_pop: Total immigrant population (non-native-born)
      • emi_pop: Total emigrant population (living abroad)
    • mig_bilateral.csv: Bilateral flow data, comprising:
      • mig_prev: Total origin-destination flows
      • mig_brth: Total birth-destination flows, where Origin ISO reflects place of birth

    Each dataset includes a mean variable (mean estimate) and a std variable (standard deviation of the estimate).

    An ISO3 conversion table is also provided.

  2. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Thomas Gaskin; Thomas Gaskin; Guy Abel; Guy Abel (2025). Deep learning four decades of human migration: datasets [Dataset]. http://doi.org/10.5281/zenodo.17344747
Organization logo

Data from: Deep learning four decades of human migration: datasets

Related Article
Explore at:
csv, ncAvailable download formats
Dataset updated
Oct 13, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Thomas Gaskin; Thomas Gaskin; Guy Abel; Guy Abel
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This Zenodo repository contains all migration flow estimates associated with the paper "Deep learning four decades of human migration." Evaluation code, training data, trained neural networks, and smaller flow datasets are available in the main GitHub repository, which also provides detailed instructions on data sourcing. Due to file size limits, the larger datasets are archived here.

Data is available in both NetCDF (.nc) and CSV (.csv) formats. The NetCDF format is more compact and pre-indexed, making it suitable for large files. In Python, datasets can be opened as xarray.Dataset objects, enabling coordinate-based data selection.

Each dataset uses the following coordinate conventions:

  • Year: 1990–2023
  • Birth ISO: Country of birth (UN ISO3)
  • Origin ISO: Country of origin (UN ISO3)
  • Destination ISO: Destination country (UN ISO3)
  • Country ISO: Used for net migration data (UN ISO3)

The following data files are provided:

  • T.nc: Full table of flows disaggregated by country of birth. Dimensions: Year, Birth ISO, Origin ISO, Destination ISO
  • flows.nc: Total origin-destination flows (equivalent to T summed over Birth ISO). Dimensions: Year, Origin ISO, Destination ISO
  • net_migration.nc: Net migration data by country. Dimensions: Year, Country ISO
  • stocks.nc: Stock estimates for each country pair. Dimensions: Year, Origin ISO (corresponding to Birth ISO), Destination ISO
  • test_flows.nc: Flow estimates on a randomly selected set of test edges, used for model validation

Additionally, two CSV files are provided for convenience:

  • mig_unilateral.csv: Unilateral migration estimates per country, comprising:
    • imm: Total immigration flows
    • emi: Total emigration flows
    • net: Net migration
    • imm_pop: Total immigrant population (non-native-born)
    • emi_pop: Total emigrant population (living abroad)
  • mig_bilateral.csv: Bilateral flow data, comprising:
    • mig_prev: Total origin-destination flows
    • mig_brth: Total birth-destination flows, where Origin ISO reflects place of birth

Each dataset includes a mean variable (mean estimate) and a std variable (standard deviation of the estimate).

An ISO3 conversion table is also provided.

Search
Clear search
Close search
Google apps
Main menu