Deep learning 40 years of human migration

📅 2025-06-28

📈 Citations: 0

✨ Influential: 0

career value

216K/year

🤖 AI Summary

Existing global migration datasets lack annual resolution, country-of-birth granularity, and temporal coverage spanning recent decades, hindering fine-grained analysis of migration dynamics. Method: We propose the first deep recurrent neural network framework tailored for long-term global migration modeling, integrating 18 geospatial, socioeconomic, and demographic covariates. The architecture incorporates uncertainty propagation and ensemble inference to yield interpretable prediction intervals. Contribution/Results: We construct a globally comprehensive, annually resolved migration flow and stock dataset covering 230 countries and territories from 1990 to 2023—the first of its kind with birth-country disaggregation. Our model significantly outperforms conventional five-year interval estimates in both accuracy and timeliness on held-out data. All data, source code, and trained model weights are fully open-sourced, establishing a reproducible, extensible foundational resource for migration research.

Technology Category

Application Category

📝 Abstract

We present a novel and detailed dataset on origin-destination annual migration flows and stocks between 230 countries and regions, spanning the period from 1990 to the present. Our flow estimates are further disaggregated by country of birth, providing a comprehensive picture of migration over the last 43 years. The estimates are obtained by training a deep recurrent neural network to learn flow patterns from 18 covariates for all countries, including geographic, economic, cultural, societal, and political information. The recurrent architecture of the neural network means that the entire past can influence current migration patterns, allowing us to learn long-range temporal correlations. By training an ensemble of neural networks and additionally pushing uncertainty on the covariates through the trained network, we obtain confidence bounds for all our estimates, allowing researchers to pinpoint the geographic regions most in need of additional data collection. We validate our approach on various test sets of unseen data, demonstrating that it significantly outperforms traditional methods estimating five-year flows while delivering a significant increase in temporal resolution. The model is fully open source: all training data, neural network weights, and training code are made public alongside the migration estimates, providing a valuable resource for future studies of human migration.

Problem

Research questions and friction points this paper is trying to address.

Estimating annual global migration flows and stocks since 1990

Disaggregating migration data by origin, destination, and birth country

Providing confidence bounds for migration estimates using deep learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Deep recurrent neural network for migration patterns

Ensemble networks with uncertainty confidence bounds

Open-source training data and model weights

🔎 Similar Papers

No similar papers found.