🤖 AI Summary
Traditional origin-destination (OD) commuter data are severely scarce due to high survey costs and stringent privacy constraints.
Method: This paper proposes GlODGen—a novel end-to-end OD generation framework that, for the first time, empirically validates that satellite imagery encodes over 98% of multi-source urban socioeconomic features underlying commuting patterns. GlODGen integrates a vision-language geospatial foundation model with a graph diffusion mechanism, enabling joint modeling with population data and global scalability.
Contribution/Results: Evaluated across six representative cities spanning Asia, Europe, North America, and Oceania, GlODGen generates OD flows highly consistent with ground-truth data (mean Pearson correlation >0.85). The framework introduces a survey-free paradigm for urban mobility sensing and releases a fully open-source toolchain—including data preprocessing, model training, and inference—to support reproducible, privacy-preserving, and globally applicable traffic analysis.
📝 Abstract
Commuting Origin-destination~(OD) flows, capturing daily population mobility of citizens, are vital for sustainable development across cities around the world. However, it is challenging to obtain the data due to the high cost of travel surveys and privacy concerns. Surprisingly, we find that satellite imagery, publicly available across the globe, contains rich urban semantic signals to support high-quality OD flow generation, with over 98% expressiveness of traditional multisource hard-to-collect urban sociodemographic, economics, land use, and point of interest data. This inspires us to design a novel data generator, GlODGen, which can generate OD flow data for any cities of interest around the world. Specifically, GlODGen first leverages Vision-Language Geo-Foundation Models to extract urban semantic signals related to human mobility from satellite imagery. These features are then combined with population data to form region-level representations, which are used to generate OD flows via graph diffusion models. Extensive experiments on 4 continents and 6 representative cities show that GlODGen has great generalizability across diverse urban environments on different continents and can generate OD flow data for global cities highly consistent with real-world mobility data. We implement GlODGen as an automated tool, seamlessly integrating data acquisition and curation, urban semantic feature extraction, and OD flow generation together. It has been released at https://github.com/tsinghua-fib-lab/generate-od-pubtools.