Downscaling human mobility data based on demographic socioeconomic and commuting characteristics using interpretable machine learning methods

📅 2025-09-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the challenge of coarse spatial resolution in urban human mobility data, which impedes the identification of fine-grained travel patterns. We propose an interpretable machine learning–based downscaling method for urban trip flows, using New York City taxi origin-destination (OD) data. Integrating demographic, socioeconomic, and commuting features, we develop four models—linear regression, random forest, support vector machine (SVM), and neural network—to accurately disaggregate flow estimates from coarse administrative units to fine-grained traffic analysis zones (TAZs). A novel perturbation sensitivity analysis is introduced to quantify the marginal contributions of heterogeneous urban features within nonlinear models, substantially enhancing model transparency and mechanistic interpretability. Results show that the neural network achieves the highest fitting accuracy, while SVM exhibits superior generalization performance. Multi-source urban features jointly enhance downscaling efficacy, demonstrating the scientific validity and practical promise of interpretable ML in modeling complex urban systems.

Technology Category

Application Category

📝 Abstract
Understanding urban human mobility patterns at various spatial levels is essential for social science. This study presents a machine learning framework to downscale origin-destination (OD) taxi trips flows in New York City from a larger spatial unit to a smaller spatial unit. First, correlations between OD trips and demographic, socioeconomic, and commuting characteristics are developed using four models: Linear Regression (LR), Random Forest (RF), Support Vector Machine (SVM), and Neural Networks (NN). Second, a perturbation-based sensitivity analysis is applied to interpret variable importance for nonlinear models. The results show that the linear regression model failed to capture the complex variable interactions. While NN performs best with the training and testing datasets, SVM shows the best generalization ability in downscaling performance. The methodology presented in this study provides both analytical advancement and practical applications to improve transportation services and urban development.
Problem

Research questions and friction points this paper is trying to address.

Downscaling taxi trip flows from larger to smaller spatial units
Analyzing mobility patterns using demographic and socioeconomic factors
Comparing interpretable machine learning models for urban mobility prediction
Innovation

Methods, ideas, or system contributions that make the work stand out.

Downscaling mobility data using interpretable machine learning methods
Applying four models including neural networks and SVM
Using perturbation analysis to interpret variable importance
🔎 Similar Papers
No similar papers found.
Yuqin Jiang
Yuqin Jiang
Assistant Professor, University of Hawaii at Manoa
Geographic Information ScienceHigh Performance ComputingCyberGIS
Andrey A. Popov
Andrey A. Popov
Assistant Professor, Information and Computer Sciences, University of Hawaiʻi at Mānoa
Data AssimilationMachine LearningReduced Order ModelingData FusionDigital Twins
T
Tianle Duan
School of Construction Management Technology, Purdue University, West Lafayette, Indiana, USA
Q
Qingchun Li
School of Construction Management Technology, Purdue University, West Lafayette, Indiana, USA