Downscaling human mobility data based on demographic socioeconomic and commuting characteristics using interpretable machine learning methods

📅 2025-09-25

📈 Citations: 0

✨ Influential: 0

career value

186K/year

🤖 AI Summary

This study addresses the challenge of coarse spatial resolution in urban human mobility data, which impedes the identification of fine-grained travel patterns. We propose an interpretable machine learning–based downscaling method for urban trip flows, using New York City taxi origin-destination (OD) data. Integrating demographic, socioeconomic, and commuting features, we develop four models—linear regression, random forest, support vector machine (SVM), and neural network—to accurately disaggregate flow estimates from coarse administrative units to fine-grained traffic analysis zones (TAZs). A novel perturbation sensitivity analysis is introduced to quantify the marginal contributions of heterogeneous urban features within nonlinear models, substantially enhancing model transparency and mechanistic interpretability. Results show that the neural network achieves the highest fitting accuracy, while SVM exhibits superior generalization performance. Multi-source urban features jointly enhance downscaling efficacy, demonstrating the scientific validity and practical promise of interpretable ML in modeling complex urban systems.

Technology Category

Application Category

📝 Abstract

Understanding urban human mobility patterns at various spatial levels is essential for social science. This study presents a machine learning framework to downscale origin-destination (OD) taxi trips flows in New York City from a larger spatial unit to a smaller spatial unit. First, correlations between OD trips and demographic, socioeconomic, and commuting characteristics are developed using four models: Linear Regression (LR), Random Forest (RF), Support Vector Machine (SVM), and Neural Networks (NN). Second, a perturbation-based sensitivity analysis is applied to interpret variable importance for nonlinear models. The results show that the linear regression model failed to capture the complex variable interactions. While NN performs best with the training and testing datasets, SVM shows the best generalization ability in downscaling performance. The methodology presented in this study provides both analytical advancement and practical applications to improve transportation services and urban development.

Problem

Research questions and friction points this paper is trying to address.

Downscaling taxi trip flows from larger to smaller spatial units

Analyzing mobility patterns using demographic and socioeconomic factors

Comparing interpretable machine learning models for urban mobility prediction

Innovation

Methods, ideas, or system contributions that make the work stand out.

Downscaling mobility data using interpretable machine learning methods

Applying four models including neural networks and SVM

Using perturbation analysis to interpret variable importance

🔎 Similar Papers

Human Mobility Modeling with Limited Information via Large Language Models