MixTTE: Multi-Level Mixture-of-Experts for Scalable and Adaptive Travel Time Estimation

📅 2026-01-06

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

208K/year

🤖 AI Summary

This work addresses the limitations of existing travel time estimation methods in effectively modeling city-scale traffic dynamics and long-tail scenarios, which often result in suboptimal prediction performance across large-scale road networks. To overcome these challenges, we propose a scalable and adaptive travel time estimation framework that jointly captures local segment-level dependencies and global route-level dynamics through a hierarchical mixture-of-experts architecture. Key innovations include a spatiotemporal external attention mechanism to model cross-regional temporal correlations, a graph-structured stable Mixture-of-Experts network, and an asynchronous incremental learning strategy for efficient online updates. Evaluated on real-world large-scale datasets, our approach significantly outperforms seven strong baselines and has been deployed on the DiDi platform, substantially improving both estimation accuracy and system stability.

Technology Category

Application Category

📝 Abstract

Accurate Travel Time Estimation (TTE) is critical for ride-hailing platforms, where errors directly impact user experience and operational efficiency. While existing production systems excel at holistic route-level dependency modeling, they struggle to capture city-scale traffic dynamics and long-tail scenarios, leading to unreliable predictions in large urban networks. In this paper, we propose \model, a scalable and adaptive framework that synergistically integrates link-level modeling with industrial route-level TTE systems. Specifically, we propose a spatio-temporal external attention module to capture global traffic dynamic dependencies across million-scale road networks efficiently. Moreover, we construct a stabilized graph mixture-of-experts network to handle heterogeneous traffic patterns while maintaining inference efficiency. Furthermore, an asynchronous incremental learning strategy is tailored to enable real-time and stable adaptation to dynamic traffic distribution shifts. Experiments on real-world datasets validate MixTTE significantly reduces prediction errors compared to seven baselines. MixTTE has been deployed in DiDi, substantially improving the accuracy and stability of the TTE service.

Problem

Research questions and friction points this paper is trying to address.

Travel Time Estimation

traffic dynamics

long-tail scenarios

large urban networks

prediction reliability

Innovation

Methods, ideas, or system contributions that make the work stand out.

Mixture-of-Experts

Travel Time Estimation

Spatio-temporal Attention