🤖 AI Summary
Traditional uniform road segmentation in bus arrival time prediction overlooks physical constraints—such as traffic conditions, intersections, and points of interest—leading to suboptimal prediction accuracy and efficiency. To address this, we propose the first reinforcement learning–based non-uniform road segmentation method. Our approach formulates segmentation as a sequential decision-making problem, where a policy network automatically identifies critical road segments most influential for prediction, enabling adaptive, context-aware partitioning. Crucially, it decouples segmentation from prediction, allowing lightweight linear models to perform efficient inference. Extensive experiments on a large-scale real-world benchmark demonstrate that our segmentation strategy significantly improves prediction accuracy; remarkably, a linear model combined with non-uniform segmentation outperforms multiple sophisticated end-to-end deep models. The code and dataset are publicly available.
📝 Abstract
In bus arrival time prediction, the process of organizing road infrastructure network data into homogeneous entities is known as segmentation. Segmenting a road network is widely recognized as the first and most critical step in developing an arrival time prediction system, particularly for auto-regressive-based approaches. Traditional methods typically employ a uniform segmentation strategy, which fails to account for varying physical constraints along roads, such as road conditions, intersections, and points of interest, thereby limiting prediction efficiency. In this paper, we propose a Reinforcement Learning (RL)-based approach to efficiently and adaptively learn non-uniform road segments for arrival time prediction. Our method decouples the prediction process into two stages: 1) Non-uniform road segments are extracted based on their impact scores using the proposed RL framework; and 2) A linear prediction model is applied to the selected segments to make predictions. This method ensures optimal segment selection while maintaining computational efficiency, offering a significant improvement over traditional uniform approaches. Furthermore, our experimental results suggest that the linear approach can even achieve better performance than more complex methods. Extensive experiments demonstrate the superiority of the proposed method, which not only enhances efficiency but also improves learning performance on large-scale benchmarks. The dataset and the code are publicly accessible at: https://github.com/pangjunbiao/Less-is-More.