π€ AI Summary
This work addresses key limitations in traditional multi-stage route recommendation systemsβsuch as misalignment between training objectives and online metrics, reliance on rule-based deduplication, and suboptimal performance due to the decoupling of reranking from final ranking. To overcome these issues, we propose SCASRec, the first framework that formulates route recommendation as a generative sequence modeling task, enabling end-to-end joint optimization of ranking and deduplication. SCASRec introduces a Stepwise Correction Reward (SCR) mechanism that focuses on hard negative samples and a learnable End-of-Recommendation (EOR) token to support self-correction and dynamic termination. Evaluated on two large-scale public datasets, SCASRec achieves state-of-the-art performance both offline and online, and has been fully deployed in a real-world navigation application, significantly enhancing user experience.
π Abstract
Route recommendation systems commonly adopt a multi-stage pipeline involving fine-ranking and re-ranking to produce high-quality ordered recommendations. However, this paradigm faces three critical limitations. First, there is a misalignment between offline training objectives and online metrics. Offline gains do not necessarily translate to online improvements. Actual performance must be validated through A/B testing, which may potentially compromise the user experience. Second, redundancy elimination relies on rigid, handcrafted rules that lack adaptability to the high variance in user intent and the unstructured complexity of real-world scenarios. Third, the strict separation between fine-ranking and re-ranking stages leads to sub-optimal performance. Since each module is optimized in isolation, the fine-ranking stage remains oblivious to the list-level objectives (e.g., diversity) targeted by the re-ranker, thereby preventing the system from achieving a jointly optimized global optimum. To overcome these intertwined challenges, we propose SCASRec (Self-Correcting and Auto-Stopping Recommendation), a unified generative framework that integrates ranking and redundancy elimination into a single end-to-end process. SCASRec introduces a stepwise corrective reward (SCR) to guide list-wise refinement by focusing on hard samples, and employs a learnable End-of-Recommendation (EOR) token to terminate generation adaptively when no further improvement is expected. Experiments on two large-scale, open-sourced route recommendation datasets demonstrate that SCASRec establishes an SOTA in offline and online settings. SCASRec has been fully deployed in a real-world navigation app, demonstrating its effectiveness.