🤖 AI Summary
This work addresses key challenges in query recommendation for local lifestyle services, including the difficulty of satisfying long-tail user demands, the lack of geographic awareness in large language models (LLMs), exposure bias, and online inference latency. To tackle these issues, we propose a geographically aware LLM-based query recommendation framework that enhances geographic grounding through city-aware term co-occurrence mining. We design a beam search-driven GRPO algorithm to align training and inference, thereby mitigating exposure bias, and introduce quality-aware acceleration with vocabulary pruning to substantially reduce latency. A multi-objective reward mechanism jointly optimizes relevance and business metrics. Extensive offline evaluations and large-scale online A/B tests demonstrate the effectiveness and deployability of our approach, yielding a 0.35% increase in click-through rate and a 2.56% reduction in queries with no or low results.
📝 Abstract
In local-life service platforms, the query suggestion module plays a crucial role in enhancing user experience by generating candidate queries based on user input prefixes, thus reducing user effort and accelerating search. Traditional multi-stage cascading systems rely heavily on historical top queries, limiting their ability to address long-tail demand. While LLMs offer strong semantic generalization, deploying them in local-life services introduces three key challenges: lack of geographic grounding, exposure bias in preference optimization, and online inference latency. To address these issues, we propose LocalSUG, an LLM-based query suggestion framework tailored for local-life service platforms. First, we introduce a city-aware candidate mining strategy based on term co-occurrence to inject geographic grounding into generation. Second, we propose a beam-search-driven GRPO algorithm that aligns training with inference-time decoding, reducing exposure bias in autoregressive generation. A multi-objective reward mechanism further optimizes both relevance and business-oriented metrics. Finally, we develop quality-aware beam acceleration and vocabulary pruning techniques that significantly reduce online latency while preserving generation quality. Extensive offline evaluations and large-scale online A/B testing demonstrate that LocalSUG improves click-through rate (CTR) by +0.35% and reduces the low/no-result rate by 2.56%, validating its effectiveness in real-world deployment.