Efficient Generative Retrieval for E-commerce Search with Semantic Cluster IDs and Expert-Guided RL

πŸ“… 2026-05-14
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF

career value

199K/year
πŸ€– AI Summary
This work addresses the challenges of applying generative retrieval in industrial e-commerce searchβ€”namely, the vast and dynamic product catalog, stringent latency constraints, and misalignment with downstream ranking objectives. To overcome these issues, the authors propose CQ-SID, a novel framework that treats generative retrieval as a complementary recall component. It introduces category-aware query-item contrastive learning and a residual quantized VAE to generate hierarchical semantic cluster IDs, substantially reducing beam search complexity. Furthermore, an expert-guided group relative policy optimization (EG-GRPO) reinforcement learning approach is designed to align retrieval outputs with downstream ranking goals. Offline experiments demonstrate a 26.76% improvement in semantic hit rate and an 11.11% gain in personalized click-through hit rate, alongside a 50% reduction in beam size. Online A/B tests show a 1.15% increase in GMV and a 0.40% uplift in UCTCVR, with over 72% of conversions attributed to the generative recall channel.
πŸ“ Abstract
Generative retrieval offers a promising alternative by unifying the fragmented multi-stage retrieval process into a single end-to-end model. However, its practical adoption in industrial e-commerce search remains challenging, given the massive and dynamic product catalogs, strict latency requirements, and the need to align retrieval with downstream ranking goals. In this work, we propose a retrieval framework tailored for real-world recall scenarios, positioning generative retrieval as a recall-stage supplement rather than an end-to-end replacement. Our method, CQ-SID (Category-and-Query constrained Semantic ID), employs category-aware and query-item contrastive learning along with Residual Quantized VAEs to encode items into hierarchical semantic cluster identifiers, significantly reducing beam search complexity. Additionally, we develop EG-GRPO (Expert-Guided Group Relative Policy Optimization), a reinforcement learning approach that aligns generative recall with downstream ranking under sparse rewards by injecting ground-truth samples to stabilize training. Offline experiments on TmallAPP search logs show that CQ-SID achieves up to 26.76% and 11.11% relative gains in semantic and personalized click hitrate over RQ-VAE baselines, while halving beam search size. EG-GRPO further improves multi-objective performance. Online A/B tests confirm gains in GMV (+1.15%) and UCTCVR (+0.40%). The generative recall channel now contributes substantially in production, accounting for over 50.25% of exposures, 58.96% of clicks, and 72.63% of purchases, demonstrating a viable path for deploying generative retrieval in real-world e-commerce systems.
Problem

Research questions and friction points this paper is trying to address.

generative retrieval
e-commerce search
latency constraints
ranking alignment
dynamic product catalogs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Generative Retrieval
Semantic Cluster IDs
Expert-Guided RL
E-commerce Search
Residual Quantized VAE
πŸ”Ž Similar Papers
No similar papers found.