GRank: Towards Target-Aware and Streamlined Industrial Retrieval with a Generate-Rank Framework

📅 2025-10-17

📈 Citations: 0

✨ Influential: 0

career value

200K/year

🤖 AI Summary

Industrial-scale recommender systems face dual challenges in the retrieval stage: weak representation capability (e.g., two-tower models lack fine-grained target awareness) and high index maintenance overhead (tree-, graph-, or quantization-based structures struggle to adapt to dynamic user preferences). This paper proposes GRank, a structure-free generative-reranking collaborative retrieval paradigm. Its core innovation lies in an end-to-end jointly optimized, target-aware generator—accelerated via GPU-enabled maximum inner product search (MIPS)—and a lightweight semantic-consistent reranker. Multi-task learning ensures semantic alignment between the two stages, enabling user-centric, fine-grained intent modeling. Evaluated on a billion-item catalog, GRank achieves over 30% improvement in Recall@500 and delivers 1.7× higher QPS than state-of-the-art methods under P99 latency constraints. It has been deployed at scale, serving 400 million users steadily and significantly increasing core session duration.

Technology Category

Application Category

📝 Abstract

Industrial-scale recommender systems rely on a cascade pipeline in which the retrieval stage must return a high-recall candidate set from billions of items under tight latency. Existing solutions ei- ther (i) suffer from limited expressiveness in capturing fine-grained user-item interactions, as seen in decoupled dual-tower architectures that rely on separate encoders, or generative models that lack precise target-aware matching capabilities, or (ii) build structured indices (tree, graph, quantization) whose item-centric topologies struggle to incorporate dynamic user preferences and incur prohibitive construction and maintenance costs. We present GRank, a novel structured-index-free retrieval paradigm that seamlessly unifies target-aware learning with user-centric retrieval. Our key innovations include: (1) A target-aware Generator trained to perform personalized candidate generation via GPU-accelerated MIPS, eliminating semantic drift and maintenance costs of structured indexing; (2) A lightweight but powerful Ranker that performs fine-grained, candidate-specific inference on small subsets; (3) An end-to-end multi-task learning framework that ensures semantic consistency between generation and ranking objectives. Extensive experiments on two public benchmarks and a billion-item production corpus demonstrate that GRank improves Recall@500 by over 30% and 1.7$ imes$ the P99 QPS of state-of-the-art tree- and graph-based retrievers. GRank has been fully deployed in production in our recommendation platform since Q2 2025, serving 400 million active users with 99.95% service availability. Online A/B tests confirm significant improvements in core engagement metrics, with Total App Usage Time increasing by 0.160% in the main app and 0.165% in the Lite version.

Problem

Research questions and friction points this paper is trying to address.

Improving retrieval recall and latency in billion-scale recommender systems

Overcoming limitations of dual-tower architectures and structured indexing methods

Unifying target-aware learning with user-centric retrieval without structured indices

Innovation

Methods, ideas, or system contributions that make the work stand out.

Target-aware Generator using GPU-accelerated MIPS

Lightweight Ranker for fine-grained candidate inference

End-to-end multi-task learning framework

🔎 Similar Papers

A Comprehensive Survey on Retrieval Methods in Recommender Systems