🤖 AI Summary
To address four critical limitations of existing LLM-based recommender systems—computational redundancy, monolithic explanation generation, poor generalization, and misaligned evaluation—this paper proposes the first LLM-augmented intent reasoning framework for recommendation. Methodologically, it introduces: (1) hierarchical multi-agent collaborative reasoning to decouple user interest modeling from item understanding; (2) a meta-prompting mechanism for dynamic, diverse explanation generation; (3) constraint-aware reinforcement learning for multi-objective decoupled optimization; and (4) an “agent-as-judge” evaluation paradigm aligned with human judgment criteria. Experiments demonstrate a 60% reduction in GPU resource consumption, a 1.6-percentage-point gain in domain-specific recall, a 24.1% improvement in label prediction accuracy, and 7.3% and 13.0% gains in explanation diversity and acceptance rate, respectively. Online A/B tests on Taobao show significant improvements: +2.98% CTR, +3.71% IPV, +2.19% TV, and +11.46% NER—advancing recommendation from behavioral matching to explicit intent understanding.
📝 Abstract
Large language models (LLMs) have demonstrated remarkable potential in transforming recommender systems from implicit behavioral pattern matching to explicit intent reasoning. While RecGPT-V1 successfully pioneered this paradigm by integrating LLM-based reasoning into user interest mining and item tag prediction, it suffers from four fundamental limitations: (1) computational inefficiency and cognitive redundancy across multiple reasoning routes; (2) insufficient explanation diversity in fixed-template generation; (3) limited generalization under supervised learning paradigms; and (4) simplistic outcome-focused evaluation that fails to match human standards.
To address these challenges, we present RecGPT-V2 with four key innovations. First, a Hierarchical Multi-Agent System restructures intent reasoning through coordinated collaboration, eliminating cognitive duplication while enabling diverse intent coverage. Combined with Hybrid Representation Inference that compresses user-behavior contexts, our framework reduces GPU consumption by 60% and improves exclusive recall from 9.39% to 10.99%. Second, a Meta-Prompting framework dynamically generates contextually adaptive prompts, improving explanation diversity by +7.3%. Third, constrained reinforcement learning mitigates multi-reward conflicts, achieving +24.1% improvement in tag prediction and +13.0% in explanation acceptance. Fourth, an Agent-as-a-Judge framework decomposes assessment into multi-step reasoning, improving human preference alignment. Online A/B tests on Taobao demonstrate significant improvements: +2.98% CTR, +3.71% IPV, +2.19% TV, and +11.46% NER. RecGPT-V2 establishes both the technical feasibility and commercial viability of deploying LLM-powered intent reasoning at scale, bridging the gap between cognitive exploration and industrial utility.