SAGER: Self-Evolving User Policy Skills for Recommendation Agent

πŸ“… 2026-04-16
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF

career value

204K/year
πŸ€– AI Summary
This work addresses the limitation of existing large language model–based recommendation agents, which rely on static system prompts and struggle to personalize their reasoning logic, often updating only user memory without refining decision-making after recommendation failures. To overcome this, the authors propose equipping each user with an evolving natural language policy skill that continuously adapts through interaction. Their approach employs a dual-representation architecture, incremental contrastive chain-of-thought reasoning, and skill-augmented listwise inference to enable self-evolution of personalized recommendation logic. This is the first method to integrate evolvable natural language skills into recommendation agents, decoupling skill evolution from reasoning execution and leveraging contrastive analysis between accepted and rejected items to diagnose and correct reasoning flaws. Evaluated on four public benchmarks, the method achieves state-of-the-art performance, with gains orthogonal to memory accumulation, demonstrating the unique contribution of adaptive reasoning to recommendation quality.

Technology Category

Application Category

πŸ“ Abstract
Large language model (LLM) based recommendation agents personalize what they know through evolving per-user semantic memory, yet how they reason remains a universal, static system prompt shared identically across all users. This asymmetry is a fundamental bottleneck: when a recommendation fails, the agent updates its memory of user preferences but never interrogates the decision logic that produced the failure, leaving its reasoning process structurally unchanged regardless of how many mistakes it accumulates. To address this bottleneck, we propose SAGER (Self-Evolving Agent for Personalized Recommendation), the first recommendation agent framework in which each user is equipped with a dedicated policy skill, a structured natural-language document encoding personalized decision principles that evolves continuously through interaction. SAGER introduces a two-representation skill architecture that decouples a rich evolution substrate from a minimal inference-time injection, an incremental contrastive chain-of-thought engine that diagnoses reasoning flaws by contrasting accepted against unchosen items while preserving accumulated priors, and skill-augmented listwise reasoning that creates fine-grained decision boundaries where the evolved skill provides genuine discriminative value. Experiments on four public benchmarks demonstrate that SAGER achieves state-of-the-art performance, with gains orthogonal to memory accumulation, confirming that personalizing the reasoning process itself is a qualitatively distinct source of recommendation improvement.
Problem

Research questions and friction points this paper is trying to address.

recommendation agent
reasoning personalization
static system prompt
decision logic
user-specific policy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-Evolving Policy Skill
Personalized Reasoning
Contrastive Chain-of-Thought
Skill-Augmented Recommendation
LLM-based Recommendation Agent
πŸ”Ž Similar Papers
No similar papers found.