🤖 AI Summary
To address the limitations of conventional RAG methods in dynamic recommendation—specifically their reliance on static retrieval and inability to model fine-grained user intent—this paper proposes a multi-agent collaborative, intent-driven RAG framework. We introduce a novel four-role LLM agent architecture: User Understanding, Semantic Alignment Reasoning, Contextual Summarization, and Ranking & Generation. Active reasoning is deeply integrated throughout the entire retrieval-and-generation pipeline, synergistically incorporating natural language inference (NLI), conversational and long-term behavioral modeling, and context-aware ranking. Extensive experiments on three public benchmarks demonstrate substantial improvements: +42.1% in NDCG@5 and +35.5% in Hit@5 over standard RAG and timeliness-aware baselines. Ablation studies quantitatively validate the critical contribution of each agent component to overall performance.
📝 Abstract
Retrieval-Augmented Generation (RAG) has shown promise in enhancing recommendation systems by incorporating external context into large language model prompts. However, existing RAG-based approaches often rely on static retrieval heuristics and fail to capture nuanced user preferences in dynamic recommendation scenarios. In this work, we introduce ARAG, an Agentic Retrieval-Augmented Generation framework for Personalized Recommendation, which integrates a multi-agent collaboration mechanism into the RAG pipeline. To better understand the long-term and session behavior of the user, ARAG leverages four specialized LLM-based agents: a User Understanding Agent that summarizes user preferences from long-term and session contexts, a Natural Language Inference (NLI) Agent that evaluates semantic alignment between candidate items retrieved by RAG and inferred intent, a context summary agent that summarizes the findings of NLI agent, and an Item Ranker Agent that generates a ranked list of recommendations based on contextual fit. We evaluate ARAG accross three datasets. Experimental results demonstrate that ARAG significantly outperforms standard RAG and recency-based baselines, achieving up to 42.1% improvement in NDCG@5 and 35.5% in Hit@5. We also, conduct an ablation study to analyse the effect by different components of ARAG. Our findings highlight the effectiveness of integrating agentic reasoning into retrieval-augmented recommendation and provide new directions for LLM-based personalization.