Internalizing Multi-Agent Reasoning for Accurate and Efficient LLM-based Recommendation

📅 2026-02-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of integrating the semantic reasoning capabilities of large language models with collaborative filtering signals to enhance recommendation accuracy while maintaining low inference latency. The authors propose STAR, a novel framework that, for the first time, internalizes multi-agent collaborative reasoning trajectories—encompassing planning, tool invocation, and self-reflection—into a single efficient recommender model via trajectory-driven distillation. To bridge user behavior with natural language reasoning, STAR introduces a collaborative signal translation mechanism that converts interaction histories into textual evidence to augment the model’s reasoning process. Extensive experiments demonstrate that STAR outperforms its multi-agent teacher by 8.7%–39.5% across multiple metrics while eliminating iterative inference delays, thereby achieving a unified solution that delivers both high accuracy and real-time responsiveness.

Technology Category

Application Category

📝 Abstract
Large Language Models (LLMs) are reshaping recommender systems by leveraging extensive world knowledge and semantic reasoning to interpret user intent. However, effectively integrating these capabilities with collaborative signals while avoiding prohibitive inference latency remains a critical bottleneck. To address this, we propose a trajectory-driven internalization framework to develop a Single-agent Trajectory-Aligned Recommender (STAR). Specifically, to internalize complex reasoning capabilities into a single efficient model, we first design a multi-agent teacher system capable of multi-turn tool usage and reflection. This teacher utilizes a Collaborative Signal Translation mechanism to explicitly convert latent behavioral patterns into descriptive natural language evidence to enhance reasoning accuracy. Subsequently, a trajectory-driven distillation pipeline transfers this agentic logic, including planning, tool usage, and self-reflection, into the compact STAR model. Extensive experiments demonstrate that STAR surpasses its teacher by 8.7% to 39.5% while eliminating iterative latency, paving the way for real-time, reasoning-enhanced recommendation.
Problem

Research questions and friction points this paper is trying to address.

LLM-based recommendation
multi-agent reasoning
collaborative signals
inference latency
reasoning efficiency
Innovation

Methods, ideas, or system contributions that make the work stand out.

trajectory-driven distillation
multi-agent reasoning
collaborative signal translation
LLM-based recommendation
reasoning internalization
🔎 Similar Papers
2024-05-17Annual Meeting of the Association for Computational LinguisticsCitations: 4
Yang Wu
Yang Wu
Tencent
Computer VisionMachine LearningComputer Graphics
H
Haoze Wang
Tencent, Beijing, China
Q
Qian Li
Tencent, Beijing, China
Jun Zhang
Jun Zhang
Tencent
AI codecimage/video generationmedical image analysis
H
Huan Yu
Tencent, Beijing, China
J
Jie Jiang
Tencent, Beijing, China