Offline Reasoning for Efficient Recommendation: LLM-Empowered Persona-Profiled Item Indexing

📅 2026-02-25

📈 Citations: 0

✨ Influential: 0

career value

190K/year

🤖 AI Summary

This work proposes Persona4Rec, a novel framework that addresses the high online inference cost and latency of existing large language model (LLM)-based recommender systems, which hinder their practical deployment. Persona4Rec uniquely shifts the semantic reasoning capability of LLMs to the offline phase, extracting multi-perspective, interpretable user motivations—referred to as “personas”—from product reviews to construct a persona-aware item index. During online inference, recommendations are generated through lightweight user–persona matching, enabling highly efficient retrieval. The approach achieves recommendation performance comparable to state-of-the-art LLM-based re-ranking methods while substantially reducing online latency. Moreover, it offers intuitive, review-grounded explanations for recommendations, effectively balancing efficiency, effectiveness, and interpretability.

Technology Category

Application Category

📝 Abstract

Recent advances in large language models (LLMs) offer new opportunities for recommender systems by capturing the nuanced semantics of user interests and item characteristics through rich semantic understanding and contextual reasoning. In particular, LLMs have been employed as rerankers that reorder candidate items based on inferred user-item relevance. However, these approaches often require expensive online inference-time reasoning, leading to high latency that hampers real-world deployment. In this work, we introduce Persona4Rec, a recommendation framework that performs offline reasoning to construct interpretable persona representations of items, enabling lightweight and scalable real-time inference. In the offline stage, Persona4Rec leverages LLMs to reason over item reviews, inferring diverse user motivations that explain why different types of users may engage with an item; these inferred motivations are materialized as persona representations, providing multiple, human-interpretable views of each item. Unlike conventional approaches that rely on a single item representation, Persona4Rec learns to align user profiles with the most plausible item-side persona through a dedicated encoder, effectively transforming user-item relevance into user-persona relevance. At the online stage, this persona-profiled item index allows fast relevance computation without invoking expensive LLM reasoning. Extensive experiments show that Persona4Rec achieves performance comparable to recent LLM-based rerankers while substantially reducing inference time. Moreover, qualitative analysis confirms that persona representations not only drive efficient scoring but also provide intuitive, review-grounded explanations. These results demonstrate that Persona4Rec offers a practical and interpretable solution for next-generation recommender systems.

Problem

Research questions and friction points this paper is trying to address.

LLM-based recommendation

online inference latency

real-time recommendation

computational efficiency

scalable recommender systems

Innovation

Methods, ideas, or system contributions that make the work stand out.

offline reasoning

persona representation

LLM-empowered recommendation