🤖 AI Summary
This work addresses the high computational cost and limited scalability of conventional personalized large language models, which typically rely on extensive fine-tuning to accommodate diverse user preferences. The authors propose a lightweight, inference-time personalization approach that introduces, for the first time, a classifier-guided mechanism during generation. By dynamically modulating the decoding process through real-time fusion of preference classification signals, the method enables personalized text generation without any additional fine-tuning. It supports both single- and multi-dimensional preference control and consistently produces high-quality, tailored outputs across various preference dimensions. This approach significantly reduces computational overhead while offering strong controllability and excellent scalability.
📝 Abstract
Personalized LLMs can significantly enhance user experiences by tailoring responses to preferences such as helpfulness, conciseness, and humor. However, fine-tuning models to address all possible combinations of user preferences is computationally expensive and impractical. In this paper, we introduce \textbf{CLIPer}(\textbf{Cl}assifier-guided \textbf{I}nference-time \textbf{Per}sonalization), a lightweight personalization approach that leverages a classifier model to steer LLM generation dynamically to different user preferences at inference time. Our method eliminates the need for extensive fine-tuning, inducing negligible additional computational overhead while enabling more controllable and nuanced personalization across single and multi-dimensional preferences. Comprehensive empirical analyses demonstrate the scalability and effectiveness of our approach in delivering personalized language generation.