🤖 AI Summary
This work addresses the challenge of balancing fine-grained personalization and scalable deployment in large language models for personalized text generation. The authors propose CARD, a framework that learns shared-style LoRA adapters through user clustering and integrates implicit preference-aware contrastive learning to inject personalization signals during decoding in a lightweight manner. Notably, CARD requires no manual annotations and dynamically adjusts outputs at inference time while keeping the base model frozen, thereby ensuring both computational efficiency and deployment scalability. Evaluated on the LaMP and LongLaMP benchmarks, CARD achieves generation quality on par with or superior to state-of-the-art methods while significantly improving inference efficiency and scalability.
📝 Abstract
Adapting large language models to individual users remains challenging due to the tension between fine-grained personalization and scalable deployment. We present CARD, a hierarchical framework that achieves effective personalization through progressive refinement. CARD first clusters users according to shared stylistic patterns and learns cluster-specific LoRA adapters, enabling robust generalization and strong low-resource performance. To capture individual differences within each cluster, we propose an implicit preference learning mechanism that contrasts user-authored text with cluster-level generations, allowing the model to infer user-specific style preferences without manual annotation. At inference time, CARD injects personalization exclusively at decoding via lightweight user preference vectors and low-rank logit corrections, while keeping the base model frozen. Experiments on the LaMP and LongLaMP benchmarks show that CARD achieves competitive or superior generation quality compared to state-of-the-art baselines, while significantly improving efficiency and scalability for practical personalized text generation.