🤖 AI Summary
This paper addresses the challenge of personalizing large language model (LLM)-based dialogue assistants under extreme scarcity of individual user preference data—often limited to only a few annotated examples—introducing the novel task of *Personalized Preference Alignment under Limited Data* (PPALLI). To tackle this, we propose the Feature-aware Sampling and Tuning framework (FaST), which first automatically discovers high-level semantic features from sparse user feedback, then performs parameter-efficient fine-tuning guided by these features. We evaluate FaST on two newly constructed datasets, DnD and ELIP, demonstrating consistent and significant improvements over state-of-the-art personalized alignment methods across multiple low-resource benchmarks. Results confirm its effectiveness, generalizability, and deployment feasibility. Our core contribution lies in tightly coupling interpretable feature discovery with lightweight adaptation—enabling scalable, transparent, and sample-efficient personalized alignment for the first time in ultra-low-data regimes.
📝 Abstract
LLM-powered conversational assistants are often deployed in a one-size-fits-all manner, which fails to accommodate individual user preferences. Recently, LLM personalization -- tailoring models to align with specific user preferences -- has gained increasing attention as a way to bridge this gap. In this work, we specifically focus on a practical yet challenging setting where only a small set of preference annotations can be collected per user -- a problem we define as Personalized Preference Alignment with Limited Data (PPALLI). To support research in this area, we introduce two datasets -- DnD and ELIP -- and benchmark a variety of alignment techniques on them. We further propose FaST, a highly parameter-efficient approach that leverages high-level features automatically discovered from the data, achieving the best overall performance.