Steerable Chatbots: Personalizing LLMs with Preference-Based Activation Steering

📅 2025-05-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Conventional prompting struggles to elicit users’ implicit preferences, leading to homogeneous LLM outputs and limiting personalization for non-expert users. Method: This paper proposes a fine-tuning-free, memoryless inference-time alignment method that projects interpretable preference dimensions—e.g., formality, verbosity, and sentiment polarity—directly into the LLM’s hidden-layer activation space, enabling real-time, user-controllable output steering via lightweight linear interventions. Contribution/Results: We design three interactive interfaces and conduct a within-subject user study (n=14). Results demonstrate statistically significant improvements in response personalization. Users exhibit systematic trade-offs among perceived controllability, usability, and transparency, with clear interface-dependent preference patterns. Our approach establishes a new paradigm for low-barrier, highly controllable LLM personalization, advancing human-aligned generative interaction without model modification or historical context reliance.

Technology Category

Application Category

📝 Abstract
As large language models (LLMs) improve in their capacity to serve as personal AI assistants, their ability to output uniquely tailored, personalized responses that align with the soft preferences of their users is essential for enhancing user satisfaction and retention. However, untrained lay users have poor prompt specification abilities and often struggle with conveying their latent preferences to AI assistants. To address this, we leverage activation steering to guide LLMs to align with interpretable preference dimensions during inference. In contrast to memory-based personalization methods that require longer user history, steering is extremely lightweight and can be easily controlled by the user via an linear strength factor. We embed steering into three different interactive chatbot interfaces and conduct a within-subjects user study (n=14) to investigate how end users prefer to personalize their conversations. The results demonstrate the effectiveness of preference-based steering for aligning real-world conversations with hidden user preferences, and highlight further insights on how diverse values around control, usability, and transparency lead users to prefer different interfaces.
Problem

Research questions and friction points this paper is trying to address.

Enhancing LLM personalization for user satisfaction via activation steering
Addressing poor prompt specification in untrained users for preference alignment
Evaluating lightweight steering interfaces for real-world conversation personalization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Activation steering aligns LLMs with preferences
Lightweight user control via linear strength factor
Interactive chatbot interfaces for personalization
🔎 Similar Papers
No similar papers found.
Jessica Y. Bo
Jessica Y. Bo
University of Toronto
Human-AI Interactions
T
Tianyu Xu
Google AR
Ishan Chatterjee
Ishan Chatterjee
Google, University of Washington
Augmented RealitySpatial ComputingInput Methods
K
Katrina Passarella-Ward
Google AR
A
Achin Kulshrestha
Google AR
D
D Shin
Google DeepMind