Steerable Chatbots: Personalizing LLMs with Preference-Based Activation Steering

📅 2025-05-07

📈 Citations: 0

✨ Influential: 0

career value

229K/year

🤖 AI Summary

Conventional prompting struggles to elicit users’ implicit preferences, leading to homogeneous LLM outputs and limiting personalization for non-expert users. Method: This paper proposes a fine-tuning-free, memoryless inference-time alignment method that projects interpretable preference dimensions—e.g., formality, verbosity, and sentiment polarity—directly into the LLM’s hidden-layer activation space, enabling real-time, user-controllable output steering via lightweight linear interventions. Contribution/Results: We design three interactive interfaces and conduct a within-subject user study (n=14). Results demonstrate statistically significant improvements in response personalization. Users exhibit systematic trade-offs among perceived controllability, usability, and transparency, with clear interface-dependent preference patterns. Our approach establishes a new paradigm for low-barrier, highly controllable LLM personalization, advancing human-aligned generative interaction without model modification or historical context reliance.

Technology Category

Application Category

📝 Abstract

As large language models (LLMs) improve in their capacity to serve as personal AI assistants, their ability to output uniquely tailored, personalized responses that align with the soft preferences of their users is essential for enhancing user satisfaction and retention. However, untrained lay users have poor prompt specification abilities and often struggle with conveying their latent preferences to AI assistants. To address this, we leverage activation steering to guide LLMs to align with interpretable preference dimensions during inference. In contrast to memory-based personalization methods that require longer user history, steering is extremely lightweight and can be easily controlled by the user via an linear strength factor. We embed steering into three different interactive chatbot interfaces and conduct a within-subjects user study (n=14) to investigate how end users prefer to personalize their conversations. The results demonstrate the effectiveness of preference-based steering for aligning real-world conversations with hidden user preferences, and highlight further insights on how diverse values around control, usability, and transparency lead users to prefer different interfaces.

Problem

Research questions and friction points this paper is trying to address.

Enhancing LLM personalization for user satisfaction via activation steering

Addressing poor prompt specification in untrained users for preference alignment

Evaluating lightweight steering interfaces for real-world conversation personalization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Activation steering aligns LLMs with preferences

Lightweight user control via linear strength factor

Interactive chatbot interfaces for personalization

🔎 Similar Papers

No similar papers found.