Designing Role Vectors to Improve LLM Inference Behaviour

📅 2025-02-17

📈 Citations: 0

✨ Influential: 0

career value

206K/year

🤖 AI Summary

This work addresses the limited efficacy of conventional persona prompting in guiding large language models (LLMs) toward domain-specific expertise. We propose a learnable and intervenable persona vector mechanism: 29 domain-specific persona vectors are extracted from intermediate-layer activations, and model internal representations are directly modulated via two complementary strategies—activation addition and directional ablation—to selectively enhance domain-relevant capabilities while suppressing irrelevant ones. Unlike external prompting, this internal representation manipulation paradigm requires no input text modification, yielding consistent performance gains across multiple domain benchmarks (+3.2% average improvement) with negligible impact on out-of-domain tasks (Δ < 0.4%). To our knowledge, this is the first work to formulate persona modeling as a differentiable, intervenable vector space and empirically validate its cross-domain generalization. The approach establishes a novel paradigm for controllable reasoning through fine-grained, gradient-based internal representation steering.

Technology Category

Application Category

📝 Abstract

The influence of personas on Large Language Models (LLMs) has been widely studied, yet their direct impact on performance remains uncertain. This work explores a novel approach to guiding LLM behaviour through role vectors, an alternative to persona-based prompting. We construct 29 role vectors derived from model activations and evaluate their impact on benchmark performance across multiple domains. Our analysis investigates whether these vectors can effectively steer models toward domain-specific expertise. We measure two key interventions: (i) activation addition, which reinforces role-specific directions, and (ii) directional ablation, which removes them. Results on well-established benchmarks indicate that role vectors do, in fact, influence model behaviour, improving task performance in relevant domains while marginally affecting unrelated tasks. This, in turn, suggests that manipulating internal model representations has a greater impact on outcomes than persona-based prompting.

Problem

Research questions and friction points this paper is trying to address.

Improving LLM inference behavior

Exploring role vectors impact

Steering models to domain expertise

Innovation

Methods, ideas, or system contributions that make the work stand out.

Role vectors guide LLM behavior

Activation addition reinforces specific directions

Directional ablation removes role-specific influences

🔎 Similar Papers

Rethinking Semantic Parsing for Large Language Models: Enhancing LLM Performance with Semantic Hints