PERSONA: Dynamic and Compositional Inference-Time Personality Control via Activation Vector Algebra

๐Ÿ“… 2026-02-17
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Existing approaches to personality control in large language models rely on static prompts or costly fine-tuning, making dynamic and composable personality expression challenging. This work proposes a training-free framework that, for the first time, reveals personality traits in the modelโ€™s activation space are approximately orthogonal and amenable to vector arithmetic. By leveraging contrastive activation analysis, the method extracts personality basis vectors and enables fine-grained control through scaling, addition, subtraction, and context-aware dynamic composition. Evaluated on PersonalityBench, the approach achieves a score of 9.60โ€”nearly matching the fine-tuned upper bound of 9.61โ€”and attains a 91% win rate on the Persona-Evolve benchmark, substantially outperforming existing training-free methods.

Technology Category

Application Category

๐Ÿ“ Abstract
Current methods for personality control in Large Language Models rely on static prompting or expensive fine-tuning, failing to capture the dynamic and compositional nature of human traits. We introduce PERSONA, a training-free framework that achieves fine-tuning level performance through direct manipulation of personality vectors in activation space. Our key insight is that personality traits appear as extractable, approximately orthogonal directions in the model's representation space that support algebraic operations. The framework operates through three stages: Persona-Base extracts orthogonal trait vectors via contrastive activation analysis; Persona-Algebra enables precise control through vector arithmetic (scalar multiplication for intensity, addition for composition, subtraction for suppression); and Persona-Flow achieves context-aware adaptation by dynamically composing these vectors during inference. On PersonalityBench, our approach achieves a mean score of 9.60, nearly matching the supervised fine-tuning upper bound of 9.61 without any gradient updates. On our proposed Persona-Evolve benchmark for dynamic personality adaptation, we achieve up to 91% win rates across diverse model families. These results provide evidence that aspects of LLM personality are mathematically tractable, opening new directions for interpretable and efficient behavioral control.
Problem

Research questions and friction points this paper is trying to address.

personality control
dynamic personality
compositional traits
Large Language Models
activation space
Innovation

Methods, ideas, or system contributions that make the work stand out.

activation vector algebra
personality control
training-free inference
compositional traits
dynamic adaptation
๐Ÿ”Ž Similar Papers
No similar papers found.