PERSONA: Dynamic and Compositional Inference-Time Personality Control via Activation Vector Algebra

📅 2026-02-17

📈 Citations: 0

✨ Influential: 0

career value

205K/year

🤖 AI Summary

Existing approaches to personality control in large language models rely on static prompts or costly fine-tuning, making dynamic and composable personality expression challenging. This work proposes a training-free framework that, for the first time, reveals personality traits in the model’s activation space are approximately orthogonal and amenable to vector arithmetic. By leveraging contrastive activation analysis, the method extracts personality basis vectors and enables fine-grained control through scaling, addition, subtraction, and context-aware dynamic composition. Evaluated on PersonalityBench, the approach achieves a score of 9.60—nearly matching the fine-tuned upper bound of 9.61—and attains a 91% win rate on the Persona-Evolve benchmark, substantially outperforming existing training-free methods.

Technology Category

Application Category

📝 Abstract

Current methods for personality control in Large Language Models rely on static prompting or expensive fine-tuning, failing to capture the dynamic and compositional nature of human traits. We introduce PERSONA, a training-free framework that achieves fine-tuning level performance through direct manipulation of personality vectors in activation space. Our key insight is that personality traits appear as extractable, approximately orthogonal directions in the model's representation space that support algebraic operations. The framework operates through three stages: Persona-Base extracts orthogonal trait vectors via contrastive activation analysis; Persona-Algebra enables precise control through vector arithmetic (scalar multiplication for intensity, addition for composition, subtraction for suppression); and Persona-Flow achieves context-aware adaptation by dynamically composing these vectors during inference. On PersonalityBench, our approach achieves a mean score of 9.60, nearly matching the supervised fine-tuning upper bound of 9.61 without any gradient updates. On our proposed Persona-Evolve benchmark for dynamic personality adaptation, we achieve up to 91% win rates across diverse model families. These results provide evidence that aspects of LLM personality are mathematically tractable, opening new directions for interpretable and efficient behavioral control.

Problem

Research questions and friction points this paper is trying to address.

personality control

dynamic personality

compositional traits

Large Language Models

activation space

Innovation

Methods, ideas, or system contributions that make the work stand out.

activation vector algebra

personality control

training-free inference

compositional traits

dynamic adaptation

🔎 Similar Papers

AutoPal: Autonomous Adaptation to Users for Personal AI Companionship

2024-06-20Citations: 1