Contextual Linear Activation Steering of Language Models

📅 2026-04-27
📈 Citations: 0
Influential: 0
📄 PDF

career value

204K/year
🤖 AI Summary
Existing linear activation steering methods apply a fixed intensity to all tokens, failing to adapt to varying input contexts and resulting in inconsistent control performance. This work proposes Context-aware Linear Activation Steering (CLAS), which introduces, for the first time, a dynamic steering intensity mechanism that adaptively adjusts the strength based on the input context without requiring additional trainable parameters. Extensive experiments across 11 benchmark tasks and 4 model families demonstrate that CLAS significantly outperforms standard linear steering approaches and, in few-shot settings, matches or even surpasses state-of-the-art parameter-efficient methods such as ReFT and LoRA. These results highlight CLAS as a more precise, efficient, and data-efficient strategy for guiding model behavior.

Technology Category

Application Category

📝 Abstract
Linear activation steering is a powerful approach for eliciting the capabilities of large language models and specializing their behavior using limited labeled data. While effective, existing methods often apply a fixed steering strength to all tokens, resulting in inconsistent steering quality across diverse input prompts. In this work, we introduce Contextual Linear Activation Steering (CLAS), a method that dynamically adapts linear activation steering to context-dependent steering strengths. Across eleven steering benchmarks and four model families, it consistently outperforms standard linear activation steering and matches or exceeds the performance of ReFT and LoRA in settings with limited labeled data. We therefore propose CLAS as a scalable, interpretable, and accurate method for specializing and steering large language models.
Problem

Research questions and friction points this paper is trying to address.

linear activation steering
context-dependent steering
language model specialization
steering consistency
limited labeled data
Innovation

Methods, ideas, or system contributions that make the work stand out.

Contextual Linear Activation Steering
dynamic steering strength
large language models
few-shot specialization
interpretable control
🔎 Similar Papers
No similar papers found.