InfoSteer: Steering Information Utility in Language Model Post-Training

📅 2025-07-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address insufficient utilization of pretraining knowledge and the limited effectiveness of supervised fine-tuning in post-training of language models, this paper proposes InfoSteer. Methodologically, it introduces the first formulation of feed-forward network (FFN) layers as addressable key-value memories, jointly regulating hidden-state information allocation via forward-propagation intervention and backward-propagation regularization. This mechanism dynamically steers the model to more efficiently activate and compose pretraining semantic knowledge during both in-distribution (ID) and out-of-distribution (OOD) tasks, thereby enhancing generation quality and internal representation interpretability. Evaluated across multiple architectures—including Qwen, Gemma, and Llama—InfoSteer achieves consistent performance gains on 15 downstream tasks while substantially improving parameter-wise information efficiency.

Technology Category

Application Category

📝 Abstract
Recent advancements in language models (LMs) gradually ushered in an era where post-training is crucial. Yet, post-training approaches such as supervised fine-tuning (SFT) do not guarantee effective use of knowledge acquired during pretraining. We therefore present ours, a lightweight method that encourages parametric information utilization in LMs during post-training. This is achieved via treating FFN layer as associate key-value memory, and promotes the use of stored memory vectors via forward-pass interventions or regularization during backpropagation. We find this simple guidance during post-training phase delivers consistent performance improvements across diverse model families--including Qwen, Gemma and Llama-spanning over 15 downstream tasks in both ID and OOD evaluations. Beyond performance gains, we also find that steered LMs can adaptively allocate information-placing more emphasis on generating semantically meaningful tokens, while using fewer resources on simple transition ones (e.g., `,' or `and'). Our work underscores that vanilla post-training does not fully leverage pre-training potential, and steering LMs in latent representation space offers a promising approach that enhances both performance and interpretability.
Problem

Research questions and friction points this paper is trying to address.

Enhances knowledge utilization in post-trained language models
Improves performance via FFN layer memory interventions
Optimizes resource allocation for semantic token generation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Treats FFN layer as key-value memory
Uses forward-pass interventions or regularization
Enhances performance and interpretability adaptively
🔎 Similar Papers
No similar papers found.
C
Chunyuan Deng
Dept. of Computer Science, Rice University, Houston, TX 77005
Ruidi Chang
Ruidi Chang
Rice University
Natural Language ProcessingMachine Learning Interpretability
H
Hanjie Chen
Dept. of Computer Science, Rice University, Houston, TX 77005