InjectRBP: Steering Large Language Model Reasoning Behavior via Pattern Injection

๐Ÿ“… 2026-02-12
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Current approaches to reasoning optimization in large language models often rely on heuristic prompting, lacking systematic analysis of the modelsโ€™ reasoning behavior patterns. This work addresses this gap by modeling adaptive reasoning from a behavioral perspective and proposes two parameter-free, injection-based guidance mechanisms: InjectCorrect, which mimics historically correct responses, and InjectRLOpt, which injects actions guided by a value function. To dynamically generate injection signals, the authors introduce a reliability-aware Softmax strategy. Experimental results demonstrate that the proposed method significantly enhances performance across diverse reasoning tasks, achieving accuracy gains of up to 5.34% and 8.67% respectively, without any model training or parameter updates.

Technology Category

Application Category

๐Ÿ“ Abstract
Reasoning can significantly enhance the performance of Large Language Models. While recent studies have exploited behavior-related prompts adjustment to enhance reasoning, these designs remain largely intuitive and lack a systematic analysis of the underlying behavioral patterns. Motivated by this, we investigate how models'reasoning behaviors shape reasoning from the perspective of behavioral patterns. We observe that models exhibit adaptive distributions of reasoning behaviors when responding to specific types of questions, and that structurally injecting these patterns can substantially influence the quality of the models'reasoning processes and outcomes. Building on these findings, we propose two optimization methods that require no parameter updates: InjectCorrect and InjectRLOpt. InjectCorrect guides the model by imitating behavioral patterns derived from its own past correct answers. InjectRLOpt learns a value function from historical behavior-pattern data and, via our proposed Reliability-Aware Softmax Policy, generates behavioral injectant during inference to steer the reasoning process. Our experiments demonstrate that both methods can improve model performance across various reasoning tasks without requiring any modifications to model parameters, achieving gains of up to 5.34% and 8.67%, respectively.
Problem

Research questions and friction points this paper is trying to address.

reasoning behavior
behavioral patterns
large language models
prompt adjustment
reasoning enhancement
Innovation

Methods, ideas, or system contributions that make the work stand out.

behavioral pattern injection
parameter-free reasoning optimization
reliability-aware softmax
reasoning steering
large language model
๐Ÿ”Ž Similar Papers
No similar papers found.