The Hidden Signal of Verifier Strictness: Controlling and Improving Step-Wise Verification via Selective Latent Steering

📅 2026-05-20

📈 Citations: 0

✨ Influential: 0

career value

189K/year

🤖 AI Summary

Generative verifiers in stepwise verification often suffer from poor calibration, leading to either overly lenient acceptance of incorrect reasoning or excessively strict rejection of valid steps. This work proposes VerifySteer, a method that analyzes hidden-state signals at paragraph boundaries to identify latent correctness features correlated with accept/reject decisions. Leveraging these signals, VerifySteer performs instance-level routing and selective intervention to dynamically adjust verification strictness without requiring model fine-tuning. Evaluated on ProcessBench and Hard2Verify, VerifySteer outperforms prompt-based optimization and general activation steering baselines, achieving performance comparable to self-consistency approaches while reducing inference compute by 4–7×. Moreover, it can be effectively combined with verifier fine-tuning to further enhance accuracy.

📝 Abstract

Generative verifiers have emerged as a promising paradigm for step-wise verification, but their verification behavior is often poorly calibrated: they may be under-critical and miss erroneous steps, or over-critical and reject correct reasoning. We refer to this tendency to be overly lenient or overly critical as verifier strictness. In this work, we study whether verifier strictness can be controlled through hidden-state intervention. We uncover a verification-specific hidden-state signal: in step-wise verification, a verifier's tendency to accept or reject a solution step is encoded near the boundary of the corresponding verification paragraph. Exploiting this signal, we show that hidden-state steering can directly modulate verifier strictness without fine-tuning. However, uniform steering induces a trade-off between error detection and correctness certification. To address this, we propose VerifySteer, which exploits latent correctness signals for sample-level routing and selectively intervenes on paragraph boundaries. Experiments on ProcessBench and Hard2Verify show that VerifySteer outperforms prompt optimization and activation steering baselines, and is competitive with self-consistency while requiring 4-7x less inference compute. VerifySteer is also complementary to verification fine-tuning, providing further gains on top of fine-tuned verifiers. The code is available at https://github.com/YefanZhou/VerifySteer.

Problem

Research questions and friction points this paper is trying to address.

verifier strictness

step-wise verification

generative verifiers

calibration

error detection

Innovation

Methods, ideas, or system contributions that make the work stand out.

verifier strictness

hidden-state steering

step-wise verification