FaithRL: Learning to Reason Faithfully through Step-Level Faithfulness Maximization

📅 2026-02-03
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitations of existing reinforcement learning approaches in multi-step reasoning, which rely solely on sparse final rewards and lack supervision over intermediate steps, often leading to overconfidence, spurious reasoning, and hallucinations. To tackle this, the authors propose FaithRL, a novel framework that formalizes the objective of maximizing reasoning faithfulness for the first time. FaithRL introduces a geometric reward design coupled with a faithfulness-aware advantage modulation mechanism, enabling fine-grained credit assignment at the step level to suppress unjustified inferences while preserving valid deductions. Theoretical analysis demonstrates that the method mitigates overconfidence, and extensive experiments across multiple large language models and benchmarks show a significant reduction in hallucination rates without compromising—and sometimes even improving—answer accuracy, highlighting its strong reasoning faithfulness and generalization capability.

Technology Category

Application Category

📝 Abstract
Reinforcement Learning with Verifiable Rewards (RLVR) has markedly improved the performance of Large Language Models (LLMs) on tasks requiring multi-step reasoning. However, most RLVR pipelines rely on sparse outcome-based rewards, providing little supervision over intermediate steps and thus encouraging over-confidence and spurious reasoning, which in turn increases hallucinations. To address this, we propose FaithRL, a general reinforcement learning framework that directly optimizes reasoning faithfulness. We formalize a faithfulness-maximization objective and theoretically show that optimizing it mitigates over-confidence. To instantiate this objective, we introduce a geometric reward design and a faithfulness-aware advantage modulation mechanism that assigns step-level credit by penalizing unsupported steps while preserving valid partial derivations. Across diverse backbones and benchmarks, FaithRL consistently reduces hallucination rates while maintaining (and often improving) answer correctness. Further analysis confirms that FaithRL increases step-wise reasoning faithfulness and generalizes robustly. Our code is available at https://github.com/aintdoin/FaithRL.
Problem

Research questions and friction points this paper is trying to address.

reasoning faithfulness
hallucination
reinforcement learning
step-level supervision
large language models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Faithful Reasoning
Reinforcement Learning
Step-Level Reward
Hallucination Reduction
Advantage Modulation
🔎 Similar Papers
No similar papers found.
R
Runquan Gui
University of Science and Technology of China
Yafu Li
Yafu Li
The Chinese University of Hong Kong
ReasoningTrustworthy AIMultilinguality
Xiaoye Qu
Xiaoye Qu
Shanghai AI Lab
Z
Ziyan Liu
University of Science and Technology of China
Y
Yeqiu Cheng
University of Science and Technology of China
Yu Cheng
Yu Cheng
Professor of Computer Science and Engineering, The Chinese University of Hong Kong
Deep Generative ModelsMultimodal LearningModel Compression