🤖 AI Summary
This work addresses the instability of chain-of-thought (CoT) reasoning across diverse tasks and its frequent lack of faithfulness—i.e., generated answers not being fully supported by the intermediate reasoning steps. To this end, we introduce the first step-level causal modeling framework that explicitly distinguishes centralized versus distributed reasoning paradigms, uncovering causal dependencies among context, reasoning chains, and final answers. We propose “Inference Bridging”, a novel method integrating attribution analysis and semantic consistency as dual criteria to jointly optimize CoT filtering and prompt enhancement. Our pipeline comprises context retrieval, CoT generation, and step-aware re-ranking. Extensive experiments demonstrate significant improvements in both reasoning faithfulness and answer accuracy across multiple benchmarks, confirming effectiveness and strong cross-task generalization. The core contributions are: (1) step-granular causal modeling of CoT reasoning, and (2) a dual-criterion co-optimization framework for faithful, robust inference.
📝 Abstract
Large language models (LLMs) suffer from serious unfaithful chain-of-thought (CoT) issues. Previous work attempts to measure and explain it but lacks in-depth analysis within CoTs and does not consider the interactions among all reasoning components jointly. In this paper, we first study the CoT faithfulness issue at the granularity of CoT steps, identify two reasoning paradigms: centralized reasoning and distributed reasoning, and find their relationship with faithfulness. Subsequently, we conduct a joint analysis of the causal relevance among the context, CoT, and answer during reasoning. The result proves that, when the LLM predicts answers, it can recall correct information missing in the CoT from the context, leading to unfaithfulness issues. Finally, we propose the inferential bridging method to mitigate this issue, in which we use the attribution method to recall information as hints for CoT generation and filter out noisy CoTs based on their semantic consistency and attribution scores. Extensive experiments demonstrate that our approach effectively alleviates the unfaithful CoT problem.