🤖 AI Summary
Language models often suffer from hidden errors in chain-of-thought (CoT) reasoning, undermining the reliability of their inference. To address this, we propose a self-correction framework that jointly models the truthfulness of each reasoning step—treated as a latent variable—and the final answer, enabling end-to-end localization and correction of erroneous steps. Our approach introduces an efficient discrete Boolean search algorithm for truth assignment and a generalizable zero-shot truthfulness discriminator, which evaluates step-level credibility without requiring additional annotations. The method integrates posterior approximate inference, joint likelihood modeling over language models, supervised fine-tuning, and pseudo-label generation. Extensive experiments on ProntoQA and GSM8K demonstrate its effectiveness: it reliably identifies flawed reasoning steps and achieves up to a 25% absolute improvement in final answer accuracy under zero-shot settings.
📝 Abstract
Chain-of-Thought (CoT) reasoning has advanced the capabilities and transparency of language models (LMs); however, reasoning chains can contain inaccurate statements that reduce performance and trustworthiness. To address this, we introduce a new self-correction framework that augments each reasoning step in a CoT with a latent variable indicating its veracity, enabling modeling of all possible truth assignments rather than assuming correctness throughout. To efficiently explore this expanded space, we introduce Search Corrector, a discrete search algorithm over boolean-valued veracity assignments. It efficiently performs otherwise intractable inference in the posterior distribution over veracity assignments by leveraging the LM's joint likelihood over veracity and the final answer as a proxy reward. This efficient inference-time correction method facilitates supervised fine-tuning of an Amortized Corrector by providing pseudo-labels for veracity. The Amortized Corrector generalizes self-correction, enabling accurate zero-shot veracity inference in novel contexts. Empirical results demonstrate that Search Corrector reliably identifies errors in logical (ProntoQA) and mathematical reasoning (GSM8K) benchmarks. The Amortized Corrector achieves comparable zero-shot accuracy and improves final answer accuracy by up to 25%.