Stepwise Think-Critique: A Unified Framework for Robust and Interpretable LLM Reasoning

📅 2025-12-17

📈 Citations: 0

✨ Influential: 0

career value

182K/year

🤖 AI Summary

Current large language models (LLMs) typically decouple reasoning from verification in complex reasoning tasks—either omitting self-checking entirely or relying on external verifiers—resulting in delayed feedback, architectural redundancy, and poor co-optimization. To address this, we propose Stepwise Think-Critique (STC), the first framework unifying fine-grained, step-level reasoning and self-critique within a single model. Its core contributions are: (1) an interleaved Think-Critique architecture enabling immediate, verifiable self-assessment after each reasoning step; (2) a hybrid reinforcement learning objective jointly optimizing reasoning quality and critique consistency; and (3) a critique consistency reward coupled with multi-stage trajectory supervision. Experiments on mathematical reasoning benchmarks demonstrate that STC significantly improves both answer accuracy and error detection capability, yielding more transparent, traceable, and robust reasoning chains—advancing the development of endogenous critical capabilities in LLMs.

Technology Category

Application Category

📝 Abstract

Human beings solve complex problems through critical thinking, where reasoning and evaluation are intertwined to converge toward correct solutions. However, most existing large language models (LLMs) decouple reasoning from verification: they either generate reasoning without explicit self-checking or rely on external verifiers to detect errors post hoc. The former lacks immediate feedback, while the latter increases system complexity and hinders synchronized learning. Motivated by human critical thinking, we propose Stepwise Think-Critique (STC), a unified framework that interleaves reasoning and self-critique at each step within a single model. STC is trained with a hybrid reinforcement learning objective combining reasoning rewards and critique-consistency rewards to jointly optimize reasoning quality and self-evaluation. Experiments on mathematical reasoning benchmarks show that STC demonstrates strong critic-thinking capabilities and produces more interpretable reasoning traces, representing a step toward LLMs with built-in critical thinking.

Problem

Research questions and friction points this paper is trying to address.

Integrates reasoning with self-critique in LLMs

Replaces external verification with unified internal feedback

Enhances interpretability and robustness in reasoning tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Interleaves reasoning and self-critique per step

Uses hybrid reinforcement learning for joint optimization

Produces interpretable reasoning traces via built-in critique

🔎 Similar Papers

Semantic Self-Consistency: Enhancing Language Model Reasoning via Semantic Weighting