🤖 AI Summary
This work addresses the limitations of current large language models in symbolic, rule-intensive logical reasoning tasks, where performance is often hindered by structural looseness, poor error tolerance, or reliance on format-sensitive external solvers. To overcome these challenges, the authors propose MatrixCoT, a novel framework that structures reasoning as a verifiable dependency matrix. By integrating natural language normalization and typing, explicit reference mechanisms, and feedback-driven replanning under semantic equivalence constraints, MatrixCoT enables structured, self-correcting symbolic reasoning without external solvers. Extensive experiments across five logical reasoning benchmarks and five large language models demonstrate that MatrixCoT not only maintains competitive performance but also significantly enhances robustness and interpretability in complex symbolic reasoning.
📝 Abstract
As knowledge and semantics on the web grow increasingly complex, enhancing Large Language Models (LLMs)'comprehension and reasoning capabilities has become particularly important. Chain-of-Thought (CoT) prompting has been shown to enhance the reasoning capabilities of LLMs. However, it still falls short on logical reasoning tasks that rely on symbolic expressions and strict deductive rules. Neuro-symbolic methods address this gap by enforcing formal correctness through external solvers. Yet these solvers are highly format-sensitive, and small instabilities in model outputs can lead to frequent processing failures. The LLM-driven approaches avoid parsing brittleness, but they lack structured representations and process-level error-correction mechanisms. To further enhance the logical reasoning capabilities of LLMs, we propose MatrixCoT, a structured CoT framework with a matrix-based plan. Specifically, we normalize and type natural language expressions and attach explicit citation fields, and introduce a matrix-based planning method to preserve global relations among steps. The plan thus becomes a verifiable artifact and execution becomes more stable. For verification, we also add a feedback-driven replanning mechanism. Under semantic-equivalence constraints, it identifies omissions and defects, rewrites and compresses the dependency matrix, and produces a more trustworthy final answer. Experiments on five logical-reasoning benchmarks and five LLMs show that, without relying on external solvers, MatrixCoT enhances both the robustness and interpretability of LLMs when tackling complex symbolic reasoning tasks, while maintaining competitive performance.