🤖 AI Summary
This work addresses the challenge of simultaneously achieving high accuracy and interpretability in complex reasoning tasks performed by large language models (LLMs). To this end, we propose CaRing, a neuro-symbolic synergistic framework. Methodologically, CaRing introduces the first systematic translation of Prolog symbolic execution logs into human-readable, causally consistent reasoning proofs, achieved via three core components: LLM-to-logic translation, symbolic execution log parsing, and joint neuro-symbolic optimization—enabling bidirectional enhancement between neural and symbolic reasoning. CaRing overcomes the inherent limitations of purely neural approaches (e.g., opacity) and purely symbolic systems (e.g., poor readability and scalability). Evaluated on two logical reasoning benchmarks and one arithmetic reasoning benchmark, CaRing achieves significant improvements in both answer accuracy and proof correctness over strong baselines. The implementation is publicly available.
📝 Abstract
Two lines of approaches are adopted for complex reasoning with LLMs. One line of work prompts LLMs with various reasoning structures, while the structural outputs can be naturally regarded as intermediate reasoning steps. Another line of work adopt LLM-free declarative solvers to do the reasoning task, rendering higher reasoning accuracy but lacking interpretability due to the black-box nature of the solvers. Aiming to resolve the trade-off between answer accuracy and interpretability, we present a simple extension to the latter line of work. Specifically, we showcase that the intermediate search logs generated by Prolog interpreters can be accessed and interpreted into human-readable reasoning proofs. As long as LLMs correctly translate problem descriptions into Prolog representations, the corresponding reasoning proofs are ensured to be causal and reliable. On two logical reasoning and one arithmetic reasoning datasets, our framework obtains significant improvements in terms of both answer accuracy and reasoning proof accuracy. Our code is released at https://github.com/DAMO-NLP-SG/CaRing