🤖 AI Summary
Large language models (LLMs) excel at material reasoning—i.e., contextual plausibility—but lack formal rigor and verifiability. Method: This paper proposes an LLM-driven neural-symbolic iterative refinement framework grounded in Peircean “abduction–critique” cycles. It enables LLMs to jointly generate natural-language explanations and formal-language specifications, tightly coupling symbolic theorem provers (Lean/Coq) with differentiable soft evaluators for coherence, conciseness, and credibility. Contribution/Results: Through neural-symbolic co-refinement, the framework unifies material and formal reasoning. Evaluated on natural-language explanation generation, it achieves a 32.7% absolute gain in logical correctness and a 28.4% improvement in semantic plausibility, yielding end-to-end verifiable, interpretable, and iteratively refinable hybrid reasoning.
📝 Abstract
A persistent challenge in AI is the effective integration of material and formal inference - the former concerning the plausibility and contextual relevance of arguments, while the latter focusing on their logical and structural validity. Large Language Models (LLMs), by virtue of their extensive pre-training on large textual corpora, exhibit strong capabilities in material inference. However, their reasoning often lacks formal rigour and verifiability. At the same time, LLMs' linguistic competence positions them as a promising bridge between natural and formal languages, opening up new opportunities for combining these two modes of reasoning. In this paper, we introduce PEIRCE, a neuro-symbolic framework designed to unify material and formal inference through an iterative conjecture-criticism process. Within this framework, LLMs play the central role of generating candidate solutions in natural and formal languages, which are then evaluated and refined via interaction with external critique models. These critiques include symbolic provers, which assess formal validity, as well as soft evaluators that measure the quality of the generated arguments along linguistic and epistemic dimensions such as plausibility, coherence, and parsimony. While PEIRCE is a general-purpose framework, we demonstrate its capabilities in the domain of natural language explanation generation - a setting that inherently demands both material adequacy and formal correctness.