Improving Symbolic Translation of Language Models for Logical Reasoning

📅 2026-01-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of format and semantic errors produced by small language models when translating natural language into first-order logic (FOL), which undermines the reliability of symbolic reasoning. To mitigate this, the authors propose a staged incremental reasoning framework: first, a large language model synthesizes training data to supervise fine-tuning of the small model; then, the translation process is decoupled into predicate generation and FOL formulation stages. An external verification module is introduced to detect and correct predicate arity errors, thereby enhancing translation accuracy. Evaluated on four logical reasoning benchmarks, the approach significantly reduces error rates, improves predicate coverage, and boosts overall reasoning performance, bringing small models closer to reliable, verifiable symbolic reasoning systems.

Technology Category

Application Category

📝 Abstract
The use of formal language for deductive logical reasoning aligns well with language models (LMs), where translating natural language (NL) into first-order logic (FOL) and employing an external solver results in a verifiable and therefore reliable reasoning system. However, smaller LMs often struggle with this translation task, frequently producing incorrect symbolic outputs due to formatting and translation errors. Existing approaches typically rely on self-iteration to correct these errors, but such methods depend heavily on the capabilities of the underlying model. To address this, we first categorize common errors and fine-tune smaller LMs using data synthesized by large language models. The evaluation is performed using the defined error categories. We introduce incremental inference, which divides inference into two stages, predicate generation and FOL translation, providing greater control over model behavior and enhancing generation quality as measured by predicate metrics. This decomposition framework also enables the use of a verification module that targets predicate-arity errors to further improve performance. Our study evaluates three families of models across four logical-reasoning datasets. The comprehensive fine-tuning, incremental inference, and verification modules reduce error rates, increase predicate coverage, and improve reasoning performance for smaller LMs, moving us closer to developing reliable and accessible symbolic-reasoning systems.
Problem

Research questions and friction points this paper is trying to address.

symbolic translation
logical reasoning
first-order logic
language models
translation errors
Innovation

Methods, ideas, or system contributions that make the work stand out.

incremental inference
symbolic translation
predicate-arity verification
fine-tuning with synthetic data
logical reasoning
R
Ramya Keerthy Thatikonda
Department of Data Science & AI, Monash University
Jiuzhou Han
Jiuzhou Han
Monash University
Large Language ModelsLanguage AgentNatural Language ProcessingArtificial Intelligence
W
W. Buntine
College of Engineering and Computer Science, VinUniversity
Ehsan Shareghi
Ehsan Shareghi
Monash University
Natural Language Processing