LOGICPO: Efficient Translation of NL-based Logical Problems to FOL using LLMs and Preference Optimization

📅 2025-06-23

📈 Citations: 0

✨ Influential: 0

career value

179K/year

🤖 AI Summary

This work addresses the challenge of accurately translating natural language logical problems into first-order logic (FOL) expressions using large language models (LLMs), where existing approaches suffer from insufficient logical correctness and syntactic consistency. To bridge this gap, we introduce LogicPO—the first preference-optimization dataset specifically designed for logical formalization—and pioneer the application of preference learning techniques, including Direct Preference Optimization (DPO) and Knowledge Tuning Optimization (KTO), to this task. Leveraging open-source models such as Phi-3.5, we jointly apply supervised fine-tuning and preference optimization to enhance holistic logical structure modeling. Experimental results demonstrate that our method improves logical correctness by 10% over GPT-3.5-turbo (8-shot) and reduces syntax error rate by 14%, validating the effectiveness and novelty of preference learning in logical formalization.

Technology Category

Application Category

📝 Abstract

Logical reasoning is a key task for artificial intelligence due to it's role in major downstream tasks such as Question Answering, Summarization. Recent methods in improving the reasoning ability of LLMs fall short in correctly converting a natural language reasoning problem to an equivalent logical formulation, which hinders the framework's overall ability to reason. Towards this, we propose to use finetuning on a preference optimization dataset to learn to parse and represent a natural language problem as a whole to a consistent logical program by 1) introducing a new supervised and preference optimization dataset LogicPO, and 2) adopting popular techniques such as Direct Preference Optimization (DPO), Kahneman-Tversky optimization (KTO) to finetune open-source LLMs. Our best model with Phi-3.5 consistently outperforms GPT-3.5-turbo's (8-shot) by producing 10% more logically correct and with 14% less syntax errors. Through the framework and our improved evaluation metrics, we offer a promising direction in improving the logical reasoning of LLMs by better representing them in their logical formulations.

Problem

Research questions and friction points this paper is trying to address.

Improving NL-to-FOL translation for logical reasoning

Reducing syntax errors in logical problem formulations

Enhancing LLM performance in logical representation tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Fine-tuning LLMs with preference optimization dataset

Using DPO and KTO for logical problem translation

Improved logical correctness and reduced syntax errors

🔎 Similar Papers

Divide and Translate: Compositional First-Order Logic Translation and Verification for Complex Logical Reasoning