Can Transformers Reason Logically? A Study in SAT Solving

📅 2024-10-09
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work investigates whether decoder-only Transformers can acquire logical reasoning capabilities for Boolean satisfiability (SAT) through training—not hard-coded rules. Theoretically, we establish the first formal proof that such models can simulate the DPLL algorithm—including backtracking and chain-of-thought (CoT) reasoning—for 3-SAT within a non-uniform computational model. Methodologically, we introduce PARAT, a tool that materializes this theoretical construction via algorithmic trajectory supervision to train the model. Empirically, the trained model achieves high accuracy on out-of-distribution (OOD) instances of unseen problem scales, demonstrating strong generalization; however, its performance degrades under sequence-length extrapolation. This work bridges theory and practice by providing the first rigorous, verifiable framework for interpretable logical reasoning in large language models—unifying formal computability guarantees with empirical learnability.

Technology Category

Application Category

📝 Abstract
We formally study the logical reasoning capabilities of decoder-only Transformers in the context of the boolean satisfiability (SAT) problem. First, we prove by construction that decoder-only Transformers can decide 3-SAT, in a non-uniform model of computation, using backtracking and deduction via Chain-of-Thought (CoT). %We prove its correctness by showing trace equivalence to the well-known DPLL SAT-solving algorithm. Second, we implement our construction as a PyTorch model with a tool (PARAT) that we designed to empirically demonstrate its correctness and investigate its properties. Third, rather than extit{programming} a transformer to reason, we evaluate empirically whether it can be extit{trained} to do so by learning directly from algorithmic traces (``reasoning paths'') from our theoretical construction. The trained models demonstrate strong out-of-distribution generalization on problem sizes seen during training but has limited length generalization, which is consistent with the implications of our theoretical result
Problem

Research questions and friction points this paper is trying to address.

Assess Transformers' logical reasoning in SAT solving
Prove Transformers can decide 3-SAT via CoT
Evaluate if Transformers can learn reasoning from traces
Innovation

Methods, ideas, or system contributions that make the work stand out.

Transformers solve 3-SAT
Backtracking and Chain-of-Thought
Learning from algorithmic traces
🔎 Similar Papers
No similar papers found.
L
Leyan Pan
Georgia Institute of Technology, Atlanta, GA 30332, USA
Vijay Ganesh
Vijay Ganesh
Professor, Georgia Institute of Technology, Atlanta, GA, USA
SAT/SMT SolversAIsoftware engineeringmathematical logicquantum foundations
Jacob Abernethy
Jacob Abernethy
Georgia Institute of Technology, Atlanta, GA 30332, USA; Google Research
C
Chris Esposo
Georgia Institute of Technology, Atlanta, GA 30332, USA
W
Wenke Lee
Georgia Institute of Technology, Atlanta, GA 30332, USA