Can Transformers Reason Logically? A Study in SAT Solving

📅 2024-10-09

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

164K/year

🤖 AI Summary

This work investigates whether decoder-only Transformers can acquire logical reasoning capabilities for Boolean satisfiability (SAT) through training—not hard-coded rules. Theoretically, we establish the first formal proof that such models can simulate the DPLL algorithm—including backtracking and chain-of-thought (CoT) reasoning—for 3-SAT within a non-uniform computational model. Methodologically, we introduce PARAT, a tool that materializes this theoretical construction via algorithmic trajectory supervision to train the model. Empirically, the trained model achieves high accuracy on out-of-distribution (OOD) instances of unseen problem scales, demonstrating strong generalization; however, its performance degrades under sequence-length extrapolation. This work bridges theory and practice by providing the first rigorous, verifiable framework for interpretable logical reasoning in large language models—unifying formal computability guarantees with empirical learnability.

Technology Category

Application Category

📝 Abstract

We formally study the logical reasoning capabilities of decoder-only Transformers in the context of the boolean satisfiability (SAT) problem. First, we prove by construction that decoder-only Transformers can decide 3-SAT, in a non-uniform model of computation, using backtracking and deduction via Chain-of-Thought (CoT). %We prove its correctness by showing trace equivalence to the well-known DPLL SAT-solving algorithm. Second, we implement our construction as a PyTorch model with a tool (PARAT) that we designed to empirically demonstrate its correctness and investigate its properties. Third, rather than extit{programming} a transformer to reason, we evaluate empirically whether it can be extit{trained} to do so by learning directly from algorithmic traces (``reasoning paths'') from our theoretical construction. The trained models demonstrate strong out-of-distribution generalization on problem sizes seen during training but has limited length generalization, which is consistent with the implications of our theoretical result

Problem

Research questions and friction points this paper is trying to address.

Assess Transformers' logical reasoning in SAT solving

Prove Transformers can decide 3-SAT via CoT

Evaluate if Transformers can learn reasoning from traces

Innovation

Methods, ideas, or system contributions that make the work stand out.

Transformers solve 3-SAT

Backtracking and Chain-of-Thought

Learning from algorithmic traces

🔎 Similar Papers

No similar papers found.