Syntactic Blind Spots: How Misalignment Leads to LLMs Mathematical Errors

📅 2025-10-02

📈 Citations: 0

✨ Influential: 0

career value

175K/year

🤖 AI Summary

Large language models (LLMs) exhibit a “syntactic blind spot” in mathematical reasoning: they systematically fail on semantically simple problems whose syntactic structures deviate from their training distribution, misapplying familiar heuristics—not due to insufficient mathematical competence, but owing to fragile coupling between surface syntax and internal representations. Method: The authors introduce Dependency Locality Theory (DLT) to quantify syntactic complexity and propose a semantics-preserving syntactic template rewriting method to reconstruct erroneous instances. Contribution/Results: Experiments show substantial accuracy gains post-rewriting; DLT scores strongly correlate with failure rates across multiple benchmarks. This work pioneers the systematic application of syntax-aware intervention for diagnosing and repairing mathematical reasoning deficits in LLMs, demonstrating that structural misalignment—not conceptual difficulty—is the primary bottleneck. It establishes a novel paradigm for enhancing LLM robustness in formal reasoning.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) demonstrate strong mathematical problem-solving abilities but frequently fail on problems that deviate syntactically from their training distribution. We identify a systematic failure mode, syntactic blind spots, in which models misapply familiar reasoning strategies to problems that are semantically straightforward but phrased in unfamiliar ways. These errors are not due to gaps in mathematical competence, but rather reflect a brittle coupling between surface form and internal representation. To test this, we rephrase incorrectly answered questions using syntactic templates drawn from correct examples. These rephrasings, which preserve semantics while reducing structural complexity, often lead to correct answers. We quantify syntactic complexity using a metric based on Dependency Locality Theory (DLT), and show that higher DLT scores are associated with increased failure rates across multiple datasets. Our findings suggest that many reasoning errors stem from structural misalignment rather than conceptual difficulty, and that syntax-aware interventions can reveal and mitigate these inductive failures.

Problem

Research questions and friction points this paper is trying to address.

LLMs fail on mathematically simple but syntactically unfamiliar problems

Errors stem from structural misalignment rather than conceptual limitations

Syntactic complexity causes misapplication of reasoning strategies in LLMs

Innovation

Methods, ideas, or system contributions that make the work stand out.

Rephrase questions using syntactic templates from correct examples

Quantify syntactic complexity with Dependency Locality Theory metric

Apply syntax-aware interventions to mitigate inductive failures

🔎 Similar Papers

Towards Understanding the Characteristics of Code Generation Errors Made by Large Language Models