Explain-then-Process: Using Grammar Prompting to Enhance Grammatical Acceptability Judgments

📅 2025-06-02

📈 Citations: 0

✨ Influential: 0

career value

181K/year

🤖 AI Summary

Large language models (LLMs) can articulate grammatical rules but struggle to apply them effectively for sentence acceptability judgment. To address this, we propose *explanation post-processing*: an LLM first generates a concise metalinguistic explanation of a targeted grammatical phenomenon; this explanation is then provided as contextual input to a target model—either an LLM or a small language model (SLM)—which judges the grammaticality of minimal sentence pairs. We introduce *grammar prompting*, the first mechanism enabling closed-loop conversion from metalinguistic explanation to discriminative behavior—requiring no fine-tuning, additional parameters, or architectural modification, and fully compatible with chain-of-thought (CoT) reasoning. Evaluated on English BLiMP, Chinese SLING, and Russian RuBLiMP, our approach substantially outperforms strong baselines: it narrows the average accuracy gap between SLMs and LLMs by 20 percentage points, and when combined with CoT, reduces error rates by 56% (from 13.0% to 5.8%). The method incurs negligible computational overhead and demonstrates cross-lingual generalizability.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) can explain grammatical rules, yet they often fail to apply those rules when judging sentence acceptability. We present"grammar prompting", an explain-then-process paradigm: a large LLM first produces a concise explanation of the relevant syntactic phenomenon, then that explanation is fed back as additional context to the target model -- either an LLM or a smaller language model (SLM) -- before deciding which sentence of a minimal pair is grammatical. On the English BLiMP, Chinese SLING, and Russian RuBLiMP benchmarks, this simple prompt design yields substantial improvements over strong baselines across many syntactic phenomena. Feeding an LLM's metalinguistic explanation back to the target model bridges the gap between knowing a rule and using it. On SLMs, grammar prompting alone trims the average LLM-SLM accuracy gap by about 20%, and when paired with chain-of-thought, by 56% (13.0 pp ->5.8 pp), all at negligible cost. The lightweight, language-agnostic cue lets low-cost SLMs approach frontier-LLM performance in multilingual settings.

Problem

Research questions and friction points this paper is trying to address.

LLMs struggle to apply grammatical rules in sentence judgments

Grammar prompting improves accuracy in multilingual syntactic benchmarks

Bridges gap between rule knowledge and application in language models

Innovation

Methods, ideas, or system contributions that make the work stand out.

Grammar prompting enhances grammatical judgments

Explain-then-process paradigm improves syntactic accuracy

Lightweight method bridges LLM-SLM performance gap

🔎 Similar Papers

How to Make the Most of LLMs' Grammatical Knowledge for Acceptability Judgments