Evaluating In-Context Translation with Synchronous Context-Free Grammar Transduction

📅 2026-04-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study investigates the capacity of large language models to perform translation in low-resource settings using only in-context syntactic descriptions. To this end, it introduces—for the first time—a controlled evaluation framework based on pairs of formal languages generated by synchronous context-free grammars, which systematically emulate syntactic, morphological, and orthographic divergences observed in natural languages. Experimental results demonstrate that translation accuracy declines significantly with increasing grammar complexity and sentence length. The primary performance bottleneck stems from morphological and orthographic disparities between source and target languages. Common failure modes include lexical misuse, hallucinated neologisms, and untranslated segments. This work establishes a formal analytical framework for assessing the generalization capabilities and inherent limitations of in-context translation.
📝 Abstract
Low-resource languages pose a challenge for machine translation with large language models (LLMs), which require large amounts of training data. One potential way to circumvent this data dependence is to rely on LLMs' ability to use in-context descriptions of languages, like textbooks and dictionaries. To do so, LLMs must be able to infer the link between the languages' grammatical descriptions and the sentences in question. Here we isolate this skill using a formal analogue of the task: string transduction based on a formal grammar provided in-context. We construct synchronous context-free grammars which define pairs of formal languages designed to model particular aspects of natural language grammar, morphology, and written representation. Using these grammars, we measure how well LLMs can translate sentences from one formal language into another when given both the grammar and the source-language sentence. We vary the size of the grammar, the lengths of the sentences, the syntactic and morphological properties of the languages, and their written script. We note three key findings. First, LLMs' translation accuracy decreases markedly as a function of grammar size and sentence length. Second, differences in morphology and written representation between the source and target languages can strongly diminish model performance. Third, we examine the types of errors committed by models and find they are most prone to recall the wrong words from the target language vocabulary, hallucinate new words, or leave source-language words untranslated.
Problem

Research questions and friction points this paper is trying to address.

in-context learning
machine translation
low-resource languages
synchronous context-free grammar
string transduction
Innovation

Methods, ideas, or system contributions that make the work stand out.

in-context learning
synchronous context-free grammar
string transduction
low-resource machine translation
large language models
🔎 Similar Papers
No similar papers found.