Evaluating In-Context Translation with Synchronous Context-Free Grammar Transduction

📅 2026-04-08

📈 Citations: 0

✨ Influential: 0

career value

182K/year

🤖 AI Summary

This study investigates the capacity of large language models to perform translation in low-resource settings using only in-context syntactic descriptions. To this end, it introduces—for the first time—a controlled evaluation framework based on pairs of formal languages generated by synchronous context-free grammars, which systematically emulate syntactic, morphological, and orthographic divergences observed in natural languages. Experimental results demonstrate that translation accuracy declines significantly with increasing grammar complexity and sentence length. The primary performance bottleneck stems from morphological and orthographic disparities between source and target languages. Common failure modes include lexical misuse, hallucinated neologisms, and untranslated segments. This work establishes a formal analytical framework for assessing the generalization capabilities and inherent limitations of in-context translation.

Technology Category

Application Category

📝 Abstract

Low-resource languages pose a challenge for machine translation with large language models (LLMs), which require large amounts of training data. One potential way to circumvent this data dependence is to rely on LLMs' ability to use in-context descriptions of languages, like textbooks and dictionaries. To do so, LLMs must be able to infer the link between the languages' grammatical descriptions and the sentences in question. Here we isolate this skill using a formal analogue of the task: string transduction based on a formal grammar provided in-context. We construct synchronous context-free grammars which define pairs of formal languages designed to model particular aspects of natural language grammar, morphology, and written representation. Using these grammars, we measure how well LLMs can translate sentences from one formal language into another when given both the grammar and the source-language sentence. We vary the size of the grammar, the lengths of the sentences, the syntactic and morphological properties of the languages, and their written script. We note three key findings. First, LLMs' translation accuracy decreases markedly as a function of grammar size and sentence length. Second, differences in morphology and written representation between the source and target languages can strongly diminish model performance. Third, we examine the types of errors committed by models and find they are most prone to recall the wrong words from the target language vocabulary, hallucinate new words, or leave source-language words untranslated.

Problem

Research questions and friction points this paper is trying to address.

in-context learning

machine translation

low-resource languages

synchronous context-free grammar

string transduction

Innovation

Methods, ideas, or system contributions that make the work stand out.

in-context learning

synchronous context-free grammar

string transduction