MAS-LitEval : Multi-Agent System for Literary Translation Quality Assessment

📅 2025-06-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Traditional translation evaluation metrics (e.g., BLEU, METEOR) rely on lexical overlap and thus fail to capture culture adaptation, narrative coherence, and stylistic fidelity—key dimensions in literary translation. To address this limitation, we propose the first multi-agent quality assessment framework specifically designed for literary translation. Leveraging large language models, our framework orchestrates three specialized agents—Terminology Expert, Narrative Analyst, and Stylistic Critic—via structured prompt engineering and consensus-based aggregation, enabling interpretable, fine-grained modeling of cultural context and aesthetic features. Crucially, it moves beyond the lexical-overlap paradigm. Evaluated on multiple translations of *The Little Prince* and *A Connecticut Yankee in King Arthur’s Court*, our method achieves a Pearson correlation of 0.890 with human judgments—significantly outperforming state-of-the-art baselines. This work establishes a novel, principled paradigm for automated literary translation evaluation.

Technology Category

Application Category

📝 Abstract
Literary translation requires preserving cultural nuances and stylistic elements, which traditional metrics like BLEU and METEOR fail to assess due to their focus on lexical overlap. This oversight neglects the narrative consistency and stylistic fidelity that are crucial for literary works. To address this, we propose MAS-LitEval, a multi-agent system using Large Language Models (LLMs) to evaluate translations based on terminology, narrative, and style. We tested MAS-LitEval on translations of The Little Prince and A Connecticut Yankee in King Arthur's Court, generated by various LLMs, and compared it to traditional metrics. extbf{MAS-LitEval} outperformed these metrics, with top models scoring up to 0.890 in capturing literary nuances. This work introduces a scalable, nuanced framework for Translation Quality Assessment (TQA), offering a practical tool for translators and researchers.
Problem

Research questions and friction points this paper is trying to address.

Assessing literary translation quality beyond lexical overlap
Evaluating narrative consistency and stylistic fidelity in translations
Providing a scalable framework for nuanced translation assessment
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-agent system using LLMs for translation assessment
Evaluates terminology, narrative, and style fidelity
Outperforms traditional metrics like BLEU and METEOR
🔎 Similar Papers
No similar papers found.
J
Junghwan Kim
Seoul National University, Seoul, Republic of Korea
K
Kieun Park
Seoul National University, Seoul, Republic of Korea
S
Sohee Park
Infiniction, Seoul, Republic of Korea
H
Hyunggug Kim
Infiniction, Seoul, Republic of Korea
Bongwon Suh
Bongwon Suh
Seoul National University
Human Computer InteractionSocial ComputingData Analytics