Towards Reliable Machine Translation: Scaling LLMs for Critical Error Detection and Safety

📅 2026-02-11

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

This work proposes treating critical error detection as a core mechanism for ensuring responsible multilingual AI, addressing severe semantic errors in machine translation—such as factual distortions, intent reversal, and bias—that undermine the reliability and fairness of multilingual systems. Leveraging instruction-tuned large language models (LLMs), the approach enhances detection capabilities through zero-shot, few-shot, and fine-tuning adaptation strategies. Experimental results demonstrate that the method significantly outperforms encoder-based baselines like XLM-R and ModernBERT on public benchmarks, effectively reducing the risks of misinformation and linguistic harm in high-stakes scenarios. This study thus offers a novel pathway toward building safer, more trustworthy multilingual information systems.

Technology Category

Application Category

📝 Abstract

Machine Translation (MT) plays a pivotal role in cross-lingual information access, public policy communication, and equitable knowledge dissemination. However, critical meaning errors, such as factual distortions, intent reversals, or biased translations, can undermine the reliability, fairness, and safety of multilingual systems. In this work, we explore the capacity of instruction-tuned Large Language Models (LLMs) to detect such critical errors, evaluating models across a range of parameters using the publicly accessible data sets. Our findings show that model scaling and adaptation strategies (zero-shot, few-shot, fine-tuning) yield consistent improvements, outperforming encoder-only baselines like XLM-R and ModernBERT. We argue that improving critical error detection in MT contributes to safer, more trustworthy, and socially accountable information systems by reducing the risk of disinformation, miscommunication, and linguistic harm, especially in high-stakes or underrepresented contexts. This work positions error detection not merely as a technical challenge, but as a necessary safeguard in the pursuit of just and responsible multilingual AI. The code will be made available at GitHub.

Problem

Research questions and friction points this paper is trying to address.

Machine Translation

Critical Error Detection

Reliability

Safety

Bias

Innovation

Methods, ideas, or system contributions that make the work stand out.

Large Language Models

Critical Error Detection

Machine Translation Safety