Investigating Numerical Translation with Large Language Models

📅 2025-01-09

📈 Citations: 0

✨ Influential: 0

career value

194K/year

🤖 AI Summary

This study systematically exposes critical reliability deficiencies in large language models (LLMs) when translating large-unit numerals (e.g., million, billion, yi/100 million) between Chinese and English. Addressing a gap in prior work, we introduce the first bilingual numerical translation benchmark covering ten realistic business scenarios, accompanied by multi-dimensional error analysis and model behavioral diagnostics. Experiments reveal alarmingly high mis-translation rates—up to 20%—among mainstream open-source LLMs (e.g., Llama3.1-8B). We propose three novel, targeted mitigation strategies: explicit unit annotation, stepwise numeric parsing, and context-constrained fine-tuning—each demonstrably reducing error rates. This work fills a fundamental gap in robustness research on numeral translation and delivers a reproducible evaluation framework alongside practical, implementable optimization pathways for high-fidelity cross-lingual numerical conversion.

Technology Category

Application Category

📝 Abstract

The inaccurate translation of numbers can lead to significant security issues, ranging from financial setbacks to medical inaccuracies. While large language models (LLMs) have made significant advancements in machine translation, their capacity for translating numbers has not been thoroughly explored. This study focuses on evaluating the reliability of LLM-based machine translation systems when handling numerical data. In order to systematically test the numerical translation capabilities of currently open source LLMs, we have constructed a numerical translation dataset between Chinese and English based on real business data, encompassing ten types of numerical translation. Experiments on the dataset indicate that errors in numerical translation are a common issue, with most open-source LLMs faltering when faced with our test scenarios. Especially when it comes to numerical types involving large units like ``million", ``billion", and"yi", even the latest llama3.1 8b model can have error rates as high as 20%. Finally, we introduce three potential strategies to mitigate the numerical mistranslations for large units.

Problem

Research questions and friction points this paper is trying to address.

Large Language Models

Numeric Translation Accuracy

Error Prevention

Innovation

Methods, ideas, or system contributions that make the work stand out.

Large-scale Language Model

Numerical Translation Accuracy

Error Reduction Techniques

🔎 Similar Papers

No similar papers found.