🤖 AI Summary
Large language models (LLMs) struggle with context-sensitive units (CSUs)—e.g., polysemous words—in machine translation, often yielding locally inaccurate translations and globally inconsistent semantics. To address this, we propose **Dynamic Focusing Anchoring (DFA)**, a prompt-based, zero-shot, fine-tuning-free mechanism: it first dynamically identifies CSU translation challenges, then structurally injects semantic focus and activates context-aware knowledge. DFA is the first method to enable real-time, interpretable semantic focusing specifically for CSUs—ensuring both local translation accuracy and global semantic coherence. Experiments across multilingual MT benchmarks demonstrate that DFA significantly outperforms mainstream open-source baselines, especially in polysemy resolution and low-resource or distant-language-pair translation. Moreover, DFA exhibits strong cross-task generalization, enhancing performance on diverse NLP tasks beyond MT.
📝 Abstract
Large language models have demonstrated exceptional performance across multiple crosslingual NLP tasks, including machine translation (MT). However, persistent challenges remain in addressing context-sensitive units (CSUs), such as polysemous words. These CSUs not only affect the local translation accuracy of LLMs, but also affect LLMs' understanding capability for sentences and tasks, and even lead to translation failure. To address this problem, we propose a simple but effective method to enhance LLMs' MT capabilities by acquiring CSUs and applying semantic focus. Specifically, we dynamically analyze and identify translation challenges, then incorporate them into LLMs in a structured manner to mitigate mistranslations or misunderstandings of CSUs caused by information flattening. Efficiently activate LLMs to identify and apply relevant knowledge from its vast data pool in this way, ensuring more accurate translations for translating difficult terms. On a benchmark dataset of MT, our proposed method achieved competitive performance compared to multiple existing open-sourced MT baseline models. It demonstrates effectiveness and robustness across multiple language pairs, including both similar language pairs and distant language pairs. Notably, the proposed method requires no additional model training and enhances LLMs' performance across multiple NLP tasks with minimal resource consumption.