🤖 AI Summary
Cross-cultural translation frequently suffers from stylistic misalignment due to cultural disparities, particularly the loss of pragmatic features such as politeness markers and honorifics. To address this, we propose RASTA—a novel culture-aware translation framework that integrates retrieval-augmented generation (RAG) with explicit stylistic concept learning. RASTA retrieves culturally appropriate stylistic exemplars to construct style-informed prompts, thereby guiding large language models to faithfully reproduce source-language cultural norms in target-language outputs. This approach effectively mitigates the stylistic neutralization bias prevalent in general-purpose LLMs when translating non-Western languages. Empirical evaluation across multiple language pairs demonstrates significant improvements in the accuracy of key stylistic dimensions—including politeness level and honorific usage—while ensuring interpretability and scalability. RASTA thus establishes a new, explainable, and extensible paradigm for pragmatic alignment in machine translation.
📝 Abstract
Successful communication depends on the speaker's intended style (i.e., what the speaker is trying to convey) aligning with the listener's interpreted style (i.e., what the listener perceives). However, cultural differences often lead to misalignment between the two; for example, politeness is often lost in translation. We characterize the ways that LLMs fail to translate style - biasing translations towards neutrality and performing worse in non-Western languages. We mitigate these failures with RASTA (Retrieval-Augmented STylistic Alignment), a method that leverages learned stylistic concepts to encourage LLM translation to appropriately convey cultural communication norms and align style.