π€ AI Summary
This study addresses the challenge of preserving original typographic styles while translating text in globalized graphic design, which hinges critically on achieving high-precision word alignment between source and target languages. To this end, the work proposes three novel approaches: a neural machine translation (NMT) framework with custom inputβoutput tags, a large language model (LLM)-based solution, and a hybrid strategy that integrates NMT with LLMs. The authors further introduce a style-aware translation mechanism coupled with a unigram-based alignment technique. Experimental results demonstrate that the proposed attention-head method serves as a strong baseline, outperforming both standalone LLM and NMT approaches in alignment accuracy and matching the performance of the hybrid method, thereby showing significant potential for practical deployment.
π Abstract
Globalization of graphic designs such as those used in marketing materials and magazines is increasingly important for communication to broad audiences. To accomplish this, the textual content in the graphic designs needs to be accurately translated and have the text styling preserved in order to fit visually into the design. Preserving text styling requires high accuracy word alignment between the original and the translated text. The problem of word alignment between source and translated text is long known. The industry standards for extracting word alignments are defined by Giza++ and attention probabilities from neural machine translation (NMT) models. In this paper, we explore three new methods to tackle the word alignment problem for transferring text styles from the source to the translated text. The proposed methods are developed on top of commercially available NMT and LLM translation technologies. They include: NMT with custom input and output tags for text styling; LLM with custom input and output tags; a hybrid with NMT for translation followed by an LLM with use of unigram mappings. To analyze the performance of these solutions, their alignment results are compared with the results of an attention head approach to gauge their usability in graphic design applications. Interestingly, the attention head strong baseline proves more accurate than the LLM or NMT approach and on par with the hybrid NMT+LLM approach.